I am an assistant professor (UD) at the Web & Media group at the Computer Science department of the Vrije Universiteit Amsterdam (VU). I am also a senior research fellow at Netherlands Institute for Sound and Vision. In my research, I combine (Semantic) Web technologies with Human-Computer Interaction, Knowledge Representation and Information Extraction to tackle research challenges in various domains. These include Cultural Heritage, Digital Humanities and ICT for Development (ICT4D). More information on these projects can be found on this site or through my CV .
On 21 and 22 March, researchers from VU’s Web and Media group attended ICT.OPEN, the principal ICT research conference in the Netherlands. Here over 500 scientists from all ICT research disciplines & interested researchers from industry come together to learn from each other, share ideas and network. The conference featured some great keynote speeches, including one from Nissan’s Erik Vinkhuyzen on the role of anthropological and sociological research to develop better self-driving cars. Barbara Terhal from Aachen University gave a challenging, but well-presented talk on the challenges regarding robustness for quantum computing.
As last year, the Web and Media group this year was well represented through multiple oral presentations with accompanying posters and demonstrations :
- Oana Inel, Carlos Martinez and Victor de Boer presented DIVEplus. Oana did such a good job presenting the project in the main programme (see Oana’s DIVE+@ICTOpen2017 slides), through the demo and in front of a poster that the poster was selected as best Poster in the SIKS track.
- Benjamin Timmermans, Tobias Kuhn and Tibor Vermeij presented the Controcurator project with a demonstration and poster presentation. In the demo the ControCurator human-machine framework for identifying controversy in multimodal data is shown.
- Tobias Kuhn discussed “Genuine Semantic Publishing” in the Computer Science track on the first day. His slides can be found here. After the talk there was a very interesting discussion about the role of the narrative writing process and how it would relate to semantic publishing.
- Ronald Siebes and Victor de Boer then discussed how Big and Linked Data technologies developed in the Big Data Europe project are used to deliver pharmacological web-services for drug discovery. You can read more in Ronald’s blog post.
- Benjamin Timmermans and Zoltan Zslavik also presented the CrowdTruth demonstrator, which is shown in this short demonstrator video.
- Sabrina Sauer presented the MediaNow project with a nice poster titled MediaNow – using a living lab method to understand media professionals’ exploratory search.
— Victor de Boer (@victordeboer) March 21, 2017
Last week, BigDataEurope was present at the principal ICT research conference in the Netherlands, ICT.OPEN, where over 500 scientists from all ICT research disciplines & interested researchers from industry come together to learn from each other, share ideas and network.
This is the first time that the NWO, the Netherlands Organisation for Scientific Research, added the “Health” track, a recognition of the increased importance of ICT in the domain of diagnosis, drug discovery and health-care. We presented a short paper written by Ronald Siebes, Victor de Boer, Bryn Williams-Jones, Kiera McNeice and Stian Soiland-Reyes covering the current state of the SC1 “Health, demographic change and well-being” pilot which implements the Open PHACTS functionality on the Big Data Europe infrastructure.
We succeeded to demonstrate the ease of use and practical value of the SC1 pilot for researchers in the domain of Drug Discovery and developers of Big Linked Data solutions and are looking forward to further strengthen our collaboration with the various. The paper was accepted as a poster presentation but also selected for an oral presentation at the “Health & ICT” track.
For those curious about the Big Data Europe technology stack and who rather view videos than read descriptions and documentation, we have started a youtube video channel where BDE researchers explain the how, why and what of the BDE stack. Embedded below is a short clip of Hajira Jabeen explaining how BDE enables someone to get started with Big Data. More clips are available on the channel.
As a teaser for our upcoming ICT4D students. Have a look at this nice video that André Baart made
The Netherlands Institute for Sound and Vision (NISV) archives Dutch broadcast TV and makes it available to researchers, professionals and the general public. One subset are the Polygoonjournaals (Public News broadcasts) that are published under open licenses as part of the OpenImages platform. NISV is also interested in exploring new ways and technologies to make interaction with the material easier and to increase exposure to their archives. In this context, Rudy explored two options.
One part of the research was the autonomous colorization of old black-and-white video footage using Neural Networks. Rudy used a pre-trained NN (Zhang et al 2016) that is able to colorize black and white images. Rudy developed a program to split videos into frames, colorize the individual frames using the NN and then ‘stitch’ them back together into colorized videos. The stunning results were very well received by NISV employees. Examples are shown below.
In the other part of his research, Rudy investigated to what extent the existing news broadcast corpus, with a voice-overs from the famous Philip Bloemendal can be used to develop a modern text-to-speech engine with his voice. To do so he have mainly focused on natural language processing and the determination to what extent the language used by Bloemendal in the 1970s is still comparable enough to contemporary Dutch.
Rudy used precompiled automatic speech recognition (ASR) results to match words to sounds and developed a slot-and-filler text-to-speech system based on this. To increase the limited vocabulary, he implemented a number of strategies, including term-expansion through the use of Open Dutch Wordnet and smart decompounding (this mostly works for Dutch, mapping ‘sinterklaasoptocht’ to ‘sinterklaas’ and ‘optocht’. The different strategies were compared to a baseline. Rudy found that a combination of the two resulted in the best performance (see figure). For more information:
On 9 December 2016, the second workshop for the Big Data Europe Health, Demographic Change and Wellbeing societal challenge was held in Brussels. The aim of this workshop was to highlight progress from the BigDataEurope project in building the foundations of a generically applicable big data platform which can be applied across all Horizon 2020 societal challenges. This workshop specifically focused on health, and showcased our first pilot’s application to early bioscience research data.
The workshop had 15 participants, from within the health domain and outside it, including many participants from the European Commission. Together we discussed different perspectives on how we may use appropriate H2020 instruments and work programmes to better integrate the ecosystem of linked data repositories, data management services and virtual collaboration environments to increase the pace of knowledge sharing in health.
The workshop featured presentations from BDE’s Simon Scerri and Aad Versteden on the general goals and progress of the BigDataEurope project and the BDE infrastructure respectively. After lunch, Ronald Siebes (BDE / VU Amsterdam) presented the first pilot in this specific domain. More information on that pilot can be found here. An extensive round-table discussion followed, in which possible options for new applications and connections were considered.
One question raised was whether the generic BDE infrastructure can be used by European SMEs. The fact that the BDE infrastructure is completely Open Source, very easy to install and features intuitive interface components makes re-use relatively simple even for smaller institutions and companies.
A significant part of the discussion focussed on possible new use cases for expanding the scope of the pilot. One suggestion was to look at post-hoc integration of clinical data, which represents a typical problem of data ‘variance’. This would require integrating information from different versions of medical questionnaires, which may be recorded or stored in different ways. Data provenance is also a key concern, as keeping a trail of what has happened to clinical data is crucial to tracking patients’ histories. Once integrated, this data could then be mined to identify biases or data patterns.
Finally, the workshop participants discussed potential connections to other European projects. Here many projects were mentioned including the MIDAS project, the Big-O project on childhood obesity, the PULSE projects and IMI / IMI2 projects including EMIF. We will be seeking collaborations with these projects and will continue to develop new and interesting Big Data use cases in this domain in the coming year.
For its 10th anniversary, the Web Science Trust organized an event Webscience@10. For this event, a Webscience@10 TV channel was launched to showcase different research and education initatives around the world. On behalf of the VU Network Institute and W4RA, we submitted our Web of Voices video as well as a short introduction to the W4RA team.
Niels Ockeloen’s paper on Data2Documents was awarded the first Bob Wielinga memorial award for best research paper at the 20th International Conference on Knowledge Engineering and Knowledge Management (EKAW2016). “Data 2 Documents: Modular and distributive content management in RDF” was authored by Niels Ockeloen, Victor de Boer, Tobias Kuhn and Guus Schreiber from the Web and Media group.. The paper describes Niels’ PhD. work on a method for creating human readable web documents out of machine readable Linked Data, focussing on modularity and re-use. You can view the slides for Niels’ presentation slides here.
In modern day research, dissemination is key and it is therefore always nice to see research results being shared with the public in new and unforseen ways. Our work within the Web for Regreening in Africa (W4RA) is now part of a exhibition in the Museon museum in the Hague. The exhibition focuses on the United Nations’ 17 Sustainable Development Goals (SDGs) for 2030. As content partner of the Museon, the W4RA programme of VU has contributed ideas, visuals and texts for the exposition related to SDG No. 15, entitled “Plant in het Zand” (Plant in the sand).
From the press release: Land degradation and desertification are increasing due to both natural and human causes, including climate change and population pressures. Areas can no longer meet the needs of their populations, with famine and poverty as a result. There are various solutions, but regreening – the natural (re)generation and protection of trees by local farmers themselves – is a highly successful one. Belts of trees act as windbreaks, helping to stop soil blowing away, keeping it moist for longer, and providing a micro-climate that is better for people, animals and plants. Trees also provide food and many other economically useful products.
Within the W4RA programme, we integrate local ICT web and mobile app innovations to support local knowledge sharing around regreening efforts.