A Look Back at the 2nd BDE Workshop on Big Data in Health, Demographic Change and Wellbeing

[reblogged from Big-Data-Europe.eu]

On 9 December 2016, the second workshop for the Big Data Europe Health, Demographic Change and Wellbeing societal challenge was held in Brussels. The aim of this workshop was to highlight progress from the BigDataEurope project in building the foundations of a generically applicable big data platform which can be applied across all Horizon 2020 societal challenges. This workshop specifically focused on health, and showcased our first pilot’s application to early bioscience research data.

The workshop in full effect

The workshop had 15 participants, from within the health domain and outside it, including many participants from the European Commission. Together we discussed different perspectives on how we may use appropriate H2020 instruments and work programmes to better integrate the ecosystem of linked data repositories, data management services and virtual collaboration environments to increase the pace of knowledge sharing in health.

The workshop featured presentations from BDE’s Simon Scerri and Aad Versteden on the general goals and progress of the BigDataEurope project and the BDE infrastructure respectively. After lunch, Ronald Siebes (BDE / VU Amsterdam) presented the first pilot in this specific domain. More information on that pilot can be found here. An extensive round-table discussion followed, in which possible options for new applications and connections were considered.

Snapshot of the SC1 pilot interface, as presented by Ronald Siebes

One question raised was whether the generic BDE infrastructure can be used by European SMEs. The fact that the BDE infrastructure is completely Open Source, very easy to install and features intuitive interface components makes re-use relatively simple even for smaller institutions and companies.

A significant part of the discussion focussed on possible new use cases for expanding the scope of the pilot. One suggestion was to look at post-hoc integration of clinical data, which represents a typical problem of data ‘variance’. This would require integrating information from different versions of medical questionnaires, which may be recorded or stored in different ways. Data provenance is also a key concern, as keeping a trail of what has happened to clinical data is crucial to tracking patients’ histories. Once integrated, this data could then be mined to identify biases or data patterns.

Finally, the workshop participants discussed potential connections to other European projects. Here many projects were mentioned including the MIDAS project, the Big-O project on childhood obesity, the PULSE projects and IMI / IMI2 projects including EMIF. We will be seeking collaborations with these projects and will continue to develop new and interesting Big Data use cases in this domain in the coming year.

More images can be found below: BDE Health Workshop SC 1.2

Share This:

Web of Voices and W4RA video at the Webscience@10 TV Channel

For its 10th anniversary, the Web Science Trust organized an event Webscience@10. For this event, a Webscience@10 TV channel was launched to showcase different research and education initatives around the world. On behalf of the VU Network Institute and W4RA, we submitted our Web of Voices video as well as a short introduction to the W4RA team.

You can watch the ~10 hours of video content at  http://www.webscience.org/webscience10/tv-channel-webscience10/. You can find us (listed under Netwerk Institute Amsterdam) at 2h31mins:

Share This:

Installing and Running the First Big-Data-Europe Health Pilot

[This blog post is reblogged from big-data-europe.eu and written by Ronald Siebes and Victor de Boer]

As previously announced, the pilot implementation for the Big-Data-Europe platform for Societal Challenge 1 (the Health domain) facilitates the Open PHACTS discovery Platform functionality.  The Open PHACTS platform is built for researchers in Drug Discovery. It uses databases of physicochemical and pharmacological properties stored in a RDF Triple Store. This interconnected data is exposed through a Linked Data API composed of interoperable data. The system caches query results via a Memcached module. In the context of the SC1 pilot, most functionalities of the platform is now successfully replicated via Docker containers on the BDE infrastructure.

The Open PHACTS platform architecture
The Open PHACTS platform architecture

Please do try this at home! The pilot can be installed on Linux (through Docker compose) or Windows (through Docker toolbox). Installations instructions are available on the pilot’s GitHub page.  By design the technology itself is independent from the domain. Once you got familiar with the code and got it running by yourself, you should have enough experience to upload your own Linked Data, and create your own API.

Share This:

A look back at Downscale2016

On 29 August, the 4th International Workshop on Downscaling the Semantic Web (Downscale2016) was held as a full-day workshop in Amsterdam co-located with the ICT4S conference. The workshop attracted 12 participants and we received 4 invited paper contributions, which were presented and discussed in the morning session (slides can be found below). These papers describe a issues regarding sustainability of ICT4D approaches, specific downscaled solutions for two ICT4D use cases and a system for distributed publishing and consuming of Linked Data.. The afternoon session was reserved for demonstrations and discussions. An introduction into the Kasadaka platform was followed by an in-depth howto on developing voice-based information services using Linked Data. The papers and the descriptions of the demos are gathered in a proceedings (published online at figshare: doi:10.6084/m9.figshare.3827052.v1).

downscale2016 participants
Downscale2016 participants (photo: Kim Bosman)

During the discussions the issue of sustainability was addressed. Different dimensions of sustainability were discussed (technical, economical, social and environmental). The participants agreed that a holistic approach is needed for successful and sustainable ICT4D and that most of these dimensions were indeed present in the four presentations and the design of the Kasadaka platform. There remains a question on how different architectural solutions for services (centralized, decentralized, cloud services) relate to eachother in terms of sustainability and when a choice for one of these is most suited. Discussion then moved towards different technical opportunities for green power supplies, including solar panels.

The main presentations and slides are listed below::

  • Downscale2016  introduction (Victor and Anna) (slides)
  • Jari Ferguson and Kim Bosman. The Kasadaka Weather Forecast Service (slides)
  • Aske Robenhagen and Bart Aulbers. The Mali Milk Service – a voice based platform for enabling farmer networking and connections with buyers. (slides)
  • Anna Bon, Jaap Gordijn et al. A Structured Model-Based Approach To Preview Sustainability in ICT4D (slides)
  • Mihai Gramada and Christophe Gueret Low profile data sharing with the Entity Registry System (ERS) (slides)

Share This:

Msc project: Low-Bandwith Semantic Web

[This post is based on the Information Sciences MSc. thesis by Onno Valkering]

To make widespread knowledge sharing possible in rural areas in developing countries, the notion of the Web has to be downscaled based on the specific low-resource infrastructure in place. In this paper, we introduce SPARQL over SMS, a solution for exchanging RDF data in which HTTP is substituted by SMS to enable Web-like exchange of data over cellular networks.

SPARQL in an SMS architecture
SPARQL over SMS architecture

The solution uses converters that take outgoing SPARQL queries sent over HTTP and convert them into SMS messages sent to phone numbers (see architecture image). On the receiver-side, the messages are converted back to standard SPARQL requests.

The converters use various data compression strategies to ensure optimal use of the SMS bandwidth. These include both zip-based compression and the removal of redundant data through the use of common background vocabularies. The thesis presents the design and implementation of the solution, along with evaluations of the different data compression methods.

Test setup with two Kasadakas
Test setup with two Kasadakas

The application is validated in two real-world ICT for Development (ICT4D) cases that both use the Kasadaka platform: 1) An extension of the DigiVet application allows sending information related to veterinary symptoms and diagnoses accross different distributed systems. 2) An extension of the RadioMarche application involves the retrieval and adding of current offerings in the market information system, including the phone number of the advertisers.

For more information:

  • Download Onno’s Thesis. A version of the thesis is currently under review.
  • The slides for Onno’s presentation are also available: Onno Valkering
  • View the application code at https://github.com/onnovalkering/sparql-over-sms


Share This:

Connecting collections across national borders

Items from two collections shown side-by-sideAs audiovisual archives are digitizing their collections and making these collections available online, the need arises to also establish connections between different collections and to allow for cross-collection search and browsing. Structured vocabularies can be used as connecting points by aligning thesauri from different institutions. The project “Gemeenschappelijke Thesaurus voor Uniforme Ontsluiting” was funded by the Taalunie -a cross-national organization focusing on the Dutch language- and executed by the Netherlands Institute for Sound and Vision and the Flemish VIAA archive. It involved a case study where partial collections of the two archives were connected by aligning their thesauri. This involved the conversion of the VRT thesaurus to the SKOS format and linking it to Sound and Vision’s GTAA thesaurus.cultuurlink screenshotThe interactive alignment tool CultuurLINK, made by Dutch company Spinque was used to align the two thesauri (see the screenshot above).


The links between the collections can be explored using a cross-collection browser, also built by Spinque. This allows users to search and explore connections between the two collections. Unfortunately, the collections are not publicly available so the demonstrator is password-protected, but a publicly accessible screencast (below) shows the functionalities.

The full report can be accessed through the VIAA site. There, you can also find a blog post in Dutch.

Update: a paper about this has been accepted for publication:

  • Victor de Boer, Matthias Priem, Michiel Hildebrand, Nico Verplancke, Arjen de Vries and Johan Oomen. Exploring Audiovisual Archives through Aligned Thesauri. To appear in Proceedings of 10th Metadata and Semantics Research Conference. [Draft PDF]

Share This:

Clarin video showcases Dutch Ships and Sailors project

The CLARIN framework commissioned the production of dissemmination videos showcasing the outcomes of the individual CLARIN projects. One of these projects was the Dutch Ships and Sailors project, a collaboration between VU Computer Science, VU humanities and the Huygens Institute for National History. In this project, we developed a heterogeneous linked data cloud connecting many different maritime databases. This data cloud allows for new types of integrated browsing and new historical research questions. In the video, we (Victor de Boer together with historians Jur Leinenga and Rik Hoekstra) explain how the data cloud was formed and how it can be used by maritime historians.

CLARIN Dutch Ships & Sailors from CLARIN-NL (Dutch, with Dutch or English subtitles)  See also other DSS-related posts on this website.


Share This:

One-off lecture Social Web

Around 40 students joined this year’s “bachelor’s for a day” for the VU IMM programme this year. As in previous years, I give a 45 minute lecture and construct a hands-on session around “The Social Web”. Each year I do a non-scientific survey of Social Web use among the -mostly- 17 year old attendees. This year’s outcome:

  • Everybody still uses Facebook (even though for the last couple of years, there are some murmurs about abandoning it
  • Everybody uses Whatsapp. No surprise there
  • More than half of the students use Snapchat.
  • About 1/4 of students use LinkedIn.
  • About 1/8 of students actively uses Twitter (one post in the last 3 months)
  • Most students have heard of Hyves, but noone ever used it
  • Almost noone has heard of Second Life 🙂
  • Noone heard of Schoolbank.nl

You can find my slides below. The handson session can be found here



Share This:

CultuurLINK Linking Award

Happy and suprised to find the first (and so far only) CultuurLink Linking Award in my mail box yesterday! I checked with the nice people over at Spinque.com and it turns out it was a token of appreciation for being a prolific Cultuurlink user 🙂

I think the vocabulary alignment tool is great and easy to work with, so I can recommend it to anyone with a SKOS vocabulary who wants to match it with any of the major cultural thesauri in the ‘Hub’. Thanks to the people at Spinque for the great tool and the nice gesture!


Share This:

BDE Webinar:  Big Data and the 7 Societal Challenges Out-of-the-box technology for the future

logo-BigDataEuropeAs the Big Data Europe project enters its second year, we’re doing everything we can to make it as simple as possible to get acquainted with the platform which is under development, and facilitate future deployments of our platform to support your Big Data pipelines.

We are therefore happy to introduce this quarterly series of technical webinars, where you can keep track of progress related to our technical developments and demonstrators in each of the seven societal challenges, ask questions, and provide valuable feedback. In addition, we will also cover other important developments in the area which are not necessarily related to our project.

Online Webinar: 02-03-2016, 14:00-15:00 CET

In the first webinar in this series, you will learn about:

  • the requirements we collected from the 7 Societal Challenges we are addressing
  • the technical building blocks of our Big Data Platform
  • how the above will be provided as a generic instance for customisation
  • an introduction to the 7 selected Pilot partners and the expected outcome

The one hour webinar is run by the Big Data Europe Project and presents inputs and presentations from experts responsible for the architecture, the implementation and the upcoming pilots roll-out. The audience will be given a chance to interact and the top questions will be answered by one of our dedicated technical and domain experts.

Registration is Free, click here register now!

We are looking forward to your participation.

Share This: