This year, we organized the SEMANTiCS2021 conference in Amsterdam. Due to the ongoing COVID-19 retrictions, we opted for a hybrid conference. And hybrid it was! With 200 onsite and 264 online tickets sold this was as much a mix between online and onsite as it was a mix between industry and academia. The research track consisted of 19 papers, and the industry track was made up of 24 presentations. With four wonderful keynote speakers, a poster session and various special tracks and workshops, this was quite a full programme!
As far as I am concerned, a true success! See my Twitter-generate impression below.
This Monday, Accenture and the UN organized the Knowledge Graphs for Social Good workshop, part of the Knowledge Graph conference. My submission to this workshop “Knowledge Graphs for the Rural Poor” was about ICT for Development research previously done within the FP7 VOICES in collaboration with students. In the contribution, we argue that there are three challenges to make Knowledge Graphs relevant and accessible for the Rural Poor.
Make KGs usable in low-resource, low-connectivity contexts
Make KGs accessible for users with various (cultural) backgrounds and levels of literacy;
Develop knowledge sharing cases and applications relevant for the rural poor
I contributed to a (Dutch) article written in response to the new European AI guidelines. The article, written by members of the Cultural AI lab, argues that we need both cultural data and cultural understanding to build truly responsible AI. It has been published in the Public Spaces blog.
This year’s SEMANTiCS conference was a weird one. As so many other conferences, we had to improvise to deal with the COVID-19 restrictions around travel and event organization. With the help of many people behind the scenes -including the wonderful program chairs Paul Groth and Eva Blomqvist- , we did have a relatively normal reviewing process for the Research and Innovation track. In the end, 8 papers were accepted for publication in this year’s proceedings. The authors were then asked to present their work in pre-recorded videos. These were shown in a very nice webinar, together with contributions from industry. All in all, we feel this downscaled version of Semantics was quite successful.
It is so nice when two often very distinct research lines come together. In my case, Digital Humanities and ICT for Development rarely meet directly. But they sure did come together when Gossa Lô started with her Master AI thesis. Gossa, a long-time collaborator in the W4RA team, chose to focus on the opportunities for Machine Learning and Natural Language Processing for West-African folk tales. Her research involved constructing a corpus of West-African folk tales, performing various classification and text generation experiments and even included a field trip to Ghana to elicit information about folk tale structures. The work -done as part of an internship at Bolesian.ai– resulted in a beautiful Master AI thesis, which was awarded a very high grade.
As a follow up, we decided to try to rewrite the thesis into an article and submit it to a DH or ICT4D journal. This proved more difficult. Both DH and ICT4D are very multidisciplinary in nature and the combination of both proved a bit too much for many journals, with our article being either too technical, not technical enough, or too much out of scope.
The paper examines how machine learning (ML) and natural language processing (NLP) can be used to identify, analyze, and generate West African folk tales. Two corpora of West African and Western European folk tales were compiled and used in three experiments on cross-cultural folk tale analysis:
In the text generation experiment, two types of deep learning text generators are built and trained on the West African corpus. We show that although the texts range between semantic and syntactic coherence, each of them contains West African features.
The second experiment further examines the distinction between the West African and Western European folk tales by comparing the performance of an LSTM (acc. 0.79) with a BoW classifier (acc. 0.93), indicating that the two corpora can be clearly distinguished in terms of vocabulary. An interactive t-SNE visualization of a hybrid classifier (acc. 0.85) highlights the culture-specific words for both.
The third experiment describes an ML analysis of narrative structures. Classifiers trained on parts of folk tales according to the three-act structure are quite capable of distinguishing these parts (acc. 0.78). Common n-grams extracted from these parts not only underline cross-cultural distinctions in narrative structures, but also show the overlap between verbal and written West African narratives.
The Virtual Human Rights Lawyer is a joint project of Vrije Universiteit Amsterdam and the Netherlands Office of the Public International Law & Policy Group to help victims of serious human rights violations obtain access to justice at the international level. It enables users to find out how and where they can access existing global and regional human rights mechanisms in order to obtain some form of redress for the human rights violations they face or have faced.
Rudy Marsman, Victor de Boer, Themistoklis Karavellas, Johan Oomen New life for old media: Investigations into Speech Synthesis and Deep Learning-based colorization for audiovisual archives. Extended Abstract proceedings of NEM summit 2017
Update: the slides as presented by Johan Oomen at NEM