Yesterday I had the honour and pleasure to give one of the keynote speeches at the 1st workshop of Semantic AI, co-located with SEMANTiCS2022 in Vienna. I my talk “Knowledge Graphs for impactful Data Science In the Digital Humanities and IOT domain“, I talked about challenges and lessons learned in various projects where 1) Knowledge Graphs 2) Machine Learning and 3) User Contexts interact in interesting ways. The slides for my talk can be found below.
I am happy and proud I to announce that I will join Marieke van Erp and Laura Hollink as co-director of the Cultural AI lab. The lab brings together researchers from various research institutes and heritage organizations to investigate both how AI can be used to address various humanities and heritage challenges but also how we can use methods, theories and insights from the cultural domain to make better, fairer, more inclusive and diverse AI.
I am very excited about this and look forward to the wonderful research collaborations!
Our abstract “Using the SAREF ontology for interoperability and machine learning in a Smart Home environment” was accepted for a presentation at the ICT Open conference in 6-7 April 2022 in Amsterdam. In the abstract, we outline the current and future research VU and TNO are conducting in the context of the InterConnect project, specifically around the construction of IOT knowledge graphs, machine learning and rule-based applications. We look forward to presenting it in April.
Last week, I was invited to give a guest lecture at the University of Development Studies in Tamale, Ghana. Vrije Universiteit has a very interesting and fruitful collaboration with this great university. In my presentation “Knowledge Graphs for Social Good”, I introduce the principles and practice of knowledge graphs and their role with AI. I also talked about how knowledge graphs can be (and are) used for social impact. Finally, I talk about four challenges we encountered in our own efforts to make knowledge graphs meaningful for rural users in the Global South.
We expect the recording to be shared, for now, the slides are embedded below and can be downloaded from Google Slides.
For several courses, I made a set of video lectures around Linked Data principles and practice, specifically in the context of Digital Humanities.
This contains videos showing
- the principles of Linked Data
- The RDF-Turtle syntax
- Making RDF using OntoRefine
- The hands-on exercises with Dutch Ships and Sailors
Also, this includes a sub-tutorial on using GraphDB, OntoRefine and SPARQL to
- Download and install GraphDB
- Get some interesting data in CSV
- Convert to triples using OntoRefine
- Find potential links in DBPedia
- Link your data using SPARQL -> import data
- Try out interesting SPARQL queries
This year, we organized the SEMANTiCS2021 conference in Amsterdam. Due to the ongoing COVID-19 retrictions, we opted for a hybrid conference. And hybrid it was! With 200 onsite and 264 online tickets sold this was as much a mix between online and onsite as it was a mix between industry and academia. The research track consisted of 19 papers, and the industry track was made up of 24 presentations. With four wonderful keynote speakers, a poster session and various special tracks and workshops, this was quite a full programme!
As far as I am concerned, a true success! See my Twitter-generate impression below.
This Monday, Accenture and the UN organized the Knowledge Graphs for Social Good workshop, part of the Knowledge Graph conference. My submission to this workshop “Knowledge Graphs for the Rural Poor” was about ICT for Development research previously done within the FP7 VOICES in collaboration with students. In the contribution, we argue that there are three challenges to make Knowledge Graphs relevant and accessible for the Rural Poor.
- Make KGs usable in low-resource, low-connectivity contexts
- Make KGs accessible for users with various (cultural) backgrounds and levels of literacy;
- Develop knowledge sharing cases and applications relevant for the rural poor
The paper was based on previous work which can be found in these papers. More information can also be found elsewhere on this blog.
- Linked data for the international aid transparency initiative (project with Kasper Brandt)
- Guéret et al. Let’s “Downscale” Linked Data. (2014) [IEEE Link]
- de Boer et al. A Dialogue with Linked Data – Voice-Based Access to Market Data in the Sahel (2013) [Draft PDF]
- Valkering et al.The semantic web in an SMS (2016) [Draft PDF]
- Baart, A. et al. A voice service development platform to bridge the web’s digital divide (2018). [Link INSTICC]
- Ali, F.: Machine-to-machine communication in rural conditions. realising KasadakaNet. (Master Thesis Vu Information Science)
This year’s SEMANTiCS conference was a weird one. As so many other conferences, we had to improvise to deal with the COVID-19 restrictions around travel and event organization. With the help of many people behind the scenes -including the wonderful program chairs Paul Groth and Eva Blomqvist- , we did have a relatively normal reviewing process for the Research and Innovation track. In the end, 8 papers were accepted for publication in this year’s proceedings. The authors were then asked to present their work in pre-recorded videos. These were shown in a very nice webinar, together with contributions from industry. All in all, we feel this downscaled version of Semantics was quite successful.
The Open Access proceedings are published in the Springer LNCS series and are now available at https://www.springer.com/gp/book/9783030598327
All presentation videos can be watched at https://2020-eu.semantics.cc/ (program/recordings->videos).
And stay tuned for announcements of SEMANTiCS 2021!!
It is so nice when two often very distinct research lines come together. In my case, Digital Humanities and ICT for Development rarely meet directly. But they sure did come together when Gossa Lô started with her Master AI thesis. Gossa, a long-time collaborator in the W4RA team, chose to focus on the opportunities for Machine Learning and Natural Language Processing for West-African folk tales. Her research involved constructing a corpus of West-African folk tales, performing various classification and text generation experiments and even included a field trip to Ghana to elicit information about folk tale structures. The work -done as part of an internship at Bolesian.ai– resulted in a beautiful Master AI thesis, which was awarded a very high grade.
As a follow up, we decided to try to rewrite the thesis into an article and submit it to a DH or ICT4D journal. This proved more difficult. Both DH and ICT4D are very multidisciplinary in nature and the combination of both proved a bit too much for many journals, with our article being either too technical, not technical enough, or too much out of scope.
But now, the article ” Exploring West African Folk Narrative Texts Using Machine Learning ” has been published (Open Access) in a special issue of Information on Digital Humanities!
The paper examines how machine learning (ML) and natural language processing (NLP) can be used to identify, analyze, and generate West African folk tales. Two corpora of West African and Western European folk tales were compiled and used in three experiments on cross-cultural folk tale analysis:
- In the text generation experiment, two types of deep learning text generators are built and trained on the West African corpus. We show that although the texts range between semantic and syntactic coherence, each of them contains West African features.
- The second experiment further examines the distinction between the West African and Western European folk tales by comparing the performance of an LSTM (acc. 0.79) with a BoW classifier (acc. 0.93), indicating that the two corpora can be clearly distinguished in terms of vocabulary. An interactive t-SNE visualization of a hybrid classifier (acc. 0.85) highlights the culture-specific words for both.
- The third experiment describes an ML analysis of narrative structures. Classifiers trained on parts of folk tales according to the three-act structure are quite capable of distinguishing these parts (acc. 0.78). Common n-grams extracted from these parts not only underline cross-cultural distinctions in narrative structures, but also show the overlap between verbal and written West African narratives.
All resources, including data and code are found at https://github.com/GossaLo/afr-neural-folktales