In the latest edition of the trade publication E-Data & Research, a nice article (in Dutch) about our research on knowledge graphs for maritime history is published. Thanks to Mathilde Jansen and of course my collaborators Stijn Schouten and Marieke van Erp! The image below shows the print article, the article can be found online here.
Digital Humanities
A Polyvocal and Contextualised Semantic Web
[This post is the text of a 1-minute pitch at the IWDS symposium for our poster “A Polyvocal and Contextualised Semantic Web” which was published as the paper”Erp, Marieke van, and Victor de Boer. “A Polyvocal and Contextualised Semantic Web.” European Semantic Web Conference. Springer, Cham, 2021.”]
Knowledge graphs are a popular way of representing and sharing data, information and knowledge in many domains on the Semantic Web. These knowledge graphs however often represent singular -biased- views on the word, this can lead to unwanted bias in AI using this data. We therefore identify a need a more polyvocal Semantic Web.
So. How do we get there?
- We need perspective-aware methods for identifying existing polyvocality in datasets and for acquiring it from text or users.
- We need datamodels and patterns to represent polyvocal data information and knowledge.
- We need visualisations and tools to make the polyvocal knowledge accessible and usable for a wide variety of users, including domain experts or laypersons with varying backgrounds.
In the Cultural AI Lab, we investigate these challenges in several interrelated research projects, but we cannot do it, and should not do it alone and are looking for more voices to join us!
Modeling Ontologies for Individual Artists
[This post presents research done by Daan Raven in the context of his Master Project Information Sciences]
There is a long tradition in the Cultural Heritage domain of using structured, machine-interoperable knowledge using semantic methods and tools. However, research into developing and using ontologies specific to works of art of individual artists is persistently lacking. Such knowledge graphs would improve access to heritage information by making reasoning and inferencing possible. In his research, Daan Raven developed and applied a re-usable method, building on the ‘Methontology’ method for ontology development. We describe the steps of specification, conceptualization, integration, implementation and evaluation in a case study concerning ceramic-glass sculptor Barbara Nanning.
This work was presented at Digital Humanities Benelux 2021. The abstract and presentation as well as other digital resources related to the project can be found below:
- DHBenelux abstract (pdf)
- DHBenelux presentation (pdf)
- Knowledge Graph on Github (~15K triples)
- SPARQL endpoint: https://semanticweb.cs.vu.nl/test/sparql
Below are some examples of competency questions with pointers to SPARQL queries in YASGUI.
Which artworks in the Verre Églomisé collection of Nanning are currently stored in her private collection? | https://api.triplydb.com/s/wKZG4UFq5 |
Show me a timeline of all process that require the use of an Annealing Kiln | https://api.triplydb.com/s/j4Qk0tHzK |
# Show me all process steps that require the use of an annealing kiln and that have a landing page | https://api.triplydb.com/s/N5mo4uTM3 |
Show me (in Gallery) all objects made by “Jiří Pačinek Glass Lindava” (person in Wikidata) | https://api.triplydb.com/s/C6LsEgiZF |
Show me (in Geo) the locations of creation steps for various works (uses geonames) | https://api.triplydb.com/s/THTkhOYjd |
Student-supported project in the news
It was great to see that one of this year’s Digital Humanities in Practice projects lead to a conversation between the students in that project Helene Ayar and Edith Brooks, their external supervisors Willemien Sanders (UU) and Mari Wigham (NISV) and an advisor for another project André Krouwel (VU). That conversation resulted in original research and CLARIAH MediaSuite data story “‘Who’s speaking?’- Politicians and parties in the media during the Dutch election campaign 2021” where the content of news programmes was analysed for politicians’ names, their gender and party affiliation.
The results are very interesting and subsequently appeared on Dutch news site NOS.nl, showing that right-wing politicians are more represented on radio and tv: “Onderzoek: Rechts domineert de verkiezingscampagne op radio en tv“. Well done and congratulations!

Digital Humanities in Practice 2020-2021
This year’s edition of the VU Digital Humanities in Practice course was of course a virtual one. In this course, students of the Minor Digital Humanities and Social Analytics put everything that they have learned in that minor in practice, tackling a real-world DH or Social Analytics challenge. As in previous years, this year we had wonderful projects provided and supervised by colleagues from various institutes. We had projects related to the Odissei and Clariah research infrastructures, projects supervised by KNAW-HUC, Stadsarchief Amsterdam, projects from Utrecht University, UvA, Leiden University and our own Vrije Universiteit. We had a project related to Kieskompas and even a project supervised by researchers from Bologna University. A wide variety of challenges, datasets and domains! We would like to thank all the supervisors and the students on making this course a success.
The compilation video below shows all the projects’ results. It combines 2-minute videos produced by each of the 10 student groups.
After a very nice virtual poster session, everybody got to vote on the Best Poster Award. The winners are group 3, whose video you can also see in the video above. Below we list all the projects and the external supervisors.
1 | Extracting named entities from Social Science data. | ODISSEI project / VU CS – Ronald Siebes |
2 | Gender bias data story in the Media Suite | CLARIAH project / UU / NISV – Mari Wigham Willemien Sanders |
3 | Food & Sustainability | KNAW-HUC – Marieke van Erp |
4 | Visualizing Political Opinion (kieskompas) | Kieskompas – Andre Krouwel |
5 | Kickstarting the HTR revolution | UU – Auke Rijpma |
6 | Reconstructing the international crew and ships of the Dutch West India Company | Stadsarchief Amsterdam – Pauline van den Heuvel |
7 | Enriching audiovisual encyclopedias | NISV – Jesse de Vos |
8 | Using Social Media to Uncover How Patients Cope | LIACS Leiden – Anne Dirkson |
9 | Covid-19 Communities | UvA – Julia Noordegraaf, Tobias Blanke, Leon van Wissen |
10 | Visualizing named graphs | Uni Bologna – Marilena Daquino |
Linked Data Scopes
At this year’s Metadata and Semantics Research Conference (MTSR2020), I just presented our work on Linked Data Scopes: an ontology to describe data manipulation steps. The paper was co-authored with Ivette Bonestroo, one of our Digital Humanities minor students as well as Rik Hoekstra and Marijn Koolen from KNAW-HUC. The paper builds on earlier work by the latter two co-authors and was conducted in the context of the CLARIAH-plus project.

With the rise of data driven methods in the humanities, it becomes necessary to develop reusable and consistent methodological patterns for dealing with the various data manipulation steps. This increases transparency, replicability of the research. Data scopes present a qualitative framework for such methodological steps. In this work we present a Linked Data model to represent and share Data Scopes. The model consists of a central Data scope element, with linked elements for data Selection, Linking, Modeling, Normalisation and Classification. We validate the model by representing the data scope for 24 articles from two domains: Humanities and Social Science.
The ontology can be accessed at http://biktorrr.github.io/datascope/ .
You can do live sparql queries on the extracted examples as instances of this ontology at https://semanticweb.cs.vu.nl/test/query
You can watch a pre-recorded video of my presentation below. Or you can check out the slides here [pdf]
Listening to AI: ARIAS workshop report
Last week, I attended the second workshop of the ARIAS working group of AI and the Arts. ARIAS is a platform for research on Arts and Sciences and as such seeks to build a bridge between these disciplines. The new working group is looking at the interplay between Arts and AI specifically. Interestingly, this is not only about using AI to make art, but also to explore what art can do for AI (research). The workshop fell under the thematic theme for ARIAS “Art of Listening to the Matter” and consisted of a number of keynote talks and workshop presentations/discussions.

UvA university professor Tobias Blanke kicked off the meeting with an interesting overview of the different ‘schools’ of AI and how they relate to the humanities. Quite interesting was the talk by Sabine Niederer (a professor of visual methodologies at HvA) and Andy Dockett . They presented the results of an experiment feeding Climate Fiction (cli-fi) texts to the famous GPT algorithm. The results were then aggregated, filtered and visualized in a number of rizoprint-like pamflets.
My favourite talk of the day was by writer and critic Flavia Dzodan. Her talk was quite incendiary as it presented a post-colonial perspective on the whole notion of data science. Her point being that data science only truly started with the ‘discoveries’ of the Americas, the subsequent slave-trade and the therefor required counting of people. She then proceeded by pointing out some of the more nefarious examples of identification, classification and other data-driven ways of dealing with humans, especially those from marginalized groups. Her activist/artistic angle to this problem was to me quite interesting as it tied together themes around representation, participation that appear in the field of ICT4D and those found in AI and (Digital) Humanities. Food for thought at least.
The afternoon was reserved for talks from three artists that wanted to highlight various views on AI and art. Femke Dekker, S. de Jager and Martina Raponi all showed various art projects that in some way used AI technology and reflected on the practice and philosophical implications. Again, here GPT popped up a number of times, but also other methods of visual analysis and generative models.
Linked Art Provenance
In the past year, together with Ingrid Vermeulen (VU Amsterdam) and Chris Dijkshoorn (Rijksmuseum Amsterdam), I had the pleasure to supervise two students from VU, Babette Claassen and Jeroen Borst, who participated in a Network Institute Academy Assistant project around art provenance and digital methods. The growing number of datasets and digital services around art-historical information presents new opportunities for conducting provenance research at scale. The Linked Art Provenance project investigated to what extent it is possible to trace provenance of art works using online data sources.

In the interdisciplinary project, Babette (Art Market Studies) and Jeroen (Artificial Intelligence) collaborated to create a workflow model, shown below, to integrate provenance information from various online sources such as the Getty provenance index. This included an investigation of potential usage of automatic information extraction of structured data of these online sources.
This model was validated through a case study, where we investigate whether we can capture information from selected sources about an auction (1804), during which the paintings from the former collection of Pieter Cornelis van Leyden (1732-1788) were dispersed. An example work , the Lacemaker, is shown above. Interviews with various art historian validated the produced workflow model.

The workflow model also provides a basic guideline for provenance research and together with the Linked Open Data process can possibly answer relevant research questions for studies in the history of collecting and the art market.
More information can be found in the Final report
Digital Humanities in Practice 2018/2019
Last friday, the students of the class of 2018/2019 of the course Digital Humanities and Social Analytics in Practice presented the results of their capstone internship project. This course and project is the final element of the Digital Humanities and Social Analytics minor programme in which students from very different backgrounds gain skills and knowledge about the interdisciplinary topic.

The course took the form of a 4-week internship at an organization working with humanities or social science data and challenges and student groups were asked to use these skills and knowledge to address a research challenge. Projects ranged from cleaning, indexing, visualizing and analyzing humanities data sets to searching for bias in news coverage of political topics. The students showed their competences not only in their research work but also in communicating this research through great posters.
The complete list of student projects and collaborating institutions is below:
- “An eventful 80 years’ war” at Rijksmuseum identifying and mapping historical events from various sources.
- An investigation into the use of structured vocabularies also at the Rijksmuseum
- “Collecting and Modelling Event WW2 from Wikipedia and Wikidata” in collaboration with Netwerk Oorlogsbronnen (see poster image below)
- A project where an search index for Development documents governed by the NICC foundation was built.
- “EviDENce: Ego Documents Events modelliNg – how individuals recall mass violence” – in collaboration with KNAW Humanities Cluster (HUC)
- “Historical Ecology” – where students searched for mentions of animals in historical newspapers – also with KNAW-HUC
- Project MIGRANT: Mobilities and connection project in collaboration with KNAW-HUC and Huygens ING
- Capturing Bias with media data analysis – an internal project at VU looking at indentifying media bias
- Locating the CTA Archive Amsterdam where a geolocation service and search tool was built
- Linking Knowledge Graphs of Symbolic Music with the Web – also an internal project at VU working with Albert Merono


Architectural Digital Humanities student projects
In the context of our ArchiMediaL project on Digital Architectural History, a number of student projects explored opportunities and challenges around enriching the colonialarchitecture.eu dataset. This dataset lists buildings and sites in countries outside of Europe that at the time were ruled by Europeans (1850-1970).
Patrick Brouwer wrote his IMM bachelor thesis “Crowdsourcing architectural knowledge: Experts versus non-experts” about the differences in annotation styles between architecture historical experts and non-expert crowd annotators. The data suggests that although crowdsourcing is a viable option for annotating this type of content. Also, expert annotations were of a higher quality than those of non-experts. The image below shows a screenshot of the user study survey.
Rouel de Romas also looked at crowdsourcing , but focused more on the user interaction and the interface involved in crowdsourcing. In his thesis “Enriching the metadata of European colonial maps with crowdsourcing” he -like Patrick- used the Accurator platform, developed by Chris Dijkshoorn. A screenshot is seen below. The results corroborate the previous study that the in most cases the annotations provided by the participants do meet the requirements provided by the architectural historian; thus, crowdsourcing is an effective method to enrich the metadata of European colonial maps.