Digital Humanities in Practice 2020-2021

This year’s edition of the VU Digital Humanities in Practice course was of course a virtual one. In this course, students of the Minor Digital Humanities and Social Analytics put everything that they have learned in that minor in practice, tackling a real-world DH or Social Analytics challenge. As in previous years, this year we had wonderful projects provided and supervised by colleagues from various institutes. We had projects related to the Odissei and Clariah research infrastructures, projects supervised by KNAW-HUC, Stadsarchief Amsterdam, projects from Utrecht University, UvA, Leiden University and our own Vrije Universiteit. We had a project related to Kieskompas and even a project supervised by researchers from Bologna University. A wide variety of challenges, datasets and domains! We would like to thank all the supervisors and the students on making this course a success.

The compilation video below shows all the projects’ results. It combines 2-minute videos produced by each of the 10 student groups.

After a very nice virtual poster session, everybody got to vote on the Best Poster Award. The winners are group 3, whose video you can also see in the video above. Below we list all the projects and the external supervisors.

1Extracting named entities from Social Science data.ODISSEI project / VU CS – Ronald Siebes
2Gender bias data story in the Media SuiteCLARIAH project / UU / NISV –  Mari Wigham Willemien Sanders
3Food & SustainabilityKNAW-HUC –  Marieke van Erp
4Visualizing Political Opinion (kieskompas)Kieskompas – Andre Krouwel
5Kickstarting the HTR revolutionUU – Auke Rijpma
6Reconstructing the international crew and ships of the Dutch West India CompanyStadsarchief Amsterdam – Pauline van den Heuvel
7Enriching audiovisual encyclopediasNISV – Jesse de Vos
8Using Social Media to Uncover How Patients CopeLIACS Leiden – Anne Dirkson
9Covid-19 CommunitiesUvA – Julia Noordegraaf, Tobias Blanke, Leon van Wissen
10Visualizing named graphsUni Bologna – Marilena Daquino

Share This:

Linked Data Scopes

At this year’s Metadata and Semantics Research Conference (MTSR2020), I just presented our work on Linked Data Scopes: an ontology to describe data manipulation steps. The paper was co-authored with Ivette Bonestroo, one of our Digital Humanities minor students as well as Rik Hoekstra and Marijn Koolen from KNAW-HUC. The paper builds on earlier work by the latter two co-authors and was conducted in the context of the CLARIAH-plus project.

This figure shows envisioned use of the ontology: scholarly output is not only the research paper, but also an explicit data scope. This data scope includes (references to) datasets.

With the rise of data driven methods in the humanities, it becomes necessary to develop reusable and consistent methodological patterns for dealing with the various data manipulation steps. This increases transparency, replicability of the research. Data scopes present a qualitative framework for such methodological steps. In this work we present a Linked Data model to represent and share Data Scopes. The model consists of a central Data scope element, with linked elements for data Selection, Linking, Modeling, Normalisation and Classification. We validate the model by representing the data scope for 24 articles from two domains: Humanities and Social Science.

The ontology can be accessed at .

You can do live sparql queries on the extracted examples as instances of this ontology at

You can watch a pre-recorded video of my presentation below. Or you can check out the slides here [pdf]

Share This:

Listening to AI: ARIAS workshop report

Last week, I attended the second workshop of the ARIAS working group of AI and the Arts. ARIAS is a platform for research on Arts and Sciences and as such seeks to build a bridge between these disciplines. The new working group is looking at the interplay between Arts and AI specifically. Interestingly, this is not only about using AI to make art, but also to explore what art can do for AI (research). The workshop fell under the thematic theme for ARIAS “Art of Listening to the Matter” and consisted of a number of keynote talks and workshop presentations/discussions.

The workshop at the super-hip Butcher’s Tears in Amsterdam, note the 1.5m COVID-distance.

UvA university professor Tobias Blanke kicked off the meeting with an interesting overview of the different ‘schools’ of AI and how they relate to the humanities. Quite interesting was the talk by Sabine Niederer (a professor of visual methodologies at HvA) and Andy Dockett . They presented the results of an experiment feeding Climate Fiction (cli-fi) texts to the famous GPT algorithm. The results were then aggregated, filtered and visualized in a number of rizoprint-like pamflets.

My favourite talk of the day was by writer and critic Flavia Dzodan. Her talk was quite incendiary as it presented a post-colonial perspective on the whole notion of data science. Her point being that data science only truly started with the ‘discoveries’ of the Americas, the subsequent slave-trade and the therefor required counting of people. She then proceeded by pointing out some of the more nefarious examples of identification, classification and other data-driven ways of dealing with humans, especially those from marginalized groups. Her activist/artistic angle to this problem was to me quite interesting as it tied together themes around representation, participation that appear in the field of ICT4D and those found in AI and (Digital) Humanities. Food for thought at least.

The afternoon was reserved for talks from three artists that wanted to highlight various views on AI and art. Femke Dekker, S. de Jager and Martina Raponi all showed various art projects that in some way used AI technology and reflected on the practice and philosophical implications. Again, here GPT popped up a number of times, but also other methods of visual analysis and generative models.

Share This:

Linked Art Provenance

In the past year, together with Ingrid Vermeulen (VU Amsterdam) and Chris Dijkshoorn (Rijksmuseum Amsterdam), I had the pleasure to supervise two students from VU, Babette Claassen and Jeroen Borst, who participated in a Network Institute Academy Assistant project around art provenance and digital methods. The growing number of datasets and digital services around art-historical information presents new opportunities for conducting provenance research at scale. The Linked Art Provenance project investigated to what extent it is possible to trace provenance of art works using online data sources.

Caspar Netscher, the Lacemaker, 1662, oil on canvas. London: the Wallace Collection, P237

In the interdisciplinary project, Babette (Art Market Studies) and Jeroen (Artificial Intelligence) collaborated to create a workflow model, shown below, to integrate provenance information from various online sources such as the Getty provenance index. This included an investigation of potential usage of automatic information extraction of structured data of these online sources.

This model was validated through a case study, where we investigate whether we can capture information from selected sources about an auction (1804), during which the paintings from the former collection of Pieter Cornelis van Leyden (1732-1788) were dispersed. An example work , the Lacemaker, is shown above. Interviews with various art historian validated the produced workflow model.

The workflow model also provides a basic guideline for provenance research and together with the Linked Open Data process can possibly answer relevant research questions for studies in the history of collecting and the art market.

More information can be found in the Final report

Share This:

Digital Humanities in Practice 2018/2019

Last friday, the students of the class of 2018/2019 of the course Digital Humanities and Social Analytics in Practice presented the results of their capstone internship project. This course and project is the final element of the Digital Humanities and Social Analytics minor programme in which students from very different backgrounds gain skills and knowledge about the interdisciplinary topic.

Poster presentation of the DHiP projects

The course took the form of a 4-week internship at an organization working with humanities or social science data and challenges and student groups were asked to use these skills and knowledge to address a research challenge. Projects ranged from cleaning, indexing, visualizing and analyzing humanities data sets to searching for bias in news coverage of political topics. The students showed their competences not only in their research work but also in communicating this research through great posters.

The complete list of student projects and collaborating institutions is below:

  • “An eventful 80 years’ war” at Rijksmuseum identifying and mapping historical events from various sources.
  • An investigation into the use of structured vocabularies also at the Rijksmuseum
  • “Collecting and Modelling Event WW2 from Wikipedia and Wikidata” in collaboration with Netwerk Oorlogsbronnen (see poster image below)
  • A project where an search index for Development documents governed by the NICC foundation was built.
  • “EviDENce: Ego Documents Events modelliNg – how individuals recall mass violence” – in collaboration with KNAW Humanities Cluster (HUC)
  • “Historical Ecology” – where students searched for mentions of animals in historical newspapers – also with KNAW-HUC
  • Project MIGRANT: Mobilities and connection project in collaboration with KNAW-HUC and Huygens ING
  • Capturing Bias with media data analysis – an internal project at VU looking at indentifying media bias
  • Locating the CTA Archive Amsterdam where a geolocation service and search tool was built
  • Linking Knowledge Graphs of Symbolic Music with the Web – also an internal project at VU working with Albert Merono
One of the posters visualizing the events and persons related to the occupation of the Netherlands in WW2
Update: The student posters are now online at

Share This:

Architectural Digital Humanities student projects

In the context of our ArchiMediaL project on Digital Architectural History, a number of student projects explored opportunities and challenges around enriching the dataset. This dataset lists buildings and sites in countries outside of Europe that at the time were ruled by Europeans (1850-1970).

Patrick Brouwer wrote his IMM bachelor thesis “Crowdsourcing architectural knowledge: Experts versus non-experts” about the differences in annotation styles between architecture historical experts  and non-expert crowd annotators. The data suggests that although crowdsourcing is a viable option for annotating this type of content. Also, expert annotations were of a higher quality than those of non-experts. The image below shows a screenshot of the user study survey.

Rouel de Romas also looked at crowdsourcing , but focused more on the user interaction and the interface involved in crowdsourcing. In his thesis “Enriching the metadata of European colonial maps with crowdsourcing”  he -like Patrick- used the Accurator platform, developed by Chris Dijkshoorn. A screenshot is seen below.  The results corroborate the previous study that the in most cases the annotations provided by the participants do meet the requirements provided by the architectural historian; thus, crowdsourcing is an effective method to enrich the metadata of European colonial maps.

Finally, Gossa Lo looked at automatic enrichment using OCR techniques on textual documents for her Mini-Master projcet. She created a specific pipeline for this, which can be seen in the image below. Her code and paper are available on this Github page:

Share This:

The Benefits of Linking Metadata for Internal and External users of an Audiovisual Archive

[This post describes the Master Project work of Information Science students Tim de Bruyn and John Brooks and is based on their theses]

Audiovisual archives adopt structured vocabularies for their metadata management. With Semantic Web and Linked Data now becoming more and more stable and commonplace technologies, organizations are looking now at linking these vocabularies to external sources, for example those of Wikidata, DBPedia or GeoNames.

However, the benefits of such endeavors to the organizations are generally underexplored. For their master project research, done in the form of an internship at the Netherlands Institute for Sound and Vision (NISV), Tim de Bruyn and John Brooks conducted a case study into the benefits of linking the “Common Thesaurus for Audiovisual Archives(or GTAA) and the general-purpose dataset Wikidata. In their approach, they identified various use cases for user groups that are both internal (Tim) as well as external (John) to the organization. Not only were use cases identified and matched to a partial alignment of GTAA and Wikidata, but several proof of concept prototypes that address these use cases were developed. 


For the internal users, three cases were elaborated, including a calendar service where personnel receive notifications when an author of a work has passed away 70 years ago, thereby changing copyright status of the work. This information is retrieved from the Wikidata page of the author, aligned with the GTAA entry (see fig 1 above).

A second internal case involves the new ‘story platform’ of NISV. Here Tim implemented a prototype enduser application to find stories related to the one currently shown to the user, based on persons occuring in that story (fig 2).

The external cases centered around the users of the CLARIAH Media Suite. For this extension, several humanities researchers were interviewed to identify worthwile extensions with Wikidata information. Based on the outcomes of these interviews, John Brooks developed the Wikidata retrieval service (fig 3).

The research presented in the two theses are a good example of User-Centric Data Science, where affordances provided by data linkages are aligned with various user needs. The various tools were evaluated with end users to ensure they match their actual needs. The research was reported in a research paper which will be presented at the MTSR2018 conference: (Victor de Boer, Tim de Bruyn, John Brooks, Jesse de Vos. The Benefits of Linking Metadata for Internal and External users of an Audiovisual Archive. To appear in Proceedings of MTSR 2018 [Draft PDF])

Find out more:

See my slides for the MTSR presentation below


Share This:

Testimonials Digital Humanities minor at DHBenelux2018

At the DHBenelux 2018 conference, students from the VU minor “Digital Humanities and Social Analytics” presented their final DH in Practice work. In this video, the students talk about their experience in the minor and the internship projects. We also meet other participants of the conference talking about the need for interdisciplinary research.


Share This:

DIVE+ in Europeana Insight

This months’ edition of Europeana Insight features articles from this year’s LODLAM Challenge finalists, which include the winner: DIVE+. The online article “DIVE+: EXPLORING INTEGRATED LINKED MEDIA” discusses the DIVE+ User studies, data enrichment, exploratory interface and impact on the cultural heritage domain.

The paper was co-authored by Victor de Boer, Oana Inel, Lora Aroyo, Chiel van den Akker, Susane Legene, Carlos Martinez, Werner Helmich, Berber Hagendoorn, Sabrina Sauer, Jaap Blom, Liliana Melgar and Johan Oomen

Screenshot of the Europeana Insight article

Share This: