DHBenelux2023 trip report

Two weeks ago, I visited the 2023 edition of the Digital Humanities Benelux conference in Brussels. It turned out this was the 10th anniversary edition, which goes to show that the Luxembourgian, Belgian and Dutch DH community is alive and kicking! This years gathering at the Royal Library of Belgium brought together humanities and computer science researchers and practitioners from the BeNeLux and beyond. Participants got to meet interesting tools, datasets and use cases, all the while critically assessing issues around perspective, representation and bias in each.

On the workshop day, I attended part of a tutorial organized by people from Göttingen University on the use of Linked Data for historical data. They presented a OpenRefine and WikiData-centric pipeline also including a batch wikidata editing tool https://quickstatements.toolforge.org/.

The second half of that day I attended a workshop on the Kiara tool presented by the people behind the Dharpa project. The basic premise of the tool makes a lot of sense: while many DH people use Python notebooks, it is not always clear what operations specific blocks of code map to. Reusing other peoples code becomes difficult and reusing existing data transformation code is not trivial. The solution of Kiara is an environment in which pre-defined well-documented modules are made available so that users can easily, find, select and combine modules for data transformation. For any DH infrastructure, one has to make decisions in what flexibility to offer users. My hunch is that this limited set of operations will not be enough for arbitrary DH-Data Science pipelines and that full flexibility (provided by python notebooks) will be needed. Nevertheless, we have to keep thinking on how infrastructures provide support for pipeline transparency, reusability and cater to less digital literate users.

On the first day of the main conference, Roeland Ordelman presented our own work on the CLARIAH MediaSuite: Towards ’Stakeholder Readiness’ in the CLARIAH Media Suite: Future-Proofing an Audio-Visual Research Infrastructure. This talk was preceded by a very interesting talk from Loren Verreyen who worked with a digital dataset of program guides (I know of similar datasets archived at Beeld and Geluid). Unfortunately, the much awaited third talk on the Distracted Boyfriend meme was cancelled.

Interesting talks on the first day included a presentation by Paavo Van der Eecken on capturing uncertainty in manually annotating images. This work “Thinking Outside of the Bounding Box: A Reconsideration of the Application of Computational Tools on Uncertain Humanities Data” and its main premise that disagreement is a valuable signal are reminiscent of the CrowdTruth approach.

A very nice duo-presentation was given by Daria Kondakova and Jakob Kohler on Messy Myths: Applying Linked Open Data to Study Mythological Narratives. This paper uses the theoretical framework of Zgol to back up the concept of hylemes to analyze mythological texts. Such hylemes are triple-like statements (subject-verb-object) that describe events in text. In the context of the project, these hylemes were then converted to full-blown Linked Open Data to allow for linking and comparing versions of myths. A research prototype can be found here https://dareiadareia-messy-myths.streamlit.app/ .

The GLOBALISE project was also present at the conference with presentation about the East-Asian shipping vocabulary and a poster.

https://twitter.com/victordeboer/status/1664279204823986184

At the poster session, I had the pleasure to present a poster from students of the VU DH minor and their supervisors on a tool to identify and link occupations in biographical descriptions.

VU DH Minor students’ poster https://twitter.com/victordeboer/status/1664199079251832832

The keynote by Patricia Murrieta-Flores from University of Lancaster introduced the concept of Cosmovision with respect to the archiving and enrichment of (colonial) heritage objects from meso-America. This concept of Cosmovision is very related to our polyvocality aims and the connection to computer vision is inspiring if not very challenging.

It is great to see that DHBenelux continues to be a very open and engaging community of humanities and computer science people, bringing together datasets, tools, challenges and methods.

Share This:

News article on Knowledge Graphs for Maritime history

In the latest edition of the trade publication E-Data & Research, a nice article (in Dutch) about our research on knowledge graphs for maritime history is published. Thanks to Mathilde Jansen and of course my collaborators Stijn Schouten and Marieke van Erp! The image below shows the print article, the article can be found online here.

Share This:

A Polyvocal and Contextualised Semantic Web

[This post is the text of a 1-minute pitch at the IWDS symposium for our poster “A Polyvocal and Contextualised Semantic Web” which was published as the paper”Erp, Marieke van, and Victor de Boer. “A Polyvocal and Contextualised Semantic Web.” European Semantic Web Conference. Springer, Cham, 2021.”]

Knowledge graphs are a popular way of representing and sharing data, information and knowledge in many domains on the Semantic Web. These knowledge graphs however often represent singular -biased- views on the word, this can lead to unwanted bias in AI using this data. We therefore identify a need a more polyvocal Semantic Web.

So. How do we get there?

  1. We need perspective-aware methods for identifying existing polyvocality in datasets and for acquiring it from text or users.
  2. We need datamodels and patterns to represent polyvocal data information and knowledge.
  3. We need visualisations and tools to make the polyvocal knowledge accessible and usable for a wide variety of users, including domain experts or laypersons with varying backgrounds.

In the Cultural AI Lab, we investigate these challenges in several interrelated research projects, but we cannot do it, and should not do it alone and are looking for more voices to join us!

Share This:

Modeling Ontologies for Individual Artists

[This post presents research done by Daan Raven in the context of his Master Project Information Sciences]

There is a long tradition in the Cultural Heritage domain of using structured, machine-interoperable knowledge using semantic methods and tools. However, research into developing and using ontologies specific to works of art of individual artists is persistently lacking. Such knowledge graphs would improve access to heritage information by making reasoning and inferencing possible. In his research, Daan Raven developed and applied a re-usable method, building on the ‘Methontology’ method for ontology development. We describe the steps of specification, conceptualization, integration, implementation and evaluation in a case study concerning ceramic-glass sculptor Barbara Nanning.

This work was presented at Digital Humanities Benelux 2021. The abstract and presentation as well as other digital resources related to the project can be found below:

Below are some examples of competency questions with pointers to SPARQL queries in YASGUI.

Which artworks in the Verre Églomisé collection of Nanning are currently stored in her private collection?https://api.triplydb.com/s/wKZG4UFq5
Show me a timeline of all process that require the use of an Annealing Kilnhttps://api.triplydb.com/s/j4Qk0tHzK
 # Show me all process steps that require the use of an annealing kiln and that have a landing page
https://api.triplydb.com/s/N5mo4uTM3
Show me (in Gallery) all objects made by “Jiří Pačinek Glass Lindava” (person in Wikidata)https://api.triplydb.com/s/C6LsEgiZF
Show me (in Geo) the locations of creation steps for various works (uses geonames)https://api.triplydb.com/s/THTkhOYjd

Share This:

Student-supported project in the news

It was great to see that one of this year’s Digital Humanities in Practice projects lead to a conversation between the students in that project Helene Ayar and Edith Brooks, their external supervisors Willemien Sanders (UU) and Mari Wigham (NISV) and an advisor for another project André Krouwel (VU). That conversation resulted in original research and CLARIAH MediaSuite data story “‘Who’s speaking?’- Politicians and parties in the media during the Dutch election campaign 2021” where the content of news programmes was analysed for politicians’ names, their gender and party affiliation.

The results are very interesting and subsequently appeared on Dutch news site NOS.nl, showing that right-wing politicians are more represented on radio and tv: “Onderzoek: Rechts domineert de verkiezingscampagne op radio en tv“. Well done and congratulations!

Share This:

Digital Humanities in Practice 2020-2021

This year’s edition of the VU Digital Humanities in Practice course was of course a virtual one. In this course, students of the Minor Digital Humanities and Social Analytics put everything that they have learned in that minor in practice, tackling a real-world DH or Social Analytics challenge. As in previous years, this year we had wonderful projects provided and supervised by colleagues from various institutes. We had projects related to the Odissei and Clariah research infrastructures, projects supervised by KNAW-HUC, Stadsarchief Amsterdam, projects from Utrecht University, UvA, Leiden University and our own Vrije Universiteit. We had a project related to Kieskompas and even a project supervised by researchers from Bologna University. A wide variety of challenges, datasets and domains! We would like to thank all the supervisors and the students on making this course a success.

The compilation video below shows all the projects’ results. It combines 2-minute videos produced by each of the 10 student groups.

After a very nice virtual poster session, everybody got to vote on the Best Poster Award. The winners are group 3, whose video you can also see in the video above. Below we list all the projects and the external supervisors.

1Extracting named entities from Social Science data.ODISSEI project / VU CS – Ronald Siebes
2Gender bias data story in the Media SuiteCLARIAH project / UU / NISV –  Mari Wigham Willemien Sanders
3Food & SustainabilityKNAW-HUC –  Marieke van Erp
4Visualizing Political Opinion (kieskompas)Kieskompas – Andre Krouwel
5Kickstarting the HTR revolutionUU – Auke Rijpma
6Reconstructing the international crew and ships of the Dutch West India CompanyStadsarchief Amsterdam – Pauline van den Heuvel
7Enriching audiovisual encyclopediasNISV – Jesse de Vos
8Using Social Media to Uncover How Patients CopeLIACS Leiden – Anne Dirkson
9Covid-19 CommunitiesUvA – Julia Noordegraaf, Tobias Blanke, Leon van Wissen
10Visualizing named graphsUni Bologna – Marilena Daquino

Share This:

Linked Data Scopes

At this year’s Metadata and Semantics Research Conference (MTSR2020), I just presented our work on Linked Data Scopes: an ontology to describe data manipulation steps. The paper was co-authored with Ivette Bonestroo, one of our Digital Humanities minor students as well as Rik Hoekstra and Marijn Koolen from KNAW-HUC. The paper builds on earlier work by the latter two co-authors and was conducted in the context of the CLARIAH-plus project.

This figure shows envisioned use of the ontology: scholarly output is not only the research paper, but also an explicit data scope. This data scope includes (references to) datasets.

With the rise of data driven methods in the humanities, it becomes necessary to develop reusable and consistent methodological patterns for dealing with the various data manipulation steps. This increases transparency, replicability of the research. Data scopes present a qualitative framework for such methodological steps. In this work we present a Linked Data model to represent and share Data Scopes. The model consists of a central Data scope element, with linked elements for data Selection, Linking, Modeling, Normalisation and Classification. We validate the model by representing the data scope for 24 articles from two domains: Humanities and Social Science.

The ontology can be accessed at http://biktorrr.github.io/datascope/ .

You can do live sparql queries on the extracted examples as instances of this ontology at https://semanticweb.cs.vu.nl/test/query

You can watch a pre-recorded video of my presentation below. Or you can check out the slides here [pdf]

Share This:

Listening to AI: ARIAS workshop report

Last week, I attended the second workshop of the ARIAS working group of AI and the Arts. ARIAS is a platform for research on Arts and Sciences and as such seeks to build a bridge between these disciplines. The new working group is looking at the interplay between Arts and AI specifically. Interestingly, this is not only about using AI to make art, but also to explore what art can do for AI (research). The workshop fell under the thematic theme for ARIAS “Art of Listening to the Matter” and consisted of a number of keynote talks and workshop presentations/discussions.

The workshop at the super-hip Butcher’s Tears in Amsterdam, note the 1.5m COVID-distance.

UvA university professor Tobias Blanke kicked off the meeting with an interesting overview of the different ‘schools’ of AI and how they relate to the humanities. Quite interesting was the talk by Sabine Niederer (a professor of visual methodologies at HvA) and Andy Dockett . They presented the results of an experiment feeding Climate Fiction (cli-fi) texts to the famous GPT algorithm. The results were then aggregated, filtered and visualized in a number of rizoprint-like pamflets.

My favourite talk of the day was by writer and critic Flavia Dzodan. Her talk was quite incendiary as it presented a post-colonial perspective on the whole notion of data science. Her point being that data science only truly started with the ‘discoveries’ of the Americas, the subsequent slave-trade and the therefor required counting of people. She then proceeded by pointing out some of the more nefarious examples of identification, classification and other data-driven ways of dealing with humans, especially those from marginalized groups. Her activist/artistic angle to this problem was to me quite interesting as it tied together themes around representation, participation that appear in the field of ICT4D and those found in AI and (Digital) Humanities. Food for thought at least.

The afternoon was reserved for talks from three artists that wanted to highlight various views on AI and art. Femke Dekker, S. de Jager and Martina Raponi all showed various art projects that in some way used AI technology and reflected on the practice and philosophical implications. Again, here GPT popped up a number of times, but also other methods of visual analysis and generative models.

Share This:

Linked Art Provenance

In the past year, together with Ingrid Vermeulen (VU Amsterdam) and Chris Dijkshoorn (Rijksmuseum Amsterdam), I had the pleasure to supervise two students from VU, Babette Claassen and Jeroen Borst, who participated in a Network Institute Academy Assistant project around art provenance and digital methods. The growing number of datasets and digital services around art-historical information presents new opportunities for conducting provenance research at scale. The Linked Art Provenance project investigated to what extent it is possible to trace provenance of art works using online data sources.

Caspar Netscher, the Lacemaker, 1662, oil on canvas. London: the Wallace Collection, P237

In the interdisciplinary project, Babette (Art Market Studies) and Jeroen (Artificial Intelligence) collaborated to create a workflow model, shown below, to integrate provenance information from various online sources such as the Getty provenance index. This included an investigation of potential usage of automatic information extraction of structured data of these online sources.

This model was validated through a case study, where we investigate whether we can capture information from selected sources about an auction (1804), during which the paintings from the former collection of Pieter Cornelis van Leyden (1732-1788) were dispersed. An example work , the Lacemaker, is shown above. Interviews with various art historian validated the produced workflow model.

The workflow model also provides a basic guideline for provenance research and together with the Linked Open Data process can possibly answer relevant research questions for studies in the history of collecting and the art market.

More information can be found in the Final report

Share This:

Digital Humanities in Practice 2018/2019

Last friday, the students of the class of 2018/2019 of the course Digital Humanities and Social Analytics in Practice presented the results of their capstone internship project. This course and project is the final element of the Digital Humanities and Social Analytics minor programme in which students from very different backgrounds gain skills and knowledge about the interdisciplinary topic.

Poster presentation of the DHiP projects

The course took the form of a 4-week internship at an organization working with humanities or social science data and challenges and student groups were asked to use these skills and knowledge to address a research challenge. Projects ranged from cleaning, indexing, visualizing and analyzing humanities data sets to searching for bias in news coverage of political topics. The students showed their competences not only in their research work but also in communicating this research through great posters.

The complete list of student projects and collaborating institutions is below:

  • “An eventful 80 years’ war” at Rijksmuseum identifying and mapping historical events from various sources.
  • An investigation into the use of structured vocabularies also at the Rijksmuseum
  • “Collecting and Modelling Event WW2 from Wikipedia and Wikidata” in collaboration with Netwerk Oorlogsbronnen (see poster image below)
  • A project where an search index for Development documents governed by the NICC foundation was built.
  • “EviDENce: Ego Documents Events modelliNg – how individuals recall mass violence” – in collaboration with KNAW Humanities Cluster (HUC)
  • “Historical Ecology” – where students searched for mentions of animals in historical newspapers – also with KNAW-HUC
  • Project MIGRANT: Mobilities and connection project in collaboration with KNAW-HUC and Huygens ING
  • Capturing Bias with media data analysis – an internal project at VU looking at indentifying media bias
  • Locating the CTA Archive Amsterdam where a geolocation service and search tool was built
  • Linking Knowledge Graphs of Symbolic Music with the Web – also an internal project at VU working with Albert Merono
One of the posters visualizing the events and persons related to the occupation of the Netherlands in WW2
Update: The student posters are now online at https://github.com/biktorrr/dhip2019posters

Share This: