Capturing Polyvocality of Cultural Heritage Events Through Crowdsourcing

[This post is based on Mohamad Fernanda‘s Master Information Science thesis]

Cultural heritage event annotation often lacks diverse perspectives, resulting in incomplete or biased historical records. Master Information Science student Mohamad Fernanda’s research addresses that gap by examining how crowdsourcing can support polyvocality—bringing a broader range of viewpoints into the annotation process. His central research question is: How can annotations for cultural heritage events be sourced effectively and ethically to achieve polyvocality?

To explore this, he conducted a study in which he:

  1. gathered qualitative survey responses from 22 participants across three groups with different cultural backgrounds: a) native Dutch individuals, b) native Indonesians, and c) people of Dutch-Indonesian heritage;
  2. investigated how Large Language Models can recognize and synthesize polyvocal data.

The findings show that crowdsourcing can successfully capture multiple perspectives, resulting in a richer and more nuanced historical narrative. While LLMs offer promising support for analyzing such data, their use demands careful oversight and ethical consideration. Overall, this study demonstrates that a collaborative, ethically informed approach—combining crowdsourcing with LLM assistance—can help produce more balanced and representative accounts of cultural history.

The figure below shows the results of the coding done by the LLM of choice (Gemini 2.0 Flash). It recognizes five themes in the participants responses on questions about the representation of colonialism in Dutch museums:

Figure from Fernanda (2025)

The thesis, including the exact surveys and prompts used can be found below

Share This:

UI for Polyvocal Provenance Reporting

[This post is based on Bella Abelardo‘s Master Information Science thesis, “Designing a User Interface for Provenance Reporting of Objects with Colonial Heritage”]

Bella’s thesis addresses a critical challenge in cultural institutions: representing multiple perspectives for colonial heritage items. Current systems often create a “singular truth” in provenance reports, and unstructured data hinders discoverability.

Bella’s goal was to create a user interface to help provenance researchers holistically document the “polyvocal knowledge” often present in colonial heritage objects. Her research intended to explore improvements to the popular TMS content management system. To this end, she conducted interviews with various domain experts to gather design requirements and built a prototype, CultureSource.

two figures showing the lo-fi design of the improved user interface (imgs: B. Abelardo)

The evaluation showed CultureSource’s potential to help researchers document multiple perspectives. Bella’s research provides key requirements—standardization, multiple perspectives, usability, and data management—for future user interfaces aimed at documenting complex, multi-layered histories.

Share This:

SEMMES keynote: more than one side to the story

I was honored to be asked to give the keynote address for the 2nd edition of the Workshop on Semantic Methods for Events and Stories (SEMMES), at ESWC2024. I talked about work on polyvocality in cultural heritage knowledge graphs:

There is more than one side to every story. This common saying is not only true for works of fiction. In the global data space that is the Semantic Web, views and perspectives from different people, organizations and cultures should be available. I identify three challenges towards such a polyvocal Semantic Web. I will talk about ways to identify various voices, to model different perspectives and to make these perspectives available to end users. I will give examples from the cultural heritage domain, both in how semantic technologies can be of use to make available various perspectives on people, objects and events there but also how insights from the domain can help to shape the polyvocal Semantic Web.

You can find my slides below

Share This:

A Polyvocal and Contextualised Semantic Web

[This post is the text of a 1-minute pitch at the IWDS symposium for our poster “A Polyvocal and Contextualised Semantic Web” which was published as the paper”Erp, Marieke van, and Victor de Boer. “A Polyvocal and Contextualised Semantic Web.” European Semantic Web Conference. Springer, Cham, 2021.”]

Knowledge graphs are a popular way of representing and sharing data, information and knowledge in many domains on the Semantic Web. These knowledge graphs however often represent singular -biased- views on the word, this can lead to unwanted bias in AI using this data. We therefore identify a need a more polyvocal Semantic Web.

So. How do we get there?

  1. We need perspective-aware methods for identifying existing polyvocality in datasets and for acquiring it from text or users.
  2. We need datamodels and patterns to represent polyvocal data information and knowledge.
  3. We need visualisations and tools to make the polyvocal knowledge accessible and usable for a wide variety of users, including domain experts or laypersons with varying backgrounds.

In the Cultural AI Lab, we investigate these challenges in several interrelated research projects, but we cannot do it, and should not do it alone and are looking for more voices to join us!

Share This: