Matching B Corporations via a Hybrid Information Retrieval Framework

[This post is based on Yitong Tang‘s Master Information Science thesis, conducted in an internship project with 2CoolMonkeys B.V]

In her master’s thesis, Yitong Tang presents the design, implementation, and evaluation of a hybrid information retrieval system that integrates structured filtering with LLMs to enhance partner matching within the B Corporations ecosystem. B Corporations are companies that have undergone verification to meet high standards of social and environmental performance, transparency, and accountability. As value-driven business networks continue to grow, organizations increasingly face the challenge of identifying suitable strategic partners within vast, unstructured datasets.

This thesis introduces a partner-matching system built on a hybrid information retrieval framework. Following a Design Science Research methodology, she iteratively developed and refined the system across four prototypes. The final architecture combines the efficiency of a traditional search engine (Elasticsearch) for structured data filtering with the deep semantic capabilities of an advanced LLM used to interpret unstructured text, such as company descriptions and mission statements.

Screenshot of the recommender system (see demonstration on youtube).

To evaluate the system, she employed a mixed-methods approach that included quantitative performance metrics—precision, recall, and response time—as well as qualitative interviews with B Corp stakeholders. The findings show that the final hybrid model significantly outperforms baseline approaches, offering more accurate, contextually relevant, and well-reasoned recommendations. Its strengths are particularly evident in interpreting complex natural-language queries and incorporating geographical constraints.

Overall, this research contributes a validated and transferable blueprint for developing intelligent partner-matching systems within private or domain-specific knowledge bases. It demonstrates that combining traditional search technologies with modern LLMs provides a powerful approach for transforming raw organizational data into meaningful and actionable partnership opportunities—an approach with broad potential for mission-driven and complex business networks.

The thesis can be found below. A demonstration video can be found at https://www.youtube.com/watch?v=tcowqxA41p4

Share This:

Capturing Polyvocality of Cultural Heritage Events Through Crowdsourcing

[This post is based on Mohamad Fernanda‘s Master Information Science thesis]

Cultural heritage event annotation often lacks diverse perspectives, resulting in incomplete or biased historical records. Master Information Science student Mohamad Fernanda’s research addresses that gap by examining how crowdsourcing can support polyvocality—bringing a broader range of viewpoints into the annotation process. His central research question is: How can annotations for cultural heritage events be sourced effectively and ethically to achieve polyvocality?

To explore this, he conducted a study in which he:

  1. gathered qualitative survey responses from 22 participants across three groups with different cultural backgrounds: a) native Dutch individuals, b) native Indonesians, and c) people of Dutch-Indonesian heritage;
  2. investigated how Large Language Models can recognize and synthesize polyvocal data.

The findings show that crowdsourcing can successfully capture multiple perspectives, resulting in a richer and more nuanced historical narrative. While LLMs offer promising support for analyzing such data, their use demands careful oversight and ethical consideration. Overall, this study demonstrates that a collaborative, ethically informed approach—combining crowdsourcing with LLM assistance—can help produce more balanced and representative accounts of cultural history.

The figure below shows the results of the coding done by the LLM of choice (Gemini 2.0 Flash). It recognizes five themes in the participants responses on questions about the representation of colonialism in Dutch museums:

Figure from Fernanda (2025)

The thesis, including the exact surveys and prompts used can be found below

Share This: