Matching B Corporations via a Hybrid Information Retrieval Framework

[This post is based on Yitong Tang‘s Master Information Science thesis, conducted in an internship project with 2CoolMonkeys B.V]

In her master’s thesis, Yitong Tang presents the design, implementation, and evaluation of a hybrid information retrieval system that integrates structured filtering with LLMs to enhance partner matching within the B Corporations ecosystem. B Corporations are companies that have undergone verification to meet high standards of social and environmental performance, transparency, and accountability. As value-driven business networks continue to grow, organizations increasingly face the challenge of identifying suitable strategic partners within vast, unstructured datasets.

This thesis introduces a partner-matching system built on a hybrid information retrieval framework. Following a Design Science Research methodology, she iteratively developed and refined the system across four prototypes. The final architecture combines the efficiency of a traditional search engine (Elasticsearch) for structured data filtering with the deep semantic capabilities of an advanced LLM used to interpret unstructured text, such as company descriptions and mission statements.

Screenshot of the recommender system (see demonstration on youtube).

To evaluate the system, she employed a mixed-methods approach that included quantitative performance metrics—precision, recall, and response time—as well as qualitative interviews with B Corp stakeholders. The findings show that the final hybrid model significantly outperforms baseline approaches, offering more accurate, contextually relevant, and well-reasoned recommendations. Its strengths are particularly evident in interpreting complex natural-language queries and incorporating geographical constraints.

Overall, this research contributes a validated and transferable blueprint for developing intelligent partner-matching systems within private or domain-specific knowledge bases. It demonstrates that combining traditional search technologies with modern LLMs provides a powerful approach for transforming raw organizational data into meaningful and actionable partnership opportunities—an approach with broad potential for mission-driven and complex business networks.

The thesis can be found below. A demonstration video can be found at https://www.youtube.com/watch?v=tcowqxA41p4

Share This:

MSc. Project Roy Hoeymans: Effective Recommendation in Knowlegde Portals – the SKYbrary case study

[This post was written by Roy Hoeymans. It describes his MSc. project ]

In this master project, which I have done externally at DNV-GL, I have built a recommender system for knowledge portals. Recommender systems are pieces of software that provide suggestions for related items to a user. My research focuses on the application of a recommender system in knowledge portals. A knowledge portal is an online single point of access to information or knowledge on a specific subject. Examples of knowledge portals are SKYbrary (www.skybrary.aero) or Navipedia (www.navipedia.org).

skybrary logoPart of this project was a case study on SKYbrary, a knowledge portal on the subject of aviation safety. In this project I looked at the types of data that are typically available to knowledge portals. I used user navigation pattern data, which I retrieved via the Google Analytics API, and the text of the articles to create a user-navigation based and a content based algorithm. The user-navigation based algorithm uses an item association formula and the content based algorithm uses a tf-idf weighting scheme to calculate content similarity between articles. Because both types of algorithm have their separate disadvantages, I also developed a hybrid algorithm that combines these two.

Screenshot of the demo application
Screenshot of the demo application

To see which type of algorithm was the most effective, I conducted a survey to the content editors of SKYbrary, who are domain experts on the subject. Each question in the survey showed an article and then recommendations for that article. The respondent was then asked to rate each recommended article on a scale from 1 (completely irrelevant) to 5 (very relevant). The results of the survey showed that the hybrid algorithm algorithm is, which a statistical significant difference, better than a user-navigation based algorithm. A difference between the hybrid algorithm and the content-based algorithm was not found however. Future work might include a more extensive or different type of evaluation.

In addition to the research I have done on the algorithms, I have also developed a demo application in which the content editors of SKYbrary can use to show recommendations for a selected article and algorithm.

For more informaton, view Roy Hoeymans’ Thesis Presentation [pdf] or read the thesis [Academia].

Share This: