Exploring Culinary Links with NLP and Knowledge Graphs

[This post is based on Nour al Assali‘s bachelor AI thesis]

Nour’s research explores the use of Natural Language Processing (NLP) and Knowledge Graphs to investigate the historical connections and cultural exchanges within global cuisines. The thesis “Flavours of History: Exploring Historical and Cultural Connections Through Ingredient Analysis Using NLP and Knowledge Graphs” describes a method for analyzing ingredient usage patterns across various cuisines by processing a dataset of recipes. Its goal is to trace the diffusion and integration of ingredients into different culinary traditions. The primary aim is to establish a digital framework for addressing questions related to culinary history and cultural interactions.

The methodology involves applying NLP to preprocess recipe data, focusing on extracting and normalizing ingredient names. The pipeline contains steps for stop word removal, token- and lemmatization, character replacements etc.

With the results, a Knowledge Graph is constructed to map relationships between ingredients, recipes, and cuisines. The approach also includes visualizing these connections, with an interactive map and other tools designed to provide insights into the data and answer key research questions. The figure below shows a visualisation of top ingredients per cuisine.

Case studies on ingredients such as pistachios, tomatoes, basil, olives, and cardamom illustrate distinct usage patterns and origins. The findings reveal that certain ingredients—like pistachios, basil, and tomatoes—associated with specific regions have gained widespread international popularity, while others, such as olives and cardamom, maintain strong ties to their places of origin. This research underscores the influence of historical trade routes and cultural exchanges on contemporary culinary practices and offers a digital foundation for future investigations into culinary history and food culture.

The code and dataset used in this research are available on GitHub: https://github.com/Nour-alasali/BPAI. The complete thesis can be found below.

Share This:

Thesis writing guidelines

As supervisor of many MSc and BSc theses, I find myself giving writing tips and guidelines quite often. Inspired by Jan van Gemert’s guidelines, I compiled my own document with tips and guidelines for writing an CS/AI/IS bachelor or master thesis. These are things that I personally care about and other lecturers might have different ideas. Also, this is by no means a complete list and I will use it as a living document. You can find it here: https://tinyurl.com/victorthesiswriting

Share This:

Multitasking Behaviour and Gaze-Following Technology for Workplace Video-Conferencing.

[This post was written by Eveline van Everdingen and describes her M.Sc. project]

Working with multiple monitors is very common at the workplace nowadays. A second monitor can increase work efficiency, structure and a better overview in a job. Even in business video-conferencing, dual monitors are used. Although the purpose of dual monitor use might be clear to the multitasker, this behaviour is not always perceived as positive by their video-conferencing partners.

Gaze direction of the multitasker with the focus on the primary monitor (left), on the dual monitor (middle) or in between two monitors when switching (right).

Results show that multitasking on a dual screen or mobile device is indicated as less polite and acceptable than doing something else on the same screen. Although the multitasker might be involved with the meeting, he or she seems less engaged with the meeting, resulting in negative perceptions.

Effect of technology on politeness of multitasking

Improving the sense of eye-contact might result in a better video-conferencing experience with the multitasker, therefore a gaze-following tool with two webcams is designed (code available at https://github.com/een450/MasterProject ). When the multitasker switches to the dual screen, another webcam will catch the frontal view of the multitasker. Indeed, participants indicate the multitasking behaviour as being more polite and acceptable with the dynamic view of the multitasker. The sense of eye-contact is not significantly more positive rated with this experimental design.

These results show that gaze-following webcam technology can be successful to improve collaboration in dual-monitor multitasking.

For more information, read Eveline’s thesis [pdf] or visit the project’s figshare page.

Example of a video presented to the experiment participants.

Share This:

MSc. Project: The search for credibility in news articles and tweets

[This post was written by Marc Jacobs and describes his MSc Thesis research]

Nowadays the world does not just rely on traditional news sources like newspapers, television and radio anymore. Social Media, such as Twitter, are claiming their key position here, thanks to the fast publishing speed and large amount of items. As one may suspect, the credibility of this unrated news becomes questionable. My Master thesis focuses on determining measurable features (such as retweets, likes or number of Wikipedia entities) in newsworthy tweets and online news articles.

Credibility framework pyramid

The gathering of the credibility features consisted of two parts: a theoretical and practical part. First, a theoretical credibility framework has been built using recent studies about credibility on the Web. Next, Ubuntu was booted, Python was started, and news articles and tweets, including metadata, were mined. The news items have been analysed, and, based on the credibility framework, features were extracted. Additional information retrieval techniques (website scraping, regular expressions, NLTK, IR-API’s) were used to extract additional features, so the coverage of the credibility framework was extended.

The data processing and experimentation pipeline

The last step in this research was to present the features to the crowd in an experimental design, using the crowdsourcing platform Crowdflower. The correlation between a specific feature and the credibility of the tweet or news article has been calculated. The results have been compared to find the differences and similarities between tweets and articles.

The highly correlated credibility features (which include the amount of matches with Wikipedia entries) may be used in the future for the construction of credibility algorithms that automatically assess the credibility of newsworthy tweets or news articles, and, hopefully, adds support to filter reliable news from the impenetrable pile of data on the Internet.

Read all the details in Marc’s thesis

Share This: