Multitasking Behaviour and Gaze-Following Technology for Workplace Video-Conferencing.

[This post was written by Eveline van Everdingen and describes her M.Sc. project]

Working with multiple monitors is very common at the workplace nowadays. A second monitor can increase work efficiency, structure and a better overview in a job. Even in business video-conferencing, dual monitors are used. Although the purpose of dual monitor use might be clear to the multitasker, this behaviour is not always perceived as positive by their video-conferencing partners.

Eveline2
Gaze direction of the multitasker with the focus on the primary monitor (left), on the dual monitor (middle) or in between two monitors when switching (right).

Results show that multitasking on a dual screen or mobile device is indicated as less polite and acceptable than doing something else on the same screen. Although the multitasker might be involved with the meeting, he or she seems less engaged with the meeting, resulting in negative perceptions.

eveline1
Effect of technology on politeness of multitasking

Improving the sense of eye-contact might result in a better video-conferencing experience with the multitasker, therefore a gaze-following tool with two webcams is designed (code available at https://github.com/een450/MasterProject ). When the multitasker switches to the dual screen, another webcam will catch the frontal view of the multitasker. Indeed, participants indicate the multitasking behaviour as being more polite and acceptable with the dynamic view of the multitasker. The sense of eye-contact is not significantly more positive rated with this experimental design.

These results show that gaze-following webcam technology can be successful to improve collaboration in dual-monitor multitasking.

For more information, read Eveline’s thesis [pdf] or visit the project’s figshare page.

Example of a video presented to the experiment participants.

Share This:

MSc Project: The Implications of Using Linked Data when Connecting Heterogeneous User Information

[This post describes Karl Lundfall‘s MSc Thesis research and is adapted from his thesis]

sms phoneIn the realm of database technologies, the reign of SQL is slowly coming to an end with the advent of many NoSQL (Not Only SQL) alternatives. Linked Data in the form of RDF is one of these, and is regarded to be highly effective when connecting datasets. In this thesis, we looked into how the choice of database can affect the development, maintenance, and quality of a product by revising a solution for the social enterprise Text to Change Mobile (TTC).

TTC is a non-governmental organization equipping customers in developing countries with high-quality information and important knowledge they could not acquire for themselves. TTC offers mobile-based solutions such as SMS and call services and focuses on projects implying a social change coherent with the values shared by the company.

We revised a real-world system for linking datasets based on a much more mainstream NoSQL technology, and by altering the approach to instead use Linked Data. The result (see the figure on the left) was a more modular system living up to many of the promises of RDF.

Overview of the Linked Data-enabled tool to connect multiple heterogeneous databases developed in the context of this Msc Project.
Overview of the Linked Data-enabled tool to connect multiple heterogeneous databases developed in the context of this Msc Project.

On the other hand, we also found that there for this use case are some obstacles in adopting Linked Data. We saw indicators that more momentum needs to build up in order for RDF to gradually mature enough to be easily applied on use cases like this. The implementation we present and demonstrates a different flavor of Linked Data than the common scenario of publishing data for public reuse, and by applying the technology in business contexts we might be able to expand the possibilities of Linked Data.

As a by-product of the research, a Node.js module for Prolog communication with Cliopatria was developed and made available at https://www.npmjs.com/package/prolog-db . This module might illustrate that new applications usingRDF could contribute in creating a snowball effect of improved quality in RDF-powered applications attracting even more practitioners.

Read more in Karl’s MSc. Thesis 

Share This:

MSc. Project: The search for credibility in news articles and tweets

[This post was written by Marc Jacobs and describes his MSc Thesis research]

Nowadays the world does not just rely on traditional news sources like newspapers, television and radio anymore. Social Media, such as Twitter, are claiming their key position here, thanks to the fast publishing speed and large amount of items. As one may suspect, the credibility of this unrated news becomes questionable. My Master thesis focuses on determining measurable features (such as retweets, likes or number of Wikipedia entities) in newsworthy tweets and online news articles.

marc_framework
Credibility framework pyramid


The gathering of the credibility features consisted of two parts: a theoretical and practical part. First, a theoretical credibility framework has been built using recent studies about credibility on the Web. Next, Ubuntu was booted, Python was started, and news articles and tweets, including metadata, were mined. The news items have been analysed, and, based on the credibility framework, features were extracted. Additional information retrieval techniques (website scraping, regular expressions, NLTK, IR-API’s) were used to extract additional features, so the coverage of the credibility framework was extended.

marc_pipeline
The data processing and experimentation pipeline

The last step in this research was to present the features to the crowd in an experimental design, using the crowdsourcing platform Crowdflower. The correlation between a specific feature and the credibility of the tweet or news article has been calculated. The results have been compared to find the differences and similarities between tweets and articles.

The highly correlated credibility features (which include the amount of matches with Wikipedia entries) may be used in the future for the construction of credibility algorithms that automatically assess the credibility of newsworthy tweets or news articles, and, hopefully, adds support to filter reliable news from the impenetrable pile of data on the Internet.

Read all the details in Marc’s thesis

Share This:

MSc. project: Requirements and design for a Business Intelligence system for SMEs

[This post was written by Arnold Kraakman and describes his MSc Thesis research] .

This master project is written as an advisory report for construction company and contractor K. Dekker B.V. and deals with Business Intelligence. Business Intelligence (BI) is a term that refers to information which can be used to make business decisions. The master thesis answers the question about what options are available for K. Dekker to implement BI within two years from the moment of writing. The research is done through semi-structured interviews and data mining. The interviews are used to gain a requirement list based on feedback the final users and with this list is a concept dashboard made, which could be used by K. Dekker. Having a BI dashboard is one of the solutions about what to do with their information to eventually implement Business Intelligence.

arnoldscr2
concept dashboard – project result in detail

Screenshot #1 shows an overview of the current running project, with the financial forecast. Most interviewees did not know which projects were currently running and done by K. Dekker B.V. Screenshot #2 shows the project characteristics and their financial result, this was the biggest must-have on the requirements list. A construction project has different characteristics, for example a bridge, made in Noord-Holland with a specific tender procedure and a specific contract form (for example: “design the whole project and build it as well” instead of only building it). Those characteristics could influence the final financial profit.

concept dashboard – project overview
concept dashboard – project overview

The thesis includes specific recommendations to K. Dekker to realize BI within two years from now on. This list is also generalized to Small and Medium-sized Enterprises (SMEs). These recommendations include that work instructions are made for ERP software therefore that everyone knows what and how information has to filled into the system. With incorrect entered data, the made decisions on this information could be incorrect as well. It is also recommended to make a project manager responsible for all the entered information. This will lead to better and more correct information and therefore the finally made business decisions are more reliable.

You can download the thesis here: arnold_kraakman_final_thesis

Share This:

Msc. Project: Linking Maritime Datasets to Dutch Ships and Sailors Cloud – Case studies on Archangelvaart and Elbing

[This post was written by Jeroen Entjes and describes his Msc Thesis research]

The Dutch maritime supremacy during the Dutch Golden Age has had a profound influence on the modern Netherlands and possibly other places around the globe. As such, much historic research has been done on the matter, facilitated by thorough documentation done by many ports of their shipping. As more and more of these documentations are digitized, new ways of exploring this data are created.

screenshot1
Screenshot showing an entry from the Elbing website

This master project uses one such way. Based on the Dutch Ships and Sailors project digitized maritime datasets have been converted to RDF and published as Linked Data. Linked Data refers to structured data on the web that is published and interlinked according to a set of standards. This conversion was done based on requirements for this data, set up with historians from the Huygens ING Institute that provided the datasets. The datasets chosen were those of Archangel and Elbing, as these offer information of the Dutch Baltic trade, the cradle of the Dutch merchant navy that sailed the world during the Dutch Golden Age.

Along with requirements for the data, the historians were also interviewed to gather research questions that combined datasets could help solve. The goal of this research was to see if additional datasets could be linked to the existing Dutch Ships and Sailors cloud and if such a conversion could help solve the research questions the historians were interested in.
Data visualization showing shipping volume of different datasets.

elbing graphAs part of this research, the datasets have been converted to RDF and published as Linked Data as an addition to the Dutch Ships and Sailors cloud and a set of interactive data visualizations have been made to answer the research questions by the historians. Based on the conversion, a set of recommendations are made on how to convert new datasets and add them to the Dutch Ships and Sailors cloud. All data representations and conversions have been evaluated by historians to assess the their effectiveness.

The data visualizations can be found at http://www.entjes.nl/jeroen/thesis/. Jeroen’s thesis can be found here: Msc. Thesis Jeroen Entjes

Share This:

MSc. Project Roy Hoeymans: Effective Recommendation in Knowlegde Portals – the SKYbrary case study

[This post was written by Roy Hoeymans. It describes his MSc. project ]

In this master project, which I have done externally at DNV-GL, I have built a recommender system for knowledge portals. Recommender systems are pieces of software that provide suggestions for related items to a user. My research focuses on the application of a recommender system in knowledge portals. A knowledge portal is an online single point of access to information or knowledge on a specific subject. Examples of knowledge portals are SKYbrary (www.skybrary.aero) or Navipedia (www.navipedia.org).

skybrary logoPart of this project was a case study on SKYbrary, a knowledge portal on the subject of aviation safety. In this project I looked at the types of data that are typically available to knowledge portals. I used user navigation pattern data, which I retrieved via the Google Analytics API, and the text of the articles to create a user-navigation based and a content based algorithm. The user-navigation based algorithm uses an item association formula and the content based algorithm uses a tf-idf weighting scheme to calculate content similarity between articles. Because both types of algorithm have their separate disadvantages, I also developed a hybrid algorithm that combines these two.

Screenshot of the demo application
Screenshot of the demo application

To see which type of algorithm was the most effective, I conducted a survey to the content editors of SKYbrary, who are domain experts on the subject. Each question in the survey showed an article and then recommendations for that article. The respondent was then asked to rate each recommended article on a scale from 1 (completely irrelevant) to 5 (very relevant). The results of the survey showed that the hybrid algorithm algorithm is, which a statistical significant difference, better than a user-navigation based algorithm. A difference between the hybrid algorithm and the content-based algorithm was not found however. Future work might include a more extensive or different type of evaluation.

In addition to the research I have done on the algorithms, I have also developed a demo application in which the content editors of SKYbrary can use to show recommendations for a selected article and algorithm.

For more informaton, view Roy Hoeymans’ Thesis Presentation [pdf] or read the thesis [Academia].

Share This:

Two TPDL papers accepted!

Today, the TPDL (International Conference on Theory and Practice of Digital Libraries) results came in and both papers on which I am a co-author got accepted. Today is a good day 🙂 tess_algThe first paper, we present work done during my stay at Netherlands Institute for Sound and Vision on automatic term extraction from subtitles. The interesting thing about this paper was that it was mainly how these algorithms were functioning in a ‘real’ context, that is within a larger media ecosystem. The paper was co-authored with Roeland Ordelman and Josefien Schuurman.

Screenshot of the QHP toolOn the second paper, I am one of the co-authors. In the paper “Supporting Exploration of Historical Perspectives across Collections”, we present an exploratory search application that highlights different perspectives on World War II across collections (including Verrijkt Koninkrijk). The project is funded by the Amsterdam Data Science seed project with Daan Odijk, research assistants Cristina Gârbacea and Thomas Schoegje, VU/CWI-colleagues Laura Hollink and Jacco van Ossenbruggen and  historian Kees Ribbens (NIOD). You can read more about it on Daan’s blog.

Share This:

A Sugar Activity for Subsistence Farmers

[reblogged from http://worldwidesemanticweb.org/2015/03/06/a-sugar-activity-for-subsistence-farmers/ This post is written by Tom Jansen]

Screenshot of the Sugar activity (Tom Jansen)
Screenshot of the Sugar activity (Tom Jansen)

Subsistence farming or agriculture is a form of farming where farmers mainly focus on growing enough food to be self-sufficient. Especially in African countries, where people are very dependent of own-grown food, this type of farming is very common. Subsistence farming, however, in these countries has so much to gain and has so much potential. Improving the farming skills of the farmers could make significant contributions to the reduction of hunger. Unfortunately, farmers often haven’t had enough agricultural education to optimally grow their own food. To help these farmers, I developed an activity that will improve their farming skills. The application helps the farmers to identify diseases of their crops and animals and will present them ways to manage the diseases and prevent them in the future. Giving them an opportunity to manage diseases of their crops and livestock means giving them an opportunity to improve their harvest. The opportunity of a bigger harvest could be a substantial contribution to a better way of living for farmers in (a.o) West Africa.

The activity is Sugar based and is therefore perfectly suitable for the XO-Laptops that are commonly used in West Africa. The activity revolves around a database with a lot of information about diseases of crops and livestock. When the farmer opens the activity, he will be led through two menus with possibilities. When the right crop or livestock is selected, a list with diseases will be shown containing identification possibilites for a particular diseases. When the farmer notices that one description of the disease is very similar to what is happening to his crops or livestock, he clicks on the disease. When the choice is made another window pops up showing the information the farmer needs to manage and prevent the disease.

Right now it is only possible to access the database and read the information inside the database. What would improve the activity is a way where farmers can access the database and not only read, but also change and add information from the database. This way the information and thus the quality of the activity could be improved without any help from the outside.

The activity can be found on the following page (containing all the code): https://github.com/WorldWideSemanticWeb/farming-activity

Read the full report here: Helping Subsistence Farmers in West Africa

Share This:

Linked Data for International Aid Transparency Initiative

In August 2013, VU Msc. student Kasper Brandt finished his thesis on developing, implementing and testing a Linked Data model for the International Aid Transparency Initiative (IATI). Now, more than a year later, that work was accepted for publication in the Journal on Data Semantics. We are very happy with this excellent result.

Model fragment
Model fragment

IATI is a multi-stakeholder initiative that seeks to improve the transparecy of development aid and to that end developed an open standard for the publication of aid information. Hundreds of NGOs and governments have registered to the IATI registry by publishing their aid activities in this XML standard. Taking the IATI model as an input, we have created a Linked Data model based on requirements elicitated from qualitative interviews using an iterative requirements engineering methodology. We have converted the IATI open data from a central registry to Linked Data and linked it to various other datasets such as World Bank indicators and DBPedia information. This dataset is made available for re-use at http://semanticweb.cs.vu.nl/iati .

burundi country page
Screenshot of an application bringing together information from multiple datasets

To demonstrate the added value of this Linked Data approach, we have created several applications which combine the information from the IATI dataset and the datasets it was linked to.  As a result, we have shown that creating Linked Data for the IATI dataset and linking it to other datasets give new valuable insights in aid transparency. Based on actual information needs of IATI users, we were able to show that linking IATI data adds significant value to the data and is able to fulfill the needs of IATI users.

A draft of the paper can be found here.

Share This:

Master Project Esra Atesçelik: Cluster Analysis Applied to Europana

[This post was written by Esra Atesçelik. It describes her MSc. project supervised  by Antoine Isaac and myself]

The digital libraries and aggregators such as Europeana provide access to millions of Cultural Heritage Objects (CHOs). Europeana is one of the libraries which does not maintain collection-level metadata. Europeana can cluster the objects that have common information with each other. It can use collection-level information to organize results and help users.

Karola Torkos - Cluster earrings (click to view on Flickr)In this project we want to show how we can cluster the objects from Europeana datasets. We also aim at finding the best way of clustering on Europeana metadata and the best parametric setting for clustering. We apply various clustering methods on Europeana metadata and aim at proposing a clustering technique that is most appropriate to group Europeana CHOs. In the experiments we evaluated the cluster results manually, on qualitative and quantitative level.

The results of experiments showed that it is difficult to define the best parametric setting and best clustering method only based on a number of experiments. However, we have shown a way to cluster Europeana objects which may be useful for Europeana.

View Esra’s presentation [pdf] and her thesis [pdf]

 

Share This: