[This post is written by André Baart and describes his MSc thesis]
While the internet usage in the developing world is still low, the adoption of simple mobile phones is widespread. A way to offer the advantages of the internet to these populations is voice-based information systems. The KasaDaka voice-services platform is aimed at providing voice-services in the context of ICT for Development (ICT4D). The platform is based on a Raspberry Pi and a GSM modem, which enables affordable voice-service hosting, using the locally available GSM network. The platform takes into account the special requirements of the ICT4D context, such as limited internet connectivity and low literacy rates.
This research focuses on lowering the barrier to entry of voice-service development, by reducing the skill set needed to do so. A Voice Service Development Kit (VSDK) is developed that allows the development of voice-services by deploying and customizing provided building-blocks. These building blocks each represent a type of interaction that is often found in voice-services. (for example a menu, user voice input or the playback of a message) The researcher argues that the simplification of voice-service development is an essential step towards sustainable voice-services in the ICT4D context; As this increases the potential number of local voice-service developers, hremoving the dependency on foreign (and thus expensive) developers and engineers. This simplification should ideally be achieved by providing a graphical interface to voice-service development.
The VSDK was evaluated during the ICT4D course at the Vrije Universiteit Amsterdam, where students built applications for various ICT4D use-cases using the VSDK. Afterwards a survey was conducted, which provided insight on the students’ experiences with voice-service development and the VSDK. From the results of the evaluation is concluded that the building-block approach to voice-service development used in the VSDK, is successful for the development of simple voice-services. It allows newcomers to (voice-service) development, to quickly develop (simple) voice-services from a graphical interface, without requiring programming experience.
The VSDK combined with the existing KasaDaka platform provides a good solution to the hosting and development of voice-services in the ICT4D context.
More details can be found in the complete thesis.A slidedeck is included below. You can find the VSDK code on Andre’s Github: http://github.com/abaart/KasaDaka-VSDK
This post describes the MSc theses of Ana-Liza Tjon-a-Pauw and Josien Jansen.
As a semantic web researcher, it is hard to sometimes not see ontologies and triples in aspects of my private life. In this case, through my contacts with dancers and choreographers, I have since a long time been interested in exploring knowledge representation for dance. After a few failed attempts to get a research project funded, I decided to let enthusiastic MSc. students have a go to continue with this exploration. This year, two Information Sciences students, Josien Jansen and Ana-Liza Tjon-a-Pauw, were willing to take up this challenge, with great success. With their background as dancers they did not only have the necessary background knowledge at but also access to dancers who could act as study and test subjects.
The questions of the two projects was therefore: 1) How can we model and represent dance in a sensible manner so that computers can make sense of choreographs and 2) How can we communicate those choreographies to the dancers?
Josien’s thesis addressed this first question. Investigating to what extent choreographers can be supported by semi-automatic analysis of choreographies through the generation of new creative choreography elements. She conducted an online questionnaire among 54 choreographers. The results show that a significant subgroup is willing to use an automatic choreography assistant in their creative process. She further identified requirements for such an assistant, including the semantic levels at which should operate and communicate with the end-users. The requirements are used for a design of a choreography assistant “Dancepiration”, which we implemented as a mobile application. The tool allows choreographers to enter (parts of) a choreography and uses multiple strategies for generating creative variations in three dance styles. Josien evaluated the tool in a user study where we test a) random variations and b) variations based on semantic distance in a dance ontology. The results show that this latter variant is better received by participants. We furthermore identify many differences between the varying dance styles to what extent the assistant supports creativity.
In her thesis, Ana-Liza dove deeper into the human-computer interaction side of the story. Where Josien had classical ballet and modern dance as background and focus, Ana-Liza looked at Dancehall and Hip-Hop dance styles. For her project, Ana-Liza developed four prototypes that could communicate pieces of computer-generated choreography to dancers through Textual Descriptions, 2-D Animations, 3-D Animations, and Audio Descriptions. Each of these presentation methods has its own advantages and disadvantages, so Ana-Liza made an extensive user survey with seven domain experts (dancers). Despite the relatively small group of users, there was a clear preference for the 3-D animations. Based on the results, Ana-Liza also designed an interactive choreography assistant (IDCAT).
The combined theses formed the basis of a scientific article on dance representation and communication that was accepted for publication in the renowned ACE entertainment conference, co-authored by us and co-supervisor Frank Nack.
The Netherlands Institute for Sound and Vision (NISV) archives Dutch broadcast TV and makes it available to researchers, professionals and the general public. One subset are the Polygoonjournaals (Public News broadcasts) that are published under open licenses as part of the OpenImages platform. NISV is also interested in exploring new ways and technologies to make interaction with the material easier and to increase exposure to their archives. In this context, Rudy explored two options.
One part of the research was the autonomous colorization of old black-and-white video footage using Neural Networks. Rudy used a pre-trained NN (Zhang et al 2016) that is able to colorize black and white images. Rudy developed a program to split videos into frames, colorize the individual frames using the NN and then ‘stitch’ them back together into colorized videos. The stunning results were very well received by NISV employees. Examples are shown below.
In the other part of his research, Rudy investigated to what extent the existing news broadcast corpus, with a voice-overs from the famous Philip Bloemendal can be used to develop a modern text-to-speech engine with his voice. To do so he have mainly focused on natural language processing and the determination to what extent the language used by Bloemendal in the 1970s is still comparable enough to contemporary Dutch.
Rudy used precompiled automatic speech recognition (ASR) results to match words to sounds and developed a slot-and-filler text-to-speech system based on this. To increase the limited vocabulary, he implemented a number of strategies, including term-expansion through the use of Open Dutch Wordnet and smart decompounding (this mostly works for Dutch, mapping ‘sinterklaasoptocht’ to ‘sinterklaas’ and ‘optocht’. The different strategies were compared to a baseline. Rudy found that a combination of the two resulted in the best performance (see figure). For more information:
[This post is based on the Information Sciences MSc. thesis by Onno Valkering]
To make widespread knowledge sharing possible in rural areas in developing countries, the notion of the Web has to be downscaled based on the specific low-resource infrastructure in place. In this paper, we introduce SPARQL over SMS, a solution for exchanging RDF data in which HTTP is substituted by SMS to enable Web-like exchange of data over cellular networks.
The solution uses converters that take outgoing SPARQL queries sent over HTTP and convert them into SMS messages sent to phone numbers (see architecture image). On the receiver-side, the messages are converted back to standard SPARQL requests.
The converters use various data compression strategies to ensure optimal use of the SMS bandwidth. These include both zip-based compression and the removal of redundant data through the use of common background vocabularies. The thesis presents the design and implementation of the solution, along with evaluations of the different data compression methods.
The application is validated in two real-world ICT for Development (ICT4D) cases that both use the Kasadaka platform: 1) An extension of the DigiVet application allows sending information related to veterinary symptoms and diagnoses accross different distributed systems. 2) An extension of the RadioMarche applicationinvolves the retrieval and adding of current offerings in the market information system, including the phone number of the advertisers.
For more information:
Download Onno’s Thesis. A version of the thesis is currently under review.
[This post is based on Andre Baart’s B.Sc. thesis. The text is mostly written by him]
In developing (rural) communities, the adoption of mobile phones is widespread. This allows information to be offered to these communities through voice-based services. This research explores the possibilities of creating a flexible framework (Kasadaka) for hosting voice services in rural communities. The context of the developing world poses special requirements, which have been taken into account in this research. The framework creates a voice service that incorporates dynamic data from a data store. The framework allows for a low-effort adaptation to new and changing use cases. The service is hosted on cheap, low-powered hardware and is connected to the local GSM network through a dongle. We validated the working and flexibility of the framework by adapting it to a new use case. Setting up this new voice server was possible in less than one hour, proving that it is suitable for rapid prototyping. This framework enables further research into the effects and possibilities of hosting voice based information services in the developing world. The image below shows the different components and the dataflow between these components when a call is made. Read more in Andre Baart‘s thesis (pdf).
All information on how to get started with Kasadaka can be found on the project’s GitHub page: https://github.com/abaart/KasaDaka
Text in italics only takes place when setting up the call.
Asterisk receives the call from the GSM dongle, answers the call, and connects it to VXI. Asterisk receives the user’s input and forwards it to VXI.
VXI requests the configured VoiceXML document from Apache. VXI requests the configured VoiceXML document from Apache. Together with the request, it sends the user input.
Apache runs the Python program (based on Flask), in which data from the triple store has to be read or written. Python sends the SPARQL query to ClioPatria.
ClioPatria runs the query on the data present, and sends the result of the query back to the Python program.
Python renders the VoiceXML template. The dynamic data is now inserted in the VoiceXML document, and it is sent back to VXI.
VXI starts interpreting the VoiceXML document. In the document there are references to audio files. It sends requests to Apache for the referenced files.
Apache sends a request for the file to the file system.
The file is read from the file system.
Apache responds with the requested audio files.
VXI puts all the audio files in the correct order and plays them back sequentially, sending the audio to the GSM dongle.
In the context of the Observe project and Lukas Hulsbergen’s thesis, we developed the interactive game/web toy “IetsNieuws“. In the game participants are asked to do voiceovers for Sound and Vision’s OpenImages videos. One player takes on the role of a newscaster, while the other player remixes news footage. Based on this players’ performance, he/she is presented an achievement screen.
Because of the limited game explanation, players created their own style of play leading to “emergent gameplay”. An experiment was done to examine whether players experience the relationship between each other when playing the game in the presence of an audience as competitive or cooperative. The results of the observations during the experiment and feedback through a questionnaire show that the subjects saw the other player as a team player and not as an opponent.
During the workshop, which was attended by around 25 AOPP members from all over Mali, we followed up on the results of a previous workshop in 2015, where we co-developed a number of use cases around improving the lives of rural farmers in Mali. Specifically, we developed two prototypes services accessible using simple mobile phones:
An online marketplace for seeds. Farmers can call in to the system to place offerings of seeds or browse current offers of seeds of various quality levels in a specific region.
A chicken vaccination service. For this service, an extension worker can register newly born chickens in the system. The system keeps an administration of when farmers need to vaccinate their chickens against specific diseases. The system then calls the farmer and plays a reminder message in his/her language.
These services were developed on Kasadaka, the cheap and low-resource rapid-prototyping platform for knowledge-rich and voice-accessible services. During the workshop we were able to further test the Kasadaka in the field. A field trip to local farmers and a milk cooperation in nearby Ouelessebougou gave us further context and information in how these services can support locals (see also the video embedded below). Chris van Aart from 2coolmonkeys demonstrated his progress on the Senepedia wiki and two Android applications that allow farmers and organizers to use geo-services to count cows, trees or other objects in the field.
In addition to these two services, we also presented seven services on the Kasadaka, developed by students of the VUA ICT4D M.Sc. course. These included a weather information service, two vetirenary services, general-purpose knowledge sharing platforms, farmer alert services and a milk market. These services were all very well received and allowed the workshop participants to really see the full potential of voice-enabled information services.
The presentation below shows more information, my personal highlights from the workshop (hence the title) as well as feedback received on the seven student projects.
[This post describes Karl Lundfall‘s MSc Thesis research and is adapted from his thesis]
In the realm of database technologies, the reign of SQL is slowly coming to an end with the advent of many NoSQL (Not Only SQL) alternatives. Linked Data in the form of RDF is one of these, and is regarded to be highly effective when connecting datasets. In this thesis, we looked into how the choice of database can affect the development, maintenance, and quality of a product by revising a solution for the social enterprise Text to Change Mobile (TTC).
TTC is a non-governmental organization equipping customers in developing countries with high-quality information and important knowledge they could not acquire for themselves. TTC offers mobile-based solutions such as SMS and call services and focuses on projects implying a social change coherent with the values shared by the company.
We revised a real-world system for linking datasets based on a much more mainstream NoSQL technology, and by altering the approach to instead use Linked Data. The result (see the figure on the left) was a more modular system living up to many of the promises of RDF.
On the other hand, we also found that there for this use case are some obstacles in adopting Linked Data. We saw indicators that more momentum needs to build up in order for RDF to gradually mature enough to be easily applied on use cases like this. The implementation we present and demonstrates a different flavor of Linked Data than the common scenario of publishing data for public reuse, and by applying the technology in business contexts we might be able to expand the possibilities of Linked Data.
As a by-product of the research, a Node.js module for Prolog communication with Cliopatria was developed and made available at https://www.npmjs.com/package/prolog-db . This module might illustrate that new applications usingRDF could contribute in creating a snowball effect of improved quality in RDF-powered applications attracting even more practitioners.
[This post was written by Marc Jacobs and describes his MSc Thesis research]
Nowadays the world does not just rely on traditional news sources like newspapers, television and radio anymore. Social Media, such as Twitter, are claiming their key position here, thanks to the fast publishing speed and large amount of items. As one may suspect, the credibility of this unrated news becomes questionable. My Master thesis focuses on determining measurable features (such as retweets, likes or number of Wikipedia entities) in newsworthy tweets and online news articles.
The gathering of the credibility features consisted of two parts: a theoretical and practical part. First, a theoretical credibility framework has been built using recent studies about credibility on the Web. Next, Ubuntu was booted, Python was started, and news articles and tweets, including metadata, were mined. The news items have been analysed, and, based on the credibility framework, features were extracted. Additional information retrieval techniques (website scraping, regular expressions, NLTK, IR-API’s) were used to extract additional features, so the coverage of the credibility framework was extended.
The last step in this research was to present the features to the crowd in an experimental design, using the crowdsourcing platform Crowdflower. The correlation between a specific feature and the credibility of the tweet or news article has been calculated. The results have been compared to find the differences and similarities between tweets and articles.
The highly correlated credibility features (which include the amount of matches with Wikipedia entries) may be used in the future for the construction of credibility algorithms that automatically assess the credibility of newsworthy tweets or news articles, and, hopefully, adds support to filter reliable news from the impenetrable pile of data on the Internet.
[This post was written by Roy Hoeymans. It describes his MSc. project ]
In this master project, which I have done externally at DNV-GL, I have built a recommender system for knowledge portals. Recommender systems are pieces of software that provide suggestions for related items to a user. My research focuses on the application of a recommender system in knowledge portals. A knowledge portal is an online single point of access to information or knowledge on a specific subject. Examples of knowledge portals are SKYbrary (www.skybrary.aero) or Navipedia (www.navipedia.org).
Part of this project was a case study on SKYbrary, a knowledge portal on the subject of aviation safety. In this project I looked at the types of data that are typically available to knowledge portals. I used user navigation pattern data, which I retrieved via the Google Analytics API, and the text of the articles to create a user-navigation based and a content based algorithm. The user-navigation based algorithm uses an item association formula and the content based algorithm uses a tf-idf weighting scheme to calculate content similarity between articles. Because both types of algorithm have their separate disadvantages, I also developed a hybrid algorithm that combines these two.
To see which type of algorithm was the most effective, I conducted a survey to the content editors of SKYbrary, who are domain experts on the subject. Each question in the survey showed an article and then recommendations for that article. The respondent was then asked to rate each recommended article on a scale from 1 (completely irrelevant) to 5 (very relevant). The results of the survey showed that the hybrid algorithm algorithm is, which a statistical significant difference, better than a user-navigation based algorithm. A difference between the hybrid algorithm and the content-based algorithm was not found however. Future work might include a more extensive or different type of evaluation.
In addition to the research I have done on the algorithms, I have also developed a demo application in which the content editors of SKYbrary can use to show recommendations for a selected article and algorithm.