A Voice Service Development Kit for the Kasadaka platform

[This post is written by André Baart and describes his MSc thesis]

While the internet usage in the developing world is still low, the adoption of simple mobile phones is widespread. A way to offer the advantages of the internet to these populations is voice-based information systems. The KasaDaka voice-services platform is aimed at providing voice-services in the context of ICT for Development (ICT4D). The platform is based on a Raspberry Pi and a GSM modem, which enables affordable voice-service hosting, using the locally available GSM network. The platform takes into account the special requirements of the ICT4D context, such as limited internet connectivity and low literacy rates.

This research focuses on lowering the barrier to entry of voice-service development, by reducing the skill set needed to do so. A Voice Service Development Kit (VSDK) is developed that allows the development of voice-services by deploying and customizing provided building-blocks. These building blocks each represent a type of interaction that is often found in voice-services. (for example a menu, user voice input or the playback of a message) The researcher argues that the simplification of voice-service development is an essential step towards sustainable voice-services in the ICT4D context; As this increases the potential number of local voice-service developers, hremoving the dependency on foreign (and thus expensive) developers and engineers. This simplification should ideally be achieved by providing a graphical interface to voice-service development.

The VSDK was evaluated during the ICT4D course at the Vrije Universiteit Amsterdam, where students built applications for various ICT4D use-cases using the VSDK. Afterwards a survey was conducted, which provided insight on the students’ experiences with voice-service development and the VSDK. From the results of the evaluation is concluded that the building-block approach to voice-service development used in the VSDK, is successful for the development of simple voice-services. It allows newcomers to (voice-service) development, to quickly develop (simple) voice-services from a graphical interface, without requiring programming experience.

The VSDK combined with the existing KasaDaka platform provides a good solution to the hosting and development of voice-services in the ICT4D context.

More details can be found in the complete thesis.A slidedeck is included below. You can find the VSDK code on Andre’s Github: http://github.com/abaart/KasaDaka-VSDK

 

Share This:

MSc Project: The Implications of Using Linked Data when Connecting Heterogeneous User Information

[This post describes Karl Lundfall‘s MSc Thesis research and is adapted from his thesis]

sms phoneIn the realm of database technologies, the reign of SQL is slowly coming to an end with the advent of many NoSQL (Not Only SQL) alternatives. Linked Data in the form of RDF is one of these, and is regarded to be highly effective when connecting datasets. In this thesis, we looked into how the choice of database can affect the development, maintenance, and quality of a product by revising a solution for the social enterprise Text to Change Mobile (TTC).

TTC is a non-governmental organization equipping customers in developing countries with high-quality information and important knowledge they could not acquire for themselves. TTC offers mobile-based solutions such as SMS and call services and focuses on projects implying a social change coherent with the values shared by the company.

We revised a real-world system for linking datasets based on a much more mainstream NoSQL technology, and by altering the approach to instead use Linked Data. The result (see the figure on the left) was a more modular system living up to many of the promises of RDF.

Overview of the Linked Data-enabled tool to connect multiple heterogeneous databases developed in the context of this Msc Project.
Overview of the Linked Data-enabled tool to connect multiple heterogeneous databases developed in the context of this Msc Project.

On the other hand, we also found that there for this use case are some obstacles in adopting Linked Data. We saw indicators that more momentum needs to build up in order for RDF to gradually mature enough to be easily applied on use cases like this. The implementation we present and demonstrates a different flavor of Linked Data than the common scenario of publishing data for public reuse, and by applying the technology in business contexts we might be able to expand the possibilities of Linked Data.

As a by-product of the research, a Node.js module for Prolog communication with Cliopatria was developed and made available at https://www.npmjs.com/package/prolog-db . This module might illustrate that new applications usingRDF could contribute in creating a snowball effect of improved quality in RDF-powered applications attracting even more practitioners.

Read more in Karl’s MSc. Thesis 

Share This:

MSc. Project: The search for credibility in news articles and tweets

[This post was written by Marc Jacobs and describes his MSc Thesis research]

Nowadays the world does not just rely on traditional news sources like newspapers, television and radio anymore. Social Media, such as Twitter, are claiming their key position here, thanks to the fast publishing speed and large amount of items. As one may suspect, the credibility of this unrated news becomes questionable. My Master thesis focuses on determining measurable features (such as retweets, likes or number of Wikipedia entities) in newsworthy tweets and online news articles.

marc_framework
Credibility framework pyramid


The gathering of the credibility features consisted of two parts: a theoretical and practical part. First, a theoretical credibility framework has been built using recent studies about credibility on the Web. Next, Ubuntu was booted, Python was started, and news articles and tweets, including metadata, were mined. The news items have been analysed, and, based on the credibility framework, features were extracted. Additional information retrieval techniques (website scraping, regular expressions, NLTK, IR-API’s) were used to extract additional features, so the coverage of the credibility framework was extended.

marc_pipeline
The data processing and experimentation pipeline

The last step in this research was to present the features to the crowd in an experimental design, using the crowdsourcing platform Crowdflower. The correlation between a specific feature and the credibility of the tweet or news article has been calculated. The results have been compared to find the differences and similarities between tweets and articles.

The highly correlated credibility features (which include the amount of matches with Wikipedia entries) may be used in the future for the construction of credibility algorithms that automatically assess the credibility of newsworthy tweets or news articles, and, hopefully, adds support to filter reliable news from the impenetrable pile of data on the Internet.

Read all the details in Marc’s thesis

Share This: