The CLARIN framework commissioned the production of dissemmination videos showcasing the outcomes of the individual CLARIN projects. One of these projects was the Dutch Ships and Sailors project, a collaboration between VU Computer Science, VU humanities and the Huygens Institute for National History. In this project, we developed a heterogeneous linked data cloud connecting many different maritime databases. This data cloud allows for new types of integrated browsing and new historical research questions. In the video, we (Victor de Boer together with historians Jur Leinenga and Rik Hoekstra) explain how the data cloud was formed and how it can be used by maritime historians.
This is a nice companion piece to the more technical description of the dataset which was published in the proceedings of ISWC 2014. The new version highlights more the general setup of the project and the considerations and innovations of the project from a historical point of view.
Since submission of this ‘mid-term project description’, the DSS data cloud has been expanding, and the ‘development’ version of the triple store now hosts six datasets thanks to the work of Jeroen Entjes (see the datacloud figure).
[This post was written by Jeroen Entjes and describes his Msc Thesis research]
The Dutch maritime supremacy during the Dutch Golden Age has had a profound influence on the modern Netherlands and possibly other places around the globe. As such, much historic research has been done on the matter, facilitated by thorough documentation done by many ports of their shipping. As more and more of these documentations are digitized, new ways of exploring this data are created.
This master project uses one such way. Based on the Dutch Ships and Sailors project digitized maritime datasets have been converted to RDF and published as Linked Data. Linked Data refers to structured data on the web that is published and interlinked according to a set of standards. This conversion was done based on requirements for this data, set up with historians from the Huygens ING Institute that provided the datasets. The datasets chosen were those of Archangel and Elbing, as these offer information of the Dutch Baltic trade, the cradle of the Dutch merchant navy that sailed the world during the Dutch Golden Age.
Along with requirements for the data, the historians were also interviewed to gather research questions that combined datasets could help solve. The goal of this research was to see if additional datasets could be linked to the existing Dutch Ships and Sailors cloud and if such a conversion could help solve the research questions the historians were interested in.
Data visualization showing shipping volume of different datasets.
As part of this research, the datasets have been converted to RDF and published as Linked Data as an addition to the Dutch Ships and Sailors cloud and a set of interactive data visualizations have been made to answer the research questions by the historians. Based on the conversion, a set of recommendations are made on how to convert new datasets and add them to the Dutch Ships and Sailors cloud. All data representations and conversions have been evaluated by historians to assess the their effectiveness.
This year’s third issue of E-Data and Research magazine features an article about the Dutch Ships and Sailors project. The article (in Dutch) describes how our project provides new ways of interacting with Dutch maritime data. So far, four datasets are present in the DSS data cloud but we are currently extending the dataset with two new datasets. More on that later…
In the same issue, there is an article about the workshop around newspaper data as provided by the National Library. This includes a picture of me presenting the DIVE project.
Who knew publishing Open Data could be so rewarding? The good people
at DANS sent me a cake because I was the first to publish research
data under OpenAccess (read more about this on OpenAccess.nl) . This data was the result of a very small research project.
The goal of the “Diepere Maritieme Data” (DMD) project was to enrich the CLARIN Dutch Ships and Sailors (DSS) Linked Data cloud with links from DSS records to scans of the original archival documents from which the data was digitized. Specifically, we enriched the subset “Noordelijke Monsterrollen Database (Northern Muster Rolls Databases) created by historian Jurjen Leinenga which was converted to an RDF dataset within the DSS project (Persistent Identifier: urn:nbn:nl:ui:13-czhm-ug URL: https://easy.dans.knaw.nl/ui/datasets/id/easy-dataset:57617)
Linking historical datasets and making them available for the Web has increasingly become a subject of research in the field of digital humanities. In the Netherlands, history is intimately related to the maritime activity because it has been essential in the development of economic, social and cultural aspects of Dutch society. As such an important sector, it has been well documented by shipping companies, governments, newspapers and other institutions.
In this master project we assume that, given the importance of maritime activity in every day life in the XIX and XX centuries, announcements on the departures and arrivals of ships or mentions of accidents or other events, can be found in newspapers.
We have taken a two-stage approach: first, an heuristic-based method for record linkage and then machine-learning algorithms for article classification to be used for filtering in combination with domain features. Evaluation of the linking method has shown that certain domain features were indicative of mentions of ships in newspapers. Moreover, the classifier methods scored near perfect precision in predicting ship related articles.
Enriching historical ship records with links to newspaper archives is significant for the digital history community since it connects two datasets that would have otherwise required extensive annotating work and man hours to align. Our work is part of the Dutch Ships and Sailors Linked Data Cloud project. Check out Andrea’s thesis[pdf].
5000+ links from people in the BiographyNet RDF data to people in the Rijksmuseum RDF data.
2 links from Dutch Ships and Sailors to Rijksmuseum collections
61 links from Dutch Ships and Sailors Ranks to CEDAR Hisco ‘occupation’ URIs were made
1320 links of CEDAR municipalities (by Amsterdamse Code) to gemeentegeschiedenis.nl municipalities
33 links of ICONCLASS (used by Rijksmuseum) to HISCO occupations
We hope to expand this datacloud in the near future and show the added value of such an interconnected digital history cloud for historical research and the general public. You can read more at Albert’s blog or on the blog of Ivo Zandhuis’ Hic Sunt Leones
Last week saw the kickoff of the new Clarin NL-funded project “Dutch Ships and Sailors”(*). This project will run for one year and gives me the opportunity to work with historians from both VU and Huygens ING on applying Linked Data principles to Dutch maritime-historical data. From the official description:
As a sea-faring nation, a large portion of Dutch history is found on the water. However, much of the digitized historical source material is still scattered across many databases and archives. This curation and demonstrator project aims to bring together the rich maritime historical data preserved in the many different databases. We propose a (semantic) web-based infrastructure
that will house various maritime-historical datasets. We will provide a tool chain and methodology for converting legacy datasets. The infrastructure includes common vocabularies to normalize and enrich existing data. Links are established between the datasets and to other relevant datasets on the Web. Although the infrastructure will be set up to facilitate 25+ identified datasets, we initially populate the infrastructure with four selected datasets. These will allow us to investigate two case studies in order to answer the historical research question “To what extent did patterns of shipping and recruitment in the Dutch maritime sector change over the course of the 18th and 19th centuries?”
(*) the project’s official title is Dutch Ships and Seamen, but we think this is potentially less problematic 🙂