2013/06/24

Linking and enriching information, some examples of using open data

Genealogical research is about people, family relationships and family history. The purpose of Genealogie Online is to publish this information. Genealogists can - in a simple way – show off the results of their research to others. They also get feedback and insights. For the latter the genealogical information is linked to other data sources. This article describes the use of open data to put genealogical data in context.

Open linked data

According to the Open Data Handbook open data is:

data that can be freely used, reused and redistributed by anyone - subject only, at most, to the requirement to attribute and sharealike.

More and more organizations make information they have available as open data. Organisations such as Wikipedia and the Dutch meteorological organisation KNMI already provided open data, organizations are following these examples. Even the European Union has adopted open data! This initiative will create many new possibilities, opportunities for new websites and mobile apps.

Linked data is about relating data to each other. If you can link data sets together, the fun begins! On Genealogie Online information is linked together with other information on three elements:

  • the surname
  • the date
  • the place

About the surname

In genealogy you come across different surnames. Genealogie Online supports the genealogists on this topic via the About the surname page, see for example the About the surname Hollestelle page.

This page consists of information about the surname which is in part aggregated from all published data on Genealogie Online. There are also links to external sites with more information about that name, such as the (Dutch) Who (re)searches who? page.

Surnames can be written in different ways, especially if you look at it over the centuries. The open data source that can help with this challenge comes from the Zeeland Archives (in the person of Leo Hollestelle): a list of variants per surname. I did also include this resource in a Dutch presentation Please give me your data I gave for archivists, to indicate that these apparently simple lists can be of great values ​​to others! On Genealogie Online I use these variants-list on the About the surname page and in the search engine.

About the day

Another element for which information can be collected is the date/year. Genealogie Online already showed for a long time information from Wikipedia about the date of birth, marriage or death.  Like information about the government, royal house and other historic events. This information is now also shown on the About the day pages.

The Royal Netherlands Meteorological Institute (KNMI) provides weather data which goes back to 1701. This way, you can show what the weather was on the day ancestors married!

Recently two new sources were added to the About the day page to give an image of the juncture: art from the Rijksmuseum and old newsreels.

The Rijksmuseum offers information about their art as open data. Besides the meta-data the images can also be used. So now you can, for example, show which art was made in 1880.

image[8] Source: Rijksmuseum, painting by Willem Roelofs made in 1880

Open Images is a open media platform which offers online access to audio visual archive material to stimulate creative reuse. One of the items they offer are Polygoon newsreels. Based on the date the specific Polygoon newsreel from (about) that time can now be shown, see as example the About the day Tuesday March 4, 1941 page.

About the town

A third element which is present in genealogical data are place names. Genealogie Online makes use of an open data set of international geographical information supplied by Geonames.

As written in Genealogy and place names this dataset is used to check places names: did the genealogist write the name correctly and can it be uniquely identified. If so, Geonames provides information like longitude latitude and links to Wikipedia for more information.

This information is used on the About the town page, see for example the About the town Gouda page. Based on the identifying Geonames ID extra information can be collected (via DBpedia) about the town, like a descriptive text and photo.

File:Gouda vanuit de lucht.jpg Source: Wikipedia Commons, page Gouda

Genealogie Online tries, with the help of its users, to link the towns to archives. A nice open data source is the (Dutch) Archief Wiki. By linking the archives to the towns they have material about, genealogist can be redirected to the right archive based on the place name.

Another nice source which is shown on the About the town pages is provided by rijksmonumenten.info, which in turn get their data from the Cultural Heritage Agency of the Netherlands, Wikipedia and Flickr. This dataset can (among other things) be searched for longitude and latitude. This results in images of national monuments around that position!

Open data, new possibilities, new insights

This article gives some examples of how Genealogie Online uses open data to offer context to genealogists.

The nice thing is, that we’re only at the beginning of the open data movement. The more organisations, including archives (like trendsetter Archief Leiden), offer open data, the more opportunities. Of course you have to watch for copyright and privacy issues, and IT systems have to support it, but these are manage-able issues.

Open data can lead to more insight, new functionality, more economic activity!