Linking and enriching information, some examples of using open data

Genealogical research is about people, family relationships and family history. The purpose of Genealogie Online is to publish this information. Genealogists can - in a simple way – show off the results of their research to others. They also get feedback and insights. For the latter the genealogical information is linked to other data sources. This article describes the use of open data to put genealogical data in context.

Open linked data

According to the Open Data Handbook open data is:

data that can be freely used, reused and redistributed by anyone - subject only, at most, to the requirement to attribute and sharealike.

More and more organizations make information they have available as open data. Organisations such as Wikipedia and the Dutch meteorological organisation KNMI already provided open data, organizations are following these examples. Even the European Union has adopted open data! This initiative will create many new possibilities, opportunities for new websites and mobile apps.

Linked data is about relating data to each other. If you can link data sets together, the fun begins! On Genealogie Online information is linked together with other information on three elements:

  • the surname
  • the date
  • the place

About the surname

In genealogy you come across different surnames. Genealogie Online supports the genealogists on this topic via the About the surname page, see for example the About the surname Hollestelle page.

This page consists of information about the surname which is in part aggregated from all published data on Genealogie Online. There are also links to external sites with more information about that name, such as the (Dutch) Who (re)searches who? page.

Surnames can be written in different ways, especially if you look at it over the centuries. The open data source that can help with this challenge comes from the Zeeland Archives (in the person of Leo Hollestelle): a list of variants per surname. I did also include this resource in a Dutch presentation Please give me your data I gave for archivists, to indicate that these apparently simple lists can be of great values ​​to others! On Genealogie Online I use these variants-list on the About the surname page and in the search engine.

About the day

Another element for which information can be collected is the date/year. Genealogie Online already showed for a long time information from Wikipedia about the date of birth, marriage or death.  Like information about the government, royal house and other historic events. This information is now also shown on the About the day pages.

The Royal Netherlands Meteorological Institute (KNMI) provides weather data which goes back to 1701. This way, you can show what the weather was on the day ancestors married!

Recently two new sources were added to the About the day page to give an image of the juncture: art from the Rijksmuseum and old newsreels.

The Rijksmuseum offers information about their art as open data. Besides the meta-data the images can also be used. So now you can, for example, show which art was made in 1880.

image[8] Source: Rijksmuseum, painting by Willem Roelofs made in 1880

Open Images is a open media platform which offers online access to audio visual archive material to stimulate creative reuse. One of the items they offer are Polygoon newsreels. Based on the date the specific Polygoon newsreel from (about) that time can now be shown, see as example the About the day Tuesday March 4, 1941 page.

About the town

A third element which is present in genealogical data are place names. Genealogie Online makes use of an open data set of international geographical information supplied by Geonames.

As written in Genealogy and place names this dataset is used to check places names: did the genealogist write the name correctly and can it be uniquely identified. If so, Geonames provides information like longitude latitude and links to Wikipedia for more information.

This information is used on the About the town page, see for example the About the town Gouda page. Based on the identifying Geonames ID extra information can be collected (via DBpedia) about the town, like a descriptive text and photo.

File:Gouda vanuit de lucht.jpg Source: Wikipedia Commons, page Gouda

Genealogie Online tries, with the help of its users, to link the towns to archives. A nice open data source is the (Dutch) Archief Wiki. By linking the archives to the towns they have material about, genealogist can be redirected to the right archive based on the place name.

Another nice source which is shown on the About the town pages is provided by rijksmonumenten.info, which in turn get their data from the Cultural Heritage Agency of the Netherlands, Wikipedia and Flickr. This dataset can (among other things) be searched for longitude and latitude. This results in images of national monuments around that position!

Open data, new possibilities, new insights

This article gives some examples of how Genealogie Online uses open data to offer context to genealogists.

The nice thing is, that we’re only at the beginning of the open data movement. The more organisations, including archives (like trendsetter Archief Leiden), offer open data, the more opportunities. Of course you have to watch for copyright and privacy issues, and IT systems have to support it, but these are manage-able issues.

Open data can lead to more insight, new functionality, more economic activity!


Development of the pedigree-timeline

A timeline is a nice representation of events in time. A timeline can also be a useful tool since it can provide new insights because you can view the data in a different perspective. For genealogists, the timeline can be useful too! For a while now, I had the idea to combine the timeline with a pedigree chart.


A timeline is a graphical representation of a chronological sequence of events or time periods. This view has the form of a bar and has timestamps with inscriptions or captions.

You can create timelines yourself through services like TimeToast, TimeRime or Tijdbalk.nl (of which you see an image below). On these websites you have to manually enter the data yourself.


The timeline in genealogy

On Genealogie Online the timeline is used to give more insight into the life of a person. Below is an example of Willem Frederik Lamoraal Boissevain. The red rectangle depicts the life of the person, below the lifespans of grandparents, parents, brothers / sisters and children are put in the timeline. It reflects what the person experienced in terms of births and deaths and who lived in the same time.


The pedigree chart
imageA pedigree chart is a representation of all direct ancestors in the male and female lines or a person.

Although you can show birth and death dates, in both textual and graphical pedigree chart, is it difficult to see the overlap in lifetimes. This is where the idea of ​​combining the pedigree chart with the timeline hit me.

The pedigree-timeline

The pedigree-timeline shows both the relationships between child and parents and the lifespans of all the ancestors. The following image was the first sketch of a pedigree-timeline.


Although the combination is correct, it’s somewhat hard to read this representation. Because the proband is left and also the most recent time, the bars start at the death and finish  at birth. Not logical ... so let’s turn it around!

The first prototype


The first sketch was made in a drawing program, but a prototype followed (image show above). This was a working prototype of the pedigree-timeline, one you can view in the browser.

In this prototype, there are also several types of bars (not present in the first sketch). Of some ancestors you might not know the date of birth or death. This is shown with a striped beginning or end. It may also be that the pedigree-timeline was based on a living person and/or that ancestors are still alive. In that case, the "lifespan bar” ends with a triangle. Finally, the bars are coloured pink or blue to indicate the sex.

Second prototype

In a pedigree chart you can easily distinguish the generations, in the first sketch and prototype of the pedigree-timeline this was less visible, you missed it. This was fixed in a second prototype by using separate colours for each generation.


Data requirements

To create a pedigree chart you only need information about the names and the child-parent relationships. A pedigree-timeline requires more: information about birth / baptism and death / funeral. Only this lifespan information can be put as a bar in the timeline. The timeline can handle approximate dates, but you have to be able to estimate lifespans.

This is an additional challenge which became very apparent when I tried to generate the data for the pedigree-timeline.I try to generate this dataset from a GEDCOM file. If no date of birth is known we have make an educated guess to find a minimal year of birth. This can done by looking at the wedding and assume the person was at least 18 years at that data. Or by looking at the dates of birth of the children. A similar set of estimation rules has been drawn up to determine dates of death.

Based on these estimates approximate lifespan bars can be put in the pedigree-timeline. When no dates are available and can’t be guessed, then these persons are not shown in the pedigree-timeline!

The future of the pedigree-timeline

The development of a new type of genealogical graph, from idea through sketches to working prototypes has been fun, hence this article.

The technology is not yet finished completely, eg. it doesn’t work well/smoothly on tablets. But I hope that this new genealogical chart will soon be available on Genealogie Online. Then, you’ll find the pedigree-timeline next to the 'pedigree on the map' chart!


And who knows, maybe other family tree websites and programs will also include the pedigree-timeline…

This article is a translation of Ontwikkeling van de kwartiertijdbalk.