2014/12/31

Analysing 635M lines of GEDCOM

imageThe GEDCOM parser of Genealogie Online needed a rewrite. The code base had grown out of proportion, resulting in inefficient code and cumbersome maintenance.

A big difference between the start of coding the GEDCOM parser and now is the number of GEDCOM files available: nearly 7 thousand. This gave me the opportunity to do some analysis (and more testing)!

 

Analysis of versions and character sets

First of all the headers of all these GEDCOM files were examined to get a feeling about which GEDCOM grammars and character sets were used.

GEDCOM version*

Count

5.5

6.339

(undefined)

248

5.5.1

245

v.1.0.01 Beta

12

5.3

5

4.0

2

2.0

1

4

1

5.01

1

Total

6.854

* The GEDCOM version as presented in HEAD > GEDC > VERS. I did not check if content did actually conform to the presented grammar version. I did manually check the 5.3, 4, etc. versions, on first glance they seemed just GEDCOM 5.5.

image

The fact that only 3.6% of the GEDCOM files identified itself as 5.5.1 surprised me as this is regarded as the current de-facto standard.

It must be noted that a big portion of GEDCOM files where produced by Dutch family tree programs. But, as can be seen on the Used family tree programs (click on the program name to expand statistics) page on Genealogie Online, only Legacy, MacFamilyTree, Ahnenblatt, PhpGedView and RootsMagic advertise their GEDCOM with the 5.5.1 label.

For the GEDCOM parser it was clear, support 5.5 (and 5.5rev) and 5.5.1 GEDCOM files.

Character set

Count

ANSI

4.395

UTF-8

1.269

ANSEL

692

ASCII

312

(undefined)

95

WINDOWS

27

IBMPC

26

MACINTOSH

21

IBM WINDOWS

11

UNICODE

5

windows-1251

1

Total

6.854

image

The number of files claiming to be UTF-8 is funny. This is because UTF-8 was introduced in GEDCOM 5.5.1. So 1.265 files claimed to be UTF-8 and 243 files claimed to be GEDCOM 5.5.1. This puts the low 3.6% in another perspective…

Fortunately, I could re-use code from the old GEDCOM parser to correctly handle character sets and encoding (was a solid piece of code).

Note: Tim Forsythe publishes similar stats from GigaTrees, which paints a more American picture (for example: 14.4% GEDCOM 5.5.1).

 

Analysis of actual use

The old GEDCOM parser also included support for invalid GEDCOM tags and custom GEDCOM tags. Although I wrote the article GEDCOM files which don’t adhere to the GEDCOM standard shouldn’t be allowed to be called GEDCOM, for Genealogie Online I’m more forgiving. I want to present the genealogical data of my users and don’t want to bother them to much with the fact that their family tree program isn’t producing valid GEDCOM. But, which of the invalid and custom tags to support in the new GEDCOM parser?

I decided to read all the GEDCOM files and count the tag-sequence uses. This resulted in a CSV file which looks like:

INDI-BIRT-AGE,45
INDI-BIRT-AGNC,1820
INDI-BIRT-DATE-ANC,162
INDI-BIRT-DATE-NOTE,172764
INDI-BIRT-DATE-NOTE-CONT,11311
INDI-BIRT-DATE-SOUR,39752
INDI-BIRT-DATE-SOUR-DATE,15951
INDI-BIRT-DATE-SOUR-ITEM,16825
INDI-BIRT-DATE-SOUR-PAGE,486
INDI-BIRT-DATE-SOUR-ROLE,36055
INDI-BIRT-DOCTOR,1
INDI-BIRT-EMAIL,1
INDI-BIRT-FAMC,1223
INDI-BIRT-LABL,4092
INDI-BIRT-LATI,49730
INDI-BIRT-LONG,49730
INDI-BIRT-MOON,37
INDI-BIRT-NOTE,720806

Next step in the analysis was visualisation of this file. I opted for my favourite Javascript module D3.js which provides a cool collapsible tree. The result is available to all those interested on the GEDCOM tag usage page (also downloadable and e-usable under a CC-BY license).

image

The colour of the node indicates if the tag-sequence is valid under the GEDCOM 5.5 grammar (red > 83.7%) or not (grey > 16.2%). This visualisation aspect is not completely accurate as not all GEDCOM files are version 5.5 (the actual version wasn't taken into account).

These tags trees give a good picture of usage. If a invalid of custom tag is used a lot, I would look into the implementation part of the GEDCOM parser.

For fun I also made selections for the top-10 programs used by Genealogie Online users. This way, you can see which program has more or less invalid/custom tags…

For my own reference I made tag trees for GEDCOM 5.5 (which is the “2 January 1986” version, which was hindered by the fact that «NOTE_STRUCTURE» references «SOURCE_CITATION» and vice versa, thus introducing a loop!) and GEDCOM 5.5.1.

De data used for all of these tag trees is also downloadable in CSV and JSON format under a CC-BY license.

The end result, besides nice visualizations, is a lean, more robust and complete GEDCOM parser for Genealogie Online! Users will notice a better support/presentation of sources and notes, and for some programs the use of RIN for identification of persons.

Which GEDCOM 5.5 grammar is correct?

The GEDCOM 5.5 standard is described in a PDF document prepared by the Family History Department of the The Church of Jesus Christ of Latter-day Saints dated 2 January 1996 (which in two days is 19 years ago).

When you Google for the GEDCOM 5.5 grammar you usually end up on the HTML version by Paul McBride which he himself calls “unofficial” (or you find the grammar files of Gedcom.pm by Paul Johnson). But over the years no one seemed to have noticed that the HTML version has a slightly different date “2 January 1996 [Revised 10 January 1996]” and differences in grammar!

Errata Sheet

Although the PDF document includes an Errata Sheet, it seems there are others. When you dig into the archives of Internet you can find references to an Errata Sheet dated 10 January 1996 which has been faxed to some people.
A GEDCOM 5.5 Errata Sheet dated 10 January 1996 supposedly contains corrections to pages 23, 24, 25, 26, 29, 29, 29, 33, 34, 39, 57, 79, and 85.
Unfortunately, this document has not hit the Internet yet, so we can’t say for sure that the “10 January 1996” version by McBride is based on this Errata Sheet.
Some of the differences in the GEDCOM 5.5 grammar between the “2 January 1996” and “Revised 10 January 1996” version are small (typo’s) but some are big (see the diff below)!

Big questions

I think the “Revised 10 January 1996” version - let's call this version GEDCOM 5.5rev - is used a lot, mainly because the HTML version is more accessible. But should we consider this an official version? In my opinion: no (because not an official LDS publication).

If there was an Errata Sheet dated 10 January 1996, why did the LDS didn’t publish it (in PDF form, online) and why didn’t they make a new GEDCOM version which they should have considering some changes are big?

A draft version of version 5.5.1 was only published in 2 October 1999 (see FamilySearch GEDCOM Specifications by Tamura Jones for a complete overview of specifications). This document contains a section which enumerates the differences with the previous version. But, some of the changes, compared to the “2 January 1996” version, which you can see in the “Revised 10 January 1996” version, weren’t mentioned in this section. I guess, the LDS internally were uncertain too about what was the correct GEDCOM 5.5 grammar.

GEDCOM 5.5 Grammar Diff

Below is a comparison between the Record Structures and Substructures of the Lineage-Linked Form (the Primitive elements of the Lineage-Linked Form are the same) between the “2 January 1996” and “Revised 10 January 1996” versions. I only focussed on the grammar, not the rest of the text in the specification. Orange highlighting means a small difference, yellow highlighting indicates a big difference. The table can also be downloaded in PDF format.


Lineage-Linked GEDCOM Form's grammar 5.5 Lineage-Linked GEDCOM Form's grammar 5.5
LDS/PDF version, dated 2 January 1996 McBride/HTML version, revised 10 January 1996
LINEAGE_LINKED_GEDCOM:= LINEAGE_LINKED_GEDCOM:=
0 <<HEADER>> {1:1} 0 <<HEADER>> {1:1}
0 <<SUBMISSION_RECORD>> {0:1} 0 <<SUBMISSION_RECORD>> {0:1}
0 <<RECORD>> {1:M} 0 <<RECORD>> {1:M}
0 TRLR {1:1} 0 TRLR {1:1}
HEADER:= HEADER:=
n HEAD {1:1} n HEAD {1:1}
+1 SOUR <APPROVED_SYSTEM_ID> {1:1} +1 SOUR <APPROVED_SYSTEM_ID> {1:1}
+2 VERS <VERSION_NUMBER> {0:1} +2 VERS <VERSION_NUMBER> {0:1}
+2 NAME <NAME_OF_PRODUCT> {0:1} +2 NAME <NAME_OF_PRODUCT> {0:1}
+2 CORP <NAME_OF_BUSINESS> {0:1} +2 CORP <NAME_OF_BUSINESS> {0:1}
+3 <<ADDRESS_STRUCTURE>> {0:1} +3 <<ADDRESS_STRUCTURE>> {0:1}
+2 DATA <NAME_OF_SOURCE_DATA> {0:1} +2 DATA <NAME_OF_SOURCE_DATA> {0:1}
+3 DATE <PUBLICATION_DATE> {0:1} +3 DATE <PUBLICATION_DATE> {0:1}
+3 COPR <COPYRIGHT_SOURCE_DATA> {0:1} +3 COPR <COPYRIGHT_SOURCE_DATA> {0:1}
+1 DEST <RECEIVING_SYSTEM_NAME> {0:1*} +1 DEST <RECEIVING_SYSTEM_NAME> {0:1*}
+1 DATE <TRANSMISSION_DATE> {0:1} +1 DATE <TRANSMISSION_DATE> {0:1}
+2 TIME <TIME_VALUE> {0:1} +2 TIME <TIME_VALUE> {0:1}
+1 SUBM @XREF:SUBM@ {1:1} +1 SUBM @<XREF:SUBM>@ {1:1}
+1 SUBN @XREF:SUBN@ {0:1} +1 SUBN @<XREF:SUBN>@ {0:1}
+1 FILE <FILE_NAME> {0:1} +1 FILE <FILE_NAME> {0:1}
+1 COPR <COPYRIGHT_GEDCOM_FILE> {0:1} +1 COPR <COPYRIGHT_GEDCOM_FILE> {0:1}
+1 GEDC {1:1} +1 GEDC {1:1}
+2 VERS <VERSION_NUMBER> {1:1} +2 VERS <VERSION_NUMBER> {1:1}
+2 FORM <GEDCOM_FORM> {1:1} +2 FORM <GEDCOM_FORM> {1:1}
+1 CHAR <CHARACTER_SET> {1:1} +1 CHAR <CHARACTER_SET> {1:1}
+2 VERS <VERSION_NUMBER> {0:1} +2 VERS <VERSION_NUMBER> {0:1}
+1 LANG <LANGUAGE_OF_TEXT> {0:1} +1 LANG <LANGUAGE_OF_TEXT> {0:1}
+1 PLAC {0:1} +1 PLAC {0:1}
+2 FORM <PLACE_HIERARCHY> {1:1} +2 FORM <PLACE_HIERARCHY> {1:1}
+1 NOTE <GEDCOM_CONTENT_DESCRIPTION> {0:1} +1 NOTE <GEDCOM_CONTENT_DESCRIPTION> {0:1}
+2 [CONT|CONC] <GEDCOM_CONTENT_DESCRIPTION> {0:M} +2 [CONT|CONC] <GEDCOM_CONTENT_DESCRIPTION> {0:M}
RECORD:= RECORD:=
[ [
n <<FAM_RECORD>> {1:1} n <<FAM_RECORD>> {1:1}
| |
n <<INDIVIDUAL_RECORD>> {1:1} n <<INDIVIDUAL_RECORD>> {1:1}
| |
n <<MULTIMEDIA_RECORD>> {1:M} n <<MULTIMEDIA_RECORD>> {1:M}
| |
n <<NOTE_RECORD>> {1:1} n <<NOTE_RECORD>> {1:1}
| |
n <<REPOSITORY_RECORD>> {1:1} n <<REPOSITORY_RECORD>> {1:1}
| |
n <<SOURCE_RECORD>> {1:1} n <<SOURCE_RECORD>> {1:1}
| |
n <<SUBMITTER_RECORD>> {1:1} n <<SUBMITTER_RECORD>> {1:1}
] ]
FAM_RECORD:= FAM_RECORD:=
n @<XREF:FAM>@ FAM {1:1} n @<XREF:FAM>@ FAM {1:1}
+1 <<FAMILY_EVENT_STRUCTURE>> {0:M} +1 <<FAMILY_EVENT_STRUCTURE>> {0:M}
+2 HUSB {0:1} +2 HUSB {0:1}
+3 AGE <AGE_AT_EVENT> {1:1} +3 AGE <AGE_AT_EVENT> {1:1}
+2 WIFE {0:1} +2 WIFE {0:1}
+3 AGE <AGE_AT_EVENT> {1:1} +3 AGE <AGE_AT_EVENT> {1:1}
+1 HUSB @<XREF:INDI>@ {0:1} +1 HUSB @<XREF:INDI>@ {0:1}
+1 WIFE @<XREF:INDI>@ {0:1} +1 WIFE @<XREF:INDI>@ {0:1}
+1 CHIL @<XREF:INDI>@ {0:M} +1 CHIL @<XREF:INDI>@ {0:M}
+1 NCHI <COUNT_OF_CHILDREN> {0:1} +1 NCHI <COUNT_OF_CHILDREN> {0:1}
+1 SUBM @<XREF:SUBM>@ {0:M} +1 SUBM @<XREF:SUBM>@ {0:M}
+1 <<LDS_SPOUSE_SEALING>> {0:M} +1 <<LDS_SPOUSE_SEALING>> {0:M}
+1 <<SOURCE_CITATION>> {0:M} +1 <<SOURCE_CITATION>> {0:M}
+2 <<NOTE_STRUCTURE>> {0:M}
+2 <<MULTIMEDIA_LINK>> {0:M}
+1 <<MULTIMEDIA_LINK>> {0:M} +1 <<MULTIMEDIA_LINK>> {0:M}
+1 <<NOTE_STRUCTURE>> {0:M} +1 <<NOTE_STRUCTURE>> {0:M}
+1 REFN <USER_REFERENCE_NUMBER> {0:M} +1 REFN <USER_REFERENCE_NUMBER> {0:M}
+2 TYPE <USER_REFERENCE_TYPE> {0:1} +2 TYPE <USER_REFERENCE_TYPE> {0:1}
+1 RIN <AUTOMATED_RECORD_ID> {0:1} +1 RIN <AUTOMATED_RECORD_ID> {0:1}
+1 <<CHANGE_DATE>> {0:1} +1 <<CHANGE_DATE>> {0:1}
INDIVIDUAL_RECORD:= INDIVIDUAL_RECORD:=
n @XREF:INDI@ INDI {1:1} n @<XREF:INDI>@ INDI {1:1}
+1 RESN <RESTRICTION_NOTICE> {0:1} +1 RESN <RESTRICTION_NOTICE> {0:1}
+1 <<PERSONAL_NAME_STRUCTURE>> {0:M} +1 <<PERSONAL_NAME_STRUCTURE>> {0:M}
+1 SEX <SEX_VALUE> {0:1} +1 SEX <SEX_VALUE> {0:1}
+1 <<INDIVIDUAL_EVENT_STRUCTURE>> {0:M} +1 <<INDIVIDUAL_EVENT_STRUCTURE>> {0:M}
+1 <<INDIVIDUAL_ATTRIBUTE_STRUCTURE>> {0:M} +1 <<INDIVIDUAL_ATTRIBUTE_STRUCTURE>> {0:M}
+1 <<LDS_INDIVIDUAL_ORDINANCE>> {0:M} +1 <<LDS_INDIVIDUAL_ORDINANCE>> {0:M}
+1 <<CHILD_TO_FAMILY_LINK>> {0:M} +1 <<CHILD_TO_FAMILY_LINK>> {0:M}
+1 <<SPOUSE_TO_FAMILY_LINK>> {0:M} +1 <<SPOUSE_TO_FAMILY_LINK>> {0:M}
+1 SUBM @<XREF:SUBM>@ {0:M} +1 SUBM @<XREF:SUBM>@ {0:M}
+1 <<ASSOCIATION_STRUCTURE>> {0:M} +1 <<ASSOCIATION_STRUCTURE>> {0:M}
+1 ALIA @<XREF:INDI>@ {0:M} +1 ALIA @<XREF:INDI>@ {0:M}
+1 ANCI @<XREF:SUBM>@ {0:M} +1 ANCI @<XREF:SUBM>@ {0:M}
+1 DESI @<XREF:SUBM>@ {0:M} +1 DESI @<XREF:SUBM>@ {0:M}
+1 <<SOURCE_CITATION>> {0:M} +1 <<SOURCE_CITATION>> {0:M}
+1 <<MULTIMEDIA_LINK>> {0:M} +1 <<MULTIMEDIA_LINK>> {0:M}
+1 <<NOTE_STRUCTURE>> {0:M} +1 <<NOTE_STRUCTURE>> {0:M}
+1 RFN <PERMANENT_RECORD_FILE_NUMBER> {0:1} +1 RFN <PERMANENT_RECORD_FILE_NUMBER> {0:1}
+1 AFN <ANCESTRAL_FILE_NUMBER> {0:1} +1 AFN <ANCESTRAL_FILE_NUMBER> {0:1}
+1 REFN <USER_REFERENCE_NUMBER> {0:M} +1 REFN <USER_REFERENCE_NUMBER> {0:M}
+2 TYPE <USER_REFERENCE_TYPE> {0:1} +2 TYPE <USER_REFERENCE_TYPE> {0:1}
+1 RIN <AUTOMATED_RECORD_ID> {0:1} +1 RIN <AUTOMATED_RECORD_ID> {0:1}
+1 <<CHANGE_DATE>> {0:1} +1 <<CHANGE_DATE>> {0:1}
MULTIMEDIA_RECORD:= MULTIMEDIA_RECORD:=
n @XREF:OBJE@ OBJE {1:1} n @<XREF:OBJE>@ OBJE {1:1}
+1 FORM <MULTIMEDIA_FORMAT> {1:1} +1 FORM <MULTIMEDIA_FORMAT> {1:1}
+1 TITL <DESCRIPTIVE_TITLE> {0:1} +1 TITL <DESCRIPTIVE_TITLE> {0:1}
+1 <<NOTE_STRUCTURE>> {0:M} +1 <<NOTE_STRUCTURE>> {0:M}
+1 <<SOURCE_CITATION>> {0:M}
+1 BLOB {1:1} +1 BLOB {1:1}
+2 CONT <ENCODED_MULTIMEDIA_LINE> {1:M} +2 CONT <ENCODED_MULTIMEDIA_LINE> {1:M}
+1 OBJE @<XREF:OBJE>@ /* chain to continued object */ {0:1} +1 OBJE @<XREF:OBJE>@ /* chain to continued object */ {0:1}
+1 REFN <USER_REFERENCE_NUMBER> {0:M} +1 REFN <USER_REFERENCE_NUMBER> {0:M}
+2 TYPE <USER_REFERENCE_TYPE> {0:1} +2 TYPE <USER_REFERENCE_TYPE> {0:1}
+1 RIN <AUTOMATED_RECORD_ID> {0:1} +1 RIN <AUTOMATED_RECORD_ID> {0:1}
+1 <<CHANGE_DATE>> {0:1} +1 <<CHANGE_DATE>> {0:1}
NOTE_RECORD:= NOTE_RECORD:=
n @<XREF:NOTE>@ NOTE <SUBMITTER_TEXT> {1:1} n @<XREF:NOTE>@ NOTE <SUBMITTER_TEXT> {1:1}
+1 [ CONC | CONT] <SUBMITTER_TEXT> {0:M} +1 [ CONC | CONT] <SUBMITTER_TEXT> {0:M}
+1 <<SOURCE_CITATION>> {0:M} +1 <<SOURCE_CITATION>> {0:M}
+1 REFN <USER_REFERENCE_NUMBER> {0:M} +1 REFN <USER_REFERENCE_NUMBER> {0:M}
+2 TYPE <USER_REFERENCE_TYPE> {0:1} +2 TYPE <USER_REFERENCE_TYPE> {0:1}
+1 RIN <AUTOMATED_RECORD_ID> {0:1} +1 RIN <AUTOMATED_RECORD_ID> {0:1}
+1 <<CHANGE_DATE>> {0:1} +1 <<CHANGE_DATE>> {0:1}
REPOSITORY_RECORD:= REPOSITORY_RECORD:=
n @<XREF:REPO>@ REPO {1:1} n @<XREF:REPO>@ REPO {1:1}
+1 NAME <NAME_OF_REPOSITORY> {0:1} +1 NAME <NAME_OF_REPOSITORY> {0:1}
+1 <<ADDRESS_STRUCTURE>> {0:1} +1 <<ADDRESS_STRUCTURE>> {0:1}
+1 <<NOTE_STRUCTURE>> {0:M} +1 <<NOTE_STRUCTURE>> {0:M}
+1 REFN <USER_REFERENCE_NUMBER> {0:M} +1 REFN <USER_REFERENCE_NUMBER> {0:M}
+2 TYPE <USER_REFERENCE_TYPE> {0:1} +2 TYPE <USER_REFERENCE_TYPE> {0:1}
+1 RIN <AUTOMATED_RECORD_ID> {0:1} +1 RIN <AUTOMATED_RECORD_ID> {0:1}
+1 <<CHANGE_DATE>> {0:1} +1 <<CHANGE_DATE>> {0:1}
SOURCE_RECORD:= SOURCE_RECORD:=
n @<XREF:SOUR>@ SOUR {1:1} n @<XREF:SOUR>@ SOUR {1:1}
+1 DATA {0:1} +1 DATA {0:1}
+2 EVEN <EVENTS_RECORDED> {0:M} +2 EVEN <EVENTS_RECORDED> {0:M}
+3 DATE <DATE_PERIOD> {0:1} +3 DATE <DATE_PERIOD> {0:1}
+3 PLAC <SOURCE_JURISDICTION_PLACE> {0:1} +3 PLAC <SOURCE_JURISDICTION_PLACE> {0:1}
+2 AGNC <RESPONSIBLE_AGENCY> {0:1} +2 AGNC <RESPONSIBLE_AGENCY> {0:1}
+2 <<NOTE_STRUCTURE>> {0:M} +2 <<NOTE_STRUCTURE>> {0:M}
+1 AUTH <SOURCE_ORIGINATOR> {0:1} +1 AUTH <SOURCE_ORIGINATOR> {0:1}
+2 [CONT|CONC] <SOURCE_ORIGINATOR> {0:M} +2 [CONT|CONC] <SOURCE_ORIGINATOR> {0:M}
+1 TITL <SOURCE_DESCRIPTIVE_TITLE> {0:1} +1 TITL <SOURCE_DESCRIPTIVE_TITLE> {0:1}
+2 [CONT|CONC] <SOURCE_DESCRIPTIVE_TITLE> {0:M} +2 [CONT|CONC] <SOURCE_DESCRIPTIVE_TITLE> {0:M}
+1 ABBR <SOURCE_FILED_BY_ENTRY> {0:1} +1 ABBR <SOURCE_FILED_BY_ENTRY> {0:1}
+1 PUBL <SOURCE_PUBLICATION_FACTS> {0:1} +1 PUBL <SOURCE_PUBLICATION_FACTS> {0:1}
+2 [CONT|CONC] <SOURCE_PUBLICATION_FACTS> {0:M} +2 [CONT|CONC] <SOURCE_PUBLICATION_FACTS> {0:M}
+1 TEXT <TEXT_FROM_SOURCE> {0:1} +1 TEXT <TEXT_FROM_SOURCE> {0:1}
+2 [CONT|CONC] <TEXT_FROM_SOURCE> {0:M} +2 [CONT|CONC] <TEXT_FROM_SOURCE> {0:M}
+1 <<SOURCE_REPOSITORY_CITATION>> {0:1} +1 <<SOURCE_REPOSITORY_CITATION>> {0:1}
+1 <<MULTIMEDIA_LINK>> {0:M} +1 <<MULTIMEDIA_LINK>> {0:M}
+1 <<NOTE_STRUCTURE>> {0:M} +1 <<NOTE_STRUCTURE>> {0:M}
+1 REFN <USER_REFERENCE_NUMBER> {0:M} +1 REFN <USER_REFERENCE_NUMBER> {0:M}
+2 TYPE <USER_REFERENCE_TYPE> {0:1} +2 TYPE <USER_REFERENCE_TYPE> {0:1}
+1 RIN <AUTOMATED_RECORD_ID> {0:1} +1 RIN <AUTOMATED_RECORD_ID> {0:1}
+1 <<CHANGE_DATE>> {0:1} +1 <<CHANGE_DATE>> {0:1}
SUBMISSION_RECORD:= SUBMISSION_RECORD:=
n @XREF:SUBN@ SUBN {1:1] n @<XREF:SUBN>@ SUBN {1:1]
+1 SUBM @XREF:SUBM@ {0:1} +1 SUBM @<XREF:SUBM>@ {0:1}
+1 FAMF <NAME_OF_FAMILY_FILE> {0:1} +1 FAMF <NAME_OF_FAMILY_FILE> {0:1}
+1 TEMP <TEMPLE_CODE> {0:1} +1 TEMP <TEMPLE_CODE> {0:1}
+1 ANCE <GENERATIONS_OF_ANCESTORS> {0:1} +1 ANCE <GENERATIONS_OF_ANCESTORS> {0:1}
+1 DESC <GENERATIONS_OF_DESCENDANTS> {0:1} +1 DESC <GENERATIONS_OF_DESCENDANTS> {0:1}
+1 ORDI <ORDINANCE_PROCESS_FLAG> {0:1} +1 ORDI <ORDINANCE_PROCESS_FLAG> {0:1}
+1 RIN <AUTOMATED_RECORD_ID> {0:1} +1 RIN <AUTOMATED_RECORD_ID> {0:1}
SUBMITTER_RECORD:= SUBMITTER_RECORD:=
n @<XREF:SUBM>@ SUBM {1:1} n @<XREF:SUBM>@ SUBM {1:1}
+1 NAME <SUBMITTER_NAME> {1:1} +1 NAME <SUBMITTER_NAME> {1:1}
+1 <<ADDRESS_STRUCTURE>> {0:1} +1 <<ADDRESS_STRUCTURE>> {0:1}
+1 <<MULTIMEDIA_LINK>> {0:M} +1 <<MULTIMEDIA_LINK>> {0:M}
+1 LANG <LANGUAGE_PREFERENCE> {0:3} +1 LANG <LANGUAGE_PREFERENCE> {0:3}
+1 RFN <SUBMITTER_REGISTERED_RFN> {0:1} +1 RFN <SUBMITTER_REGISTERED_RFN> {0:1}
+1 RIN <AUTOMATED_RECORD_ID> {0:1} +1 RIN <AUTOMATED_RECORD_ID> {0:1}
+1 <<CHANGE_DATE>> {0:1} +1 <<CHANGE_DATE>> {0:1}
ADDRESS_STRUCTURE:= ADDRESS_STRUCTURE:=
n ADDR <ADDRESS_LINE> {0:1} n ADDR <ADDRESS_LINE> {0:1}
+1 CONT <ADDRESS_LINE> {0:M} +1 CONT <ADDRESS_LINE> {0:M}
+1 ADR1 <ADDRESS_LINE1> {0:1} +1 ADR1 <ADDRESS_LINE1> {0:1}
+1 ADR2 <ADDRESS_LINE2> {0:1} +1 ADR2 <ADDRESS_LINE2> {0:1}
+1 CITY <ADDRESS_CITY> {0:1} +1 CITY <ADDRESS_CITY> {0:1}
+1 STAE <ADDRESS_STATE> {0:1} +1 STAE <ADDRESS_STATE> {0:1}
+1 POST <ADDRESS_POSTAL_CODE> {0:1} +1 POST <ADDRESS_POSTAL_CODE> {0:1}
+1 CTRY <ADDRESS_COUNTRY> {0:1} +1 CTRY <ADDRESS_COUNTRY> {0:1}
n PHON <PHONE_NUMBER> {0:3} n PHON <PHONE_NUMBER> {0:3}
ASSOCIATION_STRUCTURE:= ASSOCIATION_STRUCTURE:=
n ASSO @<XREF:INDI>@ {0:M} n ASSO @<XREF:INDI>@ {0:M}
+1 TYPE <RECORD_TYPE> {1:1}
+1 RELA <RELATION_IS_DESCRIPTOR> {1:1} +1 RELA <RELATION_IS_DESCRIPTOR> {1:1}
+1 <<NOTE_STRUCTURE>> {0:M} +1 <<NOTE_STRUCTURE>> {0:M}
+1 <<SOURCE_CITATION>> {0:M} +1 <<SOURCE_CITATION>> {0:M}
CHANGE_DATE:= CHANGE_DATE:=
n CHAN {1:1} n CHAN {1:1}
+1 DATE <CHANGE_DATE> {1:1} +1 DATE <CHANGE_DATE> {1:1}
+2 TIME <TIME_VALUE> {0:1} +2 TIME <TIME_VALUE> {0:1}
+1 <<NOTE_STRUCTURE>> {0:M} +1 <<NOTE_STRUCTURE>> {0:M}
CHILD_TO_FAMILY_LINK:= CHILD_TO_FAMILY_LINK:=
n FAMC @<XREF:FAM>@ {1:1} n FAMC @<XREF:FAM>@ {1:1}
+1 PEDI <PEDIGREE_LINKAGE_TYPE> {0:M} +1 PEDI <PEDIGREE_LINKAGE_TYPE> {0:1}
+1 <<NOTE_STRUCTURE>> {0:M} +1 <<NOTE_STRUCTURE>> {0:M}
EVENT_DETAIL:= EVENT_DETAIL:=
n TYPE <EVENT_DESCRIPTOR> {0:1} n TYPE <EVENT_DESCRIPTOR> {0:1}
n DATE <DATE_VALUE> {0:1} n DATE <DATE_VALUE> {0:1}
n <<PLACE_STRUCTURE>> {0:1} n <<PLACE_STRUCTURE>> {0:1}
n <<ADDRESS_STRUCTURE>> {0:1} n <<ADDRESS_STRUCTURE>> {0:1}
n AGE <AGE_AT_EVENT> {0:1} n AGE <AGE_AT_EVENT> {0:1}
n AGNC <RESPONSIBLE_AGENCY> {0:1} n AGNC <RESPONSIBLE_AGENCY> {0:1}
n CAUS <CAUSE_OF_EVENT> {0:1} n CAUS <CAUSE_OF_EVENT> {0:1}
n <<SOURCE_CITATION>> {0:M} n <<SOURCE_CITATION>> {0:M}
+1 <<NOTE_STRUCTURE>> {0:M}
+1 <<MULTIMEDIA_LINK>> {0:M}
n <<MULTIMEDIA_LINK>> {0:M} n <<MULTIMEDIA_LINK>> {0:M}
n <<NOTE_STRUCTURE>> {0:M} n <<NOTE_STRUCTURE>> {0:M}
FAMILY_EVENT_STRUCTURE:= FAMILY_EVENT_STRUCTURE:=
[ [
n [ ANUL | CENS | DIV | DIVF ] [Y|<NULL>] {1:1} n [ ANUL | CENS | DIV | DIVF ] [Y|<NULL>] {1:1}
+1 <<EVENT_DETAIL>> {0:1} +1 <<EVENT_DETAIL>> {0:1}
| |
n [ ENGA | MARR | MARB | MARC ] [Y|<NULL>] {1:1} n [ ENGA | MARR | MARB | MARC ] [Y|<NULL>] {1:1}
+1 <<EVENT_DETAIL>> {0:1} +1 <<EVENT_DETAIL>> {0:1}
| |
n [ MARL | MARS ] [Y|<NULL>] {1:1} n [ MARL | MARS ] [Y|<NULL>] {1:1}
+1 <<EVENT_DETAIL>> {0:1} +1 <<EVENT_DETAIL>> {0:1}
| |
n EVEN {1:1} n EVEN {1:1}
+1 <<EVENT_DETAIL>> {0:1} +1 <<EVENT_DETAIL>> {0:1}
] ]
INDIVIDUAL_ATTRIBUTE_STRUCTURE:= INDIVIDUAL_ATTRIBUTE_STRUCTURE:=
[ [
n CAST <CASTE_NAME> {1:1} n CAST <CASTE_NAME> {1:1}
+1 <<EVENT_DETAIL>> {0:1} +1 <<EVENT_DETAIL>> {0:1}
| |
n DSCR <PHYSICAL_DESCRIPTION> {1:1} n DSCR <PHYSICAL_DESCRIPTION> {1:1}
+1 <<EVENT_DETAIL>> {0:1} +1 <<EVENT_DETAIL>> {0:1}
| |
n EDUC <SCHOLASTIC_ACHIEVEMENT> {1:1} n EDUC <SCHOLASTIC_ACHIEVEMENT> {1:1}
+1 <<EVENT_DETAIL>> {0:1} +1 <<EVENT_DETAIL>> {0:1}
| |
n IDNO <NATIONAL_ID_NUMBER> {1:1} n IDNO <NATIONAL_ID_NUMBER> {1:1}
+1 <<EVENT_DETAIL>> {0:1} +1 <<EVENT_DETAIL>> {0:1}
| |
n NATI <NATIONAL_OR_TRIBAL_ORIGIN> {1:1} n NATI <NATIONAL_OR_TRIBAL_ORIGIN> {1:1}
+1 <<EVENT_DETAIL>> {0:1} +1 <<EVENT_DETAIL>> {0:1}
| |
n NCHI <COUNT_OF_CHILDREN> {1:1} n NCHI <COUNT_OF_CHILDREN> {1:1}
+1 <<EVENT_DETAIL>> {0:1} +1 <<EVENT_DETAIL>> {0:1}
| |
n NMR <COUNT_OF_MARRIAGES> {1:1} n NMR <COUNT_OF_MARRIAGES> {1:1}
+1 <<EVENT_DETAIL>> {0:1} +1 <<EVENT_DETAIL>> {0:1}
| |
n OCCU <OCCUPATION> {1:1} n OCCU <OCCUPATION> {1:1}
+1 <<EVENT_DETAIL>> {0:1} +1 <<EVENT_DETAIL>> {0:1}
| |
n PROP <POSSESSIONS> {1:1} n PROP <POSSESSIONS> {1:1}
+1 <<EVENT_DETAIL>> {0:1} +1 <<EVENT_DETAIL>> {0:1}
| |
n RELI <RELIGIOUS_AFFILIATION> {1:1} n RELI <RELIGIOUS_AFFILIATION> {1:1}
+1 <<EVENT_DETAIL>> {0:1} +1 <<EVENT_DETAIL>> {0:1}
| |
n RESI {1:1} n RESI {1:1}
+1 <<EVENT_DETAIL>> {0:1} +1 <<EVENT_DETAIL>> {0:1}
| |
n SSN <SOCIAL_SECURITY_NUMBER> {0:1} n SSN <SOCIAL_SECURITY_NUMBER> {0:1}
+1 <<EVENT_DETAIL>> {0:1} +1 <<EVENT_DETAIL>> {0:1}
| |
n TITL <NOBILITY_TYPE_TITLE> {1:1} n TITL <NOBILITY_TYPE_TITLE> {1:1}
+1 <<EVENT_DETAIL>> {0:1} +1 <<EVENT_DETAIL>> {0:1}
] ]
INDIVIDUAL_EVENT_STRUCTURE:= INDIVIDUAL_EVENT_STRUCTURE:=
[ [
n [ BIRT | CHR ] [Y|<NULL>] {1:1} n [ BIRT | CHR ] [Y|<NULL>] {1:1}
+1 <<EVENT_DETAIL>> {0:1} +1 <<EVENT_DETAIL>> {0:1}
+1 FAMC @<XREF:FAM>@ {0:1} +1 FAMC @<XREF:FAM>@ {0:1}
| |
n [ DEAT | BURI | CREM ] [Y|<NULL>] {1:1} n [ DEAT | BURI | CREM ] [Y|<NULL>] {1:1}
+1 <<EVENT_DETAIL>> {0:1} +1 <<EVENT_DETAIL>> {0:1}
| |
n ADOP [Y|<NULL>] {1:1} n ADOP [Y|<NULL>] {1:1}
+1 <<EVENT_DETAIL>> {0:1} +1 <<EVENT_DETAIL>> {0:1}
+1 FAMC @<XREF:FAM>@ {0:1} +1 FAMC @<XREF:FAM>@ {0:1}
+2 ADOP <ADOPTED_BY_WHICH_PARENT> {0:1} +2 ADOP <ADOPTED_BY_WHICH_PARENT> {0:1}
| |
n [ BAPM | BARM | BASM | BLES ] [Y|<NULL>] {1:1} n [ BAPM | BARM | BASM | BLES ] [Y|<NULL>] {1:1}
+1 <<EVENT_DETAIL>> {0:1} +1 <<EVENT_DETAIL>> {0:1}
| |
n [ CHRA | CONF | FCOM | ORDN ] [Y|<NULL>] {1:1} n [ CHRA | CONF | FCOM | ORDN ] [Y|<NULL>] {1:1}
+1 <<EVENT_DETAIL>> {0:1} +1 <<EVENT_DETAIL>> {0:1}
| |
n [ NATU | EMIG | IMMI ] [Y|<NULL>] {1:1} n [ NATU | EMIG | IMMI ] [Y|<NULL>] {1:1}
+1 <<EVENT_DETAIL>> {0:1} +1 <<EVENT_DETAIL>> {0:1}
| |
n [ CENS | PROB | WILL] [Y|<NULL>] {1:1} n [ CENS | PROB | WILL] [Y|<NULL>] {1:1}
+1 <<EVENT_DETAIL>> {0:1} +1 <<EVENT_DETAIL>> {0:1}
| |
n [ GRAD | RETI ] [Y|<NULL>] {1:1} n [ GRAD | RETI ] [Y|<NULL>] {1:1}
+1 <<EVENT_DETAIL>> {0:1} +1 <<EVENT_DETAIL>> {0:1}
| |
n EVEN {1:1} n EVEN {1:1}
+1 <<EVENT_DETAIL>> {0:1} +1 <<EVENT_DETAIL>> {0:1}
] ]
LDS_INDIVIDUAL_ORDINANCE:= LDS_INDIVIDUAL_ORDINANCE:=
[ [
n [ BAPL | CONL ] {1:1} n [ BAPL | CONL ] {1:1}
+1 STAT <LDS_BAPTISM_DATE_STATUS> {0:1} +1 STAT <LDS_BAPTISM_DATE_STATUS> {0:1}
+1 DATE <DATE_LDS_ORD> {0:1} +1 DATE <DATE_LDS_ORD> {0:1}
+1 TEMP <TEMPLE_CODE> {0:1} +1 TEMP <TEMPLE_CODE> {0:1}
+1 PLAC <PLACE_LIVING_ORDINANCE> {0:1} +1 PLAC <PLACE_LIVING_ORDINANCE> {0:1}
+1 <<SOURCE_CITATION>> {0:M} +1 <<SOURCE_CITATION>> {0:M}
+1 <<NOTE_STRUCTURE>> {0:M} +1 <<NOTE_STRUCTURE>> {0:M}
| |
n ENDL {1:1} n ENDL {1:1}
+1 STAT <LDS_ENDOWMENT_DATE_STATUS> {0:1} +1 STAT <LDS_ENDOWMENT_DATE_STATUS> {0:1}
+1 DATE <DATE_LDS_ORD> {0:1} +1 DATE <DATE_LDS_ORD> {0:1}
+1 TEMP <TEMPLE_CODE> {0:1} +1 TEMP <TEMPLE_CODE> {0:1}
+1 PLAC <PLACE_LIVING_ORDINANCE> {0:1} +1 PLAC <PLACE_LIVING_ORDINANCE> {0:1}
+1 <<SOURCE_CITATION>> {0:M} +1 <<SOURCE_CITATION>> {0:M}
+1 <<NOTE_STRUCTURE>> {0:M} +1 <<NOTE_STRUCTURE>> {0:M}
| |
n SLGC {1:1} n SLGC {1:1}
+1 STAT <LDS_CHILD_SEALING_DATE_STATUS> {0:1} +1 STAT <LDS_CHILD_SEALING_DATE_STATUS> {0:1}
+1 DATE <DATE_LDS_ORD> {0:1} +1 DATE <DATE_LDS_ORD> {0:1}
+1 TEMP <TEMPLE_CODE> {0:1} +1 TEMP <TEMPLE_CODE> {0:1}
+1 PLAC <PLACE_LIVING_ORDINANCE> {0:1} +1 PLAC <PLACE_LIVING_ORDINANCE> {0:1}
+1 FAMC @<XREF:FAM>@ {1:1} +1 FAMC @<XREF:FAM>@ {1:1}
+1 <<SOURCE_CITATION>> {0:M} +1 <<SOURCE_CITATION>> {0:M}
+1 <<NOTE_STRUCTURE>> {0:M} +1 <<NOTE_STRUCTURE>> {0:M}
] ]
LDS_SPOUSE_SEALING:= LDS_SPOUSE_SEALING:=
n SLGS {1:1} n SLGS {1:1}
+1 STAT <LDS_SPOUSE_SEALING_DATE_STATUS> {0:1} +1 STAT <LDS_SPOUSE_SEALING_DATE_STATUS> {0:1}
+1 DATE <DATE_LDS_ORD> {0:1} +1 DATE <DATE_LDS_ORD> {0:1}
+1 TEMP <TEMPLE_CODE> {0:1} +1 TEMP <TEMPLE_CODE> {0:1}
+1 PLAC <PLACE_LIVING_ORDINANCE> {0:1} +1 PLAC <PLACE_LIVING_ORDINANCE> {0:1}
+1 <<SOURCE_CITATION>> {0:M} +1 <<SOURCE_CITATION>> {0:M}
+1 <<NOTE_STRUCTURE>> {0:M} +1 <<NOTE_STRUCTURE>> {0:M}
MULTIMEDIA_LINK:= MULTIMEDIA_LINK:=
[ /* embedded form*/ [ /* embedded form*/
n OBJE @<XREF:OBJE>@ {1:1} n OBJE @<XREF:OBJE>@ {1:1}
| /* linked form*/ | /* linked form*/
n OBJE {1:1} n OBJE {1:1}
+1 FORM <MULTIMEDIA_FORMAT> {1:1} +1 FORM <MULTIMEDIA_FORMAT> {1:1}
+1 TITL <DESCRIPTIVE_TITLE> {0:1} +1 TITL <DESCRIPTIVE_TITLE> {0:1}
+1 FILE <MULTIMEDIA_FILE_REFERENCE> {1:1} +1 FILE <MULTIMEDIA_FILE_REFERENCE> {1:1}
+1 <<NOTE_STRUCTURE>> {0:M} +1 <<NOTE_STRUCTURE>> {0:M}
] ]
NOTE_STRUCTURE:= NOTE_STRUCTURE:=
[ [
n NOTE @<XREF:NOTE>@ {1:1} n NOTE @<XREF:NOTE>@ {1:1}
+1 <<SOURCE_CITATION>> {0:M} +1 SOUR @<XREF:SOUR>@ {0:M}
| |
n NOTE [SUBMITTER_TEXT> | <NULL>] {1:1} n NOTE [<SUBMITTER_TEXT> | <NULL>] {1:1}
+1 [ CONC | CONT ] <SUBMITTER_TEXT> {0:M} +1 [ CONC | CONT ] <SUBMITTER_TEXT> {0:M}
+1 <<SOURCE_CITATION>> {0:M} +1 SOUR @<XREF:SOUR>@ {0:M}
] ]
PERSONAL_NAME_STRUCTURE:= PERSONAL_NAME_STRUCTURE:=
n NAME <NAME_PERSONAL> {1:1} n NAME <NAME_PERSONAL> {1:1}
+1 NPFX <NAME_PIECE_PREFIX> {0:1} +1 NPFX <NAME_PIECE_PREFIX> {0:1}
+1 GIVN <NAME_PIECE_GIVEN> {0:1} +1 GIVN <NAME_PIECE_GIVEN> {0:1}
+1 NICK <NAME_PIECE_NICKNAME> {0:1} +1 NICK <NAME_PIECE_NICKNAME> {0:1}
+1 SPFX <NAME_PIECE_SURNAME_PREFIX {0:1} +1 SPFX <NAME_PIECE_SURNAME_PREFIX> {0:1}
+1 SURN <NAME_PIECE_SURNAME> {0:1} +1 SURN <NAME_PIECE_SURNAME> {0:1}
+1 NSFX <NAME_PIECE_SUFFIX> {0:1} +1 NSFX <NAME_PIECE_SUFFIX> {0:1}
+1 <<SOURCE_CITATION>> {0:M} +1 <<SOURCE_CITATION>> {0:M}
+2 <<NOTE_STRUCTURE>> {0:M}
+2 <<MULTIMEDIA_LINK>> {0:M}
+1 <<NOTE_STRUCTURE>> {0:M} +1 <<NOTE_STRUCTURE>> {0:M}
PLACE_STRUCTURE:= PLACE_STRUCTURE:=
n PLAC <PLACE_VALUE> {1:1} n PLAC <PLACE_VALUE> {1:1}
+1 FORM <PLACE_HIERARCHY> {0:1} +1 FORM <PLACE_HIERARCHY> {0:1}
+1 <<SOURCE_CITATION>> {0:M} +1 <<SOURCE_CITATION>> {0:M}
+1 <<NOTE_STRUCTURE>> {0:M} +1 <<NOTE_STRUCTURE>> {0:M}
SOURCE_CITATION:= SOURCE_CITATION:=
[ [
n SOUR @<XREF:SOUR>@ /* pointer to source record */ {1:1} n SOUR @<XREF:SOUR>@ /* pointer to source record */ {1:1}
+1 PAGE <WHERE_WITHIN_SOURCE> {0:1} +1 PAGE <WHERE_WITHIN_SOURCE> {0:1}
+1 EVEN <EVENT_TYPE_CITED_FROM> {0:1} +1 EVEN <EVENT_TYPE_CITED_FROM> {0:1}
+2 ROLE <ROLE_IN_EVENT> {0:1} +2 ROLE <ROLE_IN_EVENT> {0:1}
+1 DATA {0:1} +1 DATA {0:1}
+2 DATE <ENTRY_RECORDING_DATE> {0:1} +2 DATE <ENTRY_RECORDING_DATE> {0:1}
+2 TEXT <TEXT_FROM_SOURCE> {0:M} +2 TEXT <TEXT_FROM_SOURCE> {0:M}
+3 [ CONC | CONT ] <TEXT_FROM_SOURCE> {0:M} +3 [ CONC | CONT ] <TEXT_FROM_SOURCE> {0:M}
+1 QUAY <CERTAINTY_ASSESSMENT> {0:1} +1 QUAY <CERTAINTY_ASSESSMENT> {0:1}
+1 <<MULTIMEDIA_LINK>> {0:M} +1 <<MULTIMEDIA_LINK>> {0:M}
+1 <<NOTE_STRUCTURE>> {0:M} +1 <<NOTE_STRUCTURE>> {0:M}
| /* Systems not using source records */ | /* Systems not using source records */
n SOUR <SOURCE_DESCRIPTION> {1:1} n SOUR <SOURCE_DESCRIPTION> {1:1}
+1 [ CONC | CONT ] <SOURCE_DESCRIPTION> {0:M} +1 [ CONC | CONT ] <SOURCE_DESCRIPTION> {0:M}
+1 TEXT <TEXT_FROM_SOURCE> {0:M} +1 TEXT <TEXT_FROM_SOURCE> {0:M}
+2 [CONC | CONT ] <TEXT_FROM_SOURCE> {0:M} +2 [CONC | CONT ] <TEXT_FROM_SOURCE> {0:M}
+1 <<NOTE_STRUCTURE>> {0:M} +1 <<NOTE_STRUCTURE>> {0:M}
] ]
SOURCE_REPOSITORY_CITATION:= SOURCE_REPOSITORY_CITATION:=
[
n REPO @XREF:REPO@ {1:1} n REPO @<XREF:REPO>@ {1:1}
+1 <<NOTE_STRUCTURE>> {0:M} +1 <<NOTE_STRUCTURE>> {0:M}
+1 CALN <SOURCE_CALL_NUMBER> {0:M} +1 CALN <SOURCE_CALL_NUMBER> {0:M}
+2 MEDI <SOURCE_MEDIA_TYPE> {0:1} +2 MEDI <SOURCE_MEDIA_TYPE> {0:1}
SPOUSE_TO_FAMILY_LINK:= SPOUSE_TO_FAMILY_LINK:=
n FAMS @<XREF:FAM>@ {1:1} n FAMS @<XREF:FAM>@ {1:1}
+1 <<NOTE_STRUCTURE>> {0:M} +1 <<NOTE_STRUCTURE>> {0:M}

If you have any information related to this article please e-mail me or add it to the comments!


[Update 2015/01/01] Errata Sheet found

After the discovery of the differences described above and the reference to the Errata Sheet, I also e-mailed several people to help find the document. Louis Kessler got in contact with Brian Madsen who had the Errata Sheet (on paper) and scanned it, read all about it in Louis' blogpost More GEDCOM Archaeological Discoveries. The Errata Sheet (PDF itself is shown below. The Errata Sheet contains all of the big changes as highlighted in the table above!

2014/02/03

Links you can make from a record

imageWhen genealogical data from a record is displayed on a website of an archive, the names are usually (web)links, so you can easily and quickly search for that name. Open Archives shows that a record can be linked to many different sources of information, so the records become enriched.

Links to search actions

Open Archives too makes the names of the persons into ​​’search links’. But more powerful are the 'search links' that are displayed with couples. Searching for records on two names is a widely used and much requested feature that you can offer directly from the record, because there are usually various types of relationships between individuals shown on a record.

Below is an example of a marriage certificate. With just one mouse-click on the ‘relation bracket’ you can search for the parents of the bride or groom (in order to find the brothers and sisters of the bride or groom) or the bride/groom couple (to find their children).

imageClick on the image to view the record on Open Archives.

Often names in records are easy to identify because they are in a separate field. Sometimes there are also names in the comments of a record. Like the mention ​of a twin sister/brother. Open Archives recognizes these mentions of twins and makes these names into links as well.

imageClick on the image to view the record on Open Archives.

However, the search for the twin sister/brother is not purely a search by name. The search can be made ​​smarter by using the information from the record, such as the date and the name of the mother. In most cases such a smart search returns the twin sister/brother immediately.

Links to other documents

Information in records about the parents can be used to find more information about the person. For example, when parents are mentioned in a death certificate, often the birth and marriage certificate can also be found (because these records also mention the parents). Open Archives performs these searches on-the-fly and displays the results as links to other records.

imageClick on the image to view the record on Open Archives.

This principle of searching for related documents (and thus persons) can also be repeated several times. That’s what the ‘Links Explorer’ on Open Archives does. After clicking on the ‘Links Explorer’ icon (pictured here on the right) will open up a window and, starting from the record you were viewing, looks for related records which are then presented in a relationship network. This diagram shows parent-child relationships with a red (blood) line and marriages with an orange line.

imageClick on the image to view the record on the Open Archives,
then click the Links Explorer icon.


The information in records can also be used to query other data sources.

Links to biographies

With a name and birth date/place and/or death date/place you can search the Biographical Portal of the Netherlands to see if a biography is known for that person. Open Archives performs this search automatically. When there are one or more biographies these links are shown. The example below shows the links to biographies about Henry Constantine Cras.

imageClick on the image to view the record on Open Archives.

Links to gravestones

There are various websites that offer information about graves, like Graftombe.nl and Dutch-Cemeteries.com. These sources are queried when birth and death are shown on Open Archives. When a link is found, it’s presented below the record:

imageClick on the image to view the record on Open Archives.

Links to online family trees

The previous example also shows that Open Archives looks up the (main) person in online family trees, specifically Genealogy Online. Conversely, Genealogy Online gives hints on scans of genealogical events in archives via the Scans search service.

Link to the weather

With the combination date and place name the weather can be looked up, for the Netherlands in the historical dataset of the Royal Netherlands Meteorological Institute. If there are measurements, the weather on and around the specified data can be shown.

imageClick on the image to view the record on Open Archives,
then click on the date September 13, 1746.

Link to a map

Population registers also contain street names, which are interesting. A researcher usually likes to know where the street was. Open Archives now has knowledge about a large part of the (historic) streets of Leiden and Rijnsburg (Netherlands). With this information the street can be displayed (with a thick orange line) on a historical map.

imageClick on the image to view the record on Open Archives, then click the street name Hogewoerd.

Link to scan

If an archive doesn’t have scans of certain records, this does not mean that there are no scans. When Open Archives gets new open data from archival institutions or individuals, Open Archives will also look if scans are available elsewhere and whether they can be linked.

For example, Open Archives shows FamilySearch scans with records of the Regional History Center Vecht en Venen, scans of GaHetNa (Dutch National Archives) with records of Groene Hart Archieven and scans from Van Papier naar Digitaal with records from the Regional Archives of Alkmaar!

Show enriched information

As this article (and Open Archives) shows: a record doesn’t have to be shown just as-is.  Many records can be enriched with (links to) one or more other sources of information. This enrichment makes Open Archives a more useful research tool.

image

 

About Open Archives

Open Archives is an initiative of Bob Coret to show that open data and services push innovation. The genealogical search engine is available in English, French, German and Dutch. Follow Open Archives on Google+ or Twitter.