GED-inline is an online GEDCOM validator. It has been around since 2011. Just upload your GEDCOM to the service and get an instant report. Now, developer Nigel Munro Parker has open-sourced the validation component. It’s available at Github via https://github.com/nigel-parker/gedinline This makes validating GEDCOM files even more accessible for developers.

Generated by                   CFTREE
Submitted by                   Bob Coret
Encoding                       ANSI
GEDCOM version in file         5.5
GEDCOM version assumed         5.5

Analysis time                  3 seconds to analyse the file
Speed                          1714 records per second

Lines                   51944  Number of lines in the GEDCOM file
Records                  5142  Number of records
Warnings                  363  Total number of warning messages
User-defined                0  Number of lines with user-defined tags

Individuals              3549  Number of individuals in the GEDCOM file
Males                    1864  Number of males
Females                  1672  Number of females
Other                      13  

Families                 1419  Number of families
Marriages                 638  Number of marriages
Places                   3333  Number of places mentioned (not necessarily unique)
Source records            153  Number of source records

(detailed warnings not shown here)

Logo Genealogy OnlineGenealogy Online has deployed the GED-inline component in it’s infrastructure. Every uploaded GEDCOM file is checked. The results are stored and presented to the user. The message being that a GEDCOM file with warnings and user-defined tags could lead to information loss when transferring the GEDCOM files from their genealogy program or service to another genealogy program or service (like Genealogy Online).

GED-inline validation statistics

An analysis of over 10 thousand GEDCOM files shows statistics about the number of warnings per ‘generator’ (program/service which generated the GEDCOM file), the number of user-defined tags and the used encoding and GEDCOM version.

1,215,130,449 lines of GEDCOM were inspected, 8,129,466 warnings were given (that’s 0.7%), and 93,365,260 lines contained user defined tags (that’s 7.7%).

The overview clearly shows that a lot of vendors do not make GEDCOM files which strictly conform to (a version of) the GEDCOM specification published on gedcom.org. And the user-defined tags contain information which other genealogy programs and services usually don’t understand (because usually undocumented) and are rarely implemented by other vendors. This two aspects increases the chance of loss of users genealogical information…