The European ethnohistory database is a unique resource which describes the movements and locations of 891 ethnic units (each a "gens" or an archaeological assemblage) from 2200 BC to 1970 AD. The data were abstracted from a total of 191 secondary or tertiary ethnohistorical/archeological literature sources, as well as 91 historical maps. A total of 6161 records was originally captured from these materials, but only 3460 remain as accepted records after numerous checks to correct records, avoid duplications, and improve logical consistency between records.
Over the past five years, the dataset has been subjected to three separate correction cycles. The records were first examined for consistency within a single named group (e.g., the Goths). Quite a few records were removed because they clearly duplicated other active records in the dataset or occurred too early in time (earlier than the 2200 BC cutoff date). Note, however, that there are 37 records that start between 3000 BC and 2200 BC. These were included for a variety of reasons to round out the database. The next logical step was an examination of all gens/archaeological groups, within a language family (e.g., examining all the Slavic or Germanic speakers, etc.). This check allowed for correction of synonymous gens names, resulting in greater consistency between the data of ethnically or linguistically related groups. The final pass-through for consistency required a global examination of all data records which fell into one of eighty-five 5 x 5-degree-quadrat land areas. To accomplish this final step, the geographic area described by each record was drawn by hand on a map, digitized on a computer screen, and then plotted to check for accuracy. The verbal text decriptions resulted in geographic outlines which can be plotted on a computer screen. (The LINMAP program will allow the user to view the outlines assigned to any record of interest. Please see the LINMAP documentation for more details on this point.) This final correction pass resolved errors not visible in the first two checks--those between groups occupying the same territory (when they should not), missing "vacant" areas (which should have been inhabited), etc. While it is impossible to guarantee that this (or any other dataset like it) is error-free, we have taken great pains to insure that the data are accurate within the limits of our expertise.
It is our hope that the information summarized in the records contained in the European ethnohistory dataset will be used across several, divergent disciplines. Experts in European history will no doubt find it a useful resource, as will others whose studies are focussed on the anthropology, linguistics, or genetics of Europe.