Welcome Intro Format Data Download

Basic Format of the Data

Individual records are composed of 11 different fields which supply a reference number, a record number, a starting date, an ending date, approximate date indicator, a group name, a language family code, a geographic place description, an action code, a code indicating origin of the record, and additional comments. Each of these fields will be explained in more detail below. The data are presented in two forms: chronologically and alphabetically by the name of the group (chronologically within each group name).

We can start by looking at two sample records taken from the dataset:

 11  55  200 230 N Goths  G NW of Black Sea State...   E   From the marshes
                           ... Carpathians, Don R,...      nr Azov Sea.
                           ... on lower Dnepr R.           *{900-488}*

900 489  200 240 N Goths  G {NW of Black Sea State...  L   For consistency
                           ... Carpathians, Don R,...      with (11-55) &
                           ... on lower Dnepr R.}          (110-49).

Reference and Record Fields

The reference and record numbers serve as guides to refer the user to the original source material for that particular record. In the first example given above, 11 refers to reference number 11 (for the complete list, please see the master list of abstracted references), a book entitled The Germanic Invasions by L. Musset (1975). The record number indicates this is the 55th piece of information abstracted from this particular source. (The original references in our files are highlighted and numbered to show the exact location of the indicated information.) The second example, reference 900 demonstrates that some records are not taken directly from a single, identifiable source. (If one checks the master list for abstracted references, there is no listing for reference 900.) Records flagged with either a 900 or 901 label were added to the dataset during correcting and editing. These records can have hybrid origins, sometime citing other reference-record numbers which were not retained (due to dating inconsistencies, conflicts, etc.). In other cases (such as our example above (900-489)) records were created to smooth the logical progression of a group or to fill in a gap of missing information. For records with reference numbers 900 and 901, consult the comment text field for sources of the information and for reasons for constructing the record (e.g., For consistency with (11-55) & (110-49), Acc/to (38-2), etc.).

Start and End Date Fields

The next two consecutive fields provide a time interval for the action or location described in a given record. In all cases, a start date is given (200 AD in our examples above). If the action took place within one year, no end date is given. Instead, the end date field has the # symbol. When an end date is not known or cannot be constructed, the "#" indicates the absence of a firm end date. The examples cited earlier occur in the time intervals 200 to 230 AD and 200 to 240 AD respectively. Negative dates indicate BC, positive ones AD.

Approximate Date Field

This field is usually specified as a single letter immediately preceeding the group name field in columns 23-37. The field can contain either an A (indicating that the specified dates for the record are only approximate) or an N (indicating dates are firm). The distinction between the two codes is not hard and fast--a good deal of historical dating is inherently approximate. Note that a value of N is assigned to the earlier Goth examples cited above.

Group Name

Each record decribes the name of a single group--be it an archeological culture or an ethnic group (gens). The majority of early records (before 2000 BC) name archaeological assemblages or horizons, rather than specific ethnic groups. However, there are some named peoples before this date (e.g., Phoenicians, Anatolians, Hurrians, etc.). With the passage of time, the identity of groups becomes more firmly established and more named ethnic groups appear. If you are in doubt as to whether a name indicates an archaeological assemblage or an ethnic group, consult the date and language-family fields. Early records with a known language identification generally refer to ethnic groups. Archeological cultures are mostly labeled with a "U" for unknown language affiliation. In many cases, the Comment text field has information that clarifies this issue further.

In cases where a group name exceeds 16 characters in length, the name has been abbreviated. In such cases the original name is always specified in the Comment text field. Where there are synonymous names for a given group of people, one variant has been chosen over the others, but the comments will also list the other forms of the group name.

Language-Family Affiliation

Each record contains a single letter field to show the language-family affiliation for the studied group (see Abbreviations--Language-Family Codes for a complete list of choices). The affiliation can established in two ways--directly from the cited source or from another established source describing the same people. In cases where language information is not available, a "U" indicates unknown language-family affiliation. When the known language family is not in the list, an "O" for "other" language-family affiliation is used and the language-family affiliation is given in the Comment text field.

Geographic Place Description

Immediately following the language affiliation is a large, hanging-paragraph text field describing the geographic location(s) where some activity occurs. In both our examples for Goths, the text reads as follows: "NW of Black Sea between Carpathians, Don R, Vistula R, Sea of Azov, centered on lower Dnepr R". Record (11-55) shows Goths expanding into the area described as "NW of Black Sea State....centered on lower Dnepr R". The next record (900-489) is a location to keep the Goths in the new place for the length of time that it took to complete the expansion. In many instances, the geographic text contains standard abbreviations for names of countries and geographical features (see Abbreviations--Geographical for the list of abbreviations). Note also that the text description for (900-489) is enclosed with curly brackets {NW..., Dnepr R} to show that the area has been described before and is equal to *{11-55}*. The reference-record value(s) specified between enclosed asterisks are cross-references to component areas described by other records in the dataset. This notation is used frequently in the data to combine areas and build large geographic conglomerations for individual groups. Conversely, areas can also be subtracted rather than added together (see Scordisci, (132-9), at -100 BC, as an example of this). The {curly brackets} will enclose the text for each component section. The user should look for formula containing reference-record numbers (listed within the "*{ }*" notation) at the end of the text, which describes how the areas were joined together to form the overall geographic area.

In a few cases, the geographical text descriptions can become very lengthy. This is partly a by-product of the growth and decline of empires (e.g., the Roman Empire), composed of several component regions, where some movement or action occurred over the course of history. A total of 82 captured records are simply too long to be handled in the relational database program which creates the final output data listings. For ease of data handling, these records are abbreviated in the main dataset and text descriptions flagged with a warning statment such as "(shortened - see long records)". Where it is pertinent, the *{ }* formula for component areas is featured after the warning text. To view the original geographical text for an abbreviated record, follow the appropriate link to a secondary listing which contains the complete text of all long records, sorted by reference-record number order.

Where the text descriptions are lengthy and somewhat confusing, the user is strongly encouraged to use the accompanying LINMAP program to directly view the overall assigned outline or perhaps just the component sections (i.e. *{901-38} + {27-174}* in (901-39), Romans, at 43 AD) for the sake of clarity.

Action Code

Immediately following the geographic text field is a single character field, containing a letter code for some action taken by the named group in a record. There are a total of 12 different "actions" represented in the entire dataset (see Abbreviations--Action Codes for the complete list of actions). Each record describes only a single type of action. For example, in our sample records for Goths, (11-55) codes E for an expansion into the area "NW of Black Sea... Dnepr R". In the second record, (900-489), L locates this group of Goths in their new area.

The action codes include nine movements as follows:

Again the definitions for these codes are not hard and fast. Whether a given event should be called a conquest, an expansion or a partial migration is a judgement call.

Comment Text

The final data field is another hanging-paragraph of text used to supply additional comments and information about a particular record. As in the case of the Geographic place descriptions, some standard conventions have been adopted to denote particular types of information. Reference and record numbers used as literature citations are enclosed within parentheses, (as in "Acc/to (38-17)." in the comment of (900- 180), Prussians (Old), at 200 AD). Reference and record numbers enclosed within curly brackets and asterisks *{}* denote that a labeled record is the point of origin (or destination) for the current movement. (Note that there is no need for this type of information in the comments of "L" or "I" records.) Users will also find accompanying text statements flagged with "From .... *{ref-rec}*" to completely describe the areas serving as either an origin or a destination point.

Comments will provide additional important information about the specified group. Population sizes and synonymous group names are given when this is known. In some cases, statements about other groups are included to better define the relationship they share with the named group. When there are uncertainties about the information given in other fields, the comments include text concerning this. The user is strongly encouraged to read and make use of the comment information for each record.

Welcome Intro Format Data Download

Last Altered December 5, 1996
E-mail: Questions/Comments Technical Problems
The Ethnohistory Project / msr@life.bio.sunysb.edu