Basic Format of the Data
Individual records are composed of 11 different fields which
supply a reference number, a record number, a starting date, an
ending date, approximate date indicator, a group name, a language
family code, a geographic place description, an action code, a
code indicating origin of the record, and additional comments.
Each of these fields will be explained in more detail below. The data
are presented in two forms: chronologically and alphabetically by the
name of the group (chronologically within each group name).
We can start by looking at two sample records taken from
the dataset:
11 55 200 230 N Goths G NW of Black Sea State... E From the marshes
... Carpathians, Don R,... nr Azov Sea.
... on lower Dnepr R. *{900-488}*
900 489 200 240 N Goths G {NW of Black Sea State... L For consistency
... Carpathians, Don R,... with (11-55) &
... on lower Dnepr R.} (110-49).
*{11-55}*
Reference and Record Fields
The reference and record numbers
serve as guides to refer the user to the original source material
for that particular record. In the first example given above, 11
refers to reference number 11 (for the complete list, please see
the
master list of abstracted references), a book
entitled The Germanic Invasions by L. Musset (1975). The record
number indicates this is the 55th piece of information abstracted
from this particular source. (The original references in our
files are highlighted and numbered to show the exact location of
the indicated information.) The second example, reference 900
demonstrates that some records are not taken directly from a
single, identifiable source. (If one checks the master list for
abstracted references, there is no listing for reference 900.)
Records flagged with either a 900 or 901 label were added to the
dataset during correcting and editing. These records can have
hybrid origins, sometime citing other reference-record numbers
which were not retained (due to dating inconsistencies,
conflicts, etc.). In other cases (such as our example above
(900-489)) records were created to smooth the logical progression
of a group or to fill in a gap of missing information. For
records with reference numbers 900 and 901, consult the comment
text field for sources of the information and for reasons for
constructing the record (e.g., For consistency with (11-55) &
(110-49), Acc/to (38-2), etc.).
Start and End Date Fields
The next two consecutive fields
provide a time interval for the action or location described in a
given record. In all cases, a start date is given (200 AD in our
examples above). If the action took place within one year, no
end date is given. Instead, the end date field has the # symbol.
When an end date is not known or cannot be constructed, the "#"
indicates the absence of a firm end date. The examples cited
earlier occur in the time intervals 200 to 230 AD and 200 to 240
AD respectively. Negative dates indicate BC, positive ones AD.
Approximate Date Field
This field is usually specified as a
single letter immediately preceeding the group name field in
columns 23-37. The field can contain either an A (indicating
that the specified dates for the record are only approximate) or
an N (indicating dates are firm). The distinction between the two codes
is not hard and fast--a good deal of historical dating is inherently
approximate. Note that a value of N is assigned to the earlier
Goth examples cited above.
Group Name
Each record decribes the name of a single group--be
it an archeological culture or an ethnic group (gens). The
majority of early records (before 2000 BC) name archaeological
assemblages or horizons, rather than specific ethnic groups.
However, there are some named peoples before this date (e.g.,
Phoenicians, Anatolians, Hurrians, etc.). With the passage of
time, the identity of groups becomes more firmly established and
more named ethnic groups appear. If you are in doubt as to
whether a name indicates an archaeological assemblage or an
ethnic group, consult the date and language-family fields. Early
records with a known language identification generally refer to
ethnic groups. Archeological cultures are mostly labeled with a
"U" for unknown language affiliation. In many cases, the Comment
text field has information that clarifies this issue further.
In cases where a group name exceeds 16 characters in length,
the name has been abbreviated. In such cases the original name
is always specified in the Comment text field. Where there are
synonymous names for a given group of people, one variant has
been chosen over the others, but the comments will also list the
other forms of the group name.
Language-Family Affiliation
Each record contains a single
letter field to show the language-family affiliation for the
studied group (see
Abbreviations--Language-Family Codes for a complete list of
choices). The affiliation can
established in two ways--directly from the cited source or from
another established source describing the same people. In cases
where language information is not available, a "U" indicates
unknown language-family affiliation. When the known language
family is not in the list, an "O" for "other"
language-family affiliation is used and the language-family
affiliation is given in the Comment text field.
Geographic Place Description
Immediately following the language
affiliation is a large, hanging-paragraph text field describing
the geographic location(s) where some activity occurs. In both
our examples for Goths, the text reads as follows: "NW of Black
Sea between Carpathians, Don R, Vistula R, Sea of Azov, centered
on lower Dnepr R". Record (11-55) shows Goths expanding into the
area described as "NW of Black Sea State....centered on lower
Dnepr R". The next record (900-489) is a location to keep the
Goths in the new place for the length of time that it took to
complete the expansion. In many instances, the geographic text
contains standard abbreviations for names of countries and
geographical features (see
Abbreviations--Geographical
for the list of abbreviations). Note also that the
text description for (900-489) is enclosed with curly brackets
{NW..., Dnepr R} to show that the area has been described before
and is equal to *{11-55}*. The reference-record value(s)
specified between enclosed asterisks are cross-references to
component areas described by other records in the dataset. This
notation is used frequently in the data to combine areas and
build large geographic conglomerations for individual groups.
Conversely, areas can also be subtracted rather than added
together (see Scordisci, (132-9), at -100 BC, as an example of
this). The {curly brackets} will enclose the text for each
component section. The user should look for formula containing
reference-record numbers (listed within the "*{ }*" notation) at
the end of the text, which describes how the areas were joined
together to form the overall geographic area.
In a few cases, the geographical text descriptions can
become very lengthy. This is partly a by-product of the growth
and decline of empires (e.g., the Roman Empire), composed of
several component regions, where some movement or action occurred
over the course of history. A total of 82 captured records are
simply too long to be handled in the relational database program
which creates the final output data listings. For ease of data
handling, these records are abbreviated in the main dataset and
text descriptions flagged with a warning statment such as
"(shortened - see long records)". Where it is pertinent, the *{
}* formula for component areas is featured after the warning
text. To view the original geographical text for an abbreviated
record, follow the appropriate link to a secondary listing
which contains the complete text of all long records, sorted by
reference-record number order.
Where the text descriptions are lengthy and somewhat
confusing, the user is strongly encouraged to use the
accompanying
LINMAP program to directly view the overall assigned
outline or perhaps just the component sections (i.e. *{901-38} +
{27-174}* in (901-39), Romans, at 43 AD) for the sake of clarity.
Action Code
Immediately following the geographic text field is
a single character field, containing a letter code for some
action taken by the named group in a record. There are a total
of 12 different "actions" represented in the entire dataset (see
Abbreviations--Action Codes for the complete list of
actions). Each record describes only a single type of action.
For example, in our sample records for Goths, (11-55) codes E for
an expansion into the area "NW of Black Sea... Dnepr R". In the
second record, (900-489), L locates this group of Goths in their
new area.
The action codes include nine movements as follows:
- A--a group attacks the described geographic area militarily;
- C--a group conquers the described area and leaves an occupying army;
- E--a group expands into the area from a contiguous basal area;
- M--entire group migrates to the new area;
- N--the group attacks the area by sea;
- Q--a portion of a larger group migrates into a new area with some remaining behind;
- R--the group is resettled in a new place by another group;
- S--a small portion of the group settles in the described (small) area;
- T--a group contracts (or leaves) from the described geographic area
into a smaller area. In addition, there are 3 more activities
which are not related to movements into or out of a described
geographic area:
- L--the group is located in the described area;
- I--group is assimilated by another one;
- W--group wanders about within the confines of the described area.
Again the definitions for these codes are not hard and fast.
Whether a given event should be called a conquest, an expansion
or a partial migration is a judgement call.
Comment Text
The final data field is another hanging-paragraph
of text used to supply additional comments and information about
a particular record. As in the case of the Geographic place
descriptions, some standard conventions have been adopted to
denote particular types of information. Reference and record
numbers used as literature citations are enclosed within
parentheses, (as in "Acc/to (38-17)." in the comment of (900-
180), Prussians (Old), at 200 AD). Reference and record numbers
enclosed within curly brackets and asterisks *{}* denote that a
labeled record is the point of origin (or destination) for the
current movement. (Note that there is no need for this type of
information in the comments of "L" or "I" records.) Users will
also find accompanying text statements flagged with "From ....
*{ref-rec}*" to completely describe the areas serving as either
an origin or a destination point.
Comments will provide additional important information about
the specified group. Population sizes and synonymous group names
are given when this is known. In some cases, statements about
other groups are included to better define the relationship they
share with the named group. When there are uncertainties about
the information given in other fields, the comments include text
concerning this. The user is strongly encouraged to read and
make use of the comment information for each record.
Last Altered December 5, 1996
The Ethnohistory Project /
msr@life.bio.sunysb.edu