Note: this program is out of date! See the new Windows version.
TPSREGR - Thin-plate splines regression analysis
F. James Rohlf
6 July 1993
Department of Ecology and Evolution
State University of New York at Stony Brook
Stony Brook, NY 11794-5245
Phone: 631-632-8580
ROHLF at life.bio.sunysb.edu
The purpose of this program is to regress the shape of a
collection of specimens (captured as coordinates of landmarks)
onto an independent variable. The independent variable might be
size in a study of allometry or it could be longitude,
temperature, or any other variable of interest.
The program regresses the partial warp scores (the weight
matrix of partial warp scores) onto the independent variable and
then plots a thin-plate spline as a function of the independent
variable so that one can see the shape change associated with
larger or smaller values of the independent variable. An
alternative is to read in a vector giving an explicit linear
combination to be used.
The reference configuration must be supplied as a file (it can
be computed by the GRF, GRF_ND, or TPSRW programs). Either the
raw coordinate data (usually the most convenient way to use the
program) or files giving the weight matrix, principal warps and
their eigenvalues must be provided. The regression makes most
sense if the reference configuration is such that it corresponds
to the average of the independent variable.
There are two versions of the program. One for DOS real mode
and another for DOS protected mode (DPMI). Their use is identical
except that the protected mode version requires that the RTM.EXE
program be present (it will be loaded automatically in order to
switch TPSRW into protected mode). Unless you are using software
that provides DPMI services (e.g., Windows, 386Max, OS/2) the file
DPMI16BI.OVL must also be present. The protected mode version
requires an 80386 or 80486 computer and is able to use both
ordinary RAM and extended memory so that larger datasets can be
processed. It does not make use of overlay files.
At present the DOS version can handle a maximum of 500 specimens
and 100 landmark points. The DPMI version can handle 2000
specimens and 200 landmarks. No attempt was made to push these
limits to their maxima. Please contact me if these limitations are
a problem.
Note: this version requires a new set of BGI graphics driver
files.
The program is still under development -- please be patient!
To use the program:
1. Type its name at the DOS prompt:> TPSREGR
2. A menu will be displayed. The legal options at a given time will
be shown highlighted.
3. First choose option 1 to specify the name of either the raw
data file or the file containing the weight matrix (computed,
for example, by the TPSRW program). If you provided a name for
the data file then the program will not ask for the name of the
weight matrix file. To provided the weight matrix leave the
name of the data file blank.
If you supply the name of a data file then you will be asked
whether you would like the x,y-projections of the uniform
component to be added in the weight matrix computed by the
program. This is of interest if you would like to see the
extent to which the uniform component can be predicted by the
independent variable. If you supply a weight matrix (such as
output by the TPSRW program), you will be asked whether the
"retain affine" option was used. This is so the program can
ignore the additional 6 columns added to the matrix. While
regression of all the affine parameters might sometimes be of
interest, it is not obvious how their effect should be plotted
so they are ignored for now.
All files must be in NTSYS-pc compatible formats.
"NTSYSpc" format:
The format is the same as used by the Fourier program in
NTSYSpc. There can be comment lines, followed by a matrix
header line, possibly followed by label lines, and finally
followed by x,y-coordinates as in the "matrix" format described
above.
" fake data for 4 specimens (identical) with p=5 landmarks
1 4 10 0
1.1 2.2 3.3 4.4 5.5 6.6 7.7 8.8 9.9 0.0
1.1 2.2 3.3 4.4 5.5 6.6 7.7 8.8 9.9 0.0
1.1 2.2 3.3 4.4 5.5 6.6 7.7 8.8 9.9 0.0
1.1 2.2 3.3 4.4 5.5 6.6 7.7 8.8 9.9 0.0
The program will ask for a name to be given to an output
listing file. Various numerical results will be stored in this
file. If a file already exists with the name you specify you
will be asked whether to overwrite (and hence destroy) the old
file or to append the new information to the end of the old
file.
Finally, the program will ask for the name of a file giving
a list of pairs of landmarks to connect in output plots.
This can sometimes make the plots easier to visualize.
This input is optional. The input format is that for a
graph matrix in NTSYS-pc. An example is as follows for
5 landmarks and 3 links (or edges of a graph) to be drawn.
The value for the length of each edge does not matter but
it must be provided.
" Example of link matrix (type=7) for 5 landmark & 3 edges
" to be shown.
7 5 3 0
1 3 0.0
3 4 0.0
1 4 0.0
4. Next, select option 2 to specify the reference configuration.
It must be provided as an NTSYS-pc compatible file. An average
consensus configuration can be computed by the TPSRW and GRF_ND
programs.
The file reference can be dimensions x landmarks, landmarks x
dimensions, or strung out as a single array in the order:
1x, 1y, 2x, 2y, etc. for a total of 2p elements.
If a weight matrix was provided (rather than a data matrix)
then the program will ask what value was used for the
exponential weight, alpha, in computing the weight matrix. The
published accounts of relative warp analysis (Bookstein, 1991,
or PMMW pp. 246-248) are equivalent to alpha = 1 (which
corresponds to an inverse bending energy metric. It does not
matter (for this program) what value of alpha was used.
The reference configuration should correspond to a specimen
that is average for the population.
5. Next choose option 3 or 4 to either read in the file containing
the independent variable (the program will estimate the
relationship with the partial warps by using regression) or to
read a vector giving the weights for each partial warp (perhaps
a discriminant function vector). In both cases the vector must
be an NTSYS-pc file with only 1 row (or column) containing
values. For option 3, its length must be the number of
objects (specimens) in the data file. For option 4 the
length must be equal to 2(p-3) if the uniform component is
not added or 2(p-3)+2 if the uniform component is added to
the weight matrix, where p=number of landmarks.
6. Next choose option 5 to perform the computations. If an
independent variable was read and the sample size is larger
than the number of parameters, then you will be given a choice
of whether to: (L) regress W on X using least-squares (the
usual case), (M) use major axis regression (PCA, see Biometry
section 15.7), or (I) to use a multiple regression of X on W
(but then the estimated relation has to be inverted, see the
next paragraph). Enter the letter (L,M, or I) for your choice.
Note: multiple regression cannot be used unless there are more
observation than parameters being estimated. The menu choice
will not be shown if that method cannot be used. The "inverse"
is computed as follows. First, the usual regression is made of
the variables in the weight matrix (W) onto the independent
variable. Interpret the coefficients (ignoring the intercept)
as defining a gradient vector through the centroid of the space
(points in direction that fitted hyperplane is steepest).
Project points onto this vector to determine the relationship
between it and the independent variable. Use that relationship
to predict the values in the weight matrix as a function of the
independent variable. This often does not work very well.
You may wish to try it just to obtain a significance test
for the relationship between shape and the independent
variable.
If an independent vector was read, then you will be given two
choices: (P) project the objects onto the vector and then use
lease-squares to regress the partial warps onto the projection
or (W) use the coeficients as is to weight the principal warps.
Enter the letter (P or W) for your choice. Use (W) to
visualize a particular warp or combination of warps and uniform
component. The plot produced when the (P) option is used often
does not correspond to particular warp you tried to select
since other partial warp scores may be highly correlated with
it in your sample of specimens. With the (W) option the scale
is arbitrary. Be prepared to press the "+" or "-" keys many
times in order to view the plot.
The numerical results will be written to the listing file.
Messages will be displayed on the screen showing progress
through the computations. If the program runs out of memory
you may only get a message that says "Out of memory!".
7. The last choice, "C", should be specified before you try to
get hardcopy of the plots shown in the other menu choices.
See the section "Graphics hardcopy" below for more information.
8. Next you can choose any of the plotting options (options 6 -
7). See below for information about each type of plot.
6 - Plot partial warp scores against the independent variable
This menu item plots the partial warp scores for each specimen
against the independent variable. If the uniform components
are included then they can also be plotted (they are placed at
the end). Note: this option cannot be selected if an
independent vector is used.
1. Press the "+" and "-" keys to cycle through the partial warps.
2. Press "L" to toggle the labelling of the specimens.
3. Press "P" for graphics hardcopy.
4. Press the "ESC" key to exit.
7 - Plot regression as a spline
This plot shows the thin-plate spline. It will cycle through
displays to give an animated display of the spline being deformed
for larger and then smaller values of the independent variable.
1. You can select the magnitude of the range of the independent
variable by pressing the 'M' key followed by pressing the + and
- keys. The value of X is displayed. When the correlations
with the independent variable are very small you will have to
greatly enlarge the range in order to see any effect.
This is the default mode. The initial displayed range is half
of the observed range. Pressing + will double this range.
2. If the uniform components are included then you can press the
"U" key to toggle their contributions off and on. Likewise,
you can press the "N" key to toggle the display of the
nonuniform (local deformation) components off and on.
A message will be displayed at the upper left of the screen to
indicate their current status.
3. If you press "C" the landmarks for each specimen will be
connected by a series of lines in the order in which the
landmarks were entered. Press "C" again to turn this display off.
4. Press the "L" key to display labels for the points. Press it
again to turn off their display.
5. Press the "V" key to display displacement vectors. Press it
again to turn them off. These vectors are plotted only on the
untransformed grid. The end points of the vectors are the
locations of each point after the transformation that is about
to be applied. The vectors are usually similar to the relative
warp loading vectors. Sometimes they are quite different.
6. To print a copy of the graphics screen press "P". The program
will then prompt for you to press either the "+" or the "-"
keys. This allows you to specify whether to output the spline
based on the positive or negative warping of the space.
The screen will clear until the plot is complete.
7. Press the "E" key to toggle the plot of the line segments between
pairs of landmarks on and off. Uses the link file if present.
8. Press the "R" key to reset the display back to the default.
9. For fast computers you can press the "D" key followed by the
"+" key to increment (by 0.5 sec) the delay between succesimve
displays. If you increase it too much you can decrease the
delay by pressing the "-" key. The computer will beep if you
attempt to reduce the delay below 0 (you cannot speed up the
computations by having a negative delay!).
9. Press the ESC key to exit.
----------------------------------------------------------------
Configuration for graphics hardcopy
A window will be displayed that lists the various devices and
their modes. Another window will then be displayed that asks for
a device or file name. If you would like the output written to a
file for later use then enter a valid file name. The name should
be short to allow for the fact that the program will append a
number so that each picture can be stored in a separate file.
To have the output sent directly to a printer attached to a
printer port enter LPT1 or LPT2. For output directly to a printer
or a plotter attached to a serial port enter COM1 or COM2. In the
later case you must also specify the baud rate, parity, number of
data bits, and whether or not to use XON/XOFF protocol.
The available baud rates are: 300, 1200, 2400, 4800, and 9600.
Parity can be N (none), E (even), or O (odd). The number of data
bits can be 7 or 8. Use the symbol "X" to indicate XON/XOFF.
These codes are entered after the port name. For example, for
a plotter attached to COM1 and working with 2400 baud, no parity,
8 data bits, and using XON/XOFF enter the following:
COM1,2400,N,8,X
The following printers are supported: Epson 9-pin printers
(including Epson FX and MX, IBM Graphics Printer and Proprinter,
and Panasonic and OkiData ["native" or with Epson or IBM
emulation]), Epson 24-pin printers (includes Epson LQ, NEC
Pinwriter, and Panasonic printers with Epson emulation), and
Toshiba P321 24-pin printer. The Epson 9-pin and 24-pin color dot
matrix printers are supported. The HP LaserJet (all models), HP
DeskJet (all models), and Canon LBP-8 laser and inkjet printers
are supported.
The following plotters are supported: HP7470, HP7475, and
HP7585. Many other plotters are compatible with these plotters.
If the plotting information is written to a file it can be read by
many word processors, desktop publishing programs, and by graphics
programs.
In addition, you can select output formats of CGM, GEM IMG,
PCX, WordPerfect WPG, and TIFF (both compressed and uncompressed).
MS Windows bitmap files (BMP) are also supported. These are
useful in order to import the graphics into various desktop
publishing and "paint" programs where you can add annotations,
delete unwanted details, etc.
BGI files
These are the files that provide the graphics support to the
program. You only need to have the BGI files on your disk for the
devices you expect to use. If you do not have the proper graphics
BGI file you will not be able to see a plot on the screen. If the
proper BGI file for graphics hardcopy is not present the program
will exit back to the main menu without any error message. The
correspondence between BGI files and devices is given below.
Graphics adapters:
_CANON.BGI Canon LBP-8 printer
_CFX.BGI 9-pin color dot matrix
_CLQ.BGI 24-pin color dot matrix
_DIC.BGI Kodax Diconic printer
_DJ.BGI HP DeskJet printer
_DJC.BGI HP Color DeskJet printer
_DMPL.BGI DM/PL plotters
_FX.BGI Epson 9-pin printers
_HP7470.BGI HP7470 plotter
_HP7475.BGI HP7475 plotter
_HP7550.BGI HP7550 plotter
_HP7585.BGI HP7585 plotter
_LQ.BGI Epson 24-pin printers
_LJ.BGI HP LaserJet printer
_LJ3R.BGI HP LaserJet III printer
_OKI92.BGI Okidata 92 native mode
_PJET.BGI HP paintjet
_PP24.BGI 24 pin dot matrix
_TJ.BGI HP ThinkJet printer
For the above devices you will need to know how it is attached
to your computer (printer or serial port). In the case of a
serial port you will also need to know the baud rate, parity,
number of data bits, and whether the XON/XOFF protocol is used.
Graphics file formats:
_AI.BGI Adobe Illustrator Postscript
_BMP.BGI MS Windows bitmap files
_CGM.BGI CGM files
_DXF.BGI AutoCad
_IMG.BGI GEM IMG files
_PCX.BGI PCX paint file format
_TIFF.BGI Compressed TIFF format
_UTIFF.BGI Uncompressed TIFF format
_WPG.BGI WordPerfect WPG files
The BGI files whose names begin with "_" are part of the
GRAF/DRIVE package from Flemming Software as is the GCOPY.EXE
program that can be used to copy graphic files to a printer or
plotter. Type GCOPY and instructions will be displayed.
----------------------------------------------------------------
Sample data files
A set of data files are provided as an example. They are the
rat calvarial growth dataset described on pages 408-414 in the
"orange book" (Bookstein, 1991). The file RATS.NTS contains the
x,y-coordinates in NTSYS-pc compatible format. Each specimen is a
row of the matrix and the 16 columns correspond to the x and y
coordinates for the 8 landmarks. The RATS.REF file contains
coordinates that can be used as a reference and the RATS.SIZ file
contains the centroid size of each specimen. The file RATS.LNK
is an example of a matrix of edge links (not actually needed
for this dataset since the landmarks were digitized in a logical
order so that the "Connect" option will link them automatically.
The file RATS.V1 provides and example of an independent vector
that can be used to display the first principal warp. It is of
length 10. Thus it assumes that the uniform components are not
added. Use option (W) when computing the regression.
Note: the specimens are ordered as in Bookstein (1991),
specimen 1 ages 7 to 150 days, then specimen 2 ages 7 to 150 days,
etc.
----------------------------------------------------------------
Output listing
The program produces a rather modest listing file containing
printouts of some of the matrices involved. Of most interest will
be the listing giving the mean partial warp scores and the
regression coefficients on the independent variable. It may be of
interest to discover which principal warps are most correlated
with the independent variable. If the first few warps are highly
correlated then the effect of the independent variable is
localized. If only the last few partial warps are highly
correlated then there are large-scale effects.
The uniform components have very large correlations with the
independent variable. As one might expect, You may find that the
three different regression techniques yield quite different
coefficients. When the correlations are all very low you may have
to greatly magnifiy the plot of the splines in order to see
anything happening (if the correlations were all equal to zero
then no amount of magnification would show any "action" in the
plot).
----------------------------------------------------------------
Changes from previous version
7/6/93 Recompiled for BP7 and added support for DOS DPMI mode.
Increased the size of datasets that could be processed.