TO: Readers of Morphmet
FROM: Fred Bookstein, fred@brainmap.med.umich.edu
RE: the new book by Dryden and Mardia
"Statistical Shape Analysis" (Ian Dryden and Kanti Mardia, John Wiley and Sons, 1998) is the latest and best entry in the growing list of volumes systematizing and summarizing this important new specialty. It will be especially useful for practitioners and students with a background in the mathematical sciences.
The new methodology we've come to call "geometric morphometrics" pulls together techniques and styles of inquiry from many different fields--geometry, mathematical biology, graphics, computer vision, paleontology, anatomy, biometrics, and others. I think that mathematical statistics was the most recent of those sources to make its influence felt. The relevance of classic mathematical-statistical concerns for the kinds of data analyses some of us were eagerly improvising came to our attention only in the second half of the 1980's, when David Kendall pointed out that everything I was doing with shape coordinates as an applied statistician actually constituted an approximate method, managed via a tangent construction, to his Riemannian spaces of similarity equivalence classes. Kendall argued that in this Riemannian space, which was the only reasonable mathematical setting for the problem, none of my approximations were necessary. Quite a challenge, then: could rigorous analyses actually be carried out in his original spaces? or at least wielded there to infer the relationships among alternative approximations or to construct optimal versions?
Kanti Mardia and his then-student Ian Dryden seized upon this somewhat oracular observation of Kendall's and immediately began work to unfold its implications for statistical practice. Those turned out to be closely aligned with Mardia's own work of the preceding two decades regarding distributions for directional data. In a great series of original papers beginning in the late 1980's, Dryden and Mardia produced the exact distributions of shape to which Kendall averred, both for his original problem of identically distributed isotropic diffusions (which apply in astronomy, for instance) and for the far more practical variants pertinent to problems in biomedical settings--diffusions around meaningful mean shapes, distributions failing of isotropy or symmetry in realistic ways. Others they recruited, notably their colleague John Kent at Leeds, contributed other insights into the symmetries of the distributions and the associated tangent constructions.
What has resulted is as much a matter of careful definitions as theorems or special cases. The book at hand assembles this literature, along with a variety of historical comments, computational demonstrations, and speculative extensions, in one tight system of terminology, notation, and derivation. Indeed, one important contribution of the book is just this much-needed control of terminology. Irksome apparent inconsistencies in our literature, notably regarding claims and counterclaims about the properties of different methods, would often turn out to arise from uses of the same words with subtly different mathematical meanings. These core distinctions are collected in Chapter 3, Planar Procrustes Analysis, and Chapter 4, Shape Space and Distances. In particular, the difference between Procrustes fits that do, or do not, alter Centroid Size from unity is clearly drawn as the distinction between "full" and "partial" (fits, distances, tangent coordinates), and likewise the relation of either of these two dis tances, which are root sums of squares, to the original Riemannian metric of Kendall's shape manifolds, which they call simply "Procrustes distance."
The core of any applied data analysis will be the construction of the pole of the Kendall space in the vicinity of which the data are to be inspected: this classic topic is the subject of Chapter 5, Generalized Procrustes Methods. The core of the mathematical-statistical analysis is to be found in Chapter 6, Shape Models for Two-Dimensional Data. Systematically, Dryden and Mardia write down all of the shape distributions that have been found useful to date--the uniform distribution, the complex Bingham, the complex Watson, and the offset normal (Mardia-Dryden), both isotropic and anisotropic. Normalizing constants, linear approximations, and limiting distributions are presented whenever they are known. Immediately following (Chapter 7, Tangent Space Inference) is the core of the corresponding applied multivariate method as it is most frequently encountered in the literatures frequented by the readers of this newsgroup (paleobiology, evolutionary biology, medical imaging). The standard approximate methods o f Bookstein, Goodall, and others now appear as straightforward transcriptions of the basic distributional results and approximations that have already been arrived at.
The final thrust of an analysis on these principles is ordinarily the construction of diagrams back in the two- or three- dimensional space of the original data. Dryden and Mardia have adopted my favorite diagram style, the thin-plate spline, introduced in Chapter 10 to represent differences of average shape along with the systems of orthogonal functions, partial warps and relative warps, that extend its power in practice.
Beyond this central core of formalisms, the topics to be included in any compendium on geometric morphometrics must necessarily represent the taste or experience of the authors. "Statistical Shape Analysis" has four chapters of these additional themes, covering higher-dimensional distributions (which are less thoroughly understood than those for two-dimensional data), joint distributions of size and shape (for which this reviewer is content to use covariance-driven approximations), shape from images, and other "additional topics" not yet integrated into the underlying mathematicostatistical framework. The book concludes with a helpful index of notations, three exemplary data sets, and a fine bibliography.
No reviewer ever agrees with the authors about coverage. But everything I think important in my own algebraic-statistical work has been included here, even my most recent suggestions dealing with the uniform (affine) component of variation, permutation tests (rather than prophylactics to verify assumptions), and extension of Procrustes analysis to outlines via thin-plate-spline. Had I been third author, I would have added a topic on verification of assumptions toward the end of Chapter 6, and I would have tried to ensure that at least one of the worked examples included the production of a "wrong" result, or perhaps no result at all, as the result of recourse to a "wrong" method, and also the production of some startlingly brilliant analysis by recourse to a perfectly targeted version of one of the sturdy techniques taught here. Authors who eschew the exposure of false findings must be considered pacifist; those who do not call attention to new discoveries made with their aid are obviously modest. This boo k, magisterial in spite of such modesty and pacifism, certainly belongs close at hand to the active computer screen of anyone who works with landmark data, whether inventor or applier of these newly standardized, theorem-driven approaches.