4 PROCEEDINGS OF THE NATIONAL MUSEUM vol. 124 



made for missing or inapplicable attributes (the latter arise if, as is 

 commonly the case, the applicability of later questions depends on 

 the answer to earlier ones). (3) The dichotomizing of a single multi- 

 state attribute always generates a set (at least two) of qualitative 

 attributes, and these are linked logically in the sense that certain 

 combinations of states will be redundant (experience suggests that 

 this will not disturb the analysis, provided the number of originally 

 multistate attributes is small; Watson, Williams, and Lance, 1966). 

 (4) In a completely qualitative system no provision can be made 

 for "doubtful" entries (in the present case these comprised less than 

 2 percent of the total). (5) A character may be capable of subdivision; 

 for example, carapace ornamentation can be reduced to the single 

 character "mostly ridges present rather than raised granular areas," 

 or (as in the present case) the ridges can be listed separately; this 

 decision necessarily involves the concept of "weighting" and must 

 be resolved on taxonomic grounds, not numerical grounds. 



As the investigation proceeded, 44 species eventually were com- 

 pared by reference to 57 features (selected features are listed in 

 table 1, species in table 2, and data in table 3). 2 Selected features 

 were those believed likely to give good overall discrimination. Had 

 particular comparison been an issue, other characters might well 

 have been more appropriate. The wording of the features was designed 

 to give positive answers to our specific questions for most of the 

 western American Portunus species. 



During tabulation of data, the inadequacy of many past descrip- 

 tions of the species became apparent. Such descriptions have con- 

 centrated upon specific recognition and distinctions from nearly 

 related species but have omitted similarities to more distant species. 



Numerical model. — Any study of inter-relationships requires the 

 definition of a measure of likeness to serve as the basic numerical 

 model of the system. Such measures — the so-called "similarity 

 coefficients" — have been proposed in great variety; the best known 

 are summarized and defined in Goodman and Kruskal (1954, 1959), 

 Dagnelie (1960), and Sokal and Sneath (1963). The simplest measure 

 of difference between two qualitatively specified individuals is the 

 "number of features of difference" (the NFD value) wherein one 

 individual scores -f and the other — . In the conventional "a, b, c, d" 

 symbolism of a 2X2 contingency table, this is the quantity "6+c." 

 Moreover, if we regard the attributes as defining a set of orthogonal 

 axes in Euclidean space and regard the coordinate along a given axis 

 as "1" (if the feature is possessed) and "0" (if it is lacking), "6+c" 

 then represents the square of the Euclidean distance between the two 



2 Tables at end of paper. 



