taxonomic problems. This consists of a value 



D=£^ dl £i \ (12) 



which is a maximum obtained by varying the co- 

 efficients,^ - - - -J& independently while 

 a - - — dj are the differences between the 

 means of i characters in two samples. This 

 paper was followed by two others (Fisher 1938, 

 1940) in which he compared the discriminant 

 function with Hotelling's (1931) generalized test 

 of significance and with Mahanalobis' (1936) 

 generalized distance function and then developed 

 tests of significance for the discriminant function. 



The discriminant function has been used 

 occasionally in taxonomic work . Mather and 

 Dobzhansky (1939) were able to use it with mor- 

 phological characters to distinguish between 

 two races of Drosophila which had been thought 

 to be morphologically identical although physio- 

 logically and ecologically distinct. Stone (1947) 

 used it with counted characters in a study of sub- 

 speciation in Boleosoma nigrum , a small fish. 

 By means of it he was able to show that there 

 was no overlap between two forms called sub- 

 species and considered that they should be 

 designated as species. These two instances are, 

 however, rather unusual and there has been no 

 widespread use of the method, at least in fish 

 taxonomy. The reasons are probably that the 

 mathematics are complicated and, more import- 

 ant, that the method is essentially a method of 

 assigning individuals to known groups. 



A statistic which gets at the heart of the 

 taxonomic problem of determining overlap is 

 the generalized distance function as stated by 

 Mahanalobis (1936) . He started with the case 

 of p independent variates in two statistical 

 populations where _ /- - \2 



„D 



J-. 6. -<!' 



(13) 



If this is reduced to the case of a single 

 variate, it is important to note that 



D 



(*1 - *2J 



(14) 



is merely the square of our equation (5) and the 

 D is equivalent to the one which we have used in 

 measuring overlap. 



Mahanalobis (1936) then generalized to 

 the case of p correlated variates in two popula- 

 tions 



P P 



p D 2 =£ s^vv^ -v < 15) 



in which w 1 J is the reciprocal of the variance - 

 covariance matrix w, ■ . Fisher (1938) pointed 

 out that the p was unnecessary and the formula 

 has been reduced to 



D 2 =££ w« dj dj 



(16) 



in which d. and d are the differences between 

 the means. 



Mahanolobis" approach to the problem of 

 determining the distance between populations was 

 essentially intuitive, but Rao (1947) supplied a 

 logical solution in which he defined the distance 

 between multivariate populations in terms of the 

 overlapping and pointed out that this distance is 

 an explicit function of D 2 . Rao (1947, 1952) 

 also points out that Er satisfies two fundamental 

 postulates of distance: 



1 . The distance between two groups is not 

 less than zero. 



2 . The sum of distances of a group from two 

 other groups is not less than the distance 

 between the two other groups (triangle 

 law of distance) . 



A further empirical requirement is also satis- 

 fied: 



3. The distance must not decrease when 

 additional characters are considered. 



This generalized distance function has 

 been applied to taxonomic problems largely by 

 members of the Indian school of statistics. The 

 most extensive use was probably that of Mahana- 

 lobis et al. (1949), who made a monumental 

 anthropometric study of over 2,000 individuals 

 in 22 groups using 12 measurements. Several 

 other examples as well as a thorough mathematic- 

 al treatment are given in the text by Rao (1952). 



The method of computation using formula 

 (15) involves a matrix inversion and Rao (1952) 

 suggests that this is suitable for up to about 4 



23 



