278 



FISHERY BULLETIN OF THE FISH AND WILDLIFE SERVICE 



Table 8. — Coefficients of the discriminant function Di = 

 and successive values of D, ? 



21, X, 



The solution of these six equations requires the 

 inversion of a 6 X 6 matrix. Rao (1952) presents 

 a method of solving these equations so that succes- 

 sive discriminant functions are obtained. At the 

 first stage of solution, the discriminant function 

 using anterior scutes only is computed while at the 

 second stage the function using anterior scutes 

 and posterior scutes is obtained. The discrimi- 

 nant function using all six characters is obtained 

 at the sixth stage. The solution of these equations 

 is given in table 8. Any particular discriminant 

 function can be obtained by substituting the 1 ( 

 from this table into the equation: 



y _vi v 



The variance of Y, is D* and can be obtained 

 at the same time as the coefficients 1,. It can be 

 proved (Rao 1952) that D,/2 is a normal deviate 

 with mean zero and a standard deviation of one. 

 The probability of obtaining a normal deviate 

 equal to D,/2 is identical to the probability of 

 correctly classifying an individual from any one 

 population. Values of D 2 „ D,/2 and the prob- 

 ability of correct classification are also given in 

 table 8. From this table it can be seen that the 

 increase in D 2 with the addition of vertebrae is 

 quite small; therefore, the number of vertebrae is 

 not very useful for purposes of discrimination when 

 used with the other five characters. From the 

 estimates of w u it is apparent that the covariance 

 between vertebrae and other characters is gener- 

 ally large. This correlation may reduce the use- 

 fulness of vertebrae for discrimination. Taking 

 an extreme example where the correlation between 

 two characters is one, it would be useless to include 

 more than one of them in a discriminant function. 

 Immediately the question arises as to how the cor- 

 relation of the characters affects the relative effi- 

 ciency of the function. This can be answered by 

 a test of significance which tests the hypothesis of 

 no added increase in D 2 in going from a discrimi- 

 nant function using the first p characters to one 



using p plus q. In this case p = 5 and p plus q = 6" 

 Rao presents this test on page 253. 



N,N, 



R= 



(N 1 + N 2 )(N,+N 2 -2) 



D, 



1 + 



N,N, 



(N 1 +N 2 )(N l +N 2 -2) D * 

 (91X104) 



(195)(193) 



(3.22) 



= 1.0078 



(91)(104) 

 1 + (195)(193) (3 " 16) 



F= N, + N,-p-q-l (R _ 1)s=M6 



This F [with q and (N, + N 2 -p-q-l) d. f.] is not 

 significant; therefore, the hypothesis of no added 

 information being supplied by vertebral counts 

 can be accepted. It must be remembered that 

 this is true only when use is made of the data from 

 the remaining five characters. Since vertebrae 

 add nothing to the power of discrimination, they 

 will be omitted from further calculations. The 

 fact that vertebral counts can be eliminated from 

 the discriminant function has considerable prac- 

 tical value, because these counts have to be made 

 from x-rays or after careful dissection of the fish. 

 This one count would probably be as costly in 

 terms of time and money as the other five. 



The next step is to find the means of the discrim- 

 inant function for the two populations. This is 

 done by substituting the mean values of the charac- 

 ters for eacli population into the discriminant 

 function. The discriminant function as taken 

 from table 8 (excluding vertebrae) is: 



Y=0.785X,f0.577X 2 + 



0.871X 3 +0.234X 4 -|- 1 .731X5 



The mean value of this function for the Hudson 

 River, 1939, is 74.103 and for the Connecticut 

 River, 1945, is 70.940. If this function were to be 

 used to discriminate between the two populations, 

 those fish with a value of Y less than 72.52 would 

 be called Connecticut River fish and those above 

 72.52 would be classified as Hudson River fish. 

 The error in this classification would be the pro- 

 portion of Connecticut fish with a Y greater than 

 72.52 and the proportion of Hudson fish with a Y 

 less than 72.52. The variance of Y is: 



D 2 =l 1 d 1 +l 2 d 2 +l 3 d3+hd4 

 D 2 =3.163 



■l 5 d s 



