APPLICATIONS OF STATISTICAL ANALYSIS 19 



three sets of measurements we might make. Therefore the deviations 

 from the straight line are not significantly different from what would be 

 expected by chance alone. Thus the straight line is an adequate repre- 

 sentation of the data. 



In genetics, chi squared is usually used as in the following example. 

 The problem is whether there is a correlation between two properties: 

 hair color and eye color. It was found that, of 80 individuals, there 

 were 36 with blue eyes and light hair, 11 with brown eyes and light hair, 

 9 with blue eyes and dark hair, and 24 with brown eyes and dark hair. 

 Are the deviations from chance great enough to warrant the conclusion 

 that hair color and eye color tend to vary together? 



Note that there are 47 light-haired individuals (and therefore 33 with 

 dark hair). There are 45 blue-eyed people (and therefore 35 with brown 

 eyes). If the light-haired people had the same proportion of blue and 

 brown eyes as the entire population studied, then 45/80 of them would 

 have blue eyes. This is 45/80 X 47 or 26.43 people. 



It is not necessary to compute any of the other classes of individuals 

 because there must be a total of 47 light-haired people, so that theo- 

 retically there would be 47 — 26.43 = 20.57 light-haired people with 

 brown eyes if the eye color distributes as for the entire group of people. 

 Further, since there are 45 blue-eyed people in all, and 26.43 have light 

 hair, the remaining 45 — 26.43 = 18.57 people should have dark hair. 

 Finally, the number of dark-haired, brown-eyed people should be 14.43. 



Chi squared is then computed as the sum of the four terms: 



x 



(36 - - 26.43) 2 (11 - 20.57) 2 (9 - 18.57) 2 (24 - 14.43) 2 

 26.43 20.57 + 18.57 14.43 



= 19.19. 



The number of independent measurements is unity. We saw this 

 directly, since as soon as we had computed one theoretical number (26.43 

 blue-eyed, light-haired people), the other three numbers were fixed. We 

 can also fairly readily see this by counting the number of quantities we 

 have fixed, since we fixed the total number of people to be 80, the total 

 number of blue-eyed people to be 45 and the total number of light-haired 

 people to be 47. Thus we have only one of the four measurements which 

 can be arbitrarily made. 



Looking at our table, we see that for n = 1, the probability of getting 

 a value of 6.6 for chi squared is already 0.01. To get a value of 19.19 

 would correspond to a probability much smaller than this. Therefore 

 the deviations from theoretical values can by no means be accounted for 

 by chance fluctuations. Thus we conclude that the hypothesis must be 

 that eye color and hair color do not vary randomly, but are associated. 



