Correlation and Application of Statistics to Problems of Heredity 7 



all. It is not possible to say whether the observed " reversion " was due to 

 the weight of a single seed not representing the true maternal character, to 

 the hypothesis of self-fertilisation not being correct or to other causes. 

 Theoretically the important point is that Galton reached linear regression 

 as a first feature of his correlation table. The next point Galton reached was 

 the homoscedasticity or equal variability of the arrays of daughter seeds 

 corresponding to a given mother seed*. "I was certainly astonished to find 

 the family variability of the produce of the little seeds to be equal to that of 

 the big ones ; but so it was, and I thankfully accept the fact ; for if it had 

 been otherwise, I cannot imagine, from theoretical considerations, how the 

 typical problem could be solved" (p. 10). 



The second logical stage in Gal ton's analysis is mathematical ; he en- 

 deavours, assuming that the population is stable and is distributed normally, 

 to find what relation must exist between the " reversion " coefficient and 



-) 



* Thus far I have not been able to find Galton's data for the weights of sweet-peas in the 

 Galtoniana here. It is not easy, however, to find a special topic in the mass of note-books and 

 undated and unindexed papers. Quite possibly, however, he lent his measurements to somebody, 

 as he lent many series of observations to myself. It would be interesting to see exactly the 

 data from which he deduced the two fundamental principles of a normal bivariate distribution, 

 i.e. the straight-line regression and the equivariability of the arrays. Galton gives the correlation 

 table of filial and parental seeds in the Appendix, p. 226, of his Natural Inheritance for lengths 

 not weights. This shows that the mean length and variability of the parent seeds were arbi- 

 trarily chosen, thero being 70 of each. Further, in the table the offspring seeds are modified to 

 show 100 iu each array. We do not know therefore the true means or standard deviations 

 of either parental or offspring populations. This does not, however, affect the determination of 

 either means or standard deviations of arrays. I find in hundredths of an inch: 





My means do not agree with Galton's, possibly he found his before reducing his whole 

 numbers to percentages. (It could not be by the distribution of the filial diameters "Under 15," 

 as this would tend, I think, to reduce all his means below mine.) He does not give his array 

 standard deviations nor the quartiles. However, on some such numbers as these Galton reached 

 his results. The array means are not incompatible with a straight-line relation; the standard 

 deviations suggest that the smaller parental seeds had offspring seeds of less variability than 

 those of the larger seeds, rather than equivariability being the rule. This view might be modified 

 if we knew the actual distribution of the filial seeds "Under 15." Many of these dwarf seeds I 

 suspect were abortions, as their lumping up at the tail of the arrays really prevents the latter 

 from being considered as "normal curves." Galton states (loc. cit. supra) that he had obtained 

 confirmatory results for the foliage and length of pod; this indicates that his experiments must 

 have been carried on for a second year, as he started only with the parental seed. 



