456 
PROF. KARL PEARSON ON SKEW VARIATION. 
Now from this population we will take a large number m of samples of n 
individuals. If in each one of these samples we calculate the correlation, r, between 
two variates, then r will not be equal to the value of p in the sampled population, but 
the m samples will give a frequency curve for r, which is limited in range between 
+ 1 and —1 and is determined by n the number of individuals in the sample and by 
p the correlation of the characters in the indefinitely large population sampled. We 
thus obtain a doubly infinite series of frequency distributions. The general theory of 
such distributions has been worked out by “ Student ” (‘ Biometrika ’ vol. VI., p. 302, 
et seq.), Mr. H. E. Soper ( Ibid ., vol. IX., p. 91, et seq.), and Mr. ht. A. Fisher (Ibid. , 
vol. X., p. 507, et seq.). The actual forms of the frequency curves are not usually 
expressible by simple single functions, but the ordinates and the /3 l5 /3 2 admit of 
numerical determination. The calculations are extremely laborious, but up to the 
present the members of my laboratory staff have calculated some 270 frequency 
curves with nearly 40 ordinates each for values of p ranging from 0 to 1, and of 
n from 2 to 400. The great bulk of these curves show no approach to normality. 
The values of j3 1} /3 2 range from points on the B-line down to infinity, the distri¬ 
butions contain concentrated blocks, U-shaped curves, J-shaped curves, rectangles, 
trapezoid-like forms and every variety of skewness in doubly limited range curves. 
Only in cases where n is very considerable and p is neither a positive nor a negative 
high correlation is there an approximation to the Gaussian. For a series of curves 
in which f3 1 can be 5 and fi> 2 = 9,—or both, if we will—ten times these amounts, 
it is idle to talk about the value of the Gaussian curve (/3i — 0, f3 2 = 3) in describing 
variation. These frequency curves can be actually obtained by experimental sampling, 
although the process is laborious, and indeed were so obtained in the first place. # 
They arise from observation and experiment. The remarkable point about them is 
that they illustrate all the types we have been discussing and justify sharp transitions 
in algebraic forms by showing that such transitions correspond to actual physical 
facts arising from experimental statistical data. The whole illustration, details of 
which will shortly be published, indicates the evil of implicit reliance on a classical 
theory, 
The Gaussian theory of error has, with great weight of authority, been applied to 
determine significant differences in statistical constants. The theory of the 
“ probable error ” must be justified in the case of each statistical constant to which it 
is applied. Psychologists have been busy discussing the differences found in mental 
correlations deduced from small samples on the basis of significance judged by the 
Gaussian theory of probable error. That theory has practically no application, as the 
“ probable error has really no meaning in the case of the bulk of the samples dealt 
with. Applications of the theory of probable error in other sciences than psychology 
to experimental results based on small samples will readily occur to the reader. The 
conclusions may be correct or incorrect, but they are unquestionably based on an 
* ‘ Biometrika,’ vol. VI., pp. 305-7. 
