Jan. 12,1924 
Adjusting Yields for Soil Heterogeneity 
81 
difference in the conditions of soil productivity under which the two 
varieties were grown. When the number of replications is large in 
proportion to the number of strains compared, the correctness of the 
latter conclusion obviously may be warranted. As the number of rep¬ 
lications decreases relatively to the number of strains involved, however, 
the justification for such a conclusion becomes constantly less. Thus, 
when 100 strains are systematically replicated 4 times, any 2 strains 
may not have been grown within 10 plats of each other in the whole 
experiment. 
THE COEFFICIENT OF CORRELATION 
When the relation between two variables is rectilinear, the coefficient 
of correlation measures their tendency to concomitant variation. A 
significant coefficient of correlation shows a relation between the variables, 
but tells nothing of the cause of that relation. One may be the cause or 
effect of the other, the relation may be purely mathematical, as between 
a product and its factors, or both variables may be affected by one or 
more common causes. Frequently, however, other information is avail¬ 
able on which to base interpretations. Thus, a positive correlation 
between the yields of the adjacent odd and even rows in Experiment 2 
logically leads to the conclusion that the yields of both varieties were 
being influenced from row to row by variation in the soil conditions so 
gradual that when it affected one row, it also affected the adjacent 
row more or less. Similarly, systematic competition would tend to 
lower a positive correlation due to soil variation, and might even result 
in a negative correlation. 
For a good discussion of the interpretation of data on the basis of 
correlation values plus such other information as to the relations as may 
be available, the reader is referred to the works of Sewall Wright. * 5 > 6 
Letting X and Y represent the odd and even rows of Experiment 2, the 
significance of the coefficient of correlation is made more evident by its 
relation to the standard deviation of the differences. Thus, the standard 
deviation of the differences, X— Y, is given by the equation, <t 2 x _ y = 
<^ 2 x + <r 2 Y — 2f xy ox o- Y . When r = o, <r 2 x _ Y Wa 2 x + Substituting 
probable errors for standard deviations, the latter is the equation 
largely used for the probable error of a difference between means. This 
is correct in field experiments only when there is no correlation between 
the yields of the two items in the different replicates. It results in too 
small an error if there is a negative correlation and too large an error if r 
is positive. Competition thus tends to lower, and a gradual change in 
soil productivity tends to raise, the probable error computed in this way. 
It is the portion of the error due to the term, 2r XY ox in the formula 
for the squared standard deviation that is eliminated by obtaining the 
successive differences between X and Y and determining their probable 
error directly. 
REGRESSION 
It is in connection with the theory of regression that the coefficient of 
correlation attains its importance in the present paper. Using X and Y 
as in the previous paragraph, the regression of X on Y is given by the 
expression, r XY — • In other words, for each unit of deviation of Y from 
cr Y 
6 Wright, Sewall. correlation and causation. Jour. Agr. Research, v. 20, p. 557-585. 1921. 
literature cited, p. 585. 
6 -. the theory of path COEFFICIENTS. In Genetics, v. 8, p. 239-255, 8 fig. 1923. Literature 
cited, p. 255. 
73431—24-2 
