130 MCEWEN AND MICHAEL. 



thus segregated, the nature and magnitude of the inherent error can 

 be found, and approximately eliminated. 



Finally, the method of group averages involves no assumption as to 

 the nature of the correlation between the independent variables. 

 Consequently, the ascertained relation, say of w to x is not influenced 

 by the way x may be correlated with y, z, etc., in the data used, and 

 values of w computed on the basis of these ascertained relations are 

 quite as reliable for any one particular combination of values of the 

 independent variables as for any other combination within the ranges 

 covered by the data. On the other hand, any method based upon 

 assumed forms of functional relations is likely to lead to erroneous 

 conclusions as to the relative importance of the various independent 

 variables unless the assumed forms fit the data well. Values of w 

 computed from results based upon unsuitable functional forms may 

 agree well with those observed when the relation between the inde- 

 pendent variables accords well with that occurring on the average 

 in the data from which the results were obtained, but computed and 

 observed values disagree widely when this relation differs materially 

 from that prevailing in the original data. In other words, the error 

 inherent in the initial assumption is so distributed among the various 

 independent variables as to give the best possible fit to the data as a 

 whole for the functions used, and this fit may seem accurate and still 

 be highly artificial. 



For example, application of the group method to the wheat problem 

 (see tables 7 and 8) shows clearly that the relation of yield to tempera- 

 ture and precipitation was not linear, and indicates that temperature 

 was a much more important factor than precipitation. Consequently, 

 if those values of the yield corresponding to a narrow temperature 

 range and a wide range of precipitation be computed from tempera- 

 ture data alone the error should be nearly the same as when both 

 temperature and precipitation data are used. In eleven cases the 

 temperature lies between 63.7° F. and 65.0° F., while the precipitation 

 varies from 3.7 to 11.6 inches. The standard deviation of the differ- 

 ences between observed values of the yield and those computed from 

 temperature data alone is 2.03 for the multiple linear correlation 

 method, and 1.96 for the slope method. Introducing the correction 

 for precipitation decreases the standard deviation in the first case 

 by only 0.01 and increases it in the second case by only 0.01, thus 

 agreeing with expectation that no significant change would result. 

 But the standard deviation of the differences between observed and 

 computed yields for those data having a large temperature range and 



