374 MR. S. M. JACOB ON 



Then the distribution is also normal and the standard deviation is given by 



v <r /+ (T c z , the well-known result for the errors of the sum or difference of two 

 uncorrelated variables. 



Thus the probable deviation in the grouping of individuals about the determined 

 regression point is * -\/V + °V Z - 



Now assuming as before that the whole distribution, and not only the 'array' 



a 



distributions , is normal , we have °v = « y/i - r % an( i °e = —7= \/i - v 1 . 



So that the whole probable error is *■ °- J (1 - r l ) 



°J n+l 



As already noted, when n is large this tends simply to the value Vi - f a , but 

 in the cases we are dealing with n is often as low as 20, and \\/i -r % has to be in- 

 creased by about 10247 °f its value, or roughly £&. 



Thus, with the previously adopted notation the probable error in the prediction 

 of a crop of standard deviation <r c is E c \/n + i. 



As an example the rainfall in 1908 of April to August in Sialkot is 3i""3, so that 

 using the regression equation for the Kharif harvest, we find the probable unirrigated 

 crop to be 56,200 acres approximately with a probable error in excess or defect of 

 about 7,000 acres. 



Or, again, the April to August rainfall on 1908 atZafarwal was 3i"'0, which gives, 

 on using the regression equation, a probable Kharif harvest of 52,000 acres for the 

 whole of the Tahsil for unirrigated land, with a probable error of 6,200 acres. 



Exactly the same process will apply to prediction based on any of the regression 

 equations given in this paper, in every instance the given value of E c being 

 multiplied by 4-58 approximately to obtain the probable error of the prediction. 



The probable errors may seem large, but it must be remembered that in ap- 

 proaching the subject for the first time many refinements have to be neglected as 

 beyond the scope of pioneer work. Some of these have already been referred to. 



Even so some advance has been made. 1 



§ 7. The effect of errors of measurement on the correlation coefficient. 



It has already been pointed out that in treating the problems of the dependence 

 of the matured areas of crop on the rainfall, we are using statistics which are subject 

 to considerable errors which may be in part random and in part systematic. From 

 the standpoint of the present investigation the inaccuracies in the rainfall data are 

 small enough to be negligible, but this is not the case for the measurements of the 

 cropped areas, and it becomes important to determine the effect which such errors 

 would produce on the correlations. 



1 Since this paper was written I have seen in Dr. Shaw's British. Association Address to Section A in 1908, that 

 some 'interesting relations between the yield of barley and cool summers, and the yield of wheat and dry autumns' 

 have been recently obtained, and this is being made the starting point by the Board of Agriculture for a ' general investi- 

 gation of the relation between the weather and the crops wh'ch cannot fail to have important practical bearings.' 



Note added 21 9-09. — I have now received Mr. Hooker's paper. Mr. Hooker is the Head of the Statistical Branch 

 of the Board of Agriculture in England, and it is clearly his investigation that Dr. Shaw refers to. 



