354 MR. S. M. JACOB ON 



be to the 'centroid' of the area mentioned. Let c, R denote the mean values R, and 

 *c, a R the standard devations of C and R respectively. Let r denote their correla- 

 tion. 



"Then the equation determining the probable value of the matured crop in 

 the given area, based on the total amount of rainfall assumed known, is 



C -C = ^r(/^-R) (i) 



K 



This is known as the regression equation of C on R, or, as we shall call it, of crop 



on rain, and the quantity — r is known as the regression coefficient. 



*R 



This is the fundamental equation of linear regression for predicting the value 

 of one variable from a knowledge of the value of a second variable, when the means, 

 standard deviations, and coefficients of correlation of both variables are given. 



Similarly, the regression equation of R on C or of rain on crop is 



_ a R 



R-R = —r(C-C) (2) 



c 



which is clearly a different equation from (i) above. 



In the present application, the latter equation (2) will not in general be of 

 much use as it will give us a means of finding out how much rain has probably fallen 

 when the extent of the crop dependent on it is known. But for the most part 

 this information will be valueless, as the amount of rainfall is already known. 

 An occasion for its use would occur where the record of rainfall for some past 

 period had been lost, and we wished to reconstruct its probable value from the known 

 amount of crop which succeeded that period. 



The values of C and R determined by equations (1) and (2) respectively are 

 the most probable subject to the limitations of our statistics. If these statistics 

 extend over a period of ' n' years, and if we are justified in assuming that 

 the amounts and distributions of the rainfall and other circumstances social 

 and economic, which have an influence on agriculture, form atypical or 'random' 

 sample of their values in the neighbouring years, then with the already noted limita- 

 tion as to the nature of the correlation-, the probable errors in the prediction 



•67 449^./ 1 -f 2 

 of the ' regression value ' of C, say, is z. (3). 



Jn 



About the ' regression value ' of C so determined the actual values of C will be scat- 

 tered, and if the whole distribution be normal, the standard deviation of the distribution 



or the measure of the scattering about the true regression point is cV 1 - r 2 . Hence as 



