204 



LINEAR REGRESSION AND CORRELATION 



Ch. 7 



means of the line drawn into Figure 7.22. Some points lie above 

 this line, some lie below, at distances whose magnitudes can be 

 measured by the lengths of the vertical lines which could be drawn 

 connecting the points of the scatter diagram to the regression line. 

 In a useful sense, the goodness of fit achieved by any line drawn 

 among the points to depict their trend should be measured somehow 

 in terms of the amounts by which the proposed line misses the points 

 of the scatter diagram. 



y 



18 



17 - 

 1 16 - 

 % 15 



<D 



53 14 1- 



12 



13 - ^^ 



10 



5 6 7 8 9 



16-Week weight 



Figure 7.22. A random sample of 30 pairs of observations from the population 

 of Figure 7.21 and Table 7.21. Free-hand line to indicate the trend as it might 



appear to the eye ( ). Line determined by the method of least squares 



( ). 



It will facilitate the discussion to introduce some symbolism before 

 presenting the specific methods to be used in the determination of the 

 equation of the regression line. For a given value X { of the measure- 

 ment X, let the corresponding value of Y be called Y{ if it was observed 

 with Xi when the sample was taken. It will be designated as Y i if it 

 is calculated from the equation of the regression line. Also, let the 

 general linear equation relating Y { and Xi be written in the form 



(7.21) Y { = a + b(Xi -x),* 



where a and b are the two constants which must be determined in 

 order to have a specific trend line for a particular scatter diagram. 



* This form and the notation do not agree entirely with some other textbooks, 

 but they are used here for convenience. The (A\- — x) is x, so that the subsequent 

 formulas and discussions come quite simply from this form of the equation for f. 

 Some authors use other letters than a and b; and several others use b as herein, 

 but their a = (above a) — bx. 



