208 LINEAR REGRESSION AND CORRELATION Ch. 7 



PROBLEMS 



1. Obtain the linear equation whose graph fits the points of Figure 7.1 IS best 

 in the sense of the method of least squares. Graph the line on the scatter diagram 

 and indicate graphically those deviations whose sum of squares is the least possible 

 for any straight line. 



2. Do as in problem 1, for Figure 7.11A. Also compute 2(F — F) 2 . 



Ans. F = 2.16A + 4.54; S(F - F) 2 = 9.77. 



3. By what average amount would you expect F to increase for a unit increase 

 in A 7- if the data corresponding to Figure 7.11 A constitute a representative sample 

 of some two-variable population? 



4. Compare the 2(F — F) 2 and S(F — y) 2 for the data of Figures 7.1 IB and G. 

 What conclusions can you draw? 



5. Use the method of least squares to estimate for the data of Figure 7.1 IB the 

 average value of F for X = 1.5, 2.5, 3.5, and 4.5, respectively. 



6. Make up two sets of 10 pairs of observations each and such that b is about 2 

 in one set and about —3 in the other. 



7. Write down the equation of a trend line with slope = 5 and for which F = 

 10 when X = 4. Graph this line, and then construct a scatter diagram which fits 

 the trend and has S(F - F) 2 = 50. 



8. Do as in problem 7, with slope = — 3 and everything else the same. 



9. Assign row and column numbers to the data in Table 7.21. Then draw 

 two random samples of 30 pairs each — as in Table 7.22 — and obtain the least- 

 squares regression line from each sample. Plot these lines on their correspond- 

 ing scatter diagrams and discuss their differences. (Round off each X to the 

 nearest pound before doing your computations.) 



10. "Cull" the flock of Table 7.21 at 16 weeks of age by eliminating all turkeys 

 which weighed under 6 pounds at that time, then do as in problem 9. Would 

 /3 still be 0.5 for this population? 



7.3 MEASUREMENT OF THE VARIATION ABOUT A 



LINEAR TREND LINE DETERMINED BY THE 



METHOD OF LEAST SQUARES 



If measurements are taken on but one normally distributed variable, 

 Y, the variability, or dispersion, of the Y% should be measured by 

 means of the standard deviation about the mean, and estimated from 

 sy = v2(Fj — y) 2 /(n — 1) because s Y 2 is an unbiased and highly 

 efficient sampling estimate of ay 2 . The variation measured by sy is 

 then considered to be sampling variation. However, if for each F; 

 there is an associated measurement, X{, and if the X's and Y's tend to 

 be linearly related, not all the apparent variability among the Yi 

 should be assigned to mere sampling errors. Part of it can be ac- 

 counted for in terms of the varying X{ associated with the Y{. For 

 example, if Y tends to increase about 5 units for each unit increase in 

 the magnitude of X the Y associated with X = 10 is expected to be 



