160 SAMPLING NORMAL POPULATIONS Ch. 6 



the sense that it have a relatively small variance from sample to 

 sample. For example, suppose two methods of estimating /a each will 

 produce an unbiased estimate but, over many samples, one has a 

 variance of 100 whereas the other has a variance of only 25. The 

 latter estimate obviously is more consistently near ^ in size, and 

 hence less allowance need be made for sampling error in this estimate. 

 This second estimate would be considered a more efficient estimate 

 than the one whose variance was 100. 



The estimate x of n, which already has been mentioned, and whose 

 symbolic definition is 



2(X) 



(6.31) x = — — . 



n 



gives an unbiased and highly efficient estimate of n. It has been 

 pointed out earlier in a theorem that the variance of x under repeated 

 sampling is only one nth of the population variance, when a normal 

 population is being sampled. (As a matter of fact, the variance of x 

 is a 2 /n for any population if a 2 is finite.) Hence the x is widely used 

 as an estimate of ii. 



The variance a 2 will be estimated by means of the formula 



. S(X - x) 2 



(6.32) s 2 = — -• 



n — 1 



This estimate is unbiased and is considered to be about as efficient as 

 any estimate of <x 2 as long as the sample is not extremely small. The 

 usefulness of this estimate in practice will be illustrated repeatedly in 

 subsequent discussions. 



By comparison with the methods used in Chapter 2 to compute <r 

 or <t 2 , it is seen in formula 6.32 that two changes have been made. 

 The m is replaced by x and the denominator is now (n — 1) instead 

 of n. Logically, the x must be used because n is unknown; but it also 

 must be recognized that the differences, (Xi — x), are more dependent 

 upon chance events which occur in the process of sampling than were 

 the quantities, (Xj — ju). The x itself is subject to sampling error 

 whereas the /x is a fixed number for a given population. This matter 

 is taken into account in sampling theory. One step in this direction 

 is to associate with each estimated variance a number of degrees of 

 freedom. The estimate s 2 of formula 6.32 is said to be based on n — 1 

 degrees of freedom because only (n — 1) of the n differences (X{ — x) 

 are actually chance differences. This follows from the fact that 

 2(X — x) = 2X — Xx = nx — nx = 0. Hence, given any n — 1 of 



