122 SAMPLING FROM BINOMIAL POPULATIONS Ch. 5 



the sample will possess an attribute with probability of occurrence 

 = p for any specified future trial from the population is given by 

 the formula: C„, r -p r (l - p) n ~ r . 



The point of view in the preceding paragraph is that of Chapter 4 

 in which the size of p was assumed to be known. More commonly, p 

 is not known and we have only a sample estimate of its size. This 

 estimate is r/n, which varies under repeated sampling from to 1. 

 Even though r/n is a variable quantity, useful and reliable conclu- 

 sions can be drawn from samples taken from a binomial population, 

 as will be shown shortly. Three types of such conclusions will be 

 considered in this chapter: (a) Given a sample, what can we say 

 about the size of pi (6) Given a sample from a binomial distribu- 

 tion, how well does it agree with a predetermined hypothesis con- 

 cerning the magnitude of the p for that population? (c) Given two 

 random samples, did they probably come from the same binomial 

 population? The present section is concerned with question a. 



When the true proportions of the two types of members of a 

 binomial population are not known, they can be estimated by means 

 of a sample, as suggested above. This estimation can take either 

 of two forms: (a) a point, or specific, estimate of p, which would be 

 used in lieu of the p, or (6) an interval estimate which would have 

 a preassigned probability of bracketing the size of p. This latter 

 process is called placing a confidence interval on p. The confidence 

 we can have that the bracket, or interval, does actually include the 

 unknown parameter is described by the confidence coefficient. 



Statistical research indicates that the best point estimate of p is ob- 

 tained from p = r/n, the observed fraction of the sample which pos- 

 sess the particular attribute that is being studied. Some of the reasons 

 for this decision are: 



(5.21) The p has an expected value E(p) = E(r/n) = E(r)/n = 

 np/n = p for any particular sample size, n. That is, the long-run av- 

 erage size of p is exactly equal to the true population parameter p. It is 

 customary to call point estimates unbiased estimates if their mathemat- 

 ical expectation is the parameter which is being estimated. We gen- 

 erally prefer to employ unbiased estimates, like p, unless some more 

 important property is missing. 



(5.22) The estimate p = r/n has a variance = pq/n because the 

 variance of r is npq — as shown in Chapter 4— and the effect of dividing 

 the r by n is to divide the variance by n 2 , as was shown in the section 

 of Chapter 2 which dealt with the coefficient of variation. This vari- 



