SELECTING THE BEST ONE OF SEVERAL BINOMIAL POPULATIONS 539 



choose one of the k processes and assert that it is best. If he allows him- 

 self the possibility of hedging and asserting that he needs further ex- 

 perimentation, then the problem changes and the tables of this paper are 

 not appropriate. 



The foUowmg additional assumptions will be made: 



1. Observations from the same or different processes are independent. 



2. Observations from the same process have a common fixed proba- 

 bility of "success". 



3. There is no chance of error in determining whether a success or a 

 failure has occurred. 



The assumption of a common probability fixed once and for all for 

 each process is one that should be checked carefully in any practical 

 application of the results in this paper. Roughly speaking, this assump- 

 tion states that each of the processes is in a state of statistical control 

 as far as the probability of success is concerned. 



We shall consider only the case in which the same number n of obser- 

 vations are taken from each process. This is certainly reasonable for a 

 single sample procedure if no a priori information is assumed. 



STATISTICAL FORMULATION OF THE PROBLEM 



Each of k given binomial populations Hi is associated Avith a fixed 

 probability of success pi where ^ pi ^ 1 (i = 1, 2, • • • , /,). For ex- 

 ample, in the yield problem pi is the long-time yield for process n, or 

 the probability that any one unit from 11, is a good one. Let the ordered 

 values of the p, be denoted by 



P[l] ^ Pl2] ^ • • • ^ P[k] (1) 



No a priori information is assumed about the values of the pi or about 

 the correspondence between the ordered p[,] and the k identifiable popu- 

 lations n,- . In particular, we have no idea before experimentation 

 starts whether pm is associated with ITi , 112 , • ■ • , or IIa- . 



The problem is to select the population associated with p[i] on the 

 basis of n observations from each population. If there are t ties for first 

 place, say 



P[i] = P[2] = ■■■ = p[,] > p[t+i] (t < k) (2) 



then we shall certainl}^ be content with the selection of any one of the 

 associated t populations as the best one. 



As an index of the true difference (or distance) between the best and 

 second best populations we introduce the symbol 



d - P[i] - Pm (3) 



