8 THE MATHEMATICAL TREATMENT OF DATA 



One problem with our analysis thus far is that we have asked the 

 mathematicians to tell us the distributions when we have many, many 

 sets of measurements. Suppose we don't have so many measurements. 

 After all, who feels like making more than, say, a half dozen, or maybe 

 ten, measurements, let alone an essentially infinite number? The answer 

 to our suppositional question is a surprisingly simple one. The mathe- 

 maticians tell us that the difference comes in two places: in computing 

 the standard deviation, and in computing the percentages given in our 

 reference table above. The computation of s changes only by substi- 

 tuting n — 1 for n. This has the effect of increasing s, and is increasingly 

 important the smaller n is. For instance, if we make only two measure- 

 ments, the variance would double as a result of this correction. Next, if 

 we ask what the table percentages become, we have to realize that the 

 answer will depend on n — there will be a different answer for each value 

 of n. Thus we would need not just the simple table above, but a set of 

 tables. We will not present such a set, but tell you that you can find 

 them in any standard statistics book. But, in the spirit of the present 

 exposition of the subject, we can tell you that these tables aren't really 

 so necessary. In practice, if n is as large as 10, one makes little error in 

 using our own little reference table; if n is 20 or more, there is really no 

 error in our table. So, just by modifying the expression for s to 



Kxj - - a 

 '' \ n - 1 



we obtain most of the correction needed because of the small samples. 



Up to this point, we have been dealing with the symmetrical curve for 

 the distribution of deviations from the average. We have still to discuss 

 the asymmetrical distribution before giving some applications of the 

 analytical procedures which have been developed. 



THE ASYMMETRICAL DISTRIBUTION 



The Gaussian Distribution of deviations dealt with the case of a large 

 number of small sources of error. The asymmetrical distribution deals 

 with the case of having basically two alternative situations, one of which 

 is very unlikely. Indeed, the one which is unlikely is so unlikely that it 

 would be irrelevant if it were not for the fact that there were so many 

 cases in question that the total possibility is not negligible. Take, for 

 example, the case of distributing samples, each 1/100 ml, of a bacterial 

 suspension containing 10,000,000 cells per milliliter. The probability of 

 finding any individual bacterium in a particular sample is very small: 

 1/100 of 10,000,000, or 1/100,000. Yet, because of the large number of 

 bacteria in the sample, the probability of finding some bacteria in the 



