250 
STATISTICS: W. A. SHEWHART 
Proc. N. a. 
icantly different from 0 and 3. If ^ and ^2 are different from 0 and 3, the 
values of p, q, and n may be calculated since pn = x and pqn = /X2 where 
X represents the average of the distribution, and a study of these values 
will indicate the probable reason for such variations in k and ^2 provided 
the distribution follows the random laws. If the observed distribution is 
consistent with the calculated values of p, q, and n, the "probability of 
fit"^ between the observed distribution and the theoretical distribution 
representing the expansion of {p + qY should be high. 
In the case that the ^7's, V's and W's are variables, we start with un- 
known values of p, q and n, and in addition have the unknown value of A 
X which represents the effect of a single cause. It is obviously impossible 
to determine the values oi p, q and n as in the case of attributes for among 
other reasons we cannot determine the origin of the distribution. In 
general, however, we may make use of the factors, k and ^2, since these 
are independent of the unit in which the measurements are made. Fur- 
thermore, we may make use of the criterion, establishing the "goodness of 
fit" between the observed distribution and a theoretical one corresponding 
to the various degrees of approximation^ to the normal law. 
To return to equations (1) and (2) it is evident that, if none of the causes 
are common, the correlation between simultaneous measurements of 
Xi and X2 will be practically zero. Furthermore, as the number of causes 
that are common increases we should expect an increase in the correlation 
coefficient under certain limiting conditions. Though it may not be possi- 
ble to determine the number of causes that influence a variable quantity, 
such as either Xi or X2, it has been found possible, however, to determine 
the approximate value for the ratio, of the number of common causes to 
the total number of causes influencing either variable. To show this for 
the general case assumed in equations (1) and (2) let us start with the defi- 
nition of the correlation coefficient r, 
where Xi and X2 represent the deviations from the mean values of Xi and 
X2 and (Txi and a^i represent the standard deviations in these quantities 
and N represents the total number of observations made. Let us assume 
independence between the primary causes. If we expand equations (1) 
and (2) by Taylor's theorem and neglect thes econd and higher powers of 
the variations of the primary causes we can obtain values of the deviations 
Xi and X2 and of the standard deviations a^i and (Tx^, which, when substitu- 
ted in equation (3) reduce to the simple expression 
r = 
'^x^x.. 
(3) 
a 
(4) 
r = 
a + b 
