Miscellanea 
189 
unkuovvn values of the sampled population. Thus (a) and {^) provide ultimately the same (^^^ 
but the probable error of and the mean value of will be different in the two cases. In the 
first case we vary our marginal totals with the sample as they obviously would vary in practice. 
In the second case we define our to be a deviation from the indejiendent probability of an 
artificial population, we do not keep the marginal totals of the sample fixed any more than in (a). 
But if we think in terms of (and not (^-) we ap2)ear to do so because ultimately we have to take 
our marginal probabilities as those of the sample in default of a knowledge of any better values. 
This point seems to me well illustrated in what my critic in the Journal of the Royal 
Statistical Society has to say on p. 90 of his paper about Messrs Greenwood and Yule's use of ^ 
for a fourfold table. He asserts that they ought to have entered the table of goodness of fit with 
n' = 2. The problem before them was whether their fourfold tables could possibly be samples of 
bi-variate independent probaliility distributions. Each sample from such a distribution would 
have perfectly free cell frequencies mn, nivii »i3i, m^-i^ subject to the sole binding condition that 
mil + '"12 + "'21 + '"22 = ^I- 
The proper is given by 
?ftl2 — 
m i,m. , 
M 
m 1,9)1 , 
m 2. Wi. , 
m 2,w .2 
-(y), 
and this has three degrees of freedom and is what Messrs Yule and Greenwood desired to find, 
and they properly used the value of P for ?i'=4. 
Then like the astronomer, who finding the pi'ol)able error of his mean to be •67449o-/^i/' and 
not knowing the o- of his sampled population, \)\\t» it equal to the o- of his observations, so 
Messrs Yule and Greenwood very properly rej^laced the marginal totals of their unknown 
population by those of their sample, but vei-y properly did not rei^lace /i,' = 4 by n' = 2 !. 
But says my critic*, if they had, they would have got the same measure of improbability as if 
they had compared the difference of percentages ! Quite so, and obviously so ; for in taking- 
percentages they have actually fixed their marginal totals taking 100 of each class and thus for 
the first time confined their attention to a limited class of samples, not the random samjjle of 
size i/, which has not its marginal totals fixed. We have, indeed, reduced our degrees of freedom 
by two in taking ratios. 
When we consider generally the f*^!" ^ fourfold table to measure the improbability of a 
sample we are really comparing the special sample 
a 
h 
a + b with 
a' 
b' 
«' + b' 
c 
d 
c + d 
c' 
d' 
c' + d' 
a + c 
b + d 
M 
a' + c' 
b' + d' 
M 
the general population, where in the latter case a'd' = c'b'. 
Now the mean square contingency of the first of these tables is 
// (a + 6)(a + t.)y / {a + b){b + d)y ( {^a ^ c) ic + d)\^ (, ( c + d)(b + d)y 
M 
{a + b) {a + c) 
M 
{a + b) {b + d) 
{a + c) {c + d) 
M 
M 
{c + d) {b + d) 
M 
\{a + b) {a + c) {a + b) {b + d) + 
_ {ab - cdY 
" {a + b) {a + c) {b + d){c + d)' 
d'' 
{a + c) {c + d) {c + d) {b + d) 
Loc. cit. p. 90. 
