G. D. Maynard 
285 
As, therefore, using corrected deaths introduces "spurious correlation " into the 
value of p, and as using uncorrected deaths will introduce a correlation due to 
the age factor, the values of p calculated, both with and without the use of a 
correction factor, are given*. The symbol p' will be used to denote values found 
from corrected deaths. The necessity for correcting for age is not of such im- 
portance when comparing cancer with suicides or insanity. 
The partial correlation coefficients found for cancer and diabetes are as 
follows : 
40 Cities, 1900—1904, p = '6896 + -0.5.59, p' = -732.5 + -0494. 
15 States, 1906, p = -9088 ± -0303, p' = -8258 + -0554. 
These values were sufficiently striking to lead one to further consider the 
matter. 
From figures obtained from Vital Statistics (Vol. I.) U.S.A. Census, 1900, 
I calculated correction factors for the cancer and diabetes death-rates of the various 
races residing in the Registration States, and classified according to birth-place 
of mother. The correlation thus obtained for these two diseases is certainly 
significant : 
p = -8609 + -0552, />' = -5442 + -1501. 
That these values are not due to errors of random sampling is shown by 
their probable errors, for the odds are some millions to one against such an 
occurrence. Nor does it seem likely, that they are entirely due to " spurious 
correlation " in the p"s and to an entirely different source in the case of the p's. 
To what then are these correlations due ? It seemed possible that the agree- 
ment of rate might be due to certain cities having more efficient registration, 
although the remarks of the Registrar did not lend much support to this theory. 
He writes: "The 'registration area' — that is to say, the States having laws, the 
results of whose operation have been accepted as giving practically complete 
mortality returns, together with the cities in non-registration States where deaths 
are satisfactorily registered under local authority — remains substantially the same 
from 1900 to 1905. The geographic distribution is shown in the accompanying 
map" {Mortality Statistics 1905, U.S.A.: see our Fig. 3). 
I have already referred to the special returns of deaths made by the Census 
Department in 1900. The small value of the Coefficient of Variation (2-225 %) 
in the "percentage error of registration" is I think sufficient evidence fch;it the 
correlation values are not due to errors in registration. If these high correlations 
were due to varying efficiency in registration, then other diseases might be 
expected to show a similar value when correlated with cancer. It will be seen 
further on that there is no significant correlation between cancer or diabetes 
* [I believe I have found a satisfactory method of making both age and population corrections free 
from spurioT^ correlation. This method will shortly be published. Ed.] 
Biometrika vn 37 
