438 MEMOIRS NATIONAL ACADEMY OF SCIENCES. [vol. xv, 



between deviations of sampling of these same two coefficients, a quantity which is nearly as 

 large as the correlation of the two variables, whose correlations with the third are being com- 

 pared. In this case the correlation is between measurements by tests 4 and 7. The value 

 used in the calculation of the probable error just given is taken as 0.50, which is probably too 

 low. 



When we consider the difference between correlations of tests 2 and 8 with ratings in Com- 

 pany F group it is evident, without actually calculating the probable error, that the difference 

 is clearly significant of a certain kind of individuality of the ratings. Not only is the absolute 

 value of the difference between coefficients, 0.209, much greater than in the case first dis- 

 cussed, by the correlation between tests 2 and 8 is greater, approximately 0.80. This latter 

 condition would make the probable error smaller than in the previous case, other things being 

 equal. 



Without going deeper into the detailed (and idle) calculation of probable errors, since the 

 samples are too small to justify greater refinement of this sort, it seems safe to conclude that 

 the evidence indicates qualitative variability of ratings. Or, to state the fact differently, the 

 set of correlation coefficients for the eight alpha tests for a group of individuals rated by a par- 

 ticular officer, is an indirect qualitative as well as quantitative analysis of that officer's judg- 

 ments of the intellectual ability of the individuals he rated. Several sets of such correlation 

 coefficients for several different officers (supposing no qualitative variation in the groups rated 

 by them) constitute a qualitative and quantitative basis of comparison of these several officers' 

 methods of estimating intelligence. The variability of the system of coefficients for the eight 

 alpha tests would undoubtedly be greater if the intercorrelations were lower. 



If the foregoing interpretation of the results obtained up to this point be the correct one, it 

 follows that the use of subjective intelligence ratings in estimating the relative diagnostic values 

 of a set of tests does not insure the best possible results. Within the limits set by the inter- 

 correlations a set of tests can be made to measure a variety of aspects of ability by appropri- 

 ate adjustment of weights. If the intercorrelations are high, the number of possible types of 

 measurement is small, and vice versa. But even with the relatively high intercorrelations 

 shown by the alpha tests, it is evident that multiple regression equations obtained from the 

 different groups that have been discussed above would be quite different. 



There might be, however, a resultant of the ratings by a large number of individuals that 

 would constitute the ideal estimate of intelligence. This ideal measurement would, of course, 

 necessarily be ideal by statistical definition, since no other definition is available. In order to 

 obtain a reliable comparison of several tests as to their diagnostic efficiency a moderate number 

 of estimates of intelligence of tested individuals would be needed from a large number of differ- 

 ent persons, not a large number of estimates of intelligence by a relatively small number of 

 persons, for the probable errors of the results would arise mainly not from the number of cases 

 tested and rated, but from the number of cases furnishing the ratings. In practice such a 

 scheme would meet another difficulty that is shown by the material of the present study. In- 

 spection of Table 81 indicates that, as far as the ratings of Camps Beauregard and MacArthur 

 were concerned, they were class ranks, rather than absolute measures. In the Camp Meade 

 material there are only two companies which were examined exclusively by alpha as a first 

 examination, or by alpha following beta. They were Companies B and K, and the following 

 comparison of median scores and mean ratings is of interest : 



Median Mean 



total score. rating 



Company B 55. 9 5. 066 



Company K 55. 4. 175 



Although the differences are both in the same direction, they are out of all proportion to 

 each other, and support rather than contradict the opinion that the apparent correlation is 

 purely accidental. Furthermore, it seems a wholly gratuitous assumption that even the defi- 

 nitions and instructions furnished with the rating scale would enable the rating officers to do 

 more than give class ranks when dealing with groups as nearly alike as the score distributions 

 indicate them to be. It follows, therefore, that if we place all individuals rated "6," for ex- 



