632 



SCIENCE 



[N. S. Vol. XXXVIII. No. 983 



pected in view of the fact stated at the hegin- 

 ning that mathematical grades are no more 

 accurate than any other grades. The marks 

 of the second mathematics instructor are so 

 close, not because it was mathematics that he 

 was grading, but because this instructor had a 

 purely mechanical method of grading, of de- 

 ducting so many points for each kind of error. 



weighting the second set of marks by the dif- 

 ference between the averages of the two mark- 

 ings. Without giving these weighted values 

 in a separate table it will be sufficient to say 

 that the average difference thus computed is 

 3.5 as compared with the average difference 

 of 4.4 in Table III., or in terms of mean varia- 

 tion, 1.75 and 2.2, respectively. 



Average of all the differences 4.4 points. 



But this does not mean that his grades were 

 more accurate or just. Another instructor 

 might with perfect justice deduct either more 

 or less for the same kind of error. All that it 

 means is that this instructor was able by 

 means of his mechanical method to match his 

 own marks fairly closely. Furthermore, we 

 must not infer that the other instructors had 

 graded their papers carelessly either the first 

 or the second time, or both times. As a mat- 

 ter of fact, each question had been graded in 

 both markings of all papers except the second 

 and third group of psychology papers and the 

 English papers. And these are not essentially 

 different from the rest. The results, while ob- 

 tained from only seven instructors (more were 

 not available for the purpose) are quite repre- 

 sentative and reliable as any one familiar with 

 statistical methods can determine from the 

 above data. Results from twice or three 

 times as many persons would not be materially 

 different. 



We may eliminate one further factor from 

 Table III., namely, the difference due to a 

 change in an instructor's standard after an 

 interval of time. This may be eliminated by 



Of the four factors stated at the outset, each 

 contributes the following amount to the total 

 variation: The general mean variation or 

 probable error of grades assigned by teachers 

 in different schools is 5.4 points. The mean 

 variation of grades assigned by teachers in 

 the same department and institution is 5.3. 

 The mean variation of the latter, after elimi- 

 nating the effect of high or low personal stand- 

 ards, is 4.3. The mean variation of grades as- 

 signed at different times by the same teachers 

 to their own papers is 2.2. Hence the largest 

 factors are the second, third and fourth. The 

 fourth contributes 2.2 points, the third 2.1 

 points, the second 1.0 point and the first prac- 

 tically nothing toward the total of 5.4 points 

 of mean variation. 



Now what do all these results mean? How 

 small divisions on our scale are practically 

 usable? As a question of psychological meth- 

 odology the units of any scale of measure- 

 ments, if a single measurement with the scale 

 is to have objective validity, should be of such 

 a size that three fourths of all the measure- 

 ments of the same quantity shall fall within 

 the limits of one division of the scale. For 



