A condition of nearly complete overlap 

 was found in vertebral counts from samples of 

 herring off Vancouver Island by Tester (1949) . 

 He compared five areas using large samples 

 and found highly significant statistical differences 

 among them. The greatest difference, that be- 

 tween the northernmost and southernmost areas, 

 showed overlap by summation of the smaller 

 percentages of 



p* = .57 + 21.22+60.35+11.43+24+ .03 =46.9%. 



2 

 A very similar result comes from the difference 

 between the means divided by the pooled stand- 

 ard deviation, 



n= 51.943 - 51.830 = 177 



.639 



D = .088 



2 



p = .465 



An excellent method of graphical pre- 

 sentation of morphological relationships has 

 been proposed by Hubbs and Hubbs (1953) . This 

 is followed in figure 3 for the two sets of data 

 discussed above. The bars show all of the 

 pertinent statistics from the two samples, the 

 mean (x) twice the standard error of the mean 

 (2s. ), the standard deviation (s), and the range. 

 These values provide a ready comparison, for 

 when the hollow bars just meet (D^2), a . 16 

 level of overlap p(JX equals 32 percent) is in- 

 dicated, and when the solid bars just meet, the 

 means are just about significantly different. 

 The solid bar also indicates approximately the 

 95 -percent fiducial limits of the mean. Thus, 

 as the authors point out, the measures of reli- 

 ability and the measures of dispersion are both 

 indicated. 



Since both the standard deviation (s) and 

 the standard error (s^) are based on the normal 

 probability function, it is necessary to assume 

 that the data are normally distributed, and if 

 precise comparisons are needed the sample vari- 

 ances of the two samples to be compared must not 

 differ from each other more than would be ex- 

 pected by chance . In other words, if the null 

 hypothesis is used it is the hypothesis that the 

 two samples are randomly drawn from the same 

 normal population . 



Such an assumption, even though not 

 proved true, will not invalidate our use of the 

 method. This matter of non-normality is one 

 which has bothered all statisticians; Cochran 

 (1953:22) gives a good discussion and in general 

 says that no completely safe rules have been 

 found but that the distribution of the means tends 

 toward normality as the sample size increases 

 in many highly non-normal distributions. In 

 small samples from moderately skewed distribu- 

 tions empirical studies have shown that the 

 statistics depart only a negligible amount from 

 normality. However, this is a problem which 

 each taxonomist will want to explore with regard 

 to the data with which he is working. 



Another problem, more frequently en- 

 countered, is the one of different variance . Here 

 again, if only approximate results are needed it 

 may be ignored; if precise results are necessary 

 then tests and corrective formulas may be found 

 in many statistical texts. However, as will be 

 discussed later, if heterogeneous variance is 

 present it may be evidence that population differ- 

 ences exist or that sampling methods have been 

 inadequate or faulty. 



OVERLAP OF MEASURED CHARACTERS 



Only a simple extension of the method of 

 determining overlap of counted characters by the 

 difference between the means is required to de- 

 termine the overlap of measured characters if 

 regression analysis has been used. Instead of a 

 mean we use a mean value for the character 

 estimated from the regression line at a given 

 body length at or near the grand mean length . 

 Instead of the pooled standard deviation we may 

 use the pooled standard deviation from the re- 

 gression!/ lines. Formula (5) becomes 



D= ^ 'h , (8) 



fiy-x. 



4/ The standard deviation from regression is 

 also frequently called the standard error of 

 estimate. However, we refer here to the dis- 

 tribution of individuals around a line, and 

 standard error in another usage refers to a dis- 

 tribution of means around a grand mean. Hence 

 we avoid the latter term . 



15 



