Pennington Estimating the mean and variance from highly skewed marine data 



501 



tervals forp and for the mean (or median) of the log- 

 normally distributed nonzero values. For example, 

 if (p L , p LI ) is a 95% confidence interval forp and (L, 

 U) is a 95% interval for the mean (or median) of the 

 lognormal component (see, e.g. McConnaughey and 

 Conquest, 1992), then (p L L, PtjU) will have a confi- 

 dence level of at least 90% (=0.95 x 0.95). 



Examples 



There are two types of data sets that are typical for 

 marine abundance surveys. The first type has a single 

 large catch that can be many times larger than the 

 next biggest catch. This huge catch may account for 

 more than 50% of the total catch taken during the 

 survey. The other category, and the more common 

 type, is that the distribution of catches is highly 

 skewed, as is the case for the first type, but there 

 are no isolated large catches that dominate the total 

 catch. These are the basic types of data sets that 

 would be expected if samples are taken from a highly 

 skewed lognormal distribution. 



Isolated large catches 



Occasionally, a very large value can occur when 

 samples are drawn from a lognormal distribution. 

 The first example (Table 1) is an artificial data set 



generated from a lognormal distribution with p = 

 and a 2 = 4. The mean of the distribution is 7.4 and 

 its variance is 2,926. Because of one large point in 

 the sample, the estimates, x = 38.8 and s^ = 63,320, 

 are much larger than the true values. The estimated 

 standard error of the sample mean based on the 

 sample variance is 35.6 [= (63,320/50) 1/2 L 



The sample estimates of the logged values are y = 

 0.175 and s 2 = 3.921. Hence the estimates of the mean 

 and variance from the minimum variance unbiased 

 estimators are [Equations 1 and 2, m = n = 50| 



c = exp(0.175)g 50 ( 1.961) = 7.6 



and 



d = exp(0.350) g sn (7.842) - g 5 



48 

 49 



x3.921 



= 1.42 x (922.83 - 34.07) = 1261, 



which are much closer to the true values than are 

 the ordinary sample estimates. The estimate of the 

 standard error of the sample mean using d is 5.0 

 [=( 1261/50) 172 ] as compared with an estimate of 35.6 

 based on the sample variance. The expected stan- 

 dard error of the sample mean (when n = 50) is 7.6 

 [=( 2926/50 ) m l 



The estimated variance of c is given by (Equa- 

 tion 4) 



