Gerritsen and McGrath: Precision estimates and suggested sample sizes for length-frequency data 



precision of a length-frequency sample as the mean 

 precision over the entire size range. However, it appears 

 that this approach has not been used to establish an op- 

 timum sample size. Such mean precision estimates over 

 the entire size range might be used to obtain a rule- 

 of-thumb for sample sizes that are required in order 

 to obtain a certain precision level of the catch at each 

 location. In the present study we aim 1) to determine a 

 rule-of-thumb for obtaining an appropriate sample size 

 when the number of fish available in a particular sample 

 exceeds the number that can be measured at a reason- 

 able cost, and 2) to examine the sample sizes that have 

 been taken in the past, in absence of such guidance. 



Materials and methods 



Data were used from the Irish Groundfish survey, which 

 was carried out on RV Celtic Explorer in the waters 

 around Ireland during October and November 2005. 

 The catch was sorted into species and, if appropriate, 

 into size grades, each of which were treated as a sepa- 

 rate length sample. Length measurements were taken 

 from all fish and squid species that were caught. If the 

 number of individuals in a sample was large, a sub- 

 sample was taken by repeatedly transfering the sample 

 from each fish box into two other boxes and discarding 

 one of these. This method ensures that the entire catch 

 is represented uniformly in the subsample. At the time 

 of the survey, the samplers did not have any particular 

 guidance on the appropriate size for a subsample; they 

 used their own judgment to decide on the sample size. 



The precision of the number of observations in each 

 length class of a random sample can be estimated by as- 

 suming a multinomial distribution (Smith and Maguire, 

 1983). If the precision in each length class is expressed 

 in the form of a coefficient of variation (CV), an overall 

 measure of precision can be obtained by weighting each 

 CV by the number of fish in each length class. This 

 mean weighted CV (MWCV) provides a description of 

 the precision over the entire range of size classes in a 

 length frequency distribution. 



Under the assumption of a multinomial distribution, 

 the standard deviation (a,) of the number of fish in a 

 sample that are length category / can be estimated by 



AfWCV = ^p,CV, 



-I^ 



(3) 



a, = 7np,(l-p,), 



(1) 



where n = the total number of fish in the sample; and 

 p, = the proportion of the sample that is length i. 



The coefficient of variation (CV) of the number offish at 

 length i. is given by 



CV.= 



nPi 



(2) 



The highest possible value of the MWCV results from a 

 length-frequency distribution that is evenly distributed 

 over a large number of size classes. The number offish 

 at each length class are then Poisson distributed with 

 a standard deviation that equals the square root of the 

 number at length (Zar, 1999). The theoretical maximum 

 MWCV is therefore given by 



MWCV = (nlc)- 



(4) 



and the mean weighted coefficient of variation (MWCV) 

 is given by 



where c = the number of size classes in the sample. 



The minimum MWCV is zero and would result from a 

 distribution where all observations fall within a single 

 length category. Therefore, the MWCV estimates will 

 always lie between zero and the curve described by 

 Equation 4. 



Results 



During the 2005 survey, a total of 2332 length samples 

 were taken for 80 different species of fish and squid. In 

 most cases, the sample size was limited by the number 

 of individuals in the catch. However, 596 samples were 

 deemed too large to measure all individuals and sub- 

 samples were taken. The median subsample size was just 

 under a quarter of the total catch (by weight), whereas 

 90% of the subsamples were smaller than half of the total 

 catch. The four most common species that were subsam- 

 pled were poor cod (Trisopterus minutus), blue whiting 

 [Micromesistius poutassou), haddock (Melanogrammus 

 aeglefinus), and Norway pout (Trisopterus esmarkii). 



The estimated MWCV of the subsamples was closely 

 associated with the ratio of the number of individuals 

 measured to the number of length classes in the sample 

 (Fig. 1). The MWCV appeared to follow an exponential 

 curve that was close to the maximum MWCV given by 

 Equation 4. The MWCV decreased very rapidly with 

 increasing sample size up to sample sizes of around 

 10 times the number of length classes in the sample, 

 after which the sample size would need to be increased 

 considerably for a moderate further improvement in 

 precision. If the sample size is taken as 10 times the 

 number of length classes in the distribution, an MWCV 

 of around 0.25 can be expected; a sample size of 48 

 times the number of length classes would result in 

 an MWCV of 0.10 and a sample size of 155 times the 

 number of length classes would be necessary to reduce 

 the MWCV to 0.05. 



The mean sample size in the subsamples taken on 

 the survey was just under nine times the number of 

 length classes per sample, resulting in a mean MWCV 

 of 0.33. However, there was quite a large spread in 

 the sample sizes (Fig. 1); therefore some samples were 

 measured with very low precision, whereas others had 



