Hjellvik et al.: Measurement error in marine survey catches 



723 



whereas 5 = a, - «., is estimated hy S = z. 



It should be noted that in the actual computation of a^ 

 and S, the data v,., are log transformed, i.e. 



y,.j = ]ogU, j/d,.j), 1=1, 



,/=1.2. 



(3) 



where n^ and d^^ denote the catch in numbers and the 

 towed distance, respectively, for vessel j at the iih haul. 

 Log-transformed data are used to reduce the heterogene- 

 ity of the variance. 



Tests for differences between experimental groups 



If our hypothesis, that parallel trawling experiments can 

 be used to quantify a measurement error inherent in the 

 cod catching process itself, is correct, we expect that this 

 error, as estimated by ct,., should be the same for all 10 

 experimental groups. If the z/s originate from a Gaussian 

 distribution, the null hypothesis of equal variance can be 

 tested by Bartlett's test Icf. all groups tested simultane- 

 ously, Bickel and Doksum, 1977, p. 304) and if needed, 

 followed by a series of F-tests where the groups are tested 

 against each other in pairs. 



The possible differences in efficiency caused by different 

 vessels or fishing gears ( or both ) can be tested by an AN OVA 

 test followed by a series of f-tests if the AN OVA test leads 

 to rejection. Again, normally distributed observations are a 

 prerequisite for such tests. 



Our first task was therefore to check whether the z,-data 

 followed a Gaussian distribution. It seemed plausible to 

 assume that observations from different groups followed 

 the same distribution, but possibly with differences 

 in mean and variance. Therefore, when checking for 

 normality, we considered the standardized variables 



where, k=k{i), k=l. ... , 10, denotes the group that haul ;, 



; = 1 71, belongs to; and 2^ and s^. are the average and 



the estimated standard deviation of the z-values in group 

 k, respectively. Deviations from normality of |.v,l can be 

 checked visually by inspecting a normal plot, and formally 

 by e.g. the Kolmogorov-Smirnov test (Bickel and Doksum, 

 1977, ch. 9.6). 



Results 



The log-catches |y, ,) and the corresponding differences (z,! 

 are presented in Figure l.The catches range from approxi- 

 mately 6^=20 to e*=3000, but on the log-scale the differ- 

 ence in catch between the vessels does not seem to depend 

 on the size of the catches (see formal test at the end of this 

 section). A normal plot of the standardized observations 

 {x={z-z^)/s^] appears linear (Fig. 2) and the Kolmogorov- 

 Smirnov test does not reject the null hypothesis of normal- 

 ity at a 10% level. Testing each group separately (except 

 group 7 where the sample size is too small) yields the 



same result, i.e. normality is not rejected at a 10% level 

 for any group, thus justifying the use of Bartlett, F- and t- 

 tests. Bartlett's test for testing equality of variances yields 

 a P-value of 0.81. In view of this, it is not really necessary 

 to test the groups in pairs for equality in variance using an 

 F-test, but as a source of additional information we have 

 carried out the tests obtaining the lowest P-value of 0.078 

 for groups 4 and 5. Thus, based on these data, the hypoth- 

 esis of a uniform measurement error independent of geo- 

 graphical location, time, depth etc., could not be rejected. 



To investigate possible differences in efficiency for the 

 participating vessels we did an ANOVA test. We found a 

 P-value less than 10"^, indicating significant differences. 

 This finding was consistent with earlier findings in 

 calibration experiments (Pelletier, 1998). Thus, because 

 £(Z,) cannot be considered equal for all groups (also the 

 confidence intervals in Fig. IB), a pooled variance could be 

 used for estimating d\. Alternatively, e, in Equation 2 could 

 be replaced by the variables z\=z-Zj,, which are adjusted 

 for group means and are identical to the residuals from 

 the ANOVA fit. The resulting estimates with the last 

 approach are a;^=0.069 and o;.=0.263. The bootstrapped 

 standard errors of (t| and a^ are 0.0077 and 0.0147, 

 respectively (1000 bootstrap replicates were used). Some 

 caution should be exercised in interpreting these numbers 

 (see e.g. Srivastava and Chan, 1989). 



Compared with the total variability of the survey, the 

 measurement error of a single haul is relatively small. For 

 the last 5 years ( 1996-2000 ), var(.v, ) for the nonzero catches 

 varied between 1.38 and 2.06 for the winter survey and 

 between 2. .53 and 3.92 for the autumn survey. Thus, a\ is 

 about 2-5% of the total variation. This is the percentage of 

 the variation that we cannot expect to be able to explain 

 by explanatory variables. One should carefully note that 

 these numbers are on the log scale. If antilogs of the catch 

 rate were to be used, the additive model (Eq. 1 ) would have 

 to be replaced by a multiplicative model, and the relative 

 magnitude of variances would be changed. 



The results for length-stratified data are shown in 

 Table 2, as well as the results obtained by measuring 

 the catches by weight instead of by numbers, i.e. by 

 replacing n,^ in (Eq. 3) by, m,^, where w,^ is the weight 

 of the catch in kilograms. Only hauls where both vessels 



