266 
Fishery Bulletin 108(3) 
Appendix I: Errors in species identification 
L{m,n,x,y) = ^r m p imx p u 
Sightings matched between the side observers and belly 
observers provided an opportunity to estimate the prob- 
abilities that species were misidentified by examin- 
ing discrepancies in species identification between the 
matched data. Two types of discrepancies were found: 1) 
hierarchical discrepancy, where one observer identified 
the animal (or group) to species while the other observer 
identified the animal only to genus, family, etc., and 
2) mismatched species identification, where two differ- 
ent species were identified. In the case of hierarchical 
discrepancies, identifications that were not to species 
could be treated as missed sightings and discarded for 
the purpose of estimation of species abundance. The 
mismatched species were of greater concern. There were 
four possible outcomes when both observers identified 
the group to the species level: 1) both observers correctly 
identified the species, 2) one observer was correct and 
the other was incorrect, 3) both observers were incorrect 
and disagreed on the species, and 4) both observers were 
incorrect but agreed on an incorrect identification. From 
the matched data, types 1 and 4 showed no discrepancy 
and were thus indistinguishable; types 2 and 3 showed 
a discrepancy but were also not distinguishable from 
each other. These discrepancies are assumed to be the 
result of one of the following: an incorrect identifica- 
tion, an error in reporting by an observer, or a typing 
error by the data recorder. The errors are assumed to 
follow a binomial model and to have a generally low 
probability, so that the likelihood of two errors occur- 
ring for the same sighting (outcomes 3 and 4) would be 
negligible. The following analysis estimates the rates of 
single errors. Data collected under circumstances with 
less than 95% reliability for species identification were 
dropped from the abundance analysis. 
The tendency toward errors for species identification 
can vary by 1) environmental conditions, 2) observer, or 
3) recorder. Logistic regression was used to test each 
of these possible covariates and identify circumstances 
that were correlated with greater likelihood of dis- 
crepancies. A maximum likelihood scheme was then 
developed to estimate the error rates. Letting p lJ0X be 
the probability that an observer o in circumstance x 
identifies species i as species j, the likelihood (L) that 
a sighting will be identified as a particular species m 
is calculated as follows: 
L{m,x) = ^ra,p imx , 
i 
where R ai = the actual encounter rate of species i\ and 
r ai = the fraction of encounters that are species i. 
The likelihood of a particular pair of species identifi- 
cations m and n occurring for a given sighting by one 
observer in circumstance x and a second observer in 
circumstance y, is 
In anticipation of a limited data set with a reliability 
rate greater than 95%, we assumed that outcomes 3 and 
4 are rare events compared to outcomes 1 and 2, and 
therefore we ignored outcomes 3 and 4 in the likelihood 
model. Second, we assumed that the likelihood of an 
error is independent of the species involved, and there- 
fore the likelihood is simplified to 
[ P x Py if m = n 
L{m,n,x,y) - , , 
[{^-PxiPy+^-PyjPx ifm*n 
Letting s YV be the number of sighting pairs that occurred 
under circumstances x and v, and d the number of 
. x y 
discrepancies in species identification that occurred, 
the likelihood of a particular set of species identifica- 
tions was 
n 
( s \ 
d 1 
XY 
\*y J 
[( 1 -pJP.y + ( 1 -Pv)Px] rf ' V - 
where S = the set of matched sightings; 
D = the set of species discrepancies within that 
set; and 
XY = the set of circumstance pairs under which 
matched sightings were made. 
Maximum likelihood solutions were found iteratively 
for each of the covariate sets identified by the logistic 
regression as correlated with discrepancies. Observers 
were stratified into inexperienced (no aerial survey 
experience before this survey and 10 or fewer days on 
this survey) and experienced (at least one survey season 
of experience or more than 10 days on this survey), and 
environmental factors were considered individually. 
Likelihoods were compared to identify the most likely 
model and survey effort under circumstances with less 
than 95% reliability of correct species identification 
discarded. 
Appendix II: Estimation of g(0) 
(which accounts for perception bias only) 
Perception bias for a single observer, P(Y), was estimated 
for effort condition vector Y, as the probability that a 
group of harbor porpoise available to the observer would 
be perceived and identified to species by an observer 
from the logistic regression model as 
e^ Y 
1 + e^’ 
where /3 = the vector of coefficients estimated in the 
logistic regression. 
