Norcross et a I.: Habitat models for juvenile pleuronectids 
507 
of a simplified sieve and pipette procedure to obtain 
the percents of gravel, sand, and mud (Norcross et 
al., 1995). 
Distance from the mouth of the bay was used as a 
relative index of fish distribution with respect to sta- 
tion position within or outside the bay. Distance from 
each station to the nearest position at the mouth of 
a bay was calculated by drawing a line on a chart 
across the bay mouth between the two outermost 
capes. The shortest distance from the station to any 
position on this line was measured. Stations inside 
the mouth were designated as positive distances, and 
stations outside of bays were assigned negative dis- 
tances. The narrowest point of Sitkinak Strait was 
considered the “mouth” of the bay; stations to the 
west of that point were considered within the bay and 
the exposed stations in the open ocean on the east side 
of Sitkinak Strait were considered outside the mouth. 
Flatfishes were identified, and total length (mm) 
was measured in the field with a Limnoterra elec- 
tronic, digital fish-measuring board. Ages of flatfishes 
captured in August 1992 were estimated with 1) 
length-frequency plots of fishes collected August 1992 
(Norcross et al. 4 ), 2) length-frequency plots (Norcross 
et al. 3 ) and analysis of regional differences in total 
lengths (Norcross et al., 1995) of fish caught during 
August 1991, and 3) available literature (Southward, 
1967; Best, 1974, 1977; Walters et al., 1985; Harris 
and Hartt 1 ; Blackburn and Jackson 5 ). Fish lengths 
were used to separate age classes of juvenile flat- 
fishes. Catch per unit of effort (CPUE) based on a 
10-min tow time was calculated for age-0 and age-1 
individuals of each species. Habitat models were de- 
veloped for the most abundant species and age-class 
combinations. 
Statistical analyses 
Linear discriminant function analysis of combined 
1991-92 data included the broad range of conditions 
sampled around Kodiak Island. Canonical loadings 
of each variable and misclassification rates based on 
cross-validation were evaluated as outlined in 
Norcross et al. (1995) to test whether the same pa- 
rameters had been selected as the best discrimina- 
tors as those that had been selected solely on 1991 
data. The magnitude of the canonical loading of each 
variable in the discriminant analysis is a measure of 
the importance of that variable in separating the sta- 
5 Blackburn, J. E., and P. B. Jackson. 1982. Seasonal compo- 
sition and abundance of juvenile and adult marine finfish and 
crab species in the nearshore zone of Kodiak Island’s east side 
during April 1978 through March 1979. In Outer continental 
shelf environmental assessment program, p. 377-570. U.S. 
Dep. Commer., Final Reports of Principal Investigators 54. 
tions with (presence) and without (absence) the fish 
species under consideration. The success of each com- 
bination of variables in assigning a new station to 
the presence or absence group can be evaluated by 
using misclassification rates from cross-validation. 
The combined 1991 and 1992 data were further 
used to calculate Spearman’s rank correlation (rho) 
between the abundance of each fish species and each 
physical parameter. The significance of rank corre- 
lations was evaluated at the 95% level. The nonpara- 
metric test with Spearman’s rho was chosen because 
of non-normality of the CPUE data (even after trans- 
formation) and because of the high sensitivity of the 
parametric correlation coefficient (Pearson’s r) to 
outliers. To maintain an overall confidence level of 
95%, a Bonferroni-adjusted critical level of a = 0.025/ 
28 = 0.001 was used for the two-tailed test and for 
28 comparisons (4 species x 7 variables). 
To refine our previous habitat models, which were 
based primarily on presence or absence data 
(Norcross et ah, 1995), we used regression trees to 
model CPUE as a function of habitat parameters. 
We used the same parameters as in the discriminant 
analysis, except instead of percentages of gravel, 
sand, and mud in the substrate, we used a categori- 
cal description of sediment type based on Folk (1980), 
i.e. sand (S), mud (M), gravel (G), and the modifiers 
of these substrates, such as sandy mud (sM), sandy 
gravelly mud (sgM), etc., 12 categories in all. This 
categorical classification avoided problems with high 
correlations among the three sediment variables. 
Both continuous and categorical predictor variables 
can easily be accommodated in regression trees. 
The regression tree used the logarithm of CPUE 
(log(CPUE+D) as the response variable and depth, 
distance from mouth of bay, bottom temperature, 
bottom salinity, and sediment type as predictor vari- 
ables. A regression tree progressively splits stations 
on the basis of their values for one of the predictor 
variables until a leaf or terminal node is reached. 
Each leaf gives a predicted value of the response 
variable for the stations assigned to the leaf. The fit 
of the model is measured by the deviance, which is 
defined as 
D = I.(y.-^.]) 2 , 
or the sum of the squared differences between y ; = 
log(CPUE-fl) at each station i and H[,j = the mean 
for all stations i at a leaf. The deviance is defined for 
the entire tree, as well as for each leaf, and is the 
analogue of the sum of squares in regression mod- 
els. Each successive partitioning of the data reduces 
the deviance. For noisy data, the regression tree may 
overfit the data, resulting in an overly complex tree 
