166 
Fishery Bulletin 120(2) 
given that few red drum were captured (n=35) and effort 
was relatively low (70 stations). Boosted regression trees 
use machine learning to fit complex, nonlinear relation- 
ships and to offer predictive advantages over generalized 
linear or additive models. For a complete description of 
BRTs and the methods used in this study, see Drymon 
et al. (2020). 
Results of preliminary analyses indicate a high propor- 
tion of zero values (i.e., zero-inflated data). To account for 
the preponderance of zeros, a 2-step (i.e., delta or hurdle) 
process was chosen to model catch data. Probability of 
presence and absence was modeled by using a BRT with 
a binary distribution, and continuous non-zero (i.e., abun- 
dance) probability was modeled by using a BRT with a 
Gaussian distribution. Because the catch data also con- 
tain some instances of anomalously high catch (i.e., long- 
tailed data), non-zero data were natural log-transformed. 
Predictions were reverse log-transformed so that the final 
model is a product of the binary and Gaussian BRTs (Lo 
et al., 1992). 
Sixteen variables with data from multiple sources were 
considered for the BRTs (see table 1 in Drymon et al., 
2020). Although data for some variables (e.g., tempera- 
ture, salinity, and dissolved oxygen) were collected on-site 
during bottom longline sampling, all predictor data were 
obtained following methods outlined in Drymon et al. 
(2020) to facilitate comparisons with previous habitat 
modeling in the same region. Surface and bottom tem- 
peratures (in degrees Celsius), salinity, and 3-dimensional 
surface and bottom current velocities (surface, northward, 
and upward, in meters per second), as well as sea-surface 
height (in meters), were obtained from the Hybrid Coordi- 
nate Ocean Model data server (4-km resolution; HYCOM 
consortium, available from website, accessed January 
2020). Bottom dissolved oxygen (in milligrams per liter) 
was obtained from Gulf of Mexico Hypoxia Watch maps 
(NOAA National Centers for Environmental Information, 
available from website, accessed January 2020) and inter- 
polated across ~100—250 survey stations (the number of 
stations varied by year). Depth (in meters) and substrate 
grain size (in millimeters) were obtained from the Coastal 
Relief Model bathymetry for the Gulf of Mexico (resolu- 
tion of 0.33 arc seconds or ~10 m; Buczkowski et al., 2006; 
U.S. Geological Survey, gmx_grd.zip, available from web- 
site). Day length (in minutes) was calculated in R by using 
code by S. Dedman (available from GitHub, accessed Jan- 
uary 2020). 
Given the quantity of potential predictor data consid- 
ered within the BRTs, some degree of spatial autocorrela- 
tion was anticipated (e.g., between distance from shore 
and depth, between surface and bottom temperatures); 
however, BRTs are robust despite autocorrelation among 
independent variables (Abeare, 2009). All BRTs were fit 
by using the package gbm.auto (vers. 1.4.1; Dedman et al., 
2017) in R. Learning rate, bag fraction, and tree contri- 
bution are parameters that are used in concert to deter- 
mine minimum predictive error (Elith et al., 2008). These 
parameters were optimized by using gbm.auto for the 
model run for each season. 
Model performance and interpretation 
The BRT modeling approach allowed automatic partition- 
ing of the data into training and testing sets, at a ratio 
dictated by the bag fraction. Ten-fold cross validation was 
then performed, with the members of the training and 
testing sets randomized each time. Performance metrics 
included training and testing correlation, cross-validation 
deviance (and standard error [SE]), and correlation (and 
SE), as well as area under receiver-operator curve (AUC) 
(Hanley and McNeil, 1982) and its cross-validation and 
cross-validation SE for the binary models (Parisien and 
Moritz, 2009). The final Gaussian fitted functions from 
the BRT were visualized by using marginal effect plots to 
indicate the effect of a particular variable on the response 
after accounting for the average effects of other model 
variables (Elith et al., 2008). 
Habitat suitability 
The distribution of suitable habitat was predicted by using 
the BRTs described previously. Environmental data for 
model predictions were obtained as detailed previously, 
except that Hybrid Coordinate Ocean Model data were 
extracted for one representative date per season (the 
monthly groupings for the seasons were March—May, June— 
August, and September—November). Representative dates 
for environmental data were selected by ranking the abso- 
lute value of the differences between values for all sites for 
all variables against the mean for those variables, then by 
identifying the date within each season that most closely 
matched those values. The BRTs then were used to predict 
CPUE values for each 2-by-2-km cell. These values were 
then then mapped in QGIS by using the heatmap setting to 
produce color points weighted by the predicted abundances 
generated from the BRT. By using gbm.auto, the coefficient 
of variance was calculated for the predicted abundance val- 
ues for each 2-by-2-km cell to represent model variance. 
Results 
Catch data 
Between May 2006 and November 2018, 1296 bottom long- 
line sets were conducted and 815 red drum were caught 
(Fig. 2), with 741 of those red drum measured and 472 
fish kept for otolith collection. Approximately 100 stations 
were sampled each year (mean: 100 stations [standard 
deviation 22]; range: 80-143 stations), and survey effort 
(number of sets) was relatively well distributed across the 
3 seasons examined in the BRTs: spring (460 sets), sum- 
mer (405 sets), and autumn (361 sets). Red drum caught 
on bottom longlines were primarily encountered in state 
waters across all seasons (Fig. 2) and were exclusively 
larger than the size at 50% maturity reported by Bennetts 
et al. (2019) (Fig. 3A). 
To supplement the otoliths taken from the 472 red drum 
retained from the bottom longline surveys, otoliths from 
