160 
Fishery Bulletin 119(2-3) 
Figure 4 
Box plots of true and estimated unfished recruitment (RO) 
from 4 age-structured estimation models under (A) case 0 
and (B) case 12. The models, evaluated in this study for 
use in stock assessments, include the Assessment Model 
for Alaska (AMAK), the Age Structured Assessment Pro- 
gram (ASAP), the Beaufort Assessment Model (BAM), and 
Stock Synthesis (SS). The horizontal gray dashed line rep- 
resents the true RO under the 2 cases. For case 0, which is 
the null case, initial equilibrium recruitment was lowered 
from the unfished recruitment level as determined by an 
initial equilibrium fishing mortality rate. For case 12, the 
initial condition was equal to the unfished equilibrium 
population. The upper and lower parts of each box repre- 
sent the first and third quartiles (the 25th and 75th per- 
centiles), and the thick horizontal line is the median. The 
whiskers extending above and below the box correspond to 
1.5 times the interquartile range. 
was included in case 9 (Table 6, Fig. 6, Suppl. Figs. 4-8 
[online only]). Results were as expected, given that there was 
no reduction in bias but an increase in precision. The accu- 
racy of determining overfishing and overfished status 
shared similar trends compared with case 0 (Fig. 7). 
Bias adjustment of recruitment The accuracy of parameter 
estimates was high if, in the EMs, the conversion function 
was used before estimation when a bias adjustment of 
recruitment was incorporated in the OM. Median relative 
errors were close to zero for MSY-based reference points 
when the conversion function was used in the BAM and SS 
(Fig. 5). With ad hoc adjustment (i.e., after estimation) in 
the AMAK and ASAP, the median relative errors in MSY- 
based reference points were reduced (Fig. 5, Suppl. Figs. 1 
and 2 [online only]). Estimated SSB, recruitment, and F 
remained highly accurate over time (Fig. 6). The trends in 
accuracy of stock status determination were similar to the 
trends from case 0 (Fig. 7). 
Discussion 
Similarity of estimates from the 4 models 
In this study, the 4 stock assessment models, or EMs, pro- 
duced similar estimates, an outcome that can be attributed 
to the fact that the models share similar mathematical 
and statistical attributes. Prior to our study, this suppo- 
sition was expected to be true but was unverified. Under 
cases that are associated with different recruitment vari- 
ability levels, process error in F, diverse patterns of F, var- 
ious selectivity shapes, and multiple surveys, the median 
relative errors in key parameters remained low, and the 
variability of the REs were similar among the EMs. Of the 
5 cases described here, the level of recruitment variability 
caused the most change in RE patterns. The range of REs 
in all 4 EMs became wider when recruitment variability 
increased and remained stable over other cases. Further- 
more, the temporal trend in the accuracy of overfishing 
and overfished status determination was the same among 
the 4 EMs. 
These findings indicate that the 4 EMs produce simi- 
lar estimates when the same data are analyzed and the 
EMs are configured similarly. Nevertheless, the results 
would differ if different options of features are used for 
different EMs, depending on the stock-specific data and 
issues that assessment analysts must face. In practice, 
stock assessment analysts may make different configu- 
ration choices given the same data and the same model. 
We encourage analysts to clearly document the assump- 
tions made in an assessment, especially when the anal- 
ysis involves comparisons among multiple models. In 
addition, model misspecification may result from differ- 
ent assumptions about parameters, governing processes, 
and statistical properties that can have a substantial 
effect on stock assessment results and subsequent man- 
agement advice (Piner et al., 2011; Maunder and Piner, 
2015). More simulation-estimation studies could be done 
to quantify the effect of model misspecification on model 
estimates. 
The fundamental differences in mathematical and sta- 
tistical attributes found in this study could also serve as a 
starting point for diagnostics that can be used to identify 
the source of variation. In addition to confirming similar 
estimates, we identified that different approaches of com- 
puting initial numbers at age induced differences in esti- 
mates, especially those associated with RO and MSY-based 
reference points. Estimates among the 4 EMs also differed 
if bias adjustment of recruitment was not addressed care- 
fully. The effects of the initial numbers at age setup and 
recruitment bias adjustment on EM performance are dis- 
cussed in detail in subsequent sections. We also noticed 
that determining the overfishing status was not 100% 
accurate across time because, in the binary classification 
applied in our study, the overfishing determination was 
based on the maximum likelihood estimate. Use of the 
estimated model uncertainty interval may better capture 
the true overfishing determination. Aggregating this 
binary determination over years from one EM and using 
