Li et al.: A comparison of 4 age-structured stock assessment models 
simulation tests to compare estimates to true values (Fig. 1). 
The code comparison process helps to verify whether the 
code from the 4 EMs executes the intended algorithms the 
same way and to identify common features from the EMs to 
develop an OM (Table 1, Suppl. Material [online only], Suppl. 
Table 1 [online only]) with only those commonalities (NRC, 
1998) (Table 2). The simulation-estimation process, which 
helps validate the accuracy of EMs, consisted of 4 main steps: 
Common feature 
identification 
Select EMs 
Identify common 
features for OM 
development 
Source code 
comparison Collect code snippets for 
identified common features 
Compare formulas 
from technical reports 
Standardize input values 
OM simulation 
and EM 
estimation Repeat 100 times 
Simulated 
data Standardize 
inputs 
“True” 
values 
Performance 
comparison 
Compare 
performance of 
EMs 
Performance 
measure criteria 
Quantity of interest 
Figure 1 
Flow diagram of the processes used to compare 4 age-structured stock assess- 
ment models used in the United States. Steps 1 and 2, which compose the 
code comparison process, involve identification of common features and 
source code comparison of the estimation models (EMs). Steps 3 and 4, which 
compose the simulation-estimation process, involve operating model (OM) 
development, estimation with EMs, and comparison of performance between 
the EMs. Common features, input standardization, quantities of interest, and 
performance measures are described in the “Materials and methods” section 
and in Table 1. The “true” values include actual values used as inputs for 
development of the OM and simulated true values for quantities of interest. 
Compare parameters 
from source code 
1) developing an OM to simulate annual fish population and 
fishery dynamics, 2) fitting the 4 EMs to the simulated data, 
3) repeating the simulation-estimation 100 times with differ- 
ent recruitment deviations and observation errors for each 
iteration, and 4) comparing estimates from the EMs with the 
true values from the OM (Fig. 1). This process was repeated 
for 13 cases (Table 3). Comparisons were made among the 
4 EMs within each case and across cases. 
Operating model and comparison cases 
The OM developed in this study was an 
age-structured model, with parameter 
values describing life history obtained 
from Siegfried et al. (2016). That study 
simulated a population on the basis of 
an amalgam of life history traits common 
to species found in waters of the Atlan- 
tic Ocean off the southeastern United 
States. In our study, the OM population 
was simulated with an annual time step 
over 30 years and a maximum age of 
12 years (Table 1, Suppl. Material [online 
only], Suppl. Table 1 [online only]). The 
OM null case (case 0) had one fishing 
fleet and one survey, with fully selected 
F linearly increasing with time. A time- 
invariant logistic selectivity function was 
used for both the fishing fleet and the 
survey in the null case. Fishery landings 
and survey abundance index data were 
simulated yearly with observation error 
from year 1 to year 30. The annual sam- 
ple size was 200 samples for age com- 
position data from both the fishery and 
survey. In all cases except case 12, the 
initial equilibrium recruitment was low- 
ered from the unfished recruitment level 
as determined by an initial equilibrium 
F (spawning biomass per recruit based 
on F [o,] is less than unfished spawning 
biomass per recruit [9], as in equation 
3.4 provided in Supplementary Table 2 
[online only]). The addition of case 12, for 
which the initial condition was equal 
to the unfished equilibrium population 
(dp=09), as in equation 3.4 provided in 
Supplementary Table 2 (online only), 
allowed a comparison of methods for 
simulating the initial population. Details 
of parameter definitions and equations 
used to describe the OM under case 0 are 
presented in Table 1 and in the Supple- 
mentary Material (online only) (the code 
for creating the OM and comparing the 
EM results is available from website). 
Eleven additional cases were explored 
to investigate the effects of recruitment 
variability, process error in F, patterns in F, 
