Oldemeyer et al.: A multiyear Bayesian model for incorporating sparse or missing salmonid data 
259 
median parameter estimates produced for the first year 
of each scenario to the known parameters used to sim¬ 
ulate the data. Estimated median values and credible 
interval characteristics of posterior parameter distribu¬ 
tions were examined to assess strata-specific and total 
yearly abundance estimates. Total yearly abundance 
estimates and corresponding credible intervals were 
calculated by randomly sampling one value from the 
posterior abundance distribution each unique stratum 
for the first year of the scenario. The number of unique 
values is dependent on how many strata are in the 
year—35 for this simulation. Summing these values 
and reiterating the random sampling procedure 50,000 
times creates a total yearly abundance distribution, 
t/ Tot . Yearly model bias was measured by the difference 
of [/-rot from the known C/ Tot . Strata-specific accuracy 
was judged on the number of strata that included the 
known abundance parameter within the predicted 95% 
credible intervals. For abundance estimates produced 
for the 2 real data sets, point estimates and credible 
interval widths were used to evaluate relative perfor¬ 
mance among the models. To imitate a naive Lincoln- 
Peterson estimator, the M PS posterior parameter dis¬ 
tributions had portions removed that corresponded to 
strata missing data. 
Results 
Simulation and scenarios 
Markov chains converged for all models and produced 
representative posterior distributions for parameters 
with the exception of the M PS model. The M PS model 
had Gelman-Rubin test statistics >1.1 and density plots 
with multiple peaks for posterior distributions when 
strata were missing data. The M PS model relied pri¬ 
marily on the vague prior U parameter distributions to 
construct posterior distributions when data were miss¬ 
ing and MCMC required additional iterations (100,000) 
to converge around the highest density sample space 
and achieve Gelman-Rubin test statistics <1.1. The 
posterior U distributions obtained from missing strata 
by using the M PS model were largely the product of the 
prior U distribution and added little relevant biologi¬ 
cal information to the study, and therefore these strata 
were removed from the analysis. This exclusion of stra¬ 
ta is also illustrative of typical Lincoln-Petersen model 
performance in that strata without data are excluded 
from total abundance estimates even if fish are known 
to be migrating. 
The pooled probability model, M PS , produced the 
most precise yearly abundance estimates from the 
simulated scenarios with credible interval widths be¬ 
tween 5-9% of the U Tot (Table 1). The M PS precision 
is misleading in that the uncertainty associated with 
the stratum missing data was excluded from the total 
yearly abundance estimate. In addition, the precision 
of the MPS model is dependent on the assumption that 
capture probabilities are constant across all strata, 
which was not true. The inflated precision of the M PS 
model also caused known parameters to be excluded 
from strata-specific 95% credible intervals and to re¬ 
sult in M P s having the worst strata-specific coverage. 
The M P s model overestimated Oxot by 2467 individuals 
(10.4%) for the full scenario. In subsequent scenari¬ 
os, total yearly abundance estimates became less bi¬ 
ased as strata were removed and data were reduced. 
By removing strata missing data, the M PS model U Tot 
should theoretically become negatively biased by 2093 
individuals (8.9%) when missing 4 strata in the spring 
and by 1200 individuals (5.1%) when missing 8 strata 
in the summer. In these scenarios, the nature of the 
pooled capture probabilities overestimating f/ Tot offset 
the negative bias incurred from removing strata with 
missing data. 
The M hw and M SPLINE models performed better than 
M P s when addressing sparse and missing data (Table 
1). The M hw model had a credible interval width of 13% 
for the full data scenario with a bias of 543 individu¬ 
als (2.3%). As data were reduced and removed, bias in¬ 
creased up to 7064 individuals (30.0%) and the percent 
credible interval width increased up to 63%. The hier¬ 
archical structure of the M HW model integrated infor¬ 
mation from the entirety of the year, causing additional 
variability to be incorporated into the posterior distri¬ 
butions, particularly for strata missing data. Similar 
to the M H w model, the M SP line model used information 
from throughout the year to inform strata with sparse 
and missing data but implemented a P-spline function 
to localize interpolation of abundance estimates to ad¬ 
jacent strata. This process reduced the variability of 
posterior parameter distributions for strata with sparse 
and missing data and produced abundance estimates 
that were biased from -818 individuals (-3.5%) to 516 
individuals (2.2%) with credible interval widths that 
were 18-23% that of f/ Tot . The predetermined spline 
characteristics prevented the Ms P line model from pro¬ 
ducing estimates for periods missing >4 consecutive 
strata. The M HW and M SPLINE model had comparable 
numbers of strata-specific 95% credible interval bounds 
that included known abundance parameters but the 
credible interval bounds with the M SPLINE model were 
more precise. 
The M H b model had the most accurate U Tot esti¬ 
mates in 3 scenarios and the second smallest credible 
interval widths (Table 1). Strata-specific credible inter¬ 
vals produced by the M HB model where the only cred¬ 
ible intervals to encompass the known parameters for 
each strata in every scenario. As the quality of simu¬ 
lated data sets decreased, the hierarchical multiyear 
structure was able to draw inferences from previous 
years to supplement the missing and sparse data. This 
procedure allowed model M H b to produce the most ac¬ 
curate estimates with missing data. 
Application of models to Marsh Creek and Big Creek data 
Total population estimates for Marsh Creek during 
2014 were similar among the models, although confi- 
