114 
Fishery Bulletin 112(2-3) 
the range of Chinook Salmon in the the United States 
from Washington to California, while also allowing for 
the identification of fish from elsewhere in the geo- 
graphic range of this species. Adult fish were sampled 
on spawning grounds, in terminal fisheries, or at hatch- 
eries during the period of 2003-13 and were provided 
by numerous contributors (see the Acknowledgments 
section and Warheit et al. 3 ). We included populations 
expected to be encountered in ocean fisheries off Cali- 
fornia and Oregon, as well as populations with special 
management status (e.g., ESA-listed populations). Ac- 
cordingly, the major lineages of Chinook Salmon from 
California and Oregon were emphasized in this base- 
line, as were populations distinguished by life histo- 
ry strategy (e.g., spring-run, fall-run, and winter-run 
strategies), but representatives of the major lineages 
from farther north also were included. 
DNA was extracted from samples for California 
populations with DNeasy Blood & Tissue Kits on a 
BioRobot 3000 4 platform (QIAGEN, Inc., Valencia, CA) 
according to the manufacturer’s protocols, and DNA 
from populations in Oregon, Washington, Canada, and 
Alaska was extracted by contributors ( see Acknowledg- 
ments section) who used various methods. Sample sizes 
ranged from 44 to 1409 individuals per population and 
averaged 116 individuals per population. The 1409 fish 
from the population in the Trinity River Hatchery ini- 
tially were genotyped with our SNP panel for another 
purpose, but they were included here in total to provide 
a comprehensive reference sample for identification of 
this important group. Excluding this disproportionately 
large sample, the average number of individuals per 
population was 97. In total, the new baseline includ- 
ed 7984 Chinook Salmon from 68 distinct populations 
(Table 1). 
Each population in this baseline belongs to a single 
reporting unit, a designation established in previous 
GSI research that reflects a combination of “genetic 
similarity, geographic features, and management appli- 
cations” (Seeb et al., 2007). Reporting units generally 
are composed of multiple populations that share ge- 
netic similarity or are subject to similar management 
regimes. The 68 populations of Chinook Salmon in our 
baseline fall into 38 distinct reporting units (Table 1), 
and some reporting units in Alaska and Canada are 
represented by only a single population. 
Coho Salmon occasionally are misidentified as Chi- 
nook Salmon in ocean fisheries and in ecological sam- 
pling. We included a collection of 47 Coho Salmon from 
California as the 69 th population in our baseline to 
3 Warheit, K. I., L. W. Seeb, W. D. Templin, and J. E. Seeb. 
2013. Moving GSI into the next decade: SNP coordination 
for Pacific Salmon Treaty fisheries. FPT 13-09, 47 p. lAvail- 
able from Washington Department of Fish and Wildlife, 600 
Capitol Way N., Olympia, WA 98501-1091.] 
4 Mention of trade names or commercial companies is for iden- 
tification purposes only and does not imply endorsement by 
the National Marine Fisheries Service, NOAA. 
help us to identify Coho Salmon that have been identi- 
fied incorrectly as Chinook Salmon. 
Markers and genotyping 
We compiled a list of 192 TaqMan (Life Technologies 
Corp., Carlsbad, CA), or 5’-nuclease, SNP genotyp- 
ing assays from previously published discovery stud- 
ies (Smith et al., 2005a, 2005b; Campbell and Narum, 
2008; Narum et al., 2008; Clemento et al., 2011) to test 
their scorability and power for GSI. TaqMan technol- 
ogy combines standard PCR primers that target the 
genomic region around a SNP with 2 different fluores- 
cent probes that identify the 2 nucleotide bases present 
at the SNP. As recommended by the manufacturer, we 
used a multiplex preamplification reaction to increase 
the copy number of targeted genomic regions. Multi- 
plex PCR products were diluted with 15 pL of 2 mM 
Tris buffer and were frozen. 
Samples then were genotyped on 96.96 Dynamic Ar- 
rays with an EP1 System (Fluidigm Corp., South San 
Francisco, CA) according to the manufacturer’s proto- 
cols. Fluidigm Dynamic Arrays use integrated nanoflu- 
idic circuitry to simultaneously determine the genotype 
at 96 SNP loci for 96 samples (2 of which are no-DNA 
template controls). Genotypes were determined with 
the Fluidigm Genotyping Analysis software (vers. 
2.1.1). The use of quantitative PCR methods for geno- 
type determination involves discerning, on a 2-D graph, 
clusters of fluorescence intensity of the probes for the 
2 alleles; the 2 homozygote clusters have fluorescence 
primarily from only 1 probe, but a heterozygote cluster 
has similar intensities from both probes. 
Marker selection 
We selected a panel of 95 SNP markers from among 
the 192 candidates, reserving 1 marker for a species 
identification assay (see final paragraph of this sec- 
tion). The risk of “high-grading bias” (i.e., wrongly in- 
flating the apparent resolving power of a group of loci 
for GSI) is particularly great when selecting a panel 
of markers to distinguish between populations that 
are closely related, as many of the populations in our 
baseline are. To avoid high-grading bias, we employed 
the “training-holdout-leave-one-out” (THL) procedure of 
Anderson (2010); this procedure requires that data be 
split into training and holdout sets. Training-set geno- 
types are used to select the loci included in a baseline 
and can be included in the eventual baseline, but they 
are not used to evaluate its performance. Rather, per- 
formance of a baseline is determined with simulation 
and self-assignment with only the holdout set, which 
was not used in any way to select baseline loci. We 
chose a training set of 372 individuals drawn from 22 
populations ( 14 from California, 3 from Oregon, 3 from 
Washington, 1 from British Columbia, and 1 from Alas- 
ka) for initial genotyping with all 192 loci. 
