156 
Fishery Bulletin 99(1 ) 
Q. Denote the count of the yth allele of the /?th locus for 
the mth mixture individual by x mh •. Let the collection of 
such counts, X m , denote the multilocus genotype of the 
mth individual, and let the array X denote the collection 
of such arrays for the M individuals composing the stock- 
mixture sample. Further, let the RF of individuals with 
the genotype X m in the ith stock, which depends on that 
stock’s allele RFs, be denoted as flX m \ Q { ). The RF of 
the genotype in the stock mixture is the weighted sum, 
Ip.KXJQ ;), and so the likelihood function for the stock- 
mixture sample is 
Dirichlet priors with multinomial counts from the stock 
mixture. 
The stock identities of the mixture individuals are deter- 
mined by chance in the data augmentation algorithm. Let 
z m =(z m i,z m2 , • • • , z rnc ) indicate the stock origin of the mth 
mixture individual by a single “1” at the coordinate of the 
contributing stock, and c-1 “0”s at the remaining coordi- 
nates. For later reference, let Z=lz v z 2 , . . . , z M ) denote the 
stock origins of all the mixture individuals. If p and Q were 
known, the proportion of mixture individuals with genotype 
X m that came from the ith stock could be calculated as 
in c 
g(X|p,Q) = ]"~[ ^p,/(X m | Qj 
( 6 ) 
In the Bayesian view, X is fixed and g(X |p,Q) is a random 
function of the unknowns, p and Q. Again, although the 
likelihood function for the stock-mixture genotypes has 
been described with alleles and loci, it applies equally to 
stock-mixture genotypes of any combination of indepen- 
dent components: alleles at loci, haplotypes at mtDNA, 
and genotypes at loci in Hardy- Weinberg disequilibrium. 
Posterior distribution of the unknowns, nlO\X, Y) 
The Bayesian assessment of the unknown stock propor- 
tions in the stock mixture and of the baseline RFs of hap- 
lotypes, alleles, or genotypes is provided by their joint 
posterior distribution. This posterior distribution is propor- 
tional to the product of the prior density for the unknowns 
and the likelihood function of the stock-mixture sample, 
given the unknowns. The prior density for the stock-mix- 
ture proportions is the uninformative Dirichlet of Equa- 
tion 2. The baseline posterior at Equation 5 becomes the 
stock-mixture prior for the HAG RFs. Prior information 
on stock-mixture composition and the HAG RFs is rea- 
sonably considered independent, so the joint prior for the 
unknowns is the product (Eqs. 2 and 5), 
W mi = Pif( X m \Qi^^Pkf( X m IQ*}, i ~ 1 , 2 ,... C. ( 8 ) 
k=l 
Equivalently, the probability that a randomly drawn mix- 
ture individual with genotype X m came from the ith stock 
is w mi of Equation 8. The data augmentation algorithm 
draws the missing stock identity, z m , for each mixture indi- 
vidual from the multinomial distribution, z m ~ Multi l,w m ), 
where the probabilities for the stocks listed by tv m ={w ml , 
w m2 , . . . ,w mc ) are computed from the current samples of 
p and Q. Colloquially, the stock identity of each stock-mix- 
ture individual is randomly assigned with the probability 
for any stock equal to the stock-mixture fraction of the 
genotype contributed by the stock. 
In broad outline, the data augmentation algorithm used 
to draw posterior samples is straightforward. After the ini- 
tial sample is obtained (as described later), a sequence 
of samples is drawn with each sample dependent only on 
the preceding sample, that is, the algorithm is a Markov 
chain Monte Carlo (MCMC) method. At the £th sample, 
two steps are performed: 
1 Draw stock identities of the mixture individuals, z ik '~ 
Mult( l,M> (fe ^), using Equation 8 for genotype X m and the 
current values p=p ,k> and Q-Q <k> , m- 1,2, . . . , M. 
2 Draw p (k+v and Q (k+1) from their respective posterior 
densities, nip \X,Y,Z M ), and nlQ\X,Y,Z (k) ). 
7T(p,Q) = 7T(p)7T(Q | Y). 11) 
The posterior distribution for p and Q with the stock- 
mixture sample observed, nlp.Q |X,Y), is proportional to 
the product of their likelihood at Equation 6 and their 
prior at Equation 7. Analytic evaluation of the posterior 
distribution is impractical because of the prodigious com- 
putation required, caused by the combinatorial explo- 
sion of terms in the likelihood function with increase in 
stock-mixture sample size (Bernardo and Giron, 1988). 
Instead, a sufficient number of samples are drawn se- 
quentially from the posterior distribution to accurately 
describe it. The data augmentation algorithm (Tanner 
and Wong, 1987; Diebolt and Robert, 1994) can be used 
to draw the sequence of samples. The idea underlying the 
algorithm is that the estimation problem would be much 
simplified if the stock identities of the mixture individ- 
uals were known. Given the stock identities, the poste- 
rior distribution for the stock proportions and HAG RFs 
in the baseline stocks simply requires updating of the 
The stock identities, Z (k> , of the stock-mixture sample are 
sufficient statistics for p (Pella et al., 1996). With them 
available, the genetic data of the stock-mixture sample is 
of no value to estimation of p. Therefore, the posterior for 
p is obtained by updating the Dirichlet prior for p with the 
counts of stock identities for the mixture individuals, 
n(p\X,Y,Z ik) ) = n(p\Z ik) ) = 
The posterior density for HAG RFs of the genetic com- 
ponents, niQ\X,Y,Z lk> ), updates the stock-mixture prior, 
or baseline posterior, MQ \ Y), at Equation 5 for the HAG 
counts from the identified mixture individuals as 
H 
n(Q t | X, Y,Z"°) = ]Jnlq ib | X,y ih ,Z‘ k) ) = 
( 10 ) 
