303 



Parameterizing probabilities for estimating 

 age-composition distributions for mixture models 



Daniel K. Kimura 



Martin W. Dorn 



Alaska Fisheries Science Center 

 National Marine Fistieries Service 

 7600 Sand Point Way N E 

 Seattle, Washington 98115-6349 

 E-mail (for D Kimura) Dan Kimuraa noaa gov 



Heifetz and Fujioka parameterization 



Heifetz and Fujioka (1991), for a 

 tagged fish migration problem, used 

 a somewhat similar parameterization 

 that guaranteed estimated param- 

 eters would be probability distribu- 

 tions. Suppose a distribution consisted 

 of categories with probabilities [p I. 

 The Heifetz-Fujioka (H-F) parameter- 

 ization would be 



^.; = vV(l-^'^'-X'''-' 



j,k = l,...,a-l and 



When estimating parameters that 

 constitute a discrete probability 

 distribution |p^|, it is difficult to 

 determine how constraints should 

 be made to guarantee that the esti- 

 mated parameters Ip^l constitute a 

 probability distribution (i.e., p,>0, 

 5:p^ = ll. For age distributions esti- 

 mated from mixtures of length-at-age 

 distributions, the EM (expectation- 

 maximization) algorithm (Hassel- 

 blad, 1966; Hoenig and Heisey, 1987; 

 Kimura and Chikuni, 1987), restricted 

 least squares (Clark, 1981), and weak 

 quasisolutions (Troynikov, 2004) have 

 all been used. Each of these meth- 

 ods appears to guarantee that the 

 estimated distribution will be a true 

 probability distribution with all cat- 

 egories greater than or equal to zero 

 and with individual probabilities that 

 sum to one. In addition, all these 

 methods appear to provide a theo- 

 retical basis for solutions that will be 

 either maximum-likelihood estimates 

 or at least convergent to a probability 

 distribution. 



However, all these methods are pre- 

 sented in a theoretical context that is 

 useful in understanding the theory, 

 but may not be suitable for actual 

 application. Currently, most modelers 

 will have an optimization program 

 available. This note describes how, in 

 a brief amount of time, the modeler 

 can add a mixture model program 

 to his collection of readily available 

 programs — one that will estimate 

 maximum-likelihood estimates for 

 the mixture problem and will incor- 

 porate the experimenter's familiar 

 optimization program. 



To do this it is necessary to pa- 

 rameterize the estimated probabili- 

 ties so that they are in fact guar- 

 anteed to constitute a probability 

 distribution (i.e., p,>0, 2p^=l). Two 

 such parameterizations are the mul- 

 tinomial logit and a parameteriza- 

 tion method described by Heifetz and 

 Fujioka (1991). The trick with both 

 parameterizations is that although 

 the parameters that are actually 

 estimated are unconstrained, these 

 unconstrained estimates can be eas- 

 ily transformed to constrained prob- 

 ability estimates. 



Materials and methods 



Multinomial logit parameterization 



Consider parameterizing the proba- 

 bilities using the classic logit model: 



a+Xr, 



j,k = l,...,a-l and 



Pa 



d + I'-/,) 



The r^ can be guaranteed to be 

 positive by defining r =exp(.v ). The 

 parameters that are estimated are 

 x^lnir^). In turn the [x^] are used to 

 estimate the \r^\, and then the Ip^j. 

 Thus, .Tp ... , .tjj [ if are estimated on 

 (-00, cx>), the resulting |p^| is guaran- 

 teed to be a probability distribution. 

 The relationship between the Ip^l and 

 l?-^! is a unique one-to-one mapping 

 because given any Ip^l, r=p^/p^, for 

 J=l a-1. 



P„=exp(-X'>' 

 vhere /), > for ^ = 1 a-1. 



As above r^, can be guaranteed to be 

 positive by defining r^ = exp(.r^) and 

 the parameters that are estimated 

 are x^=ln(rj). Also as above, \x ] are 

 used to estimate the |r ), and then 

 the Ip^K Thus, if .tj, . . . , .t^,_j are esti- 

 mated on (-00, 00), the resulting |p^) 

 is guaranteed to be a probability dis- 

 tribution. The relationship between 

 the Ip^l and {r^l is a unique one-to- 

 one mapping because given any |p I, 

 r^=-p,ln(p„)/(l-pj, for./=l a-l. 



H-F probabilities as exploitation rates 



Stock assessment modelers would rec- 

 ognize the formula 



P., 



lo,. 



(l-exp(-Xr^)) 



as the exploitation rate formula where 

 p^ takes the role of the exploitation 

 rate ((/ ) and r takes the role of the 

 instantaneous mortality rate F^. The 

 formula r^ = -p^ln(p^,)/(l-p^) indicates 

 that the instantaneous mortality rate 

 can be solved in closed form, but this 

 is generally not true because the nat- 

 ural mortality rate M is generally 

 known as an instantaneous mortality 

 rate rather than an exploitation rate. 



Manuscript .submitted 20 January 200.5 

 to the Scientific Editor's Office. 



Manuscript approved for publication 

 2 August 2005 by the Scientific Editor. 



Fish. Bull. 104:303-305 (2006). 



