MENDELSSOHN: USING MARKOV DECISION MODELS 



good policies, it is only necessary to use a grid size 

 of 26 to 51 points for the problems under consider- 

 ation. However, to analyze the long-run (prob- 

 abilistic) behavior of a given policy, it is necessary 

 to use a grid containing no fewer than 100 points. 

 It should be reemphasized that the reason for 

 considering a coarser grid is that a smaller prob- 

 lem size allows for many problems to be solved at a 

 small cost. This is desirable to obtain insight into 

 the sensitivity of the problem. However, it is pos- 

 sible to solve quite large problems, making use of a 

 variety of methods to accelerate computations (see 

 for example Porteus 1971; Hastings and van 

 Nunen 1977). For example, the 501-point grid for 

 the Branch River used 1.80 s of CPU (central pro- 

 cessing unit) time to perform the optimization. 

 Computations, when smoothing costs are included 

 (see Policy Analysis section), have 2,601 states. 

 These used about 5 to 6 min of CPU time to per- 

 form the computations, but at a cost of about $20. 

 Our experience is that it is possible to obtain 

 reasonable estimates using coarse grids and that 

 this suffices for initial policy investigation. How- 

 ever, it is worthwhile to reanalyze the final two or 

 three problems of greatest interest on a finer grid. 



POLICY ANALYSIS 



For the Wood River, the optimal policy for Equa- 

 tion (1.2) is given by 



y^ = minimum (0.770, x^) 



and it produces a mean per period harvest of 

 1 . 14758, and a standard deviation in the harvest of 

 0.8963. The median harvest is 0.91, and no harvest 

 occurs roughly 4.3% of the time. A harvest of 25% 

 or less of the mean harvest occurs roughly 15% of 

 the time, while a harvest greater than the mean 

 harvest occurs approximately 38% of the time. 



Similarly, for the Branch River, an optimal pol- 

 icy for Equation (1.2) is given by 



y^ = minimum (0.300, x,) 



and it produces a mean per period harvest of 

 0.6622, and a standard deviation in the harvest of 

 0.6120. The median harvest is roughly 0.500; 

 there is a 3.9% chance of no harvest. A harvest of 

 25% of the mean harvest or less occurs roughly 

 14.5% of the time, and a harvest greater than the 

 mean harvest occurs approximately 61% of the 

 time. 



While these policies are similar in form to 

 policies that are optimal for a deterministic ver- 

 sion of Equation (1.2), they differ greatly in the 

 year-to-year dynamics. There are two ways of 

 finding the optimal deterministic policy. The first 

 way is to assume a general model of the form: 



Xt+i ^ Riyisxpi-Rzyi) 



The second method is to assume a general model of 

 the form: 



a:, ^ J =E exp {d)R^y^ exp (-R^ y^ ) 



where as before, R^ and R^ are the parameters of 

 the Ricker equation. The second method is prefer- 

 able since it uses all the information available. As 

 disa. normal random variable with mean zero and 

 variance cr^, it is easy to show that exp((i) is a 

 lognormal random variable with expectation exp 

 {V2. 0-2). Solving for the optimum sustained yield 

 (OSY) population size for each river gives: 



Both OSY values are lower than the mean per 

 period harvests in the stochastic models, but the 

 variation is too high to allow this amount to be 

 harvested each year. However, the Xq^y level is a 

 good estimate of the base stock size, and it is 

 known a priori from Mendelssohn and Sobel (in 

 press) that a base stock policy is optimal. 



In the deterministic model, oncexQgy is reached, 

 both the population size and the harvest size are 

 maintained at steady, equilibrium levels. An op- 

 timal policy for the stochastic model, however, 

 produces large fluctuations in both and may allow 

 no harvesting 1 yr out of 25 in the long run. For 

 many fisheries, these "boom and bust" conditions 

 may not be acceptable. Many people, especially 

 those with interest or mortgage payments, as are 

 many fishermen, are concerned about smoothness 

 of income received as well as the total amount 

 received. The final decision on the acceptable 

 amount of fluctuation is, of course, up to the deci- 

 sion maker with appropriate input. 



There are several methods available to try to 

 find a balance between the smoothness of the ran- 

 dom income stream and its total discounted ex- 

 pected value. Walters (1975) and Walters and Hil- 

 born (1978) suggested fixing a given mean harvest 



39 



