Sheridan: Forecasting the fishery for Penaeus duorarum 



747 



I compiled monthly pink shrimp catch, effort, and 

 CPUE for all sizes combined and for the smallest size 

 category (s68 tails to the pound or "68-count") in 

 NMFS statistical subareas 1-3 off southwestern 

 Florida. 



In addition to monthly values for these 29 vari- 

 ables, two quarterly means or totals for each vari- 

 able (May-July and August-October) were created. 

 These data resulted in a suite of 232 variables (29 

 variables x 6 months plus 29 variables x 2 quarters) 

 as potential predictors. All data were received and 

 analyzed in American system units (e.g. shrimp in 

 pounds, rainfall in inches). Actual and predicted land- 

 ings are presented in metric equivalents. Analyses 

 began with the year 1966 because it was the approxi- 

 mate completion date for the system of major water 

 control structures that influence ENP and Florida 

 Bay (Light and Dineen, 1994). 



Statistical analyses 



The statistical relationships of annual pink shrimp 

 catches in NMFS statistical subareas 1-3 with envi- 

 ronmental and biological variables were examined 

 by multiple linear regression. The tentatively enter- 

 tained models were of the form 



C = fc + Vi +b 2 x 2 + ...b k x k , 



where C = total November )V ,-October^ , +1 pink 

 shrimp catch; 

 x k = variables measured during May-October; 

 b k = regression coefficients; and 

 k = number of variables in the model. 



For the first forecast (released in November 1987), 

 environmental and biological data for May-October 

 1966-86 were used to develop descriptive ("hindcast") 

 models, whereas data for May-October 1987 were 

 reserved for the forecast. Initial regression analyses 

 employed the ".R-square" option of the SAS regres- 

 sion procedure (SAS Institute Inc., 1985) to capital- 

 ize on the power of Mallow's test statistic C . This 

 option produces regression equations and multiple 

 R 2 values for all possible subsets of p variables, al- 

 lowing the investigator to choose the "best" linear 

 model(s) based on R 2 . Mallow's C statistic detects a 

 "best" set of explanatory variables that minimizes 

 both error due to too few variables and variance of 

 predictions due to too many variables (Daniel et al., 

 1971). Regression equations with C > p, where p = 

 number of variables in the equation, have increased 

 bias whereas equations with C < p have increased 

 error. Regression equations including more than one 

 form of a variable, such as those with a quarterly 



variable plus one or more of its component months, 

 were not allowed. For models with C a p, stepwise 

 regression (F-to-enter=0.25 andF-to-stay=0.25) was 

 used to determine all partial and full statistics. The 

 Durbin-Watson statistic was used to assure that 

 autocorrelation in the selected models was minimal 

 (i.e. that errors in regression were independent; 

 Draper and Smith, 1981). The relationship between 

 residuals and fitted values was examined to assure 

 constant variance. Residuals were checked against 

 Cook's statistic to assure that outliers did not un- 

 duly influence model coefficients (Draper and Smith, 

 1981). Models passing all these tests were employed 

 for the annual forecast. Model performance was as- 

 sessed by examining the direction of the forecast 

 (whether landings increased or decreased over the 

 prior year) and the accuracy of the forecast (expressed 

 as percent above or below actual landings). Forecasts 

 in the same direction and with accuracies of actual 

 landings ±20% were termed successful. 



After 1987, data sets were updated regularly and 

 the regression procedures were repeated each year 

 for the annual forecast, beginning with development 

 of new descriptive models. 



Results 



Of the 232 possible predictors, only 30 monthly vari- 

 ables and two quarterly variables have ever appeared 

 in the 26 forecast models generated since 1987 (Table 

 1). Only a few of these variables have occurred on a 

 regular basis, including in decreasing frequency: 1) 

 days fished during July; 2) ENP L-67 discharge dur- 

 ing September and June; 3) ENP groundwater level 

 in wells P38 during August and P37 during Septem- 

 ber; 4) CPUE of 68-count pink shrimp during May, 

 and 5) Key West wind speed in September. Relation- 

 ships of these variables to subsequent fishing year 

 landings are illustrated in Figures 4-7. Five- and 

 six-variable models incorporating four or more of 

 these variables provided the most accurate forecasts. 

 Single forecast models were used in 1987 and 1988, 

 whereas all other forecasts employed 2-4 models 

 (Table 2). In multiple-model years, models usually 

 differed by a single variable, and tests for selecting 

 those models (Mallow's C and R 2 ) gave little reason 

 for favoring one model over another. One exception 

 occurred in 1989 when two sets of models with dif- 

 ferent independent variables were developed (mod- 

 els 3 and 4 versus models 5 and 6 in Table 2). Fore- 

 casts released to the industry stated whether land- 

 ings were expected to improve or decline and gave 

 high and low landing estimates from the models. A 

 set of revised models for 1993 (forecasts were not 



