Stockhausen and Fogarty: Removing observational noise from time series data using ARIMA models 



89 



moving observation noise from abundance estimates. 

 ARIMA models are frequently used in economic fore- 

 casting (Enders, 2004) and are becoming more common 

 in fisheries research. Recent applications of ARIMA 

 models to other fisheries problems include forecasting 

 monthly landings in the Mediterranean (Lloret et al., 

 2000), testing theories of population dynamics (Becerra- 

 Munoz et al., 1999), and modeling nutrient dynamics in 

 an upwelling system (Nogueira et al., 1998). 



In the context of reducing the influence of observa- 

 tional noise in time series data, Cleveland and Tiao 

 (1976) first developed a noise-reduction and smoothing 

 algorithm for processes that could be described by an 

 ARIMA time series model. Their approach requires 

 that the ARIMA model for the unobserved, underlying 

 process be known. This known model, in turn, uniquely 

 determines the ARIMA model for the observed time se- 

 ries contaminated by observation noise and allows one 

 to estimate the variance of the observation noise. Unfor- 

 tunately, although the ARIMA model for an unobserved, 

 underlying process may be known in some instances 

 (from theory, perhaps), in many cases the model for the 

 unobserved process will be unknown. 



Box et al. (1978) extended Cleveland and Tiao's (1976) 

 ideas and developed a noise-reduction algorithm based 

 on the ARIMA model for the observed time series. How- 

 ever, the ARIMA model for the observed time series 

 merely constrains, but does not determine, the model 

 for the unobserved, underlying process; it provides only 

 an upper bound for the observation error variance. Con- 

 sequently, this approach generally requires an external 

 estimate of the observation error variance to determine 

 the appropriate level of noise reduction. 



Pennington (1985) first applied these ARIMA-based 

 time series modeling techniques to smoothing abun- 

 dance indices derived from trawl survey data. He as- 

 sumed that an observed abundance time series reflected 

 a combination of the underlying population abundance 

 and independent, uncorrelated, and multiplicative ob- 

 servation noise (the latter arising perhaps from envi- 

 ronmentally driven changes in catchabilityl. He further 

 assumed that both the (log-transformed) observed time 

 series and unobserved population process could be rep- 

 resented by ARIMA models. Pennington (1985) then 

 developed an alternative algorithm to that of Box et al. 

 (1978); his derivation allowed particular simplification 

 in the case where the underlying population process 

 could be modeled as a random walk. In this simple case, 

 the resulting noise reduction filter is an exponentially- 

 weighted average of the observed time series for the 

 endpoint of the time series (Pennington, 1985). More 

 importantly, the observation error variance can be eas- 

 ily estimated from the ARIMA model parameters and 

 an external estimate is unnecessary. Thus, for the case 

 where a random walk model for the underlying process 

 is valid, the appropriate level of smoothing is objectively 

 determined. 



As a demonstration, Pennington (1985) applied his 

 noise reduction algorithm to groundfish trawl survey 

 data for haddock (Melanogramnius aeglefinus) from the 



northeastern Atlantic coast of the United States. He 

 found that the variances of the smoothed indices were 

 "considerably lower" than those of the originals. How- 

 ever, this demonstration used an ARIMA model derived 

 from a much longer time series that had been generated 

 from a stock assessment based on commercial catch 

 data. Pennington (1985) assumed that this model rep- 

 resented the underlying population and therefore did 

 not develop models based on the observed time series. 

 Although this assumption was perfectly reasonable, 

 given that such alternative data (the stock assessment) 

 were available, it cannot be applied to situations when 

 only survey data are available to fishery analysts. 



The ARIMA model Pennington (1985) derived from 

 stock assessment results was a random walk model; 

 therefore the appropriate level of noise reduction for the 

 corresponding survey data could be objectively deter- 

 mined from the model parameters. Pennington's (1985) 

 method was later used to apply random walk mod- 

 els to survey data (Fogarty et al.^; Pennington, 1986; 

 Anonymous, 1988, 1993). Pennington (1986) found that 

 random walk models were appropriate for the survey 

 time series considered in his study. However, random 

 walk models were assumed a priori in the remaining 

 three references (Fogarty et al.^; Anonymous, 1988, 

 1993) to generate smoothed abundance trajectories; 

 because less than 25 observations for each time series 

 were considered in these references, reliable identifica- 

 tion of the model structure for each time series was 

 considered problematic and random walk models were 

 used as "null" models. 



When it is an appropriate description of the underly- 

 ing process, a random walk model yields an objective 

 determination of the degree of noise reduction appro- 

 priate to an observed time series. However, an a priori 

 adoption of this model should be viewed with some skep- 

 ticism. Additionally, if a random walk model is not an 

 appropriate description of the underlying process, the 

 resulting smoothed time series may seem reasonable, 

 but the result no longer has support as the unobserved, 

 underlying process. In this circumstance, we regard the 

 effect of the ARIMA algorithm as merely smoothing, 

 and not necessarily as noise reducing. 



As such, we feel that the utility of ARIMA-based 

 approaches to noise reduction for abundance indices 

 derived from survey data has not been adequately ex- 

 plored to date. In addition, substantially longer time 

 series (e.g., 40 observations) are now available with 

 which to test this concept. In our study, we test the 

 utility of the ARIMA time series noise reduction ap- 

 proach propounded by Pennington (1985), using time 

 series of abundance indices from fishery-independent 

 trawl survey data for nine finfish species (Table 1) 

 during two seasons on Georges Bank. We first review 

 the original methods developed by Cleveland and Tiao 



1 Fogarty, M. J., J. S. Idoine, F. P. Almeida, and M. Pen- 

 nington. 1986. Modeling trends in abundance based on 

 research vessel surveys. ICES CM (council meeting) 1986/G, 

 p. 92. ICES, Copenhagen, Denmark. 



