Stockhausen and Fogarty: Removing observational noise from time series data using ARIMA models 



91 



average operator have zeros outside the unit circle. Ad- 

 ditionally, the white noise variance, ct-J,, must be finite. 

 These constraints ensure that a model is invertible and 

 stationary (or, if not stationary, that at least it is of a 

 suitable form). 



One final property of ARIMA models is of interest 

 here. The power spectrum for an ARM A (p.q) process 

 represented by equation 1 is given by 



pjn- 



h<^l 



a{e 



-l2Kf s 



v = i<T: 



ke-'^'O 



_2 a(B)a(F) 

 (p{B)<p{F) 



(3) 



where5 = e-'''";0</'<i 



and the forward shift operator, F=B'^, has the opposite 

 effect to B (i.e., F'" z, = z,^,„). 



has order p+(i, j)(fi) must be of order max(p+c?,(7). This 

 requirement for i)(B) constrains the order of potential 

 ARIMA models that could describe the observed process: 

 ifp+dsq then the observed process is also a (p,d,q) pro- 

 cess, otherwise it is a {p,d,p+d) process. 



In the more realistic situation where (j,) is observed, 

 we can determine the generalized AR operator cpiB) 

 and MA operator »)(S) for the observed process. Its 

 ARIMA model will be order (P,D,Q), say, where the 

 minimum possible value for Q is P+D. The model for 

 the corresponding unobserved, underlying process |2,1 

 will have order (p=P, d=D, qsma.x(P+D,Q)) and its as- 

 sociated generalized AR operator will also be cp(B). 

 Furthermore, recognizing that c, and e, are indepen- 

 dently distributed, it can be shown that the moving 

 average operator for the unobserved process, a(B}, is 

 additionally constrained by the ARIMA model for the 

 observed process such that the following relationship 

 must be satisfied: 



ARIMA models of time series with observation noise 



a';a{B)a{F) = a2rt(B)mF)-a';(p{B)<p(F). 



(8) 



Suppose, then, that .Vp ^'2, • • • ,yT represent a time series 

 of observations at times t = 1,2, ...T such that 



>'(=3,+e,. 



(4) 



where the z/s represent the unobserved, underlying 

 process and the e/s are IID normal variables with vari- 

 ance Op2 (i.e., e~N(0,a^^)) that are also independent of the 

 £,'s (i.e., < e 2^> = for allj,/j). The goal is to estimate 

 the unobserved time series |2,| by using the observed 

 time series |,y,|. For the analysis of fishery-independent 

 time series, it seems reasonable to assume that only 

 the ARIMA model for the observed time series |y,| is 

 known (it can be estimated using standard techniques). 

 In particular, this assumption means that the model 

 for |2,| is unknown within constraints implied by the 

 observation equation. However, to develop the approach 

 used here it is helpful to start as though the model for 

 the unobserved process \z,\ were known. 



Thus, we assume that the time series {2,1 can be rep- 

 resented by an ARIMA (p,d.q) process: 



In this equation, a^, (j(B),and (piB) are known from the 

 ARIMA model for the observed process, whereas o^, a'^, 

 and a(B) are unknown. 



In general, many combinations of a]-, a~, and a(B) 

 will satisfy the equality. Defining an "acceptable model" 

 for the unobserved process as one that, given the model 

 for the observed process, a{B) satisfies the previous 

 equation and its zeros are on or outside the unit circle, 

 Box et al. (1978) show that 1) for every given model 

 of an observed process, at least one acceptable model 

 for the unobserved process exists; 2) for a given model 

 of an observed process, the possible values of o^^ are 

 bounded; and 3) for a given model, every o^^ between 

 and the upper bound (K . say) determines a unique 

 acceptable model. The upper bound on the observation 

 error variance, K\ is determined from the constraint 

 that, for a model of the unobserved process to be accept- 

 able, a^-a(B)a(F)2.Q everywhere on the unit circle (i.e., 

 the power spectrum of the corresponding MA process 

 is non-negative definite). Then, from equation 8, K is 

 given by 



(p{B)z,=a{B)c,, 



(5) 



where the c,'s are IID and c, ~N(0,a;). Substituting in 

 y-e, for 2, and rearranging, one obtains 



(p{B)y,=a{B)c,+(p(B)e,, 



(6) 



which can be expressed as an ARIMA model for |y,) of 

 the form 



(p(B)y, = ri{B)d,. 



(7) 



where the c//s are IID, c?,~N(0,a^,) and the MA opera- 

 tor is i]{B). Thus, the generalized AR operator for the 

 observed |y,| is identical to that for the unobserved 

 {2,). Furthermore, because a(B) has order q and (p{B) 



^. . la^riiBjnlF] 



K = nun < —^ 



|B|=i (p{B)(p{F) 



(9) 



and is completely determined by the ARIMA model for 

 the observed process. When a^,- = K , the variance of the 

 added white noise is maximal, as will be the smoothing 

 of the observed time series. 



It is instructive to interpret Equations 8 and 9 in 

 terms of constraints on the power spectra of (2,}, [y,], 

 and le,l, although this interpretation is strictly correct 

 only when {2,) and |y,| are stationary. Let p^{f), p^lf), 

 p^(f) denote the power spectra for [z,], (y,), and |e,}, 

 respectively. Recalling the definition of the power spec- 

 trum (Eq. 3), Equation 8 on the unit circle can be easily 

 recast (multiply both sides by 2l[(p{B)(p{F)]} as 



