Time Series Analysis - Spectral Analysis

Read Post

In this post, we will continue with Shumway H. (2017) Time Series Analysis, focusing on Spectral Analysis. We will look into the frequency domain approach to time series analysis. We argue that the concept of regularity of a series can best be expressed in terms of periodic variations of the underlying phenomenon that produced the series.

Cyclical Behavior and Periodicity

In order to define the rate at which a series oscillates, we first define a cycle as one complete period of a sine or cosine function defined over a unit time interval.

Consider the periodic process:

xt=Acos(2πωt+ϕ)\begin{aligned} x_{t} &= A \mathrm{cos}(2\pi \omega t + \phi) \end{aligned}

Where ω\omega is a frequency index, defined in cycles per unit time. Using trigonometric identity, we can rewrite xtx_{t} as:

xt=U1cos(2πωt)+U2sin(2πωt)\begin{aligned} x_{t} &= U_{1}\textrm{cos}(2\pi \omega t) + U_{2}\textrm{sin}(2\pi \omega t) \end{aligned} U1=A cos(ϕ)U2=A sin(ϕ)\begin{aligned} U_{1} &= A\ \mathrm{cos}(\phi)\\ U_{2} &= -A\ \mathrm{sin}(\phi) \end{aligned}

U1,U2U_{1}, U_{2} are often taken to be normally distributed random variables. In this case:

A=U12+U22ϕ=tan1(U2U1)\begin{aligned} A &= \sqrt{U_{1}^{2} + U_{2}^{2}}\\ \phi &= \mathrm{tan}^{-1}\Big(\frac{-U_{2}}{U_{1}}\Big) \end{aligned}

A,ϕA, \phi are independent random variables iff:

A2χ2(2)\begin{aligned} A^{2} &\sim \chi^{2}(2) \end{aligned} ϕU(π,π)\begin{aligned} \phi &\sim U(-\pi, \pi) \end{aligned}

Assuming U1,U2U_{1}, U_{2} are uncorrelated random variables with mean 0 and variance σ2\sigma^{2} then:

ct=cos(2πωt)st=sin(2πωt)\begin{aligned} c_{t} &= \mathrm{cos}(2\pi\omega t)\\ s_{t} &= \mathrm{sin}(2\pi\omega t) \end{aligned} γx(h)=cov(xt+h,xt)=cov(U1ct+h+U2st+h,U1ct+U2st)=E[(U1ct+h+U2st+h)(U1ct+U2st)]=E[U1ct+hU1ct+U1ct+hU2st+U2st+hU1ct+U2st+hU2st]=σ2ct+hct+0+0+σ2st+hst=σ2ct+hct+σ2st+hst\begin{aligned} \gamma_{x}(h) &= \text{cov}(x_{t+h}, x_{t})\\ &= \text{cov}(U_{1}c_{t+h} + U_{2}s_{t+h}, U_{1}c_{t} + U_{2}s_{t})\\ &= E[(U_{1}c_{t+h} + U_{2}s_{t+h})(U_{1}c_{t} + U_{2}s_{t})]\\ &= E[U_{1}c_{t+h}U_{1}c_{t} + U_{1}c_{t+h}U_{2}s_{t} + U_{2}s_{t+h}U_{1}c_{t} + U_{2}s_{t+h}U_{2}s_{t}]\\ &= \sigma^{2}c_{t+h}c_{t} + 0 + 0 + \sigma^{2}s_{t+h}s_{t}\\ &= \sigma^{2}c_{t+h}c_{t} + \sigma^{2}s_{t+h}s_{t}\\ \end{aligned}

Using the identity:

cos(α±β)=cos(α)cos(β)sin(α)sin(β)\begin{aligned} \text{cos}(\alpha \pm \beta) &= \mathrm{cos}(\alpha)\mathrm{cos}(\beta) \mp \mathrm{sin}(\alpha)\mathrm{sin}(\beta) \end{aligned} γx(h)=σ2ct+hct+σ2st+hst=σ2cos(2πωh)\begin{aligned} \gamma_{x}(h) &= \sigma^{2}c_{t+h}c_{t} + \sigma^{2}s_{t+h}s_{t}\\ &= \sigma^{2}\textrm{cos}(2\pi\omega h) \end{aligned}

And:

var(xt)=γx(0)=σ2cos(2πω×0)=σ2\begin{aligned} \text{var}(x_{t}) &= \gamma_{x}(0)\\ &= \sigma^{2}\textrm{cos}(2\pi\omega \times 0)\\ &= \sigma^{2} \end{aligned}

To find the sample variance, we take the sample variance of the two observations:

S2=U^12+U^2221=U^12+U^22\begin{aligned} S^{2} &= \frac{\hat{U}_{1}^{2} + \hat{U}_{2}^{2}}{2 - 1}\\[5pt] &= \hat{U}_{1}^{2} + \hat{U}_{2}^{2} \end{aligned}

In general, for data that occur at discrete time points, we need at least 2 points to determine a cycle, so the highest frequency is 0.5 cycles per point. For example, if ω=0.5\omega = 0.5, a cycle take two unit time. And that is the highest frequency we can detect if sample at each unit time point. This frequency is known as the folding frequency and defines the highest frequency that can be seen in discrete sampling.

Higher frequencies sampled this way will appear at lower frequencies, called aliases.

Now consider a mixture of periodic series with multiple frequencies and amplitudes:

xt=k=1qUk1cos(2πωkt)+Uk2sin(2πωkt)\begin{aligned} x_{t} &= \sum_{k = 1}^{q}U_{k1}\mathrm{cos}(2\pi\omega_{k}t) + U_{k2}\mathrm{sin}(2\pi\omega_{k}t) \end{aligned}

Where Uk1,Uk2U_{k1}, U_{k2} are uncorrelated zero-mean random variables with variance σk2\sigma_{k}^{2} and ωk\omega_{k} are distinct frequencies.

We can show that the autocovariance function is:

γx(h)=k=1qσk2cos(2πωkh)γx(0)=var(xt)=k=1qσk2\begin{aligned} \gamma_{x}(h) &= \sum_{k=1}^{q}\sigma_{k}^{2}\mathrm{cos}(2\pi\omega_{k}h)\\ \gamma_{x}(0) &= \text{var}(x_{t})\\ &= \sum_{k=1}^{q}\sigma_{k}^{2} \end{aligned}

And the sum of the sample variances:

γ^x(0)=k=1qU^k12+U^k2221\begin{aligned} \hat{\gamma}_{x}(0) &= \sum_{k=1}^{q}\frac{\hat{U}_{k1}^{2} + \hat{U}_{k2}^{2}}{2 - 1} \end{aligned}

Example: A Periodic Series

We will generate the following series in R:

xt1=2cos(2πt×6100)+3sin(2πt×6100)xt2=4cos(2πt×10100)+5sin(2πt×10100)xt3=6cos(2πt×40100)+7sin(2πt×40100)\begin{aligned} x_{t1} &= 2 \mathrm{cos}\Big(2\pi t\times\frac{6}{100} \Big) + 3 \mathrm{sin}\Big(2\pi t\times\frac{6}{100} \Big)\\[5pt] x_{t2} &= 4 \mathrm{cos}\Big(2\pi t\times\frac{10}{100} \Big) + 5 \mathrm{sin}\Big(2\pi t\times\frac{10}{100} \Big)\\[5pt] x_{t3} &= 6 \mathrm{cos}\Big(2\pi t\times\frac{40}{100} \Big) + 7 \mathrm{sin}\Big(2\pi t\times\frac{40}{100} \Big) \end{aligned}

Example: Estimation and the Periodogram

Consider a time series sample, x1,,xnx_{1}, \cdots, x_{n} where n is odd:

xt=a0+j=1n12ajcos(2πtjn)+bjsin(2πtjn)\begin{aligned} x_{t} &= a_{0} + \sum_{j=1}^{\frac{n-1}{2}}a_{j}\textrm{cos}\Big(\frac{2\pi tj}{n}\Big) + b_{j}\textrm{sin}\Big(\frac{2\pi tj}{n}\Big) \end{aligned} t=1,,n\begin{aligned} t &= 1, \cdots, n \end{aligned}

Or when n is even:

xt=a0+an2cos(2πt2)+j=1n21ajcos(2πtjn)+bjsin(2πtjn)\begin{aligned} x_{t} &= a_{0} + a_{\frac{n}{2}}\textrm{cos}\Big(\frac{2\pi t}{2}\Big) + \sum_{j=1}^{\frac{n}{2} - 1}a_{j}\textrm{cos}\Big(\frac{2\pi tj}{n}\Big) + b_{j}\textrm{sin}\Big(\frac{2\pi tj}{n}\Big)\\ \end{aligned}

Using OLE estimation, the coefficients are:

a0=xˉaj=2nt=1nxtcos(2πtjn)bj=2nt=1nxtsin(2πtjn)\begin{aligned} a_{0} &= \bar{x}\\ a_{j} &= \frac{2}{n}\sum_{t=1}^{n}x_{t}\mathrm{cos}\Big(\frac{2\pi t j}{n}\Big)\\[5pt] b_{j} &= \frac{2}{n}\sum_{t=1}^{n}x_{t}\mathrm{sin}\Big(\frac{2\pi t j}{n}\Big) \end{aligned}

We then define the scaled periodgram as:

P(jn)=aj2+bj2\begin{aligned} P\Big(\frac{j}{n}\Big) &= a_{j}^{2} + b_{j}^{2} \end{aligned}

This is simply the sample variance at each frequency component and is an estimate of σj2\sigma_{j}^{2} corresponding to the frequency of ωj=jn\omega_{j} = \frac{j}{n}. These frequencies are known as Fourier/fundamental frequencies.

It is not necessary to run a large regression to obtain aj,bja_{j}, b_{j} and they can be computed quickly if n is a highly composite integer (if not can pad with zeros).

The Discrete Fourier Transform (DFT) is a complex-valued weighted average of the data given by:

d(jn)=1nt=1nxte2πitjn=1nt=1nxtcos(2πtjn)it=1nxtsin(2πtjn)\begin{aligned} d\Big(\frac{j}{n}\Big) &= \frac{1}{\sqrt{n}}\sum_{t=1}^{n}x_{t}e^{-\frac{2\pi i t j}{n}}\\ &= \frac{1}{\sqrt{n}}\sum_{t=1}^{n}x_{t}\textrm{cos}\Big(\frac{2\pi t j}{n}\Big) - i\sum_{t=1}^{n}x_{t}\textrm{sin}\Big(\frac{2\pi t j}{n}\Big) \end{aligned} j=0,1,,n1\begin{aligned} j &= 0, 1, \cdots, n - 1 \end{aligned}

Where jn\frac{j}{n} are the Fourier frequencies.

Because of a large number of redundancies in the calculation, the above can be computed quicker using the FFT.

For example, in R:

dft <- fft(x) / sqrt(length(x))
idft <- fft(dft, inverse = TRUE) / sqrt(length(dft))    

Note that the periodogram is:

d(jn)2=1n(t=1nxtcos(2πtjn))2+1n(t=1nxtsin(2πtjn))2\begin{aligned} \Big|d\Big(\frac{j}{n}\Big)\Big|^{2} &= \frac{1}{n}\Bigg(\sum_{t=1}^{n}x_{t}\mathrm{cos}\Big(\frac{2\pi t j}{n}\Big)\Bigg)^{2} + \frac{1}{n}\Bigg(\sum_{t=1}^{n}x_{t}\mathrm{sin}\Big(\frac{2\pi t j}{n}\Big)\Bigg)^{2} \end{aligned}

And the scaled periodogram can be calculated as:

P(jn)=4nd(jn)2\begin{aligned} P\Big(\frac{j}{n}\Big) &= \frac{4}{n}\Big|d\Big(\frac{j}{n}\Big)\Big|^{2} \end{aligned}

And note that there is a mirroring effect at the folding frequency of 12\frac{1}{2}:

P(jn)=P(1jn)\begin{aligned} P\Big(\frac{j}{n}\Big) &= P\Big(1 - \frac{j}{n}\Big) \end{aligned} j=0,1,,n1\begin{aligned} j &= 0, 1, \cdots, n-1 \end{aligned}

Calculating the scaled periodogram in R (note that R does not include the factor 1n\frac{1}{\sqrt{n}} in fft()):

n <- 100
dat$P <- 4 / n *
  Mod(fft(dat$x) / sqrt(n))^2    
dat$freq <- 0:99 / 100

> dat |>
  ggplot(aes(x = freq, y = P)) +   
  geom_segment(
    aes(
      x = freq,
      xend = freq,
      y = 0,
      yend = P
    )
  ) +
  theme_tq()

From R we can see the 3 frequencies in the periodogram:

P(6100)=P(6100)=13P(10100)=P(10100)=41P(40100)=P(40100)=85\begin{aligned} P\Big(\frac{6}{100}\Big) &= P\Big(\frac{6}{100}\Big) = 13\\ P\Big(\frac{10}{100}\Big) &= P\Big(\frac{10}{100}\Big) = 41\\ P\Big(\frac{40}{100}\Big) &= P\Big(\frac{40}{100}\Big) = 85 \end{aligned}

Otherwise:

P(j100)=0\begin{aligned} P\Big(\frac{j}{100}\Big) &= 0 \end{aligned}
> dat$P[6 + 1]
[1] 13
> dat$P[(100 - 6) + 1]
[1] 13

> dat$P[10 + 1]
[1] 41
> dat$P[(100 - 10) + 1]   
[1] 41

> dat$P[40 + 1]
[1] 85
> dat$P[(100 - 40) + 1]
[1] 85

Example: Star Magnitude

The star dataset which contain the magnitude of a star taken at midnight for 600 consecutive days. The two most prominent periodic components in the periodogram are the 29 and 24 day cycle:

star <- as_tsibble(star)

n <- nrow(star)
dat <- tibble(
  P = P <- 4 / n *
    Mod(fft(star$value - mean(star$value)) / sqrt(n))^2,
  freq = (0:(n - 1)) / n
)

dat |>
  slice(1:50) |>
  ggplot(aes(x = freq, y = P)) +
  geom_segment(
    aes(
      x = freq,
      xend = freq,
      y = 0,
      yend = P
    )
  ) +
  annotate("text", label = "24 day cycle", x = 0.047, y = 62) +    
  annotate("text", label = "29 day cycle", x = 0.03, y = 74) +
  theme_tq()

freqs <- dat |>
  slice(1:50) |>
  arrange(desc(P)) |>
  slice(1:2) |>
  pull(freq)

> freqs
[1] 0.03500000 0.04166667  

> 1 / freqs
[1] 28.57143 24.00000

Spectral Density

In this section, we define the fundamental frequency domain tool, the spectral density. In addition, we discuss the spectral representations for stationary processes. Just as the Wold decomposition theoretically justified the use of regression for analyzing time series, the spectral representation theorems supply the theoretical justifications for decomposing stationary time series into periodic components appearing in proportion to their underlying variances.

Consider the following periodic stationary random process:

xt=U1cos(2πω0t)+U2sin(2πω0t)\begin{aligned} x_{t} = U_{1}\mathrm{cos}(2\pi \omega_{0}t) + U_{2}\mathrm{sin}(2\pi \omega_{0}t) \end{aligned}

Recall that:

γ(h)=σ2cos(2πω0h)\begin{aligned} \gamma(h) &= \sigma^{2}\mathrm{cos}(2\pi\omega_{0}h) \end{aligned}

Using Euler’s formula for cosine:

cos(α)=eiα+eiα2\begin{aligned} \mathrm{cos}(\alpha) &= \frac{e^{i\alpha} + e^{-i\alpha}}{2} \end{aligned} γ(h)=σ2cos(2πω0h)=σ22e2πiω0h+σ22e2πiω0h\begin{aligned} \gamma(h) &= \sigma^{2}\mathrm{cos}(2\pi\omega_{0}h)\\ &= \frac{\sigma^{2}}{2}e^{-2\pi i \omega_{0}h} + \frac{\sigma^{2}}{2}e^{2\pi i \omega_{0}h} \end{aligned}

And using Riemann-Stieltjes integration, we can show:

γ(h)=σ2cos(2πω0h)=σ22e2πiω0h+σ22e2πiω0h=1212e2πiωh dF(ω)\begin{aligned} \gamma(h) &= \sigma^{2}\mathrm{cos}(2\pi\omega_{0}h)\\[5pt] &= \frac{\sigma^{2}}{2}e^{-2\pi i \omega_{0}h} + \frac{\sigma^{2}}{2}e^{2\pi i \omega_{0}h}\\[5pt] &= \int_{-\frac{1}{2}}^{\frac{1}{2}}e^{2\pi i \omega h}\ dF(\omega) \end{aligned}

Where F(ω)F(\omega) is defined as:

F(ω)={0ω<ω0σ22ω0<ω<ω0σ2ωω00h>1\begin{aligned} F(\omega) &= \begin{cases} 0 & \omega < -\omega_{0}\\ \frac{\sigma^{2}}{2} & -\omega_{0} < \omega < \omega_{0}\\ \sigma^{2} & \omega \geq \omega_{0}\\ 0 & h > 1 \end{cases} \end{aligned}

F(ω)F(\omega) is a cumulative distribution function of variances. Hence, we term F(ω)F(\omega) as the spectral distribution function.

Spectral Representation of an Autocovariance Function

If {xt}\{x_{t}\} is stationary with γ(h)=cov(xt+h,xt)\gamma(h) = \text{cov}(x_{t+h}, x_{t}), then there exists a unique monotonically increasing spectral distribution function F(ω)F(\omega):

F()=F(12)=0F()=F(12)=γ(0)γ(h)=1212e2πiωh dF(ω)\begin{aligned} F(-\infty) &= F\Big(-\frac{1}{2}\Big) = 0\\ F(\infty) &= F\Big(\frac{1}{2}\Big) = \gamma(0)\\ \gamma(h) &= \int_{-\frac{1}{2}}^{\frac{1}{2}}e^{2\pi i \omega h}\ dF(\omega) \end{aligned}

If the autocovariance function further satisfies:

h=γ(h)<\begin{aligned} \sum_{h =-\infty}^{\infty}| \gamma(h)| < \infty \end{aligned}

Then it has the spectral representation:

γ(h)=1212e2πiωhf(ω) dωh=0,±1,±2,\begin{aligned} \gamma(h) &= \int_{-\frac{1}{2}}^{\frac{1}{2}}e^{2\pi i \omega h}f(\omega)\ d\omega\\[5pt] h &= 0, \pm 1, \pm 2, \cdots \end{aligned}

And the inverse transform of the spectral density:

f(ω)=h=γ(h)e2πiωh\begin{aligned} f(\omega) &= \sum_{h=-\infty}^{\infty}\gamma(h)e^{-2\pi i \omega h}\\ \end{aligned} 12ω12\begin{aligned} - \frac{1}{2} \leq \omega \leq \frac{1}{2} \end{aligned}

The spectral density is the analogue of the pdf and:

f(ω)0\begin{aligned} f(\omega) &\geq 0 \end{aligned} f(ω)=f(ω)\begin{aligned} f(\omega) &= f(-\omega) \end{aligned}

The variance is the integrated spectral density over all of the frequencies:

γ(0)=var(xt)=1212f(ω) dω\begin{aligned} \gamma(0) &= \text{var}(x_{t})\\ &= \int_{-\frac{1}{2}}^{\frac{1}{2}}f(\omega)\ d\omega \end{aligned}

The autocovariance and the spectral distribution functions contain the same information. That information, however, is expressed in different ways. The autocovariance function expresses information in terms of lags, whereas the spectral distribution expresses the same information in terms of cycles.

Some problems are easier to work with when considering lagged information and we would tend to handle those problems in the time domain. Nevertheless, other problems are easier to work with when considering periodic information and we would tend to handle those problems in the spectral domain.

We note that the autocovariance function, γ(h)\gamma(h), and the spectral density, f(ω)f(\omega), are Fourier transform pairs. In particular, this means that if f(ω)f(\omega) and g(ω)g(\omega) are two spectral densities for which:

γf(h)=1212f(ω)e2πiωh dω=1212g(ω)e2πiωh dω=γg(h)\begin{aligned} \gamma_{f}(h) &= \int_{-\frac{1}{2}}^{\frac{1}{2}}f(\omega)e^{2\pi i \omega h}\ d\omega\\ &= \int_{-\frac{1}{2}}^{\frac{1}{2}}g(\omega)e^{2\pi i \omega h}\ d\omega\\ &= \gamma_{g}(h) \end{aligned}

Then:

f(ω)=g(ω)\begin{aligned} f(\omega) &= g(\omega) \end{aligned}

Example: White Noise Series

Recall the variance of white noise is:

γ(0)=1212f(ω)dω=σω2\begin{aligned} \gamma(0) &= \int_{-\frac{1}{2}}^{\frac{1}{2}}f(\omega)d\omega\\ &= \sigma_{\omega}^{2} \end{aligned}

And:

f(ω)=h=γ(h)e2πiωh=γ(0)+0+=σω2\begin{aligned} f(\omega) &= \sum_{h=-\infty}^{\infty}\gamma(h)e^{-2\pi i \omega h}\\ &= \gamma(0) + 0 + \cdots\\ &= \sigma_{\omega}^{2} \end{aligned}

Hence the process contains equal power at all frequencies. The name white noise comes from the analogy to white light.

Output Spectrum of a Filtered Stationary Series

A linear filter uses a set of specified coefficients aja_{j} to transform an input series xtx_{t} to yty_{t}:

yt=j=ajxtj\begin{aligned} y_{t} &= \sum_{j=-\infty}^{\infty}a_{j}x_{t-j} \end{aligned} j=aj<\begin{aligned} \sum_{j=-\infty}^{\infty}| a_{j}| &< \infty \end{aligned}

This form is also called a convolution and coefficients are called the impulse response function.

The Fourier transform is called the frequency response function:

A(ω)=j=aje2πiωj\begin{aligned} A(\omega) &= \sum_{j=-\infty}^{\infty}a_{j}e^{-2\pi i \omega j} \end{aligned}

If xtx_{t} has a spectrum fx(ω)f_{x}(\omega), then the spectrum of the ouput yty_{t} is:

fy(ω)=A(ω)2fx(ω)\begin{aligned} f_{y}(\omega) &= | A(\omega)| ^{2}f_{x}(\omega) \end{aligned}

The Spectral Density of ARMA

If xtx_{t} is ARMA, then:

xt=j=0ψjwtj\begin{aligned} x_{t} &= \sum_{j=0}^{\infty}\psi_{j}w_{t-j} \end{aligned} j=0ψj<\begin{aligned} \sum_{j=0}^{\infty}| \psi_{j}| < \infty \end{aligned}

We know that the spectral density for white noise:

fω(ω)=σω2ψ(z)=θ(z)ϕ(z)\begin{aligned} f_{\omega}(\omega) &= \sigma_{\omega}^{2}\\ \psi(z) &= \frac{\theta(z)}{\phi(z)} \end{aligned}

We can show that the spectral density for ARMA(p, q) xtx_{t}:

fx(ω)=σω2θ(e2πiω)2ϕ(e2πiω)2\begin{aligned} f_{x}(\omega) &= \sigma_{\omega}^{2}\frac{|\theta(e^{-2\pi i \omega})|^{2}}{|\phi(e^{-2\pi i \omega})|^{2}} \end{aligned} ϕ(z)=1k=1pϕkzkθ(z)=1+k=11θkzkA(ω)=θ(e2πiω)ϕ(e2πiω)\begin{aligned} \phi(z) &= 1 - \sum_{k=1}^{p}\phi_{k}z^{k}\\ \theta(z) &= 1 + \sum_{k=1}^{1}\theta_{k}z^{k}\\ A(\omega) &= \frac{\theta(e^{-2\pi i \omega})}{\phi(e^{-2\pi i \omega})} \end{aligned}

Example: Moving Average

Consider the MA(1) model:

xt=wt+0.5wt1\begin{aligned} x_{t} = w_{t} + 0.5w_{t-1} \end{aligned}

The autocovariance function:

γ(0)=(1+0.52)σw2=1.25σw2\begin{aligned} \gamma(0) &= (1 + 0.5^{2})\sigma_{w}^{2}\\ &= 1.25\sigma_{w}^{2}\\ \end{aligned} γ(±1)=0.5σw2γ(±h)=0h>1\begin{aligned} \gamma(\pm 1) &= 0.5\sigma_{w}^{2}\\[5pt] \gamma(\pm h) &= 0\\[5pt] h &> 1 \end{aligned}

The spectral density:

fx(ω)=h=γ(h)e2πiωh=σw2[1.25+0.5(e2πiω+e2πiω)]=σw2[1.25+cos(2πω)]\begin{aligned} f_{x}(\omega) &= \sum_{h=-\infty}^{\infty}\gamma(h)e^{-2\pi i \omega h}\\ &= \sigma_{w}^{2}[1.25 + 0.5(e^{-2\pi i \omega} + e^{2\pi i \omega})]\\ &= \sigma_{w}^{2}[1.25 + \textrm{cos}(2\pi \omega)] \end{aligned}

Computing the spectral density in a different way:

fx(ω)=σω2θ(e2πiω)2ϕ(e2πiω)2=σw21+0.5e2πiω2=σw2(1+0.5e2πiω)(1+0.5e2πiω)=σw2[1.25+0.5(e2πiω+e2πiω)]=σw2[1.25+cos(2πω)]\begin{aligned} f_{x}(\omega) &= \sigma_{\omega}^{2}\frac{|\theta(e^{-2\pi i \omega})|^{2}}{|\phi(e^{-2\pi i \omega})|^{2}}\\[5pt] &= \sigma_{w}^{2}|1 + 0.5e^{-2\pi i \omega}|^{2}\\ &= \sigma_{w}^{2}(1 + 0.5e^{-2\pi i \omega})(1 + 0.5e^{-2\pi i \omega})\\ &= \sigma_{w}^{2}[1.25 + 0.5(e^{-2\pi i \omega} + e^{2\pi i \omega})]\\ &= \sigma_{w}^{2}[1.25 + \textrm{cos}(2\pi \omega)] \end{aligned} θ(z)=1+0.5z\begin{aligned} \theta(z) &= 1 + 0.5z \end{aligned}

We can plot the spectral density in R:

library(astsa)
arma.spec(
  ma = .5,
  main = "Moving Average"  
)

Example: An AR(2) Model

Consider an AR(2) model:

xtxt1+0.9xt2=wt\begin{aligned} x_{t} - x_{t-1} + 0.9x_{t-2} = w_{t} \end{aligned}

Note that:

θ(z)=1ϕ(z)=1z+0.9z2\begin{aligned} \theta(z) &= 1\\ \phi(z) &= 1 - z + 0.9z^{2} \end{aligned}

Hence:

ϕ(e2πiω)2=(1e2πiω+0.9e4πiω)(1e2πiω+0.9e4πiω)=2.811.9(e2πiω+e2πiω)+0.9(e4πiω+e4πiω)=2.813.8cos(2πω)+1.8cos(4πω)\begin{aligned} |\phi(e^{-2\pi i \omega})|^{2} &= (1 - e^{-2\pi i \omega} + 0.9e^{-4\pi i \omega})(1 - e^{-2\pi i \omega} + 0.9e^{-4\pi i \omega})\\ &= 2.81 - 1.9(e^{2\pi i \omega} + e^{-2\pi i \omega}) + 0.9(e^{4\pi i \omega} + e^{-4\pi i \omega})\\ &= 2.81 - 3.8\mathrm{cos}(2\pi\omega) + 1.8\mathrm{cos}(4\pi\omega) \end{aligned} fx(ω)=σw22.813.8cos(2πω)+1.8cos(4πω)\begin{aligned} f_{x}(\omega) &= \frac{\sigma_{w}^{2}}{2.81 - 3.8\mathrm{cos}(2\pi\omega) + 1.8\mathrm{cos}(4\pi\omega)} \end{aligned}

Graphing this in R:

arma.spec(
  ar = c(1, -0.9),
  main = "Autoregression"  
)

Example: Every Explosion has a Cause

Suppose that:

xt=2xt1+wtwtN(0,σw2)\begin{aligned} x_{t} &= 2 x_{t-1} + w_{t}\\ w_{t} &\sim N\Big(0, \sigma_{w}^{2}\Big) \end{aligned}

The spectral density of x_{t} is:

wt=xt2xt1A(ω)=12e2πiωfw(ω)=A(ω)2fx(ω)fx(ω)=1A(ω)2fw(ω)=σw212e2πiω2\begin{aligned} w_{t} &= x_{t} - 2x_{t-1}\\ A(\omega) &= 1 - 2e^{-2\pi i \omega}\\ f_{w}(\omega) &= |A(\omega)|^{2}f_{x}(\omega)\\ f_{x}(\omega) &= \frac{1}{|A(\omega)|^{2}}f_{w}(\omega)\\ &= \sigma_{w}^{2}|1 - 2e^{-2\pi i \omega}|^{-2} \end{aligned}

But:

122πiω=122πiω=(2e2πiω)(12e2πiω1)\begin{aligned} |1 - 2^{-2\pi i \omega}| &= |1 - 2^{2\pi i \omega}|\\ &= \Big|(2e^{2\pi i \omega})\Big(\frac{1}{2}e^{-2\pi i \omega} - 1\Big)\Big| \end{aligned}

Thus, we can rewrite:

fx(ω)=14σw2112e2πiω2xt=12xt1+vtvtN(0,14σw2)\begin{aligned} f_{x}(\omega) &= \frac{1}{4}\sigma_{w}^{2}\Big|1 - \frac{1}{2}e^{-2\pi i \omega}\Big|^{-2}\\ x_{t} &= \frac{1}{2}x_{t-1} + v_{t}\\ v_{t} &\sim N\Big(0, \frac{1}{4}\sigma_{w}^{2}\Big) \end{aligned}

Discrete Fourier Transform

Given data x1,,xnx_{1}, \cdots, x_{n}, we define the DFT to be:

d(ωj)=1nt=1nxte2πiωjt\begin{aligned} d(\omega_{j}) &= \frac{1}{\sqrt{n}}\sum_{t=1}^{n}x_{t}e^{-2\pi i \omega_{j}t} \end{aligned} j=0,1,,n1\begin{aligned} j &= 0, 1, \cdots, n-1 \end{aligned} ωj=jn\begin{aligned} \omega_{j} &= \frac{j}{n} \end{aligned}

ωj\omega_{j} are called the Fourier or fundamental frequencies.

If n is a highly composite integer (it has many factors), the DFT can be computed by FFT.

The inverse DFT:

xt=1nj=0n1d(ωj)e2πiωjt\begin{aligned} x_{t} &= \frac{1}{\sqrt{n}}\sum_{j=0}^{n-1}d(\omega_{j})e^{2\pi i \omega_{j}t} \end{aligned}

We define the periodogram to be:

I(ωj)=d(ωj)2\begin{aligned} I(\omega_{j}) &= |d(\omega_{j})|^{2} \end{aligned} j=0,1,,n1\begin{aligned} j &= 0, 1, \cdots, n-1 \end{aligned}

Note that:

I(0)=nxˉ2\begin{aligned} I(0) &= n\bar{x}^{2} \end{aligned}

The periodgram can be viewed as the sample spectral density of xtx_{t}:

I(ωj)=d(ωj)2=1nt=1nxte2πiωjt1ns=1nxse2πiωjs=1nt=1ns=1nxtxse2πiω(ts)=1nh=(n1)n1t=1nhxt+hxte2πiωjh=h=(n1)n1γ(h)^e2πiωjh\begin{aligned} I(\omega_{j}) &= |d(\omega_{j})|^{2}\\ &= \frac{1}{\sqrt{n}}\sum_{t=1}^{n}x_{t}e^{-2\pi i \omega_{j}t}\frac{1}{\sqrt{n}}\sum_{s=1}^{n}x_{s}e^{-2\pi i \omega_{j}s}\\ &= \frac{1}{n}\sum_{t=1}^{n}\sum_{s=1}^{n}x_{t}x_{s}e^{-2\pi i \omega (t-s)}\\ &= \frac{1}{n}\sum_{h=-(n-1)}^{n-1}\sum_{t=1}^{n-|h|}x_{t+|h|}x_{t}e^{-2\pi i \omega_{j} h}\\ &= \sum_{h=-(n-1)}^{n-1}\hat{\gamma(h)}e^{-2\pi i \omega_{j} h} \end{aligned}

It is sometimes useful to work with the real and imaginary parts of the DFT individually.

We defined the cosine transform as:

dc(ωj)=1nt=1ncos(2πωjt)\begin{aligned} d_{c}(\omega_{j}) &= \frac{1}{\sqrt{n}}\sum_{t=1}^{n}\textrm{cos}(2\pi\omega_{j}t) \end{aligned}

And the sine transform:

ds(ωj)=1nt=1nsin(2πωjt)\begin{aligned} d_{s}(\omega_{j}) &= \frac{1}{\sqrt{n}}\sum_{t=1}^{n}\textrm{sin}(2\pi\omega_{j}t) \end{aligned} ωj=jn\begin{aligned} \omega_{j} &= \frac{j}{n} \end{aligned} j=0,1,,n1\begin{aligned} j &= 0, 1, \cdots, n-1 \end{aligned}

Note that:

d(ωj)=dc(wj)ids(ωj)I(ωj)=dc2(ωj)+ds2(ωj)\begin{aligned} d(\omega_{j}) &= d_{c}(w_{j})- i d_{s}(\omega_{j})\\[5pt] I(\omega_{j}) &= d_{c}^{2}(\omega_{j}) + d_{s}^{2}(\omega_{j}) \end{aligned}

Example: Spectral ANOVA

We have also discussed the fact that spectral analysis can be thought of as an analysis of variance.

Recall that we assumed previously that:

xt=a0+j=1n12ajcos(2πωjt)+bjsin(2πωjt)\begin{aligned} x_{t} &= a_{0} + \sum_{j=1}^{\frac{n-1}{2}}a_{j}\textrm{cos}(2\pi \omega_{j}t) + b_{j}\textrm{sin}(2\pi \omega_{j}t)\\ \end{aligned} t=1,,n\begin{aligned} t &= 1, \cdots, n \end{aligned}

Using multiple regression, we could show that the coefficients are:

a0=xˉ\begin{aligned} a_{0} &= \bar{x} \end{aligned} aj=2nt=1nxtcos(2πωjt)=2ndc(ωj)bj=2nt=1nxtsin(2πωjt)=2nds(ωj)\begin{aligned} a_{j} &= \frac{2}{n}\sum_{t=1}^{n}x_{t}\textrm{cos}(2\pi \omega_{j}t)\\ &= \frac{2}{\sqrt{n}}d_{c}(\omega_{j})\\ b_{j} &= \frac{2}{n}\sum_{t=1}^{n}x_{t}\textrm{sin}(2\pi \omega_{j}t)\\ &= \frac{2}{\sqrt{n}}d_{s}(\omega_{j}) \end{aligned}

Hence:

(xtxˉ)=2nj=1n12[dc(ωj)cos(2πωjt)+ds(ωj)sin(2πωjt)]\begin{aligned} (x_{t} - \bar{x}) &= \frac{2}{\sqrt{n}}\sum_{j=1}^{\frac{n-1}{2}}[d_{c}(\omega_{j})\textrm{cos}(2\pi \omega_{j}t) + d_{s}(\omega_{j})\textrm{sin}(2\pi \omega_{j}t)] \end{aligned} t=1,,n\begin{aligned} t & = 1, \cdots, n \end{aligned}

Squaring both sides and summing:

t=1n(xtxˉ)2=2j=1n12[dc2(ωj)+ds2(ωj)]=2j=1n12I(ωj)\begin{aligned} \sum_{t=1}^{n}(x_{t} - \bar{x})^{2} &= 2\sum_{j=1}^{\frac{n-1}{2}}[d_{c}^{2}(\omega_{j}) + d_{s}^{2}(\omega_{j})]\\ &= 2 \sum_{j=1}^{\frac{n-1}{2}}I(\omega_{j}) \end{aligned}

Thus, we have partitioned the sum of squares into harmonic components represented by frequency ωj\omega_{j} with the periodogram, I(ωj)I(\omega_{j}), being the mean square regression.

We expect the SSS for the coefficients ω1,,ωn12\omega_{1}, \cdots, \omega_{\frac{n-1}{2}} to be:

2I(ω1)2I(ω2)2I(ωn12)\begin{aligned} 2I&(\omega_{1})\\ 2I&(\omega_{2})\\ &\vdots\\ 2I(&\omega_{\frac{n-1}{2}}) \end{aligned}

And MSE:

I(ω1)I(ω2)I(ωn12)\begin{aligned} I&(\omega_{1})\\ I&(\omega_{2})\\ &\vdots\\ I(&\omega_{\frac{n-1}{2}}) \end{aligned}

Example: Spectral Analysis as Principal Component Analysis

If X=(x1,,xn)X = (x_{1}, \cdots, x_{n}) are n values of a mean-zero time series, with spectral density fx(ω)f_{x}(\omega), then:

cov(X)=Γn=[γ(0)γ(1)γ(n1)γ(1)γ(0)γ(n0)γ(n1)γ(n2)γ(0)]\begin{aligned} \textrm{cov}(X) &= \boldsymbol{\Gamma}_{n}\\ &= \begin{bmatrix} \gamma(0) & \gamma(1) & \cdots & \gamma(n-1)\\ \gamma(1) & \gamma(0) & \cdots & \gamma(n-0)\\ \vdots & \vdots & \ddots & \vdots\\ \gamma(n-1) & \gamma(n-2) & \cdots & \gamma(0) \end{bmatrix} \end{aligned}

Then the eigenvalues of Γn\Gamma_{n} are:

λjf(ωj)=h=γ(h)e2πihjn\begin{aligned} \lambda_{j} &\approx f(\omega_{j})\\ &= \sum_{h=-\infty}^{\infty}\gamma(h)e^{2\pi i h \frac{j}{n}} \end{aligned}

With approximate eigenvectors:

gj=1n[e2πi0jne2πi1jne2πi(n1)jn]j=0,1,,n1\begin{aligned} \textbf{g}_{j} &= \frac{1}{\sqrt{n}} \begin{bmatrix} e^{-2\pi i 0 \frac{j}{n}}\\ e^{-2\pi i 1 \frac{j}{n}}\\ \vdots\\ e^{-2\pi i (n-1) \frac{j}{n}} \end{bmatrix}\\ j &= 0, 1, \cdots, n-1 \end{aligned}

If we let G\boldsymbol{G} be the complex matrix with columns gj\boldsymbol{g}_{j}, then the complex vector has elements that are the DFTs:

Y=GTX=1n[t=1nxte2πit1nt=1nxte2πit2nt=1nxte2πitn1n]\begin{aligned} \boldsymbol{Y} &= \boldsymbol{G}^{T}\boldsymbol{X}\\ &= \frac{1}{\sqrt{n}}\begin{bmatrix} \sum_{t=1}^{n}x_{t}e^{-2\pi i t \frac{1}{n}}\\ \sum_{t=1}^{n}x_{t}e^{-2\pi i t \frac{2}{n}}\\ \vdots\\ \sum_{t=1}^{n}x_{t}e^{-2\pi i t \frac{n-1}{n}} \end{bmatrix} \end{aligned}

Elements of Y\textbf{Y} are asymptotically uncorrelated complex random variables with mean-zero and variance f(ωj)f(\omega_{j}).

Also we can recover X\textbf{X}:

X=GY=1n[t=1n1yte2πit1nt=1n1yte2πit2nt=1n1yte2πitn1n]\begin{aligned} \textbf{X} &= \textbf{GY}\\ &= \frac{1}{\sqrt{n}}\begin{bmatrix} \sum_{t=1}^{n-1}y_{t}e^{2\pi i t \frac{1}{n}}\\ \sum_{t=1}^{n-1}y_{t}e^{2\pi i t \frac{2}{n}}\\ \vdots\\ \sum_{t=1}^{n-1}y_{t}e^{2\pi i t \frac{n-1}{n}} \end{bmatrix} \end{aligned}

Example: Periodogram of SOI and Recruitment Series

The FFT utilizes a number of redundancies in the calculation of the DFT when n is highly composite. That is, an integer with many factors of 2, 3, or 5. The best case being when n=2pn = 2^{p} is a factor of 2. To accommodate this property, we can pad the centered (or detrended) data of length n to the next highly composite interger by adding zeros. Note that this would alter the fundamental frequency ordinates.

We can see a peak at the 12 month cycle for SOI and 4 year cycle for Rec:

par(mfrow = c(2, 1))
soi.per <- mvspec(soi, log = "no")   
abline(v = 1 / 4, lty = 2)
rec.per <- mvspec(rec, log = "no")
abline(v = 1 / 4, lty = 2)
par(mfrow = c(1, 1))

Nonparametric Spectral Estimation

We introduce a frequency band B\mathcal{B}, of L<<nL << n contiguous fundamental frequencies, centered around frequency ωj=jn\omega_{j} = \frac{j}{n}. This frequency is chosen close to a frequency of interest ω\omega.

For frequencies of the form:

B={ω:ωjmnωωj+mn}\begin{aligned} \mathcal{B} &= \{\omega^{*}: \omega_{j} - \frac{m}{n} \leq \omega^{*} \leq \omega_{j} + \frac{m}{n}\} \end{aligned} ω=ωj+knk=m,,0,,mL=2m+1\begin{aligned} \omega^{*} &= \omega_{j}+ \frac{k}{n}\\ k &= -m, \cdots, 0, \cdots, m\\ L &= 2m + 1 \end{aligned}

Such that the spectral values in the interval B\mathcal{B}:

f(ωj+kn)\begin{aligned} f(\omega_{j} + \frac{k}{n}) \end{aligned} k=m,,0,,m\begin{aligned} k &= -m, \cdots, 0, \cdots, m \end{aligned}

Are approximately equal to f(ω)f(\omega).

This structure can be realized for large sample sizes.

We now define an averaged (or smoothed) periodogram as the average of the periodogram values over the band B\mathcal{B}:

fˉ(ω)=1Lk=mmI(ωj+kn)\begin{aligned} \bar{f}(\omega) &= \frac{1}{L}\sum_{k=-m}^{m}I(\omega_{j} + \frac{k}{n}) \end{aligned}

The bandwidth is defined as:

B=Ln\begin{aligned} B = \frac{L}{n} \end{aligned}

Example: Averaged Periodogram for SOI and Recruitment

Let’s try to show a smoothed periodogram using the SOI and Rec dataset. Using L=9L = 9 (via trial and error) which result in:

L=2m+1\begin{aligned} L &= 2m + 1 \end{aligned} m=L12=4\begin{aligned} m &= \frac{L - 1}{2}\\ &= 4 \end{aligned}

In R using Daniell kernel:

soi.ave <- mvspec(
  soi,
  kernel("daniell", 4),   
  log = "no"
)
abline(
  v = c(.25, 1, 2, 3),
  lty = 2
)

Contrast this with the periodogram:

n <- nrow(soi)
soi$P <- 4 / n *
  Mod(fft(soi$value) / sqrt(n))^2    
soi$freq <- 0:(n-1) / n

soi |>
  slice(2:(n() / 2)) |>
  ggplot(aes(x = freq, y = P)) +
  geom_segment(
    aes(
      x = freq,
      xend = freq,
      y = 0,
      yend = P
    )
  ) +
  ylab("") +
  theme_tq()

Note that mvspec shows the frequency in years rather than months.

The smoothed spectra shown provide a sensible compromise between the noisy version, and a more heavily smoothed spectrum, which might lose some of the peaks.

An undesirable effect of averaging can be noticed at the yearly cycle, ω=1Δ\omega = 1\Delta, where Δ=112\Delta = \frac{1}{12} and the narrow band peaks that appeared in the periodograms have been flattened and spread out to nearby frequencies.

We also notice, and have marked, the appearance of harmonics of the yearly cycle, that is, frequencies of the form ω=kΔ\omega = k\Delta for k = 1, 2, ….

The bandwidth:

> soi.ave$n.used
[1] 480

> soi.ave$bandwidth  
[1] 0.225

Is calculated by adjusting it in terms of cycle per years from months (480 months via padding):

B=Ln=9480=0.225=9480×12\begin{aligned} B &= \frac{L}{n}\\ &= \frac{9}{480}\\ &= 0.225\\ &= \frac{9}{480} \times 12 \end{aligned}

Rather than using simple averaging, we can employ weighted averaging such that the weights decrease as distance from the center increase. An example would be the modified Daniell kernel. We will use it in the example below:

Example: Smoothed Periodogram for SOI and Recruitment

We will estimate the spectra of the SOI series using a modified Daniell kernel twice, with m=3m=3 (this is done by c(3, 3) in R) both times.

k <- kernel(
  "modified.daniell",
  c(3, 3)
)
soi.smo <- mvspec(
  soi,
  kernel = k,
  taper = .1,
  log = "no"
)
abline(v = c(.25, 1), lty = 2)   

Tapering is done on the upper and lower 10% of the data.

Tapering

Suppose xtx_{t} is a zero-mean, stationary process with spectral density fx(ω)f_{x}(\omega). If we replace the original series by the taper series \(y_{t\)}:

yt=htxt\begin{aligned} y_{t} &= h_{t}x_{t} \end{aligned} t=1,2,,n\begin{aligned} t = 1, 2, \cdots, n \end{aligned}

Taking the DFT of yty_{t}:

dy(ωj)=1nt=1nhtxte2πiωωjt\begin{aligned} d_{y}(\omega_{j}) &= \frac{1}{\sqrt{n}}\sum_{t=1}^{n}h_{t}x_{t}e^{-2\pi i \omega \omega_{j}t} \end{aligned}

Taking expection:

E[dy(ωj)2]=1212Hn(ωjω)2fx(ω) dω=1212Wn(ωjω)fx(ω) dω\begin{aligned} E[|d_{y}(\omega_{j})|^{2}] &= \int_{-\frac{1}{2}}^{\frac{1}{2}}|H_{n}(\omega_{j} - \omega)|^{2}f_{x}(\omega)\ d\omega\\ &= \int_{-\frac{1}{2}}^{\frac{1}{2}}W_{n}(\omega_{j} - \omega)f_{x}(\omega)\ d\omega\\ \end{aligned}

Where:

Wn(ω)=Hn(ω)2Hn(ω)=1nt=1nhte2πiωt\begin{aligned} W_{n}(\omega) &= |H_{n}(\omega)|^{2}\\ H_{n}(\omega) &= \frac{1}{\sqrt{n}}\sum_{t=1}^{n}h_{t}e^{-2\pi i \omega t} \end{aligned}
Wn(ω)W_{n}(\omega) is called a spectral window. In the case that ht=1h_{t}=1, $$ d_{y(\omega_{j}) ^{2}}$$ is just the periodogram of the data and the windown is:
Wn(ω)=sin2(nπω)nsin2(πω)\begin{aligned} W_{n}(\omega) &= \frac{\mathrm{sin}^{2}(n\pi\omega)}{n\mathrm{sin}^{2}(\pi\omega)} \end{aligned}

Tapers generally have a shape that enhances the center of the data relative to the extremities, such as a cosine bell of the form:

ht=0.5[1+cos(2π(tn+12)n)]\begin{aligned} h_{t} &= 0.5\Big[1 + \mathrm{cos}\Big(\frac{2\pi(t - \frac{n+1}{2})}{n}\Big)\Big] \end{aligned}

Untapered window contains considerable ripples over the band and outside the band. The ripples outside the band are called sidelobes and tend to introduce frequencies from outside the interval that may contaminate the desired spectral estimate within the band.

Parametric Spectral Estimation

The methods of the previous section lead to what is generally referred to as non-parametric spectral estimators because no assumption is made about the parametric form of the spectral density.

However, we know the spectral density of an ARMA process:

fx(ω)=σw2θ(e2πiω)2ϕ(ϕ(e2πiω))2\begin{aligned} f_{x}(\omega) &= \sigma_{w}^{2}\frac{|\theta(e^{-2\pi i \omega})|^{2}}{|\phi(\phi(e^{-2\pi i \omega}))|^{2}} \end{aligned}

For convenience, a parametric spectral estimator is obtained by fitting an AR(p) to the data, where the order p is determined by one of the model selection criteria, such as AIC, AICc, and BIC. Parametric autoregressive spectral estimators will often have superior resolution in problems when several closely spaced narrow spectral peaks are present and are preferred by engineers for a broad variety of problems.

If ϕ^1,ϕ^2,,ϕ^p,σ^w2\hat{\phi}_{1}, \hat{\phi}_{2}, \cdots, \hat{\phi}_{p}, \hat{\sigma}_{w}^{2} are the estimates from an AR(p) fit to xtx_{t}, a parametric spectral estimate of fx(ω)f_{x}(\omega) is attained by substituting these estimates into:

f^x(ω)=σ^w2ϕ^(e2πiω)2\begin{aligned} \hat{f}_{x}(\omega) &= \frac{\hat{\sigma}_{w}^{2}}{|\hat{\phi}(e^{-2\pi i \omega})|^{2}} \end{aligned} ϕ^(z)=1ϕ^1zϕ^2z2ϕ^pzp\begin{aligned} \hat{\phi}(z) &= 1 - \hat{\phi}_{1}z - \hat{\phi}_{2}z^{2} - \cdots - \hat{\phi}_{p}z^{p} \end{aligned}

To do this in R:

spaic <- spec.ar(as.ts(soi), log = "no")
abline(v = frequency(soi) * 1 / 52, lty = 3)   

Multiple Series and Cross-Spectra

The notion of analyzing frequency fluctuations using classical statistical ideas extends to the case in which there are several jointly stationary series, for example, xt,ytx_{t}, y_{t}. In this case, we can introduce the idea of a correlation indexed by frequency, called the coherence.

It can be shown that the covariance function:

γxy(h)=E[(xt+hμx)(ytμy)]\begin{aligned} \gamma_{xy}(h) &= E[(x_{t+h}- \mu_{x})(y_{t} - \mu_{y})] \end{aligned}

Has the representation:

γxy(h)=1212fxy(ω)e2πiωh dω\begin{aligned} \gamma_{xy}(h) &= \int_{-\frac{1}{2}}^{\frac{1}{2}}f_{xy}(\omega)e^{2\pi i \omega h}\ d\omega \end{aligned}

Where the cross-spectrum is defined as the Fourier transform:

fxy(ω)=h=γxy(h)e2πiωh\begin{aligned} f_{xy}(\omega) &= \sum_{h=-\infty}^{\infty}\gamma_{xy}(h)e^{-2\pi i \omega h} \end{aligned} h=0,±1,±2\begin{aligned} h &= 0, \pm 1, \pm 2 \end{aligned} 12ω12\begin{aligned} -\frac{1}{2} \leq \omega \leq \frac{1}{2} \end{aligned}

Assuming that the cross-covariance function is absolutely summable, as was the case for the autocovariance.

An important example of the application of the cross-spectrum is to the problem of predicting an output series yty_{t} from some input series xtx_{t} through a linear filter relation such as the three-point moving average considered below.

The cross-spectrum is generally a complex-valued function:

fxy(ω)=cxy(ω)iqxy(ω)cxy(ω)=h=γxy(h)cos(2πωh)qxy(ω)=h=γxy(h)sin(2πωh)\begin{aligned} f_{xy}(\omega) &= c_{xy}(\omega) - i q_{xy}(\omega)\\ c_{xy}(\omega) &= \sum_{h=-\infty}^{\infty}\gamma_{xy}(h)\mathrm{cos}(2\pi\omega h)\\ q_{xy}(\omega) &= \sum_{h=-\infty}^{\infty}\gamma_{xy}(h)\mathrm{sin}(2\pi\omega h) \end{aligned}

Are defined as the cospectrum and quadspectrum.

Because γyx(h)=γxy(h)\gamma_{yx}(h) = \gamma_{xy}(-h):

fyx(ω)=fxy(ω)cyx(ω)=cxy(ω)qyx(ω)=qxy(ω)\begin{aligned} f_{yx}(\omega) &= f_{xy}^{*}(\omega)\\ c_{yx}(\omega) &= c_{xy}(\omega)\\ q_{yx}(\omega) &= -q_{xy}(\omega) \end{aligned}

A measure of the strength of such a relation is the squared coherence function, defined as:

ρyx2(ω)=fyx(ω)2fxx(ω)fyy(ω)\begin{aligned} \rho_{yx}^{2}(\omega) &= \frac{|f_{yx}(\omega)|^{2}}{f_{xx}(\omega)f_{yy}(\omega)} \end{aligned}

Notice the similarity with the conventional squared correlation:

ρyx2=σyx2σx2σy2\begin{aligned} \rho_{yx}^{2} &= \frac{\sigma_{yx}^{2}}{\sigma_{x}^{2}\sigma_{y}^{2}} \end{aligned}

Coherency is a tool for relating the common periodic behavior of two series. Coherency is a frequency based measure of the correlation between two series at a given frequency, and it measures the performance of the best linear filter relating the two series.

Spectral Representation of a Vector Stationary Process

If xt\boldsymbol{x}_{t} is a px1 stationary process with autocovariance:

Γ(h)=E[(xt+hμ)(xt+hμ)T]={γjk(h)}\begin{aligned} \boldsymbol{\Gamma}(h) &= E[(\boldsymbol{x}_{t+h} - \boldsymbol{\mu})(\boldsymbol{x}_{t+h} - \boldsymbol{\mu})^{T}]\\ &=\{\gamma_{jk}(h)\} \end{aligned} h=γjk(h)<\begin{aligned} \sum_{h=-\infty}^{\infty}|\gamma_{jk}(h)| &< \infty \end{aligned} xt=[xt1xt2xtp]\begin{aligned} \boldsymbol{x}_{t} &= \begin{bmatrix} x_{t1}\\ x_{t2}\\ \vdots\\ x_{tp} \end{bmatrix} \end{aligned}

Then:

Γ(h)=1212e2πiωh\begin{aligned} \boldsymbol{\Gamma}(h) &= \int_{-\frac{1}{2}}^{\frac{1}{2}}e^{2\pi i \omega h} \end{aligned} f(ω)={fij(ω)}\begin{aligned} \boldsymbol{f}(\omega) &= \{f_{ij}(\omega)\}\\ \end{aligned} j,k=1,,p\begin{aligned} j, k &= 1, \cdots, p \end{aligned} h=0,±1,±2,\begin{aligned} h &= 0, \pm 1, \pm2, \cdots \end{aligned}

And the inverse transform of the spectral density matrix:

f(ω)=h=Γ(h)e2πiωh\begin{aligned} \textbf{f}(\omega) &= \sum_{h=-\infty}^{\infty}\boldsymbol{\Gamma}(h)e^{-2\pi i \omega h} \end{aligned} 12ω12\begin{aligned} -\frac{1}{2} \leq \omega \leq \frac{1}{2} \end{aligned}

The spectracl matrix f(ω)\textbf{f}(\omega) is Hermitian (conjugate and transpose):

f(ω)=f(ω)\begin{aligned} \textbf{f}(\omega) &= f^{*}(\omega) \end{aligned}

Example: Three-Point Moving Average

Consider a 3-piont moving average where xtx_{t} is a stationary input process with spectral density fxx(ω)f_{xx}(\omega):

yt=xt1+xt+xt+13\begin{aligned} y_{t} = \frac{x_{t-1} + x_{t} + x_{t+1}}{3} \end{aligned}

First the autocovariance function:

γxy(h)=cov(xt+h,yt)=13E[xt+h(xt1+xt+xt+1)]=13[γxx(h+1)+γxx(h)+γxx(h1)]\begin{aligned} \gamma_{xy}(h) &= \mathrm{cov}(x_{t+h}, y_{t})\\ &= \frac{1}{3}E[x_{t+h}(x_{t-1} + x_{t} + x_{t+1})]\\ &= \frac{1}{3}\Big[\gamma_{xx}(h + 1) + \gamma_{xx}(h) + \gamma_{xx}(h - 1)\Big] \end{aligned}

Recall that the autocovariance function has the representation:

γ(h)=1212e2πiωhf(ω) dω\begin{aligned} \gamma(h) &= \int_{-\frac{1}{2}}^{\frac{1}{2}}e^{2\pi i \omega h}f(\omega)\ d\omega \end{aligned}

Hence:

γxy(h)=13[γxx(h+1)+γxx(h)+γxx(h1)]=131212(e2πiω(h+1)+e2πiω(h)+e2πiω(h1))fxx(ω) dω=131212(e2πiω+1+e2πiω)e2πiωhfxx(ω) dω=131212[1+2cos(2πω)]fxx(ω)e2πiωh dω\begin{aligned} \gamma_{xy}(h) &= \frac{1}{3}\Big[\gamma_{xx}(h + 1) + \gamma_{xx}(h) + \gamma_{xx}(h - 1)\Big]\\ &= \frac{1}{3}\int_{-\frac{1}{2}}^{\frac{1}{2}}(e^{2\pi i \omega(h + 1)} + e^{2\pi i \omega(h)} + e^{2\pi i \omega(h - 1)})f_{xx}(\omega)\ d\omega\\ &= \frac{1}{3}\int_{-\frac{1}{2}}^{\frac{1}{2}}(e^{2\pi i \omega} + 1 + e^{-2\pi i \omega})e^{2\pi i \omega h}f_{xx}(\omega)\ d\omega\\ &= \frac{1}{3}\int_{-\frac{1}{2}}^{\frac{1}{2}}[1 + 2\mathrm{cos}(2\pi\omega)]f_{xx}(\omega)e^{2\pi i \omega h}\ d\omega \end{aligned}

Using the uniqueness of the Fourier transform, we conclude the spectral representation of γxy\gamma_{xy} is:

fxy(ω)=13[1+2cos(2πω)]fxx(ω)\begin{aligned} f_{xy}(\omega) &= \frac{1}{3}[1 + 2\mathrm{cos}(2\pi\omega)]f_{xx}(\omega) \end{aligned}

We will show later in the Linear Filter section that:

fy(ω)=A(ω)2fx(ω)A(ω)=j=aje2πiωj\begin{aligned} f_{y}(\omega) &= |A(\omega)|^{2}f_{x}(\omega)\\ A(\omega) &= \sum_{j=-\infty}^{\infty}a_{j}e^{-2\pi i \omega j} \end{aligned}

Thus:

fyy(ω)=(13)2e2πiω+1+e2πiω2fxx(ω)=19(1+2cos(2πω))2fxx(ω)\begin{aligned} f_{yy}(\omega) &= \Big(\frac{1}{3}\Big)^{2}|e^{2\pi i \omega} + 1 + e^{-2\pi i \omega}|^{2}f_{xx}(\omega)\\ &= \frac{1}{9}\Big(1 + 2\mathrm{cos}(2\pi\omega)\Big)^{2}f_{xx}(\omega) \end{aligned}

Finally getting the squared coherence:

ρyx2(ω)=13[1+2cos(2πω)]fxx(ω)2fxx(ω)×19(1+2cos(2πω))2fxx(ω)=1\begin{aligned} \rho_{yx}^{2}(\omega) &= \frac{\Big|\frac{1}{3}[1 + 2\mathrm{cos}(2\pi\omega)]f_{xx}(\omega)\Big|^{2}}{f_{xx}(\omega)\times\frac{1}{9}\Big(1 + 2\mathrm{cos}(2\pi \omega))^{2}f_{xx}(\omega)}\\ &= 1 \end{aligned}

That is, the squared coherence between xtx_{t} and yty_{t} is unity over all frequencies. This is a characteristic inherited by more general linear filters. However, if some noise is added to the three-point moving average, the coherence is not unity.

Linear Filters

In this section, we explore that notion further by showing how linear filters can be used to extract signals from a time series. These filters modify the spectral characteristics of a time series in a predictable way.

Recall that:

yt=j=ajxtj\begin{aligned} y_{t} &= \sum_{j=-\infty}^{\infty}a_{j}x_{t-j} \end{aligned} j=aj<\begin{aligned} \sum_{j=-\infty}^{\infty}|a_{j}| &< \infty \end{aligned} fyy(ω)=Ayx(ω)2fxx(ω)Ayx(ω)=j=aje2πiωj\begin{aligned} f_{yy}(\omega) &= |A_{yx}(\omega)|^{2}f_{xx}(\omega)\\ A_{yx(\omega)} &= \sum_{j=-\infty}^{\infty}a_{j}e^{-2\pi i \omega j} \end{aligned}

Ayx(ω)A_{yx}(\omega) is known as the frequency response function. This result shows that the filtering effect can be characterized as a frequency-by-frequency multiplication by the squared magnitude of the frequency response function.

First Difference and Moving Average Filters

We consider two filters in this example.

The first difference filter:

yt=xt=xtxt1\begin{aligned} y_{t} &= \nabla x_{t}\\ &= x_{t} - x_{t-1} \end{aligned}

And the annual symmetric moving average filter (which is a modified Daniell kernel with m = 6):

yt=124(xt6+xt+6)+112r=55xtr\begin{aligned} y_{t} &= \frac{1}{24}(x_{t-6} + x_{t+6}) + \frac{1}{12}\sum_{r=-5}^{5}x_{t-r} \end{aligned}

In general, differencing is an example of a high-pass filter because it retains or passes the higher frequencies, whereas the moving average is a low-pass filter because it passes the lower or slower frequencies.

Recall that:

A(ω)=j=aje2πiωj\begin{aligned} A(\omega) &= \sum_{j=-\infty}^{\infty}a_{j}e^{-2\pi i \omega j} \end{aligned}

First difference can be written as:

a0=1a1=1Ayx(ω)=1e2πiω\begin{aligned} a_{0} &= 1\\ a_{1} &= -1\\ A_{yx}(\omega) &= 1 - e^{-2\pi i \omega} \end{aligned} Ayx(ω)2=(1e2πiω)(1e2πiω)=2[1cos(2πω)]\begin{aligned} |A_{yx}(\omega)|^{2} &= (1 - e^{-2\pi i \omega})(1 - e^{-2\pi i \omega})\\ &= 2[1 - \mathrm{cos(2\pi \omega)}] \end{aligned}

Plotting the squared frequency response function:

The first difference filter will attenuate the lower frequencies and enhance the higher frequencies because the multiplier of the spectrum, Ayw(ω)2\mid A_{yw}(\omega)\mid^{2}, is large for the higher frequencies and small for the lower frequencies.

For the centered 12-month moving average:

a6=124a6=124ak=112\begin{aligned} a_{-6} &= \frac{1}{24}\\ a_{6} &= \frac{1}{24}\\ a_{k} &= \frac{1}{12}\\ \end{aligned} 5k5\begin{aligned} -5 \leq k \leq 5 \end{aligned}

This result in:

Ayw(ω)=124e2πiω6+124e2πiω6+112j=51e2πiωj+112e0+112j=15e2πiωj=112[1+cos(12πω)+2k=15cos(2πωk)]\begin{aligned} A_{yw}(\omega) &= \frac{1}{24}e^{2\pi i \omega 6} + \frac{1}{24}e^{-2\pi i \omega 6} + \frac{1}{12}\sum_{j=-5}^{-1}e^{-2\pi i \omega j} + \frac{1}{12}e^{0} + \frac{1}{12}\sum_{j=1}^{5}e^{-2\pi i \omega j}\\ &= \frac{1}{12}\Big[1 + \mathrm{cos}(12\pi\omega) + 2\sum_{k=1}^{5}\mathrm{cos}(2\pi\omega k)\Big] \end{aligned}

Plotting the squared frequency response in R:

We can expect this filter to cut most of the frequency content above .05 cycles per point, and nearly all of the frequency content above 1120.083\frac{1}{12} \approx 0.083. In particular, this drives down the yearly components with periods of 12 months and enhances the El Niño frequency, which is somewhat lower.

Cross Spectrum

We can show that the cross-spectrum satisfies:

fyx(ω)=Ayx(ω)fxx(ω)\begin{aligned} f_{yx}(\omega) &= A_{yx}(\omega)f_{xx}(\omega) \end{aligned} Ayx(ω)=fyx(ω)fxx(ω)=cyx(ω)fxx(ω)iqyx(ω)fxx(ω)\begin{aligned} A_{yx}(\omega) &= \frac{f_{yx}(\omega)}{f_{xx}(\omega)}\\ &= \frac{c_{yx}(\omega)}{f_{xx}(\omega)} - i \frac{q_{yx}(\omega)}{f_{xx}(\omega)} \end{aligned}

Note the difference between that and:

fyy(ω)=Ayx2(ω)fxx(ω)\begin{aligned} f_{yy}(\omega) &= |A_{yx}|^{2}(\omega)f_{xx}(\omega)\\ \end{aligned}

We may rewrite in polar form:

Ayx(ω)=Ayx(ω)eiϕyx(ω)Ayx(ω)=cyx2(ω)+qyx2(ω)fxx(ω)ϕyx(ω)=tan1(qxx(ω)cyx(ω))\begin{aligned} A_{yx}(\omega) &= |A_{yx}(\omega)|e^{-i \phi_{yx}(\omega)}\\ |A_{yx}(\omega)| &= \frac{\sqrt{c_{yx}^{2}(\omega) + q_{yx}^{2}(\omega)}}{f_{xx}(\omega)}\\ \phi_{yx}(\omega) &= \mathrm{tan}^{-1}\Big(-\frac{q_{xx}(\omega)}{c_{yx}(\omega)}\Big) \end{aligned}

A simple interpretation of the phase of a linear filter is that it exhibits time delays as a function of frequency in the same way as the spectrum represents the variance as a function of frequency.

Example: Simple Delaying Filter

Consider the simple delaying filter where the series gets replaced by a version, amplified by multiplying by A and delayed by D points:

yt=AxtD\begin{aligned} y_{t} &= A x_{t-D} \end{aligned} Ayw=j=aje2πiωj=Ae2πiωD\begin{aligned} A_{yw} &= \sum_{j=-\infty}^{\infty}a_{j}e^{-2\pi i \omega j}\\ &= Ae^{-2\pi i \omega D} \end{aligned} fyx(ω)=Aywfxx(ω)=Ae2πiωDfxx(ω)\begin{aligned} f_{yx}(\omega) &= A_{yw}f_{xx}(\omega)\\ &= Ae^{-2\pi i \omega D}f_{xx}(\omega) \end{aligned}

Recall that:

Ayw=Ayx(ω)eiϕyx(ω)\begin{aligned} A_{yw} &= |A_{yx}(\omega)|e^{-i \phi_{yx}(\omega)} \end{aligned}

Hence:

Ayx(ω)=Aϕyx(ω)=2πωD\begin{aligned} |A_{yx}(\omega)| &= A\\ \phi_{yx}(\omega) &= -2\pi \omega D \end{aligned}

Applying a simple time delay causes phase delays that depend on the frequency of the periodic component being delayed.

For example:

xt=cos(2πωt)yt=AxtD=cos(2πω(tD))=Acos(2πωt2πωD)\begin{aligned} x_{t} &= \mathrm{cos}(2\pi\omega t)\\ y_{t} &= Ax_{t-D}\\ &= \mathrm{cos}(2\pi\omega(t- D))\\ &= A\mathrm{cos}(2\pi\omega t - 2\pi\omega D) \end{aligned}

Thus, the output series, yty_{t}, has the same period as the input series, xt , but the amplitude of the output has increased by a factor of A\mid A \mid and the phase has been changed by a factor of 2πωD-2\pi\omega D.

Example: Difference and Moving Average Filters

The case for the moving average is easy because Ayx(ω)A_{yx}(\omega) is purely real. So, the amplitude is just Ayx(ω)\mid A_{yx}(\omega)\mid and the phase is ϕyx(ω)=0\phi_{yx}(\omega) = 0. In general, symmetric (aj=aj)(a_{j} = a_{-j}) filters have zero phase.

The first difference, however, changes this. In this case, the squared amplitude is:

Ayx(ω)2=(1e2πiω)(1e2πiω)=2[1cos(2πω)]\begin{aligned} |A_{yx}(\omega)|^{2} &= (1 - e^{-2\pi i \omega})(1 - e^{2\pi i \omega})\\ &= 2[1 - \mathrm{cos(2\pi \omega)}] \end{aligned}

To compute the phase:

Ayx(ω)=1e2πiω=eiπω(eiπωeiπω)=2ieiπωsin(πω)=2sin2(πω)+2icos(πω)sin(πω)=cyx(ω)fxx(ω)iqyx(ω)fxx(ω)\begin{aligned} A_{yx}(\omega) &= 1 - e^{-2\pi i \omega}\\ &= e^{-i\pi\omega}(e^{i\pi\omega} - e^{-i\pi\omega})\\ &= 2i e^{-i\pi\omega}\mathrm{sin}(\pi\omega)\\ &= 2\mathrm{sin}^{2}(\pi\omega) + 2i\mathrm{cos}(\pi\omega)\mathrm{sin}(\pi\omega)\\ &= \frac{c_{yx}(\omega)}{f_{xx}(\omega)} - i \frac{q_{yx}(\omega)}{f_{xx}(\omega)} \end{aligned}

So:

ϕyx(ω)=tan1(qyx(ω)cyx(ω))=tan1(2cos(πω)sin(πω)2sin2(πω))=tan1(cos(πω)sin(πω))\begin{aligned} \phi_{yx}(\omega) &= \mathrm{tan}^{-1}\Big(-\frac{q_{yx}(\omega)}{c_{yx}(\omega)}\Big)\\ &= \mathrm{tan}^{-1}\Big(-\frac{2\mathrm{cos}(\pi\omega)\mathrm{sin}(\pi\omega)}{2\mathrm{sin}^{2}(\pi\omega)}\Big)\\ &= \mathrm{tan}^{-1}\Big(-\frac{\mathrm{cos}(\pi\omega)}{\mathrm{sin}(\pi\omega)}\Big) \end{aligned}

Note that:

cos(πω)=sin(πω+π2)sin(πω)=cos(πω+π2)\begin{aligned} \mathrm{cos}(\pi\omega) &= \mathrm{sin}\Big(-\pi\omega + \frac{\pi}{2}\Big)\\ \mathrm{sin}(\pi\omega) &= \mathrm{cos}\Big(-\pi\omega + \frac{\pi}{2}\Big)\\ \end{aligned}

Thus:

ϕyx(ω)=tan1(sin(πω+π2)cos(πω+π2))=tan1tan(πω+π2)=πω+π2\begin{aligned} \phi_{yx}(\omega) &= \mathrm{tan}^{-1}\Bigg(\frac{\mathrm{sin}(-\pi\omega + \frac{\pi}{2})}{\mathrm{cos}(-\pi\omega + \frac{\pi}{2})}\Bigg)\\ &= \mathrm{tan}^{-1}\mathrm{tan}\Big(-\pi\omega + \frac{\pi}{2}\Big)\\ &= -\pi\omega + \frac{\pi}{2} \end{aligned}

Lagged Regression Models

One of the intriguing possibilities offered by the coherence analysis of the relation between the SOI and Recruitment series would be extending classical regression to the analysis of lagged regression models of the form:

yt=r=βrxtr+vt\begin{aligned} y_{t} &= \sum_{r=-\infty}^{\infty}\beta_{r}x_{t-r} + v_{t} \end{aligned}

We assume that the inputs and outputs have zero means and are jointly stationary with the 2 x 1 vector process (xt,yt)(x_{t}, y_{t}) having a spectral matrix of the form:

f(ω)=[fxx(ω)fxy(ω)fyx(ω)fyy(ω)]\begin{aligned} f(\omega) &= \begin{bmatrix} f_{xx}(\omega) & f_{xy}(\omega)\\ f_{yx}(\omega) & f_{yy}(\omega) \end{bmatrix} \end{aligned}

We assume all autocovariance functions satisfy the absolute summability conditions.

Minimizing the MSE:

MSE=E[ytr=βrxtr]2\begin{aligned} MSE = E\Big[y_{t} - \sum_{r=-\infty}^{\infty}\beta_{r}x_{t-r}\Big]^{2} \end{aligned}

Leads to the orthogonality conditions:

E[(ytr=βrxtr)xts]=0\begin{aligned} E\Big[\Big(y_{t} - \sum_{r=-\infty}^{\infty}\beta_{r}x_{t-r}\Big)x_{t-s}\Big] &= 0 \end{aligned} s=0,±1,±2,\begin{aligned} s &= 0, \pm 1, \pm 2, \cdots \end{aligned}

Taking expectations inside leads to the normal equations:

E[ytxtsr=βrxtrxts]=0E[ytxts]r=βrE[xtrxts]=0\begin{aligned} E[y_{t}x_{t-s} - \sum_{r=-\infty}^{\infty}\beta_{r}x_{t-r}x_{t-s}] &= 0\\ E[y_{t}x_{t-s}] - \sum_{r=-\infty}^{\infty}\beta_{r}E[x_{t-r}x_{t-s}] &= 0 \end{aligned} r=βrγxx(sr)=γyx(s)\begin{aligned} \sum_{r=-\infty}^{\infty}\beta_{r}\gamma_{xx}(s-r) &= \gamma_{yx}(s) \end{aligned}

Using spectral representation:

1212r=βre2πiω(sr)fxx(ω) dω=1212e2πiωsfxy(ω) dω1212e2πiωsB(ω)fxx(ω) dω=1212e2πiωsfxy(ω) dω\begin{aligned} \int_{-\frac{1}{2}}^{\frac{1}{2}}\sum_{r=-\infty}^{\infty}\beta_{r}e^{2\pi i \omega(s-r)}f_{xx}(\omega)\ d\omega &= \int_{-\frac{1}{2}}{\frac{1}{2}}e^{2\pi i \omega s}f_{xy}(\omega)\ d\omega\\ \int_{-\frac{1}{2}}^{\frac{1}{2}}e^{2\pi i \omega s}B(\omega)f_{xx}(\omega)\ d\omega &= \int_{-\frac{1}{2}}{\frac{1}{2}}e^{2\pi i \omega s}f_{xy}(\omega)\ d\omega \end{aligned}

Where B(ω)B(\omega) is the Fourier transform of the regression coefficients:

B(ω)=r=βre2πiωr\begin{aligned} B(\omega) &= \sum_{r=-\infty}^{\infty}\beta_{r}e^{-2\pi i \omega r} \end{aligned}

We might write the system of equations in the frequency domain, using the uniqueness of the Fourier transform:

B(ω)fxx(ω)=fyx(ω)\begin{aligned} B(\omega)f_{xx}(\omega) &= f_{yx}(\omega) \end{aligned}

Taking the estimator for the Fourier transform of the regression coefficients evaluated at some fundamental frequiencies:

B^(ωk)=f^yx(ωk)f^xx(ωk)ωk=kMM<<n\begin{aligned} \hat{B}(\omega_{k}) &= \frac{\hat{f}_{yx}(\omega_{k})}{\hat{f}_{xx}(\omega_{k})}\\ \omega_{k} &= \frac{k}{M}\\ M &<< n \end{aligned}

The inverse transform of the function B^(ω)\hat{B}(\omega):

β^t=M1k=0M1B^(ωk)e2πiωktt=0,±1,±(M21)\begin{aligned} \hat{\beta}_{t} &= M^{-1}\sum_{k=0}^{M-1}\hat{B}(\omega_{k})e^{2\pi i \omega_{k}t}\\ t &= 0, \pm 1, \cdots \pm\Big(\frac{M}{2-1}\Big) \end{aligned}

Example: Lagged Regression for SOI and Recruitment

The high coherence between the SOI and Recruitment series noted suggests a lagged regression relation between the two series. The model is:

yt=r=arxtr+wtxt=r=brytr+vt\begin{aligned} y_{t} &= \sum_{r=-\infty}^{\infty}a_{r}x_{t-r} + w_{t}\\ x_{t} &= \sum_{r=-\infty}^{\infty}b_{r}y_{t-r} + v_{t} \end{aligned}

Even though there is no plausible environmental explanation for the second of these two models, displaying both possibilities helps to settle on a parsimonious transfer function model.

Signal Extraction and Optimum Filtering

Assuming that:

yt=r=βrxtr+vt\begin{aligned} y_{t} &= \sum_{r=-\infty}^{\infty}\beta_{r}x_{t-r} + v_{t} \end{aligned}

Assuming that β\beta are known and xtx_{t} are unknown. In this case, we are interested in an estimator for the signal xtx_{t} of the form:

x^t=r=arytr\begin{aligned} \hat{x}_{t} &= \sum_{r=-\infty}^{\infty}a_{r}y_{t-r} \end{aligned}

Often, the special case is where the relation reduces to the simple signal plus noise model:

yt=xt+vt\begin{aligned} y_{t} = x_{t} + v_{t} \end{aligned}

In general, we seek to find the set of filter coefficients ata_{t} that minimize the MSE:

MSE=E[xtr=arytr]2\begin{aligned} MSE = E\Big[x_{t} - \sum_{r=-\infty}^{\infty}a_{r}y_{t-r}\Big]^{2} \end{aligned}

Applying the orthogonality principle:

E[(xtr=arytr)yts]=0\begin{aligned} E\Big[\Big(x_{t} - \sum_{r=-\infty}^{\infty}a_{r}y_{t-r}\Big)y_{t-s}\Big] = 0 \end{aligned} s=0,±1,±2,\begin{aligned} s &= 0, \pm 1, \pm 2, \cdots \end{aligned} r=arγyy(sr)=γxy(s)\begin{aligned} \sum_{r=-\infty}^{\infty}a_{r}\gamma_{yy}(s - r) = \gamma_{xy}(s) \end{aligned}

Substituting the spectral representations for the autocovariance functions into the above and identifying the spectral densities through the uniqueness of the Fourier transform produces:

A(ω)fyy(ω)=fxy(ω)\begin{aligned} A(\omega)f_{yy}(\omega) = f_{xy}(\omega) \end{aligned}

The Fourier transform pair for A(ω)A(\omega) and B(ω)B(\omega):

A(ω)=r=arB(ω)=r=βr\begin{aligned} A(\omega) &= \sum_{r=-\infty}^{\infty}a_{r} B(\omega) &= \sum_{r=-\infty}^{\infty}\beta_{r} \end{aligned}

A special consequence of the model:

fxy(ω)=B(ω)fxx(ω)fyy(ω)=B(ω)2fxx(ω)+fvv(ω)\begin{aligned} f_{xy}(\omega) &= B^{*}(\omega)f_{xx}(\omega)\\ f_{yy}(\omega) &= |B(\omega)|^{2}f_{xx}(\omega) + f_{vv}(\omega) \end{aligned}

Implying that the optimal filter would be Fourier transform of:

A(ω)=B(ω)(B(ω)2+fvv(ω)fxx(ω))\begin{aligned} A(\omega) &= \frac{B^{*}(\omega)}{\Big(|B(\omega)|^{2} + \frac{f_{vv}(\omega)}{f_{xx}(\omega)}\Big)} \end{aligned}

The second term n the denominator is just the inverse of the signal to noise ratio:

SNR(ω)=fxx(ω)fvv(ω)\begin{aligned} \text{SNR}(\omega) &= \frac{f_{xx}(\omega)}{f_{vv}(\omega)} \end{aligned}

It is possible to specify the signal-to-noise ratio a priori. If the signal-to-noise ratio is known, the optimal filter can be computed by the inverse transform of the function A(ω)A(\omega). It is more likely that the inverse transform will be intractable and a finite filter approximation like that used in the previous section can be applied to the data. In this case, we will have the below as the estimated filter function:

atM=M1k=0M1A(ωk)e2πiωkt\begin{aligned} a_{t}^{M} &= M^{-1}\sum_{k=0}^{M-1}A(\omega_{k})e^{2\pi i \omega_{k}t} \end{aligned}

Using the tapered filter where hth_{t} is the cosine taper:

a~t=htat\begin{aligned} \tilde{a}_{t} &= h_{t}a_{t} \end{aligned}

The squared frequency response of the resulting filter:

A~(ω)=t=athte2πiωt\begin{aligned} \tilde{A}(\omega) &= \sum_{t=-\infty}^{\infty}a_{t}h_{t}e^{-2\pi i \omega t} \end{aligned}

Example: Estimating the El Niño Signal via Optimal Filters

Looking at the spectrum of the SOI series:

We note that essentially two components have power, the El Niño frequency of about .02 cycles per month (the four-year cycle) and a yearly frequency of about .08 cycles per month (the annual cycle). We assume, for this example, that we wish to preserve the lower frequency as signal and to eliminate the higher order frequencies, and in particular, the annual cycle. In this case, we assume the simple signal plus noise model:

yt=xt+vt\begin{aligned} y_{t} &= x_{t} + v_{t} \end{aligned}

Where there is no convolving function βt\beta_{t}. Furthermore, the signal-to-noise ratio is assumed to be high to about .06 cycles per month and zero thereafter.

The optimal frequency response was assumed to be unity to .05 cycles per point and then to decay linearly to zero in several steps.

This can be done in R using SigExtract:

SigExtract(as.ts(soi), L = 9, M = 64, max.freq = 0.05)    

The filters we have discussed here are all symmetric two-sided filters, because the designed frequency response functions were purely real. Alternatively, we may design recursive filters to produce a desired response. An example of a recursive filter is one that replaces the input xtx_{t} by the filtered output:

yt=k=1pϕkytk+xtk1θkxtl\begin{aligned} y_{t} &= \sum_{k=1}^{p}\phi_{k}y_{t-k} + x_{t} - \sum_{k}^{1}\theta_{k}x_{t-l} \end{aligned}

Note the similarity between the above equation and the ARMA(p, q) model, in which the white noise component is replaced by the input.

Using the property:

fy(ω)=A(ω)2fx(ω)=θ(e2πiω)2ϕ(e2πiω)2fx(ω)\begin{aligned} f_{y}(\omega) &= |A(\omega)|^{2}f_{x}(\omega)\\ &= \frac{|\theta(e^{-2\pi i \omega})|^{2}}{|\phi(e^{-2\pi i \omega})|^{2}}f_{x}(\omega) \end{aligned}

Where:

ϕ(e2πiω)=1k=1pϕke2πikωθ(e2πiω)=1k=1qθke2πikω\begin{aligned} \phi(e^{-2\pi i \omega}) &= 1 - \sum_{k=1}^{p}\phi_{k}e^{-2\pi i k \omega}\\ \theta(e^{-2\pi i \omega}) &= 1 - \sum_{k=1}^{q}\theta_{k}e^{-2\pi i k \omega} \end{aligned}

Recursive filters such as those given above distort the phases of arriving frequencies.

See Also

References

Shuway H. (2017) Time Series Analysis and Its Applications: With R Examples (Springer Texts in Statistics) 4th ed.

Jason

Passionate software developer with a background in CS, Math, and Statistics. Love challenges and solving hard quantitative problems with interest in the area of finance and ML.