Stat 153 Time Series Problem Set 3∗ Problem 1 Problem 2 Problem 3

Stat 153 Time Series
Problem Set 3∗
Problem 1
Simply note that the sum of any two solutions to a linear homogeneous difference equation is another solution. Thus, adding a non-random sequence of real numbers to the stationary solutions
produces another solution to the difference equation. However, the solution is clearly not stationary: its mean depends on t.
You need to show that the given yt satisfies the AR(1) recursion by simplifying yt − φyt−1 and E[yt ]
is not free of t for the non-stationarity.
Problem 2
To show that there is a causal stationary solutions, we need to invert the operator φ(B), where
√
φ(z) = 1−x+ 21 x2 . The roots of φ are 1±i. Since |1±i| = 2 > 1, the ARMA(2,1) process is causal.
P
To solve for the coefficients of xt ∞
j=1 ψj wt−j , for solve the associated linear homogeneous difference equation φ(B)ψj = 0. By the general solution to linear homogeneous difference equations, we
know that ψj = c1 z1−j + c2 z2−j , where z1 , z2 are the roots of φ(z). Plugging in the initial conditions
ψ0 = 1 and ψ1 = 12 , we obtain
1
i
c1 = (1 + )
2
2
and
1
i
c2 = (1 − ).
2
2
From here, you should be able to calculate γ(0), γ(1), and γ(h). Finally, ACF is ρ(h) = γ(h)/γ(0).
Problem 3
If you have taken any econometrics/statistics course such as Econ 141/Stat 135, this question should
look very familiar to you. You should review LIE (Law of Iterative Expectations) and variance
formula if you have trouble solving this question.
(a) First, notice that
MSE = E[(Y − g(x))2 ]
2
= E Y − E[Y |X] + E[Y |X] − g(x)
∗
If errors are discovered, please kindly report them to [email protected]
1
(1)
Then, you should expand the expectation above to obtain the following:
2
E[(Y − g(x)) ] = E Y − E[Y |X] + E[Y |X] − g(x)
2
2 2
= E Y − E[Y |X]
+ E[Y |X] − g(x) + 0
(2)
To obtain the zero term in the final step above, you need to show
2E Y − E[Y |X] E[Y |X] − g(x)
=0
Thus,
2 MSE ≥ E Y − E[Y |X]
,
and the minimum is achieved if and only if
2
E[Y |X] − g(x) = 0.
i.e.
g(x) = E[Y |X].
(b) Using the result we obtained in (a), we have
M SE = E[z 2 ] = Var(z) = 1
(c) Take the first derivative of E[(y − (a + bx))2 ] with respect to a and b to find two FOC’s. One
should give you a = 1 and the other one should give you b = E[xy]/E[x2 ]. Certainly, you can show
E[xy] = 0 in order to get b = 0. For MSE, you should get MSE = E[x4 ]. Since the fourth moment
of standard normal x is 3, we get the desired result. In this case, the best linear predictor of y is
its mean.
Problem 4
Please refer to the bottom of tsa3, Page 120 (cont. on p. 121).
2
Problem 5
(a)
0
5
10
15
0.5
0.0
−0.5
Partial ACF
0.2 0.6 1.0
Sample PACF
−0.4
ACF
Sample ACF
20
5
Lag
10
15
20
Lag
Figure 1: Sample ACF and PACF
(b) We will use the Box-Jenkins method to solve this question. (Please refer to Table 3.1 on
tsa3, Page 104). An MA(1) model would have a correlation function that was zero for lags of 2
or more. Similarly, an MA(2) model would have a correlation function that was zero for lags of 3
or more. Neither of these corresponds to the sample ACF shown in Figure 1. An AR(1) model,
on the other hand, would show a PACF that was zero for lags of 2 or more and an AR(2) model
would have a PACF that was zero for lags of 3 or more. The last model, AR(2), looks the most
likely, because the PACF is fairly large for the first two lags and then it drops off fairly substantially.
(c)
Series: df.sqrt
ARIMA(2,0,0) with non-zero mean
Coefficients:
ar1
ar2
1.4050 -0.6919
s.e. 0.0425
0.0425
intercept
6.3289
0.2403
sigma^2 estimated as 1.356: log likelihood=-449
AIC=906
AICc=906.14
BIC=920.61
Based on the AR(2) coefficient estimation result above1 , we have
Xt = 1.4050Xt−1 − 0.6919Xt−2 + wt + 6.3289,
where wt ∼ WN(0, 1.356).
1
Note that I mainly rely on the package forecast, which may yield a slightly different result that the result
produced by the textbook package astsa.
3
(d)
After square-root transformation:
Point
1985
1986
1987
1988
Forecast
5.688013
5.119816
5.073576
5.401756
Lo 95
Hi 95
3.4059261 7.970100
1.1842810 9.055350
0.1695927 9.977559
0.1453471 10.658165
Reverse the transformation:
[1,]
[2,]
[3,]
[4,]
Year
1985
1986
1987
1988
Forecast
Lo 95
Hi 95 Real
32.35349 11.60033277 63.52249 17.9
26.21251 1.40252145 81.99937 13.4
25.74117 0.02876169 99.55168 29.2
29.17897 0.02112577 113.59648 100.2
(e)
12
10
8
6
4
2
0
Squared Sunplots Forecasting
14
Forecasts from ARIMA(2,0,0) with non−zero mean
1800
1850
1900
1950
50
100
150
1750
0
Sunplots Forecasting
1700
1940
1950
1960
1970
1980
1990
Time
Figure 2: Our predictions for 19851988 are in red and the prediction intervals are in dashed blue
lines. The actual values for 1985-1988 are in green.
4
Appendix
Below is the R code that I used to generate the analysis result and plots.
# Load Packages
library("forecast")
# (a)
df <-read.csv(file="http://www.stat.berkeley.edu/~yuekai/153/sunspot.txt",header=F)
df <- ts(df$V1,start=1700)
df.sqrt <- sqrt(df)
df.sqrt <- ts(df.sqrt,start=1700)
par(mfrow=c(1,2))
acf(df.sqrt, main = "Sample ACF")
pacf(df.sqrt, main = "Sample PACF")
# (c)
ar2 <- Arima(df.sqrt, order=c(2,0,0))
ar2
# (d)
sun.sqrt.pred <- forecast(ar2,4,level=95)
sun.sqrt.pred
sun.pred <- matrix(nrow = 4, ncol=5)
sun.pred[,1] <- 1985:1988
sun.pred[,2] <- sun.sqrt.pred$mean^2
sun.pred[,3] <- sun.sqrt.pred$lower^2
sun.pred[,4] <- sun.sqrt.pred$upper^2
sun.pred[,5] <- c(17.90, 13.40, 29.20, 100.20)
sun.pred
# (e)
par(mfrow=c(2,1))
plot(sun.sqrt.pred, type="o", ylab="Squared Sunplots Forecasting")
df.zoom <- window(df, start=1935)
plot(df.zoom, type="o",xlim=c(1935,1990),ylab="Sunplots Forecasting")
points(sun.pred[,c(1,2)], col = "red")
lines(sun.pred[,c(1,3)], col = "blue",lty=2)
lines(sun.pred[,c(1,4)], col = "blue",lty=2)
points(sun.pred[,c(1,5)], col = "green", type="o")
5