 367K - views

# ST Introduction to Regression AnalysisStatistics for Management and the Social Sciences II Autocorrelation When a regression model is tted to time series data the residuals often violate the standar

Notation emphasizing the autocorrelated nature of the residuals 1 1 Time series Autocorrelation brPage 2br ST 430514 Introduction to Regression AnalysisStatistics for Management and the Social Sciences II The correlation of with is called a lagg

## ST Introduction to Regression AnalysisStatistics for Management and the Social Sciences II Autocorrelation When a regression model is tted to time series data the residuals often violate the standar

Download Pdf - The PPT/PDF document "ST Introduction to Regression AnalysisS..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.

## Presentation on theme: "ST Introduction to Regression AnalysisStatistics for Management and the Social Sciences II Autocorrelation When a regression model is tted to time series data the residuals often violate the standar"— Presentation transcript:

Page 1
ST 430/514 Introduction to Regression Analysis/Statistics for Management and the Social Sciences II Autocorrelation When a regression model is ﬁtted to time series data, the residuals often violate the standard assumptions by being correlated Because one residual is correlated with another in the same series, we refer to auto -correlation. Notation, emphasizing the autocorrelated nature of the residuals: ) + ��� 1 / 1 Time series Autocorrelation
Page 2
ST 430/514 Introduction to Regression Analysis/Statistics for Management and the Social Sciences II The

correlation of with is called a lagged correlation, speciﬁcally the lag 1 correlation. More generally, the correlation of with is the lag correlation, 0. Autocorrelation is usually strongest for small lags and decays to zero for large lags. Autocorrelation depends on the lag , but often does not depend on 2 / 1 Time series Autocorrelation
Page 3
ST 430/514 Introduction to Regression Analysis/Statistics for Management and the Social Sciences II When the correlation of with does not depend on , and in addition: the expected value ) is constant; the variance var( ) is constant;

the residuals are said to be stationary Regression residuals have zero expected value, hence constant expected value, if the model is correctly speciﬁed. They also have constant variance, perhaps after the data have been transformed appropriately, or if weighted least squares has been used. 3 / 1 Time series Autocorrelation
Page 4
ST 430/514 Introduction to Regression Analysis/Statistics for Management and the Social Sciences II Correlogram The graph of corr( ) against lag is the correlogram, or auto-correlation function (ACF). For the SALES35 example: sales <-

read.table("Text/Exercises&Examples/SALES35.txt", header = TRUE) l <- lm(SALES ~ T, sales) acf(residuals(l)) The ACF is usually plotted including the lag 0 correlation, which is of course exactly 1. The blue lines indicate signiﬁcance at 05. 4 / 1 Time series Autocorrelation
Page 5
ST 430/514 Introduction to Regression Analysis/Statistics for Management and the Social Sciences II Modeling Autocorrelation Sometimes the ACF appears to decay exponentially: ACF( <φ< The 1 st order autoregressive model, AR(1), has this property; if where has constant variance and is

uncorrelated with for all 0, then corr( ) = 5 / 1 Time series Autocorrelation
Page 6
ST 430/514 Introduction to Regression Analysis/Statistics for Management and the Social Sciences II General autoregressive model, AR( ): ��� ACF decays exponentially, but not exactly as for any Moving average models, MA( ): ��� ACF is zero for lag 6 / 1 Time series Autocorrelation
Page 7
ST 430/514 Introduction to Regression Analysis/Statistics for Management and the Social Sciences II Combined ARMA( ) model: ��� ��� ACF again decays exponentially, but not exactly as for any 7 / 1 Time

series Autocorrelation
Page 8
ST 430/514 Introduction to Regression Analysis/Statistics for Management and the Social Sciences II Regression with Autocorrelated Errors To ﬁt the regression model ) + we must specify both the regression part ) = ��� and a model for the autocorrelation of , say ARMA( ) for speciﬁed and Time series software ﬁts both submodels simultaneously. 8 / 1 Time series Autocorrelation
Page 9
ST 430/514 Introduction to Regression Analysis/Statistics for Management and the Social Sciences II In SAS: proc autoreg can handle AR( )

errors, but not ARMA( for 0. proc arima can handle ARMA( ). In R, try AR(1) errors: ar1 <- arima(sales\$SALES, order = c(1, 0, 0), xreg = sales\$T) print(ar1) tsdiag(ar1) The ARMA( ) model is speciﬁed by order = c(p, 0, q) The middle part of the order is , for diﬀerencing. 9 / 1 Time series Autocorrelation
Page 10
ST 430/514 Introduction to Regression Analysis/Statistics for Management and the Social Sciences II Note that , the ar1 coeﬃcient, is signiﬁcantly diﬀerent from zero. Note that the trend (coeﬃcient of ) is similar to the original

regression: 4.2959 versus 4.2956. Its reported standard error is higher: 0.1760 versus 0.1069. The original, smaller, standard error is not credible, because it was calculated on the assumption that the residuals are not correlated. The new standard error recognizes autocorrelation, and is more credible, but is still calculated on an assumption: that the residuals are AR(1). 10 / 1 Time series Autocorrelation
Page 11
ST 430/514 Introduction to Regression Analysis/Statistics for Management and the Social Sciences II Which model to use? ACF of sales residuals decays something like an

exponential, but is also not signiﬁcantly diﬀerent from zero for 1, so it could be MA(1). ma1 <- arima(sales\$SALES, order = c(0, 0, 1), xreg = sales\$T) print(ma1) tsdiag(ma1) , the ma1 coeﬃcient, is also signiﬁcantly diﬀerent from zero. But AIC is higher than for AR(1), so AR(1) is preferred. 11 / 1 Time series Autocorrelation
Page 12
ST 430/514 Introduction to Regression Analysis/Statistics for Management and the Social Sciences II Be systematic; use BIC to ﬁnd a good model: P <- Q <- 0:3 BICtable <- matrix(NA, length(P), length(Q))

dimnames(BICtable) <- list(paste("p:", P), paste("q:", Q)) for (p in P) for (q in Q) { apq <- arima(sales\$SALES, order = c(p, 0, q), xreg = sales\$T) BICtable[p+1, q+1] <- AIC(apq, k = log(nrow(sales))) BICtable As usual, use k = log(nrow(...)) to get BIC instead of AIC. As in stepwise regression, minimizing AIC tends to overﬁt. AR(1) gives the minimum BIC out of these choices. 12 / 1 Time series Autocorrelation
Page 13
ST 430/514 Introduction to Regression Analysis/Statistics for Management and the Social Sciences II Forecasting with(sales, plot(T, SALES, xlim = c(0, 40), ylim

= c(0, 200))) ar1Fit <- coefficients(ar1)["intercept"] + coefficients(ar1)["sales\$T"] * sales\$T lines(sales\$T, ar1Fit, lty = 2) newTimes <- 36:40 p <- predict(ar1, n.ahead = length(newTimes), newxreg = newTimes) pCL <- p\$se * qnorm(.975) matlines(newTimes, p\$pred + cbind(0, -pCL, pCL), col = c("red", "blue", "blue"), lty = c(2, 3, 3)) 13 / 1 Time series Autocorrelation