Part 2 JY Le Boudec 1 March 2015 Contents Differencing Filters Filters for dummies Prediction with filters ARMA Models Other methods 2 6 Differencing the Data We have seen that changing the scale of the data may be important for obtaining a good model ID: 264681
Download Presentation The PPT/PDF document "Forecasting" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
ForecastingPart 2
JY Le Boudec
1
March 2015Slide2
Contents
Differencing FiltersFilters for dummies
Prediction with filtersARMA Models
Other methods2Slide3
6. Differencing the Data
We have seen that changing the scale of the data may be important for obtaining a good model. Another kind of pre-processing is the application of
filters. The idea is that it may be simpler to forecast the filtered dataA filter (in full, discrete-time causal filter) is a mapping from the set of finite-length time series to the same set. By convention, we consider that a filter keeps the length of the time-series unchanged. Further, a filter has to be linear, time-invariant and causal. The latter means that output of the filter up to time
depends only on the input up to time
.
3Slide4
Differencing filter
Differencing filter
= discrete-time derivative
with the convention that
whenever
is a filter, thus is linear,
If
then
:
removes linear trends
Repeated application of
removes polynomial trends
4Slide5
De-seasonalizing filters
De-seasonalizing
: (sum of last values)
with
the convention that
whenever
If
is periodic of period
then
is constant
removes periodic components
Differencing
with the convention that whenever 5Slide6
De-seasonalizing filters
this means that if
and
then
Proof:
6Slide7
Say what is true
AB
C
A,BA,CB,C
All
None
I don’t know
7
Slide8
Solution
A is true as we saw earlierB is true. We prove by a direct computation as we did earlier, or what we can use the property that
any two filters commute, i.e. the order in which a succession of filters is applied does not matter:
for any two filters
C is false. Let us compute
. Let
, so that
w
hich is not equal to
, also noted
is the discrete-time second derivative.
8Slide9
9Slide10
Point Predictions from Differenced Data
How are these predictions made ? To answer this
question,we need to see how to use filters.
10Slide11
7. Filters for Dummies
11Slide12
Filters for Dummies
12Slide13
Filters for Dummies
13
Notation for
filter
Formula for
Impulse
Response
1
Inverse filter:
means
Impulse
Response
1 Slide14
14Slide15
15Slide16
Let
be the filter defined by
with
Say what is true
The impulse response of
is
A and B
None
I don’t know
16Slide17
Solution
Both A and B are true, by definition of filters
17Slide18
Let
be the filter defined by
with
Say what is true
The impulse response of
is
A and B
None
I don’t know
18Slide19
Solution
A is true. Indeed we can write the definition of F as
Now the filter
is invertible (the coefficient
is non zero) therefore
B is false. What is given is the impulse response of the inverse filter.
19Slide20
20
Matlab
Filter
Notation
Equation
Y=
filter ([0.1 0.2 0.3], [1 -0.2], X)
Matlab
Filter
Notation
Equation
Y=
filter ([0.1 0.2 0.3], [1 -0.2], X)Slide21
21
A sample of
Q: how can we compute
back from
?
A: inverse the filter
The inverse of
is
defined if first element of
is
The result is shown with green dots; after
the results are incorrect. Why ?
XYSlide22
To understand what happens, let us compute the coefficients of these filters (i.e. their impulse responses)
It is obtained by
where
is called an impulse
The impulse of
grows exponentially and becomes huge
numerical computation becomes impossible
22
Impulse response of
Impulse response of
Slide23
Filter Stability:
A filter that is unstable usually causes numerical problems (accumulation of rounding errors)
23
Pole of
Solution of
Zeroes of F
Solutions
of
Slide24
24
A filter with stable inverse
P=[0.5 0.3 0.2] Q=[1]Slide25
What is true about this filter
(where
)?
A and B
None
I don’t know
25
P=[1] Q
=[
0.5 0.3
0.2]Slide26
Solution
A is trueB is false. It is true that
Answer A
26Slide27
MA and AR representation of a filter
Let
Definition: Moving Average representation
This is the standard representation and
is the impulse response.
We say that
is an MA(
) filter if
for
Definition: Auto-Regressive representation
It follows from the impulse response of
:
We
say that
is an
AR(
) filter if
for
27Slide28
Example: the filter
i.e.
is a MA(17) filter
Let us compute the AR representation of
; we obtain it from the impulse response of the reverse filter; let us solve for
in
. After some math we find
, i.e.
i.e
.
28Slide29
8. How is this prediction done ?
Recall that
with
and we assume
thus
(MA representation of
i.e.
AR representation of
Prediction at lag
:
assume we know 29knownGiven the past up to time , this is random with distribution
Slide30
Point Prediction at lag 1
Prediction at lag
:assume we know
Assume
with zero mean,
the mean of
given the past up to
time
is (point prediction)
30
known
Given the past up to time , this is random with distribution
Slide31
Point Predictions
Prediction at lag
:assume we know
Therefore : (point prediction at lag 2)
At lag
: use the formula
in which you replace
by
for
and
by 0 (= the mean of
) 31knownGiven the past up to time , the conditional expectation is 0 (F() has zero mean)
Given the past up to time
, the conditional expectation is
Slide32
32Slide33
Use of the alternative representation (MA representation of
33
Prediction at lag
:
assume we know
therefore
Note it would not be a good idea to use this formula to compute
because we accumulate a large number of errors – but it can be
used
to compute prediction intervals
known
Given the past up to time
, this is random with distribution
Slide34
Computation of Prediction Intervals (example with
Prediction at lag
: assume
we know
Since the filter
L
is causal and invertible, knowing
is equivalent to knowing
Therefore
(innovation formula):
34Known at time
Slide35
Given the past up to time
,
the distribution
of
is given by
- a constant
-
plus the sum of 3 independent random variables
each with distribution
(the assumed distribution
of Example: assume the distribution of is i.e. the distribution of given the past up to time is normal with mean and variance 35Slide36
A 95%-prediction interval at lag 3 is…
None of the above
36Slide37
Solution
The distribution of
given the past up to time
is normal with mean
and variance
, therefore with
proba
95%,
is in the interval
Answer B
37Slide38
Prediction
assuming differenced data is
38Slide39
39Slide40
Compare the Two
40
Linear Regression with 3 parameters + variance
Assuming differenced data is iidSlide41
9. Using ARMA Models
for the NoiseThis technique is used when the differenced data appears stationary but not iid – the correlation structure can be used to gain some information about futures
The differenced data can be modelled as an ARMA process instead of iid
41Slide42
Deciding whether a stationary
is iid
42Slide43
43
Slide44
ARMA Process
44Slide45
45Slide46
Which of these
matlab scripts produce a sample of an ARMA process ?
X=filter([1 ; -0.4],[1;0.4],randn(1,n))
X=filter([1 ;
0.4],[
1
;-0.4
],
randn
(1,n))
A and B
None
I don’t know
46Slide47
Solution
A and B each produce a sample of
where
is an iid sequence of
standard normal random variables and
is an ARMA filter.
We need to verify whether the filter and its inverse are stables. For A, the roots of the numerator polynomial are
thus
; for the denominator polynomial we have
thus
thus both
and
are stable.
Idem for B.
Both A and B Are ARMA processes.Answer C 47Slide48
ARMA Processes are Gaussian (non iid)
48Slide49
49Slide50
50Slide51
ARIMA Process
is called an ARIMA process if
is an ARMA process, where
is a combination of differencing and
deseasonalizing
filters
How to fit an ARIMA process ?
Apply differencing filters until appears stationary
Fit the differenced process
using the ARMA fitting procedure (
Thm
5.2,
Matlab’s
armax); Check ACF of residuals; residuals are (innovation formula)Be careful with overfitting problem – use AIC or BIC; ACF of may give an idea of order 51Slide52
Fitting an ARMA process is a non-linear optimization problem
Usually solved by iterative, heuristic algorithms, may converge to a local maximummay not convergeSome simple, non MLE, heuristics exist for AR or MA models
Ex: fit the AR model that has the same theoretical ACF as the sample ACF Common practice is to bootstrap the optimization procedure by starting with a “best guess”
AR or MA fit, using heuristic above52Slide53
Example
53Slide54
54Slide55
Forecasting with an ARIMA Process
By composition of filters,
where
is the filter of the ARMA process and
is the differencing filter. Using the impulse response of
and its inverse we obtain formulas similar to those we saw previously. See Prop 5.4 and forecast-exercise
55Slide56
Improve Confidence Interval If Residuals are not Gaussian (but appear to be iid)
Assume residuals are not gaussian
but are iidHow can we get prediction intervals ?
Bootstrap by sampling from residuals56Slide57
57Slide58
With bootstrap from residuals
With gaussian assumption
58Slide59
10. Other
We have seen a few forecasting recipes regression models use of differencing filters to make noise stationary
use of ARMA models to make noise iid
use of bootstrapThis can be combined or extended. For example: linear regression with ARMA noise
59Slide60
Linear Regression with ARMA Noise
Assume a linear regression model
where we find that
does not look
iid
at all. We can model
as an ARMA process and obtain
where
is an ARMA filter and
is
iid
Apply the inverse filter and obtain a linear regression model
If we know
we can estimate ; if we know we can estimate iterate and hope it convergesPrediction formulae can be obtained using the calculus of filters exactly as we did above. 60Slide61
Sparse ARMA Models
Problem: avoid many
parameters when the degree
of the A and C polynomials are highBased on heuristicsMultiplicative ARIMA,
constrained
ARIMA
Holt
Winters
See
section 5.6
61Slide62
Sparse models give less accurate predictions but have much fewer parameters and are simple to fit.
62
Constrained ARIMA