David Unger Climate Prediction Center Summary A linear regression model can be designed specifically for ensemble prediction systems It is best applied to direct model forecasts of the element in question ID: 275236
Download Presentation The PPT/PDF document "A Regression Model for Ensemble Forecast..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
A Regression Model for Ensemble Forecasts
David Unger
Climate Prediction CenterSlide2
Summary
A linear regression model can be designed specifically for ensemble prediction systems.
It is best applied to direct model forecasts of the element in question.
Ensemble regression is easy to implement and calibrate.
This talk will summarize how it worksSlide3
Ensemble Forecasting
The ensemble forecasting approach is based on the following beliefs:
1) Individual solutions represent possible outcomes.
2) Each ensemble member is equally likely to best represent the observation.
3) The ensemble set behaves as a randomly selected sample from the expected distribution of observations.Slide4
6-10 day Mean 500-hpa hts.Slide5
TheorySlide6
Conventions
Slide7
The Ensemble Regression Model
Assumptions
Slide8
Forecasts
Observations
A Schematic Drawing of an Ensemble Regression Line. Slide9
Forecasts
Potential Observations
Actual obs
20% chance
20% chance
20% chance
20% chance
An individual case: 5 Potential solutions identified
One actual observation (ovals).
Four others that “could” happen.
Red indicates best (closest) member.Slide10
Ensemble Regression Principal Assumptions
Statistics gathered from the one actual obs
Math applied with the assumption that each ensemble member could also be a solution.Slide11
How is it possible to derive?
Slide12
“Ensemble” Regression
Best Member
Regression Eq. same as for the Ensemble mean
Residual errors much smaller (usually)Slide13
What it means in English?
Derive a regression equation relating the
ensemble mean
and the
observation
.
Apply this equation to each
individual member
.
Apply an error estimate to each individual
regression corrected
forecast
This looks a lot like the “Gaussian Kernel” approach.
(Kernel Dressing) Slide14
Regression with error estimates applied Slide15
Derivation
The regression is computed from similar “statistics” needed for standard linear regression with only two additional array elements related to the ensemble size and spread
.Slide16
Multiple linear regression
Theory (applying the ensemble mean equation to individual members) also applies to multiple linear regression PROVIDED all predictors are linear. (Inclusion of binary predictors, interactive predictors etc. will not be theoretically correct).
Ensemble regression may be easier to apply to the MOS forecasts in a second step.
(Derive equations, apply them to get a series of forecasts, and do a second step processing of those forecasts) Slide17
Cpc
Products based on ensemble regression Slide18
NAEFS
Combines GEFS and Canadian ensembles
Bias corrected by EMC (6-hourly)
2 meter temperatures processed by CPC into probability of above-near-below normal categories(5-day means)Slide19
NAEFS Kernel Density Example
Standardized Temperature (Z)
Probability DensitySlide20
Long Lead Consolidation
Nino 3.4 SST forecasts
Seasonal Forecast ConsolidationSlide21
NAEFS PERFORMANCE
6-10 Day Forecast Reliability
8-14 Day Forecast ReliabilitySlide22
NAEFS Performance
Official Forecast NAEFS GuidanceSlide23
cALiBRATIONSlide24
Climate Forecast System Version 2
(CFSv2)
4 runs per day 1 every 6 hrs.
Lagged ensemble – Ensemble formed from model forecasts from different initial times all valid for the same target period
Hindcast data available only every 5
th
day from 1982-present.
Example forecast from Jan 26, 2010. Slide25
Forecast Situation
El Nino conditions were observed in early 2010.
CFS was the first to warn of a La Nina Slide26
Calibration
Most models have too little spread (overconfident). This is compensated for by wide kernels.
If the mean ensemble spread is too large, adjustments must be made.Slide27
Spread Calibration
Slide28
SST ( C )
Density
Red
– Regression on the ensemble mean. (Standard regression)
Green
line – Individual members
Blue
Combined envelop
CFSv2 Nino 3.4 K=.2
Slide29
K=.4Slide30
K=.6Slide31
K=.8Slide32
Unaltered Ensemble Regression K=1.0
SST ( C )
Probability Density
Red
– Ensmble Mean
Blue
– Kernel Env.
Green
– Individual membersSlide33
K=1.2Slide34
K=1.4Slide35
K=1.6 Near Max
Original
Fcst.
Regression
Modified
Fcst.Slide36
Spread vs. SkillSlide37Slide38
Adjustments
Slide39
An information tidbit
Generate N values taken randomly from a Gaussian distributed variable. Label them as the ensemble forecasts. N < 20.
Take another value randomly from that same distribution and label it the observation.
Do an ensemble regression on it many cases (but not so many that R=0)
Question: What happens?Slide40
Answer
Maintains a fixed ratio (on the average)
Slide41
Inflation
Slide42
Unaltered Ensemble Regression K=1.0
Very Close to Maximum K for 4 a member ensemble.
SST ( C )
Probability Density
Red
- Ensm
Blue
– Kernel Env.
Green
– Individual membersSlide43
Weighting of ensemblesSlide44
Weighting
Slide45
Weighting (illustration)
Two forecasts (Red = GFS hi-res ensemble mean standard regression error distribution)
Blue = GFS ensembles.
The “Best” forecast in this case is the one with the highest PDF
GEFS is more likely
to have the best
member if
Obs<26.8 C
GFS hi-res
Is BetterSlide46
Weighting (Continued)
Group ensembles into sets of equal skill.
(GEFS, Canadian ensembles, ECMWF ensembles, hi-res GFS, hi-res ECMWF
etc
)
Pass 1) Calculate PDF’s separately
Pass 2) Choose highest PDF as best. Keep track of percentages.
Pass 3) Enter WEIGHTED ensembles into an ensemble regression. Weights=P(Best)/N
An adaptive regression can do this in real time.Slide47
Weighted Ensemble CFSv2
Nino 3.4 SSTs – Lead 6-mo.
Ensemble Group 1 – Jan 26 2010 For August 2010 Wgt: .36
Ensemble Group 2 – Jan 21 2010 For August 2010 Wgt: .36
Ensemble Group 4 – Jan 16 2010 For August 2010 Wgt: .28Slide48
Conclusion
It is theoretically sound to derive an equation from the ensemble mean and apply it to individual members.
An ensemble regression forecast together with its error estimates resembles Gaussian kernel smoothing except members are first processed by the ensemble mean-based regression equation.
Additional control can be achieved by adjusting the spread (K-factor). This capability is required for the case where the ensemble spread is too high.
Ensemble regression need not require equally weighted members, only that the probability that each member will be closest be estimated.
Weighting coefficients can be derived from the PDFs from component models in relation to the observations.
The system delivers reliable probabilistic forecasts that are competitive in skill with manual forecasts (better in reliability).