Tracey Dorian Fanglin Yang IMSG at NOAANCEPEMC Thank you to John Halley Gotway and Tara Jensen from DTC 1 Background Compared operational GFS to the Parallel GFS GFSX GFSX is the Summer 2015 retrospective run pr4devbs15 DA and land surface changes ID: 704233
Download Presentation The PPT/PDF document "MODE CAPE Verification February 4, 2016" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
MODE CAPE VerificationFebruary 4, 2016
Tracey DorianFanglin YangIMSG at NOAA/NCEP/EMC
* Thank you to John Halley Gotway and Tara Jensen from DTC
1Slide2
Background
Compared operational GFS to the Parallel GFS (GFSX)GFSX is the Summer 2015 retrospective run (pr4devbs15) (DA and land surface changes)Period of examination is 5/1/15 - 8/1/15 (event-equalized)Forecasts verified against the 13km RAP model 00-h
forecastsCAPE forecasts initialized from 12Z cycles, valid at 00ZFocus is on 12, 36, 60, 84, 108, 132, and 156 hour forecasts.25° forecast and 13km RAP files put onto r
egional (CONUS) lambert conformal grid - Grid 130 (13km resolution)Surface CAPE evaluated over CONUS (oceans not included)Threshold and smoothing combination chosen to be >= 2000 J/kg and
2 grid squares
(chosen based subjective evaluation)
2Slide3
3Slide4
Why use the RAP for CAPE verification?
The RAP model assimilates surface and radar reflectivity data, cycles short-range convective forecasts, and allows surface observations to influence the T
and q fields through the boundary layer Because CAPE is
computed from the vertical profiles in the analysis, accurate CAPE analysis depends on accurate depictions of the T and q profilesIf any level is significantly off (especially the surface), CAPE values could be significantly offGFS does not assimilate
surface observations
(and dew points can
at times be way off)*Thank you to Geoff Manikin for this information
4Slide5
MODE Review
MODE is an object-based verification tool that is part of the Model Evaluation Tools (MET) software package developed by the Developmental Testbed Center (DTC) MODE stands for Method for Object-Based Diagnostic Evaluation
Purpose is to extract diagnostic information about model performanceCompares gridded forecasts files to
gridded observation filesMODE statistics are broken up into two categories: single and pair statistics (e.g. single object area, centroid latitude, percentile intensity;
pair
area ratio, centroid distance, percentile intensity ratio, angle differences)
5Slide6
MODE Object Identification
Smoothing radius
(in grid squares)
Intensity threshold
(in variable units)
Source: Davis 2006
* User-defined parameters in configuration file
(Intensity presented as vertical dimension)
6
Steps applied to both forecast and observation fieldsSlide7
7
>= 2000 J/kg
GFSX 12-h forecast valid 00Z 7/17/15
Misses
False
AlarmsSlide8
8
>= 2500 J/kg
GFSX 12-h forecast valid 00Z 7/17/15Slide9
Interest value quantifies the overall similarity between two objects across fields
A summary statistic based on fuzzy logic and user-defined attribute weights such as centroid distance, boundary distance, angle difference, area ratio, intersection area, intensity ratios, etc.Interest
values are computed for each possible pair of forecast/observation objectsInterest values are unitless and range
between 0 and 1The higher the interest value, the more similar the objects areA pair of objects between forecast and observation fields is considered a match if the interest value exceeds a user-defined interest threshold (default = 0.7)
Interest Values
9Slide10
Median of M
aximum Interest (MMI)-useful summary measure, but provides little to no diagnostic informationShould not be used in isolation
Summary verification measure calculated by MODE that provides an overall assessment of model performance (measure that condenses MODE object information into a single number)
Computed using total interest values for all possible pairs of forecast and observation objectsMore specifically, finds the “maximum” total interest value associated with each individual objectFrom that
set, the median value is
computed
MMIO MMIO is the median of maximum interest for only the observation objectsMMIF
MMIF
is the median of maximum interest for only the forecast objects
(dependent
on number of forecast objects
)
MMI
MMI is the median of maximum interest for both forecast and observation objects lumped together (dependent on number of forecast objects
)
10
Not sensitive to number of forecast objects assuming all models are compared to the same observation fieldSlide11
Median of Maximum InterestFor
Forecast Objects OnlyGFS
GFSX
* Included are confidence intervals of standard deviation
(false alarms will lower the MMIF value )
Simple, unmatched & matched objects
Single Object Statistics
No difference between the GFS and the GFSX
Average about 0.85
11Slide12
GFS
GFSX
Median of Maximum Interest
For
Observation
Objects Only
(misses will lower the MMIO value )
Single Object Statistics
Simple, unmatched & matched objects
GFSX has higher MMIO values, statistically-significant for short-term forecasts (12-36 hour forecasts)
Average about 0.60
12Slide13
GFS
GFSX
Median of Maximum Interest
For
Forecast
&
Observation
Objects
Single Object Statistics
Simple, unmatched & matched objects
GFSX has higher MMIO values, statistically-significant for 12 hour forecast
Average about 0.70
13Slide14
Total Object Count
GFS
GFSX
RAP
GFS and GFSX forecast much fewer objects (hundreds) than the RAP model
GFS closer to RAP, but differences between GFS and GFSX are small
Single Object Statistics
Simple, unmatched & matched objects
* Note: GFSX was closer to RAP for threshold >=2500 J/kg
14Slide15
Centroid Latitude
GFS
GFSX
RAP
Single Object Statistics
Simple, unmatched & matched objects
GFSX overall forecasts more objects slightly farther north than the GFS, but differences are small
Both the GFS and GFSX forecast objects farther south than the RAP model in June and July
Average about 33°N
* Median plotted
RAP forecasts more objects farther north
15Slide16
Centroid Latitude Difference
GFS
GFSX
Single Object Statistics
Simple, unmatched & matched objects
Both GFS and GFSX forecast CAPE on average farther south than the RAP
GFSX slightly closer to RAP analysis
The southern bias is more noticeable in June and July
Average about 3-4° too far south
* Median plotted
16Slide17
17
GFSSlide18
18
GFSXSlide19
Centroid Longitude
GFS
GFSX
RAP
Single Object Statistics
Simple, unmatched & matched objects
Average about 87°W
Small and insignificant differences between the GFS and GFSX
Both GFS and GFSX forecast CAPE farther east than the RAP
RAP forecasts many more objects farther west
* Median plotted
19
GFS and GFSX farther eastSlide20
Centroid Longitude Difference
GFS
GFSX
Single Object Statistics
Simple, unmatched & matched objects
Average about 4-6° too far east
Both GFS and GFSX forecast CAPE on average farther east than the RAP
The eastern bias is more pronounced in June and July
* Median plotted
20Slide21
21
GFSSlide22
22
GFSXSlide23
Area Ratio =
GFS
GFSX
Sum of the forecast object areas
Sum of the observed object areas
Single Object Statistics
Simple, unmatched & matched objects
GFS and GFSX have area ratios less than 1 likely due to difference in # of objects
GFSX is closer to the RAP model
23Slide24
24
GFSSlide25
25
GFSXSlide26
Differences in 90th Percentile of Intensities
GFS
GFSX
Single Object Statistics
Simple, unmatched & matched objects
GFS and GFSX underestimate 90
th
percentile intensities by about 125-150 J/kg
* Median plotted
26
Average difference about -125
to -150 J/kgSlide27
Threshold >= 3000 J/kg
27Slide28
28
GFSSlide29
29
GFSXSlide30
Centroid Distance
GFS
GFSX
Cluster matched objects
Pair statistics
GFS has slightly larger centroid distances, but differences are not statistically-significant
Average about 10-15 grid units
* Median plotted
30
ONLY FOR MATCHED PAIRSSlide31
Angle Difference
GFS
GFSX
Pair statistics
Cluster matched objects
No statistically-significant differences between the GFS and GFSX
Larger angle differences in May
Average about 15-20 degrees
* Median plotted
31
ONLY FOR MATCHED PAIRSSlide32
Interest Values
GFS
GFSX
Pair statistics
Cluster matched objects
Virtually no difference between the GFS and GFSX
Lower interest values in May than in June and July
* Median plotted
32
ONLY FOR MATCHED PAIRSSlide33
Object Hits
GFS
GFSX
GFSX has more object hits than GFS for 12-36 hour forecasts
Note: Bigger difference between GFS and GFSX for threshold >= 2500 J/kg
Depends on total number of forecast objects
Average about 550-600 hits
33Slide34
Object Misses
GFS
GFSX
Depends on total number of forecast objects
Clear separation between GFS and GFSX with GFS having more misses
Average about 1000-1100 misses
34Slide35
Object False Alarms
GFS
GFSX
Depends on total number of forecast objects
Small differences between GFS and GFSX, but GFSX has slightly more for 12-36h forecasts, while GFS has more false alarms for 84-156h forecasts
Average about 200-300 misses
35Slide36
Findings 1/7 – MMIF, MMIO, MMI
MMIF values about the same between GFS and GFSXMMIF dropouts in MayMMIO statistically-significant differences between GFSX and GFS for 12-36h forecasts with GFSX having higher MMIO values
More of a separation between the GFS and GFSX for all forecast hoursMMIO dropouts in late June/JulyMMI statistically-significant difference at 12-h lead time with GFSX having higher MMI values
MMIF values generally highest, MMIO lowest, MMI in betweenImplication may be that misses are the biggest problem (for future runs I plan to specify an area threshold in object identification step)36Slide37
Findings 2/7 – Object Count
Huge differences in the total number of objects between the GFS/GFSX and the RAP model with the RAP model forecasting many more objects (hundreds more overall)Suggests GFS and GFSX underestimate number of objectsRAP model having more objects implies that the RAP model has more >=2000 J/kg areas (higher intensities) or that those areas may be of larger areal extent than the GFS and GFSX (and were therefore not smoothed out)
Large difference in object count could also explain the large number of misses compared to hits and false alarms
GFS has slightly more objects than the GFSX for most forecast hours, which is closer to the RAP model (however results differ depending on threshold)Total number of objects increases from May to July37Slide38
Findings 3/7 – Location
GFSX on average forecasts objects slightly further north than the GFSGFSX is slightly closer to RAP in centroid latitude location, but small differences between the GFS and GFSXBoth GFS and GFSX forecast objects further south than the RAP overall during period 5/1-8/1 (by
an average of about 3-4°)Southern bias worsens from May to July (difference is as much as 6° farther south than the RAP)
No statistically-significant differences between the GFS and the GFSX for centroid longitude locationEastward bias overall for entire period 5/1-8/1 by an average of about 4-6°Too far west in May, then larger bias of being too far east in June and July compared to RAP model
38Slide39
Findings 4/7 – Intensity
Both GFS and GFSX underestimate intensity for all percentile intensitiesUnderestimation seems worse in May than in June and JulyUnderestimation is strongest for 90th
percentile intensitiesDifferences of about 200-300 J/kg in MayDifferences of about 100-200 J/kg in June and JulyNo statistically-significant differences between the GFS and the GFSX for any percentile intensities, but GFSX does overall look slightly closer to RAP intensities
39Slide40
Findings 5/7 – Size and Area
Both GFS and GFSX overall have area ratios less than 1 for all forecast hoursGFSX has area ratios closer to 1Interestingly, the area ratio approaches 1 as forecast lead time increases
for both the GFS and GFSXArea ratio > 1 in May (GFSX worse), then area ratio < 1 in June and July (GFSX better)Overestimation of object size seems to mostly come from the 156-h forecasts
Overall for the period, the GFS and GFSX both underestimate object area compared to the RAP model (apparent for most thresholds)The GFSX is slightly closer to the RAP model (especially for highest thresholds)40Slide41
Findings 6/7 – Cluster, matched pairs
Centroid Distance:GFS had larger centroid distances than the GFSX, though differences are insignificantAngle Difference:
Largest angle differences in MayNo statistically-significant differences between the GFS and the GFSXGFSX slightly larger angle differences in early forecast hours, then GFS had slightly larger angle differences for later forecast hours
Interest Values:Virtually no statistically-significant differences between the GFS and GFSXMuch lower interest values in May than in June and July (possibly due to intensity forecasts being worse, angle differences largest)Transition period in mid-June from lower interest values to higher values
41Slide42
Findings 7/7 – Hits, Misses, and False Alarms
Hits:Least amount of hits in May (perhaps due to fewer objects)GFSX has more hits than the GFS for the 12-36h forecasts
Misses:GFS clearly has more misses than the GFSX, perhaps related to the fact that MODE identified more objects from the GFS than the GFSXMost of the misses were in June and July (possibly due to more objects being identified in June and July than in May)
False Alarms:GFS has more false alarms than GFSX in the later forecast hours (108-156-h)42Slide43
Thanks!
Comments/Questions?
43Slide44
Case Studies
44Slide45
Extra Slides
45Slide46
Grid 130
46Slide47
Fuzzy Logic Interest Value Computation
47
Total interest
T =When determining if two objects are related, weights are assigned to each attribute
to represent an empirical
judgment regarding the relative
importance
of the
various attributes
// Fuzzy engine weights
// Attributes considered in determining matches
weight
= {
centroid_dist = 2.0;
boundary_dist = 4.0;
convex_hull_dist = 0.0;
angle_diff = 1.0;
area_ratio = 2.0; // default is 1
int_area_ratio = 2.0;
complexity_ratio = 0.0;
inten_perc_ratio = 0.0;
inten_perc_value = 50;
}
In
configuration file
Which % intensity should be compared for pairs of objects (median is default)Slide48
GFSX: May 6, 2015 Severe Wx Outbreak
12-h forecast valid 00Z 5/7/15
Smoothing radius = 2 grid squares
Intensity threshold >= 2000 J/kg
48Slide49
GFS: May 6, 2015 Severe Wx Outbreak
12-h forecast valid 00Z 5/7/15
Smoothing radius = 2 grid squares
Intensity threshold >= 2000 J/kg
49Slide50
Smoothing radius = 2 grid squares
Intensity threshold >= 2500 J/kg
GFSX: May 6, 2015 Severe Wx Outbreak
12-h forecast valid 00Z 5/7/15
50Slide51
GFS: May 6, 2015 Severe Wx Outbreak
12-h forecast valid 00Z 5/7/15
Smoothing radius = 2 grid squares
Intensity threshold >= 2500 J/kg
51Slide52
Example #3
52Slide53
GFSX: 156-h forecast valid 00Z 6/7/15
53Slide54
GFS: 156-h forecast valid 00Z 6/7/15
54Slide55
Example #4
55Slide56
GFSX: 84-h forecast valid 00Z 6/4/15
Threshold >= 2500 J/kg
56Slide57
GFSX: 84-h forecast valid 00Z 6/4/15
Threshold >= 2000 J/kg
* CHANGE
57Slide58
Differences in 10th Percentile of Intensities
GFS
GFSX
Single Object Statistics
Simple, unmatched & matched objects
GFS and GFSX underestimate 10
th
percentile intensities by about 25 J/kg
* Median plotted
58Slide59
Differences in 25th Percentile of Intensities
GFS
GFSX
Single Object Statistics
Simple, unmatched & matched objects
GFS and GFSX underestimate 25
th
percentile intensities by about 50 J/kg
* Median plotted
59Slide60
Differences in 50th Percentile of Intensities
GFS
GFSX
Single Object Statistics
Simple, unmatched & matched objects
GFS and GFSX underestimate 50
th
percentile intensities by about 75 J/kg
* Median plotted
60Slide61
Differences in 75th Percentile of Intensities
GFS
GFSX
Single Object Statistics
Simple, unmatched & matched objects
GFS and GFSX underestimate 75
th
percentile intensities by about 100 J/kg
* Median plotted
61Slide62
GFSX 12-h forecast valid 00Z 6/30/15
>= 2000 J/kg
62Slide63
>= 2500 J/kg
GFSX 12-h forecast valid 00Z 6/30/15
63