Representing Data Why use a Histogram When there is a lot of data When data is Continuous a mass height volume time etc Presented in a Grouped Frequency Distribution Often in groups or classes that are UNEQUAL ID: 307101
Download Presentation The PPT/PDF document "HISTOGRAMS" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
HISTOGRAMS
Representing
DataSlide2
Why use a Histogram
When there is a lot of data
When data is
Continuous
a mass, height, volume, time etc
Presented in a Grouped Frequency Distribution
Often
in groups or classes that are UNEQUAL Slide3
Continuous data
NO GAPS between Bars
Histograms look like this......Slide4
Bars
may be
different in width
Determined by Grouped
Frequency DistributionSlide5
AREA is proportional to FREQUENCY
NOT height, because of
UNEQUAL classes!
So we use
FREQUENCY DENSITY
=
Frequency
Class widthSlide6
Grouped Frequency Distribution
Speed,
km/h
0< v
≤
40
40< v
≤
50
50< v
≤
60
60< v
≤
90
90< v
≤110
Frequency
80
15
25
90
30
Classes
These classes are well defined there are no gaps !Slide7
Drawing
Sensible Scales
Bases
of rectangles correctly
aligned
Plot the Class Boundaries carefullyHeights
of rectangles needs to be correctFrequency DensitySlide8
Speed, kph
0< v
≤
40
40< v
≤
50
50< v
≤
60
60< v
≤
90
90< v
≤110
Frequency
80
15
25
90
30
Frequency Density
Class width
40
10
10
30
20
2.0
1.5
2.5
3.0
1.5
Frequency DensitiesSlide9
0
40
20
60
80
100
120
3.0
2.0
1.0
Freq Dens
Speed (
km/h
)
Frequency = Width x Height
Frequency = 40 x 2.0 = 80Slide10
Grouped Frequency Distribution
Time taken
(nearest minute)
5-9
10-19
20-29
30-39
40-59
Freq
14
9
18
3
5
Speed, kph
0< v
≤
40
40< v
≤
50
50< v
≤
60
60< v
≤
90
90< v
≤110
Frequency
80
15
25
90
30
Classes
No gaps
GAPS!
Need to adjust to Continuous
Ready to graphSlide11
Adjusting Classes
Class Widths
Time taken
(nearest minute)
5-9
10-19
20-29
30-39
40-59
Freq
14
9
18
3
5
9½
4½
19½
29½
39½
59½
10
5
10
10
20Slide12
Frequency Density
Time taken
(nearest minute)
5-9
10-19
20-29
30-39
40-59
Freq
14
9
18
3
5
Class width
5
10
10
10
20
Frequency Density
2.8
0.9
1.8
0.3
0.25Slide13
Drawing
Sensible Scales
Bases correctly aligned
Plot the Class Boundaries
Heights correct
Frequency DensitySlide14
4.5
19.5
9.5
29.5
39.5
49.5
59.5
3.0
2.0
1.0
Freq Dens
Time (Mins)
5 10 15 20 25 30 35 40 45 50 55 60Slide15
Estimating a Frequency
Imagine we want to Estimate the number of people with a time between 12 and 25
mins
Because
we have rounded
to nearest
minute with our classes we.........Consider the interval from 11.5
to 25.5Slide16
4.5
19.5
9.5
29.5
39.5
49.5
59.5
3.0
2.0
1.0
Freq Dens
Time (Mins)
11.5
25.5
Frequency = 0.9 x 8 = 7.2
Frequency = 1.8 x 6 = 10.8
Total Frequency = 18
FD
WidthSlide17
We can estimate the Mode
Time taken
(nearest minute)
5-9
10-19
20-29
30-39
40-59
Freq
14
9
18
3
5
CF
14
23
41
44
49
Mode is therefore in this ClassSlide18
4.5
19.5
9.5
29.5
39.5
49.5
59.5
3.0
2.0
1.0
Freq Dens
Time (Mins)
Modal classSlide19
…and the other one?
Simpler to plot
No adjustments required – class widths friendly
No ½ values
Estimation from the EXACT values given
No adjustment required
Estimate 15 to 56 would use 15 and 56!
Appear LESS OFTEN in the exam
Speed, kph
0< v
≤
40
40< v
≤
50
50< v
≤
60
60< v
≤
90
90< v
≤110
Frequency
80
15
25
90
30Slide20
Why use frequency density for
the vertical axes of a Histogram?
The effect of unequal class sizes on the histogram can lead to misleading ideas about the data distribution
The vertical axis is Frequency DensitySlide21
Example
:
Misprediction
of
Grade Point Average (GPA)
The following table displays the differences between predicted
GPA and actual GPA.
Positive differences result when predicted GPA > actual GPA.
Class Interval
Frequency
Class width
-2.0 to < -0.4
23
1.6
-0.4 to < -0.2
55
0.2
-0.2 to < -0.1
97
0.1
-0.1 to < 0
210
0.1
0 to < 0.1
189
0.1
0.1 to < 0.2
139
0.1
0.2 to < 0.4
116
0.2
0.4 to < 2.0
171
1.6
The
frequency histogram considerably exaggerates the incidence of
overpredicted
and
underpredicted
values
T
he
area of the two most extreme rectangles are much too large
.!!
X 10
-3
1000
2.3%
of data
17.1%
of dataSlide22
Example: Density Histogram of Misreporting GPA
Class Interval
Frequency
Class width
Frequency
Density
-2.0 to < -0.4
23
1.6
14
-0.4 to < -0.2
55
0.2
275
-0.2 to < -0.1
97
0.1
970
-0.1 to < 0
210
0.1
2100
0 to < 0.1
189
0.1
1890
0.1 to < 0.2
139
0.1
1390
0.2 to < 0.4
116
0.2
580
0.4 to < 2.0
171
1.6
107
Frequency
=( rectangle height
)
x
(
class width ) = area of rectangle
To avoid the misleading histogram like the one on last slide
,
display
the data
with
frequency
densitySlide23
X 10
-3
Frequency density x 10
-3Slide24
Chap 2-
24
Principles of Excellent Graphs
The graph should not distort the data.
The graph should not contain unnecessary things (sometimes referred to as chart junk
).
The scale on the vertical axis should begin at zero.
All axes should be properly labelled.
The graph should contain a title.
The simplest possible graph should be used for a given set of data.Slide25
Chap 2-
25
Graphical Errors: Chart Junk
1960: $1.00
1970: $1.60
1980: $3.10
1990: $3.80
Minimum Wage
Bad Presentation
Minimum Wage
0
2
4
1960
1970
1980
1990
$
Good PresentationSlide26
Chap 2-
26
Graphical Errors:
No Relative Basis
A’s received by students.
A’s received by students.
Bad Presentation
0
200
300
FD
UG
GR
SR
Freq.
10%
30%
FD
UG
GR
SR
FD = Foundation, UG = UG Dip, GR = Grad Dip, SR = Senior
100
20%
0%
%
Good PresentationSlide27
Chap 2-
27
Graphical Errors:
Compressing the Vertical Axis
Good Presentation
Quarterly Sales
Quarterly Sales
Bad Presentation
0
25
50
Q1
Q2
Q3
Q4
$
0
100
200
Q1
Q2
Q3
Q4
$
Slide28
Chap 2-
28
Graphical Errors: No Zero Point on the Vertical Axis
Monthly Sales
36
39
42
45
J
F
M
A
M
J
$
Graphing the first six months of sales
Monthly Sales
0
39
42
45
J
F
M
A
M
J
$
36
Good Presentations
Bad Presentation