/
Week 2 Lecture 1 Chapter 3. Displaying and Summarizing Quantitative Data Week 2 Lecture 1 Chapter 3. Displaying and Summarizing Quantitative Data

Week 2 Lecture 1 Chapter 3. Displaying and Summarizing Quantitative Data - PowerPoint Presentation

tatiana-dople
tatiana-dople . @tatiana-dople
Follow
374 views
Uploaded On 2018-03-23

Week 2 Lecture 1 Chapter 3. Displaying and Summarizing Quantitative Data - PPT Presentation

1 Graphical displays of a Quantitative data 2 Histogram Stemandleaf plot Boxplot Tim Hortons Example 3 Below is a snap shot of nutritional information for all donuts at Tim Hortons ID: 661549

calories data leaf values data calories values leaf median distribution stem skewed number symmetric shape histogram donuts 200 pattern

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Week 2 Lecture 1 Chapter 3. Displaying a..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Week 2Lecture 1Chapter 3. Displaying and Summarizing Quantitative Data

1Slide2

Graphical displays of a Quantitative data2Histogram

Stem-and-leaf plot

Boxplot Slide3

Tim Horton’s Example3

Below is a snap shot of nutritional information for all donuts at Tim Horton’s.

This is a real data and it can be produced from:

http://www.timhortons.com/ca/en/menu/nutrition-calculator.php#

? Slide4

Tim Horton’s Donuts Data

ID

Donut

Type of Donut

Calories

Fat

Protein

Carbs

Fiber

Sugar

1Sugar Loop DonutYeast1806428182Maple Dip DonutYeast19064311113Honey Dip DonutYeast19064311114Chocolate Dip DonutYeast19064311105Maple Glazed DonutYeast21084321136Vanilla Dip with Coloured SprinklesYeast25064461247Apple Fritter DonutYeast29087482158Caramel Apple Fritter DonutYeast30087522179Old Fashion Plain DonutCake210103251810Cinnamon Sugar DonutCake2201032811011Old Fashion Dip DonutCake2501033611712Sour Cream Cinnamon DonutCake2701632911213Old Fashion Glazed DonutCake2701034112314Double Chocolate DonutCake2701443511615Birthday Cake DonutCake2801134212416Chocolate Glazed DonutCake2801443711917Peanut Crunch DonutCake3001453912018Sour Cream Glazed DonutCake3401634612919Pumpkin Spice DonutCake250944112320Strawberry DonutFilled200553411421Blueberry DonutFilled200543411222Canadian Maple DonutFilled210653711623Boston Cream DonutFilled220653711524Strawberry Bloom DonutFilled230743911825Banana Split DonutFilled230544011826Strawberry Shortcake DonutFilled250854011527Stanley Cup DonutFilled270654812428Strawberry Vanilla DonutFilled270555213129Oreo DonutFilled4001556113530Honey Cruller DonutOther31018237022

Variables in this data set:Categorical:Types of Donut (Yeast, Cake, Filled)Quantitative:Calories Fat ProteinCarbsFiberSugar

4Slide5

Describing a DistributionThe pattern of variation of a variable is called “distribution”.

In any graph of data, look for overall pattern and for striking deviations from that pattern.

An important kind of deviation is an outlier – an individual value that falls outside the overall pattern.

5Slide6

Describing a DistributionOverall pattern of distribution can be described by its shape, centre, and spread.

Shape:

bell-shaped; bell-shaped and symmetric; symmetric; skewed

Investigate if the distribution has one major peak/mode (unimodal), or several (bimodal, multimodal)

Investigate if the distribution is symmetric or skewed in on direction (right or left)

Centre:

Mean (average values of data)

Median (midpoint of data)Mode (most frequent number in the data)Spread:RangeVarianceStandard Deviation

Interquartile Range

6Slide7

Histogram of Calories

What is the shape of the distribution of calories?

Are there any unusual values in this data? If yes, what are those values?

StatCrunch:

Graph>Histogram>Select Column(s)>Calories

7Slide8

Histogram of Calories

What is the shape of the distribution of calories?

Right Skewed (Skewed to the Right)

Are there any unusual values in this data? If yes, what are those values?

Yes; One value is displayed away from the overall pattern in the last bin. Its value is between 400 to 450.

8Slide9

Histogram of Calories

Histogram displays the entire distribution.

It slices up all the possible values into equal-width bins, also called classes.

It counts the number of cases that fall into each bin (class).

In this example, Calories range from 180 to 400.

First Bin is from 150 to 200. This means that the number of donuts whose calories are equal to 150 and more, but less than 200 (not including 200) are counted in the first bin. So, 150 ≤ calories < 200.

There are 4 donuts with calories between 150 and 200.

The second bin is from 200 to (up to) 250.

How many donuts have at most 400 calories (400 or more)?

9Slide10

Stem-and-leaf Plot of CaloriesVariable: Calories Leaf unit = 10 1 : 8999

2 : 001112233

2 : 555577777889

3 : 0014

3 :

4 : 0

Stem-and-leaf plots shows raw data values in ordered manner (from the smallest to the largest data value).

Split all numbers into two parts: the stem and the leaf.The stem is the left part of the number (data value) and the leaf is the right part. The number of stems depends on the size of the data (just like histogram in terms of number of bins to display).

Sometimes a value of a stem is repeated (stretched) in order to visualize data better.

StatCrunch:

Graph>stem and leaf>Select Column(s)>Calories10Slide11

Stem-and-leaf Plot of CaloriesVariable: Calories Leaf unit = 10 1 : 8999

2 : 001112233

2 : 555577777889

3 : 0014

3 :

4 : 0

Stem-and-leaf plots shows raw data values in ordered manner.

What is the minimum value?What is the maximum value?What is the shape of the distribution?11Slide12

Stem-and-leaf Plot of CaloriesVariable: Calories Leaf unit = 10 

1 : 8999

2 : 001112233

2 : 555577777889

3 : 0014

3 :

4 : 0

Stem-and-leaf plots shows raw data values in ordered manner. What is the minimum value?18 x 10 (leaf unit) = 180What is the maximum value?40 x 10 (leaf unit) = 400

What is the shape of the distribution?

Right Skewed.

Tilt your head to the right to see the shape (from min to max values).12Slide13

Centre of a Distribution

Mean

or

Average

:

We sum all of the observations from a particular variable that we are interested in finding its mean, and divide by the total number of cases of the same variable:

Sample Mean

=

Note that the mean gets influenced by the extremely large or small (unusual) observations. The mean is not resistant (“sensitive”) to extreme values in the data.

 

13Slide14

Centre of a Distribution

Median

:

The middle value in the sorted data. The 50

th

percentile.

In the

odd numbered data: position (the middle number)

In the

even numbered data

: average of position (the average of two middle numbers). In our example, we have 30 donuts, so it is an even data set. The median is the average of the 15th and the 16th ordered values: (250+250)/2 = 250 The median is resistant (not sensitive) to values that are extremely large or small. Because the median takes the order of the data values into account and not what the actual values are.Note: 50th percentile means that 50% of the data values are below the median and 50% of the data values are above the median.14Slide15

Mean versus MedianIn a approx. symmetric distribution, the mean and the median will be close to each other.

In a skewed distribution:

If Mean < Median, the data is left skewed.

If Mean > Median, the data is right skewed.

StatCrunch: stat>summary stats>Select Column(s)>Calories

The mean calories is 251 and the median is 250.

We can say that the data is about symmetric.

Or one can say that the data is slightly right skewed (with the support of a graphical display).

15Slide16

Mean versus Median

We note the mean for unimodal and symmetric.

E.g., Students’ marks (approximately symmetric)

We note the median for skewed distribution.

E.g., Calories example.

16