/
Introduction Population – the entire group of concern Introduction Population – the entire group of concern

Introduction Population – the entire group of concern - PowerPoint Presentation

cheryl-pisano
cheryl-pisano . @cheryl-pisano
Follow
369 views
Uploaded On 2018-02-27

Introduction Population – the entire group of concern - PPT Presentation

Sample only a part of the whole Based on sample well make a prediction about the population Bad sampling convenience bias voluntary Good sampling simple random sampleSRS Inferential Stats making predictions or ID: 637651

set data observations 100 data set 100 observations deviation quartiles july center median draw deviations variables 2011 2010 sum

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Introduction Population – the entire g..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Introduction

Population – the entire group of concernSample – only a part of the wholeBased on sample, we’ll make a prediction about the population.

Bad sampling: convenience, bias, voluntaryGood sampling: simple random sample(SRS).

Inferential Stats: making predictions or

inferences about a population

based on a sampleSlide2

Experiments

Observation – no attempt to influenceExperiment– deliberately imposes some treatment

Basic design principles: Control the effects of lurking variables Randomize which subject gets which treatment Use large sample size to reduce chance variation

Statistical Significance:

An observed

effect so big that it would rarely

occur just by chance.Slide3

Picturing Distributions with Graphs

Individuals objects described by datacan be

Variablescharacteristic of individuals of particular interestdifferent values possible for different people

What makes up any set of data?Slide4

Two kinds of variables

Categorical (Qualitative)

describes an individual by category or quality.

examples like

Numerical (Quantitative)

describes an individual by number

or quantity.

discrete

for variables

that are

continuous

for variables that are

examples likeSlide5

Describing Categorical Variables

Tables summarize the data set bylisting possible categories. giving the number of objects in each category.

or show the count as a percentage.

Picture the distribution of a cat. var. with

Pie

charts

Bar graphsSlide6

Pie Charts

whole is split into appropriate pieces.Slide7

Bar Graph

Horizontal line keeps track of categorical values.

Vertical bars at each value keeps track of # or %.

A

B

C

D

E

F

4

12

20

#

5

15

25

%Slide8

Example

180 AASU students in an Elem. Stats class come from one of four colleges (S & T, Edu, Health, Lib. Arts). The breakdown of these 80 students is given below.

CollegeLiberal ArtsEducation

Health

Professions

Science & Technology

Undeclared

Count

17

4

32

23

4

80

PercentSlide9

Ex1 - Pie Chart

CollegeLib ArtsEdu

HealthS & TUndeclared

Count

17

4

32

23

4

80

Percent

21.25%

5%

40%

28.75%

5%

100%Slide10

Ex1 – Bar Graph

LA

E

H

ST

10

20

30

%

College

Lib Arts

Edu

Health

S& T

Undeclared

Count

17

4

32

23

4

80

Percent

21.25%

5%

40%

28.75%

5%

100%

USlide11

Describing Quantitative Variables

Tables summarize the data set bylisting possible intervals (ranges, classes).giving the number of individuals in each classor showing the number as a percentage.

Picture the distribution of a quant. var. with

Histogram (similar to bar graph but now vertical

bars of neighboring classes

touch

)

Where one class ends, the next begins.Slide12

Example

2Consider the ages of the full-time faculty in the math dept. The breakdown of these 19 individuals is given in the table.

Age Class20-3030-40

40-50

50-60

60-70

Count

5

3

5

4

2

19

Percent

26.3%

15.8%

26.3%

21.1%

10.5%

100%

10

30

50

70

10

20

30

%Slide13

Info from histograms

Helps to describe a distribution withpattern (shape, center, spread)deviations (outliers) from the rest of the data

Could result from unusual observation or typoFor shape, look at symmetric vs. skewedSlide14

Examples 3 and 4

2

4

6

8

%

10

20

40

60

80

%

100

12

vSlide15

Example 4 without outliers

%

v

20

40

60

80

100

%

v

20

40

60

80

100

v

10

5

10

5

30

20

vSlide16

Describing Distributions with Numbers

Center: mean, median, modeSpread: quartiles, standard deviation

There are better ways to describe a quantitative data set than by an estimation from a graph.Slide17

Center: Mean

The mean

of a data set is the arithmetic average of

all

the observations.

Given a data set:Slide18

Mean – Example 1

Your test

scores in a Stats Class are: 60, 75, 92, 80

Your

mean score is

:Slide19

Mean – Example 2

Compare high temperatures in

Savannah for July 2010 and July 2011.

July 2010

high temps

: 83, 87, 84, …, 97, 100, 92

July 2011

high temps

: 94, 91, 93, …, 97, 99, 99 Slide20

Center: Median

The median

of a data set is the middle value of

all

the (ordered) observations.

Given a data set:Slide21

Median – Examples 3/4

11 tests

: 60, 77, 92, 80, 84, 93, 80, 95, 65, 66, 75

Ordered data set: 60, 65, 66, 75, 77, 80, 80, 84, 92, 93, 95

10 dice rolls

: 2, 4, 5, 5, 6, 7, 7, 8, 9, 10Slide22

Center: Mode

The mode

of a data set is the value that appears the most.

Tests data set: 60, 65, 66, 75, 77, 80, 80, 84, 92, 93, 95

Dice rolls: 2, 4, 5, 5, 6, 7, 7, 8, 9, 10

2010 July High Temps mode:

2011 July High Temps mode:Slide23

Spread: Quartiles

A measure of center is not useful by itselfAre other observations close or far from center?

Take an ordered data set and find:

M,

Q1,

Q3,

IQR =

Summary of data in the “Five-Number Summary”:Slide24

Quartiles – Example 5

11 tests: 60, 65, 66, 75, 77, 80, 80, 84, 92, 93, 95

5-num-sum

:

Visualize 5-num-sum with a

boxplot

.

Draw rectangle with ends at Q1 and Q3.

Draw line in the box for the median.

Draw lines to the last observations within 1.5IQR of the quartiles.

Observations outside 1.5IQR of the quartiles are suspected outliers.Slide25

Boxplot

– Example 65-Num-Sum

: 60, ____, 80, ____, 95

Draw rectangle with ends at Q1 and Q3

Draw line in the box for the median

Draw lines to last observations within 1.5IQR of the quartiles

Observations outside 1.5IQR of the quartiles are suspected outliers

50

60

70

80

90

100Slide26

Boxplot

– Example 7July 2010 5-Num-Sum

: 83, 92, 94, 97, 102

80

85

90

95

100

105

2010

IQR = 97-92=5

July 2011 5-Num-Sum

: 84, 91, 95, 98, 99

2010

2011

2011

IQR = 98-91=7Slide27

Spread: Standard Deviation

More common measure of spread (in conjunction with the mean) is the standard deviation.

A single deviation from the mean looks like

For every value

in a data set, deviations are either positive, negative or zero.

Finding an average of those will be trouble, since when you add the deviations together, you’ll get 0.

Example 1 data

: 60, 75, 92, 80Slide28

To deal with this “adding to zero”, we get rid of any negative terms by squaring each deviation.

A single

squared deviation from the mean looks like:

The average of

the squared deviations is called the

variance

:

n-1 is called the

degrees of freedom

, since knowledge of the first (n-1) deviations will automatically set the last one.Slide29

The

standard deviation

is the square root of the variance.

Observations

Deviations

Squared

Dev

60

75

92

80

mean=76.75Slide30

When to use what?

For skewed data:

For (nearly) symmetric data:

Outliers have a

big

impact on mean and std. dev.

Consider

two data sets:

Set 1: 1, 1, 3, 5, 10

Set 2: 1, 1, 3, 5, 70