/
Chapter 10: Re-expressing Data Chapter 10: Re-expressing Data

Chapter 10: Re-expressing Data - PowerPoint Presentation

cheryl-pisano
cheryl-pisano . @cheryl-pisano
Follow
424 views
Uploaded On 2016-04-11

Chapter 10: Re-expressing Data - PPT Presentation

by Sai Machineni Hang Ha AP STATISTICS Reexpress Data We reexpress data by taking logarithm the square root the reciprocal or some other mathematical operation on all values in the data set ID: 278402

model data log year data model year log journals step ladder scatterplot linear express goal stat relationship don

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Chapter 10: Re-expressing Data" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Chapter 10: Re-expressing Data

by: Sai Machineni, Hang Ha

AP STATISTICSSlide2

Re-express Data

We re-express data by taking logarithm, the square root, the reciprocal, or some other mathematical operation on all values in the data set.Slide3

Goals of Re-expression

Goal 1:

Make the distribution of a variable more symmetric:

It is best to summarize. To do this, we use the mean and SD. If unimodal though, we use the 68-95-99.7 rule.

Goal 2:

Make the spread of several groups more alike:

Groups that share a common spread are easier to compare.Slide4

Goals of Re-expression

Goal 3:

Make the form of a scatterplot more nearly linear:

The greater the value of re-expression is that we can fit a linear model once the relationship is straight

Goal 4:

Make the scatter in a scatterplot spread out evenly rather than following a fan shape Slide5

Ladder of Powers

The Ladder of powers places in order the effects that many re expressions have on the dataSlide6

Attack of the Logarithms

Use when none of the data values is zero or negative

Try taking the logs of both, the x-variable and y-variable

Then re-express the data using some combination.

Model Name

X-axis

Y-axis

Comment

Exponential

x

log(y)

This model is “0” power in the ladder, useful when percent increase

Logarithmic

log(x)

y

When a scatterplot descends rapidly at the left.

Power

log(x)

log(y)

When the ladder power is too big and the next is too smallSlide7

Why Not a Curve?

We can find “curves of best fit” using the same approach that led us to linear models

For many reasons, it is usually better to re-express the data to straighten the plot.Slide8

What Can Go Wrong?

Don’t expect to be be perfect

Don’t choose a model based on R^2 alone

Beware of multiple models

Watch out for scatterplots that turn around

Watch out for negative data values

Watch out for data far from 1

Don’t stray too far from the ladderSlide9

Example 1 (#27)

Problem:

Researcher studying how a car’s gas mileage varies with its speed drove a compact car 200 miles at various speeds on a test track. Their data are shown in the table.

Speed (mph) 35 40 45 50 55 60 65 70 75

Miles per gal 25.9 27.7 28.5 29.5 29.2 27.4 26.4 24.2 22.8

Create a linear model for this relationship and report any concerns you may have about the model.

Answer: Creating a straight relationship based upon this chapter is impossible.Slide10

Example 2 (#31)

Problem: It’s often difficult to find the ideal model for the situations in which the data are strongly curved. The table below shows the rapid growth of the number of academic journals published on the Internet during the last decade.

Year

(L1)

1991

1992

1993

1994

1995

1996

1997

Number of Journals

(L2)

27

36

45

181

306

1093

2459

Create a good model to describe this growth.

log(journals) = -686.76 + 0.346(year)

Step 1:

Type in data in STAT > Edit > L1- Year (0-6) and L2-Journals

Step 2:

Check your residual: Type in Stat- Calc- LinREg (a+bx) L1,

L2

Step 3:

Start re-expressing: Find the log of journals. In your calculator type in log(L2) STO L3 (This store the Log)

Step 4:

Check scatterplot for the re-expressed data by changing STATPLOT specifications to Xlist:YR and Ylist: RESID. Then ZoomStat 9

Step 5:

Test Residual-

Perform the regression for the log of tuition vs. year with command Stat > Cal > LinReg8 (a+bx) LYR, L1, Y1

Step 6: In Stat Plot, Change Y List to RESIDSlide11

Example 2 Continued

Use your model to estimate the number of electronic journals in the year 2000.

To estimate the year 2000 journals we must remember that in entering our data we designated

1991 as year 0. That means we’ll use 9 for the year 2001 and evaluate Y1(9)

About 21497.04 Journals.

Comment on your faith in this estimate.

My calculation may be a bit too high because even though there is a rapid growth throughout

the year. The model is still seemingly not correct.