/
Chapter 9: Regression Chapter 9: Regression

Chapter 9: Regression - PowerPoint Presentation

cheryl-pisano
cheryl-pisano . @cheryl-pisano
Follow
375 views
Uploaded On 2016-04-11

Chapter 9: Regression - PPT Presentation

Alexander Swan amp Rafey Alvi Residuals Grouping No regression analysis is complete without a display of the residuals to check that the linear model is reasonable Residuals often reveal subtleties that were not clear from a plot of the original data ID: 278726

data model regression prediction model data prediction regression years age variable residuals year marriage plot explain lurking correct extrapolations

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Chapter 9: Regression" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Chapter 9: Regression

Alexander Swan & Rafey AlviSlide2

Residuals Grouping

No regression analysis is complete without a display of the residuals to check that the linear model is reasonable.

Residuals often reveal subtleties that were not clear from a plot of the original data.Slide3

Residuals Grouping

Sometimes the subtleties we see are additional details that help confirm or refine our understanding.

Sometimes they reveal violations of the regression conditions that require our attention.Slide4

Subsets

Some important information: All the data must come from the same group.

When we discover that there is more than one group in a regression, neither modeling the groups together nor modeling them apart is correct.Slide5

SubsetsSlide6

Extrapolation

Extrapolations are dubious because they require the additional—and very questionable—assumption that nothing about the relationship between

x

and

y

changes even at extreme values of

x

.

Extrapolations can get you into deep trouble. You’re better off not making extrapolations.Slide7

Outliers

Any point that stands away from the others is called an

outlier

and strongly influences a regression. Slide8

Leverage, influential

A data point can be unusual if its x-value is far from the mean of the x-values. These kind of points have high

leverage

.

A data point is

influential

if omitted it will give a very different model.Slide9

Lurking Variable

There is no way to conclude from a regression alone that a variable causes the other. With observational data, as opposed to data from a designed experiment, there is no way to be sure that a lurking variable is not the cause of any apparent association.Slide10

Summary values

Scatterplots of summary statistics show less scatter than the baseline data on individuals. Scatterplots of statistics summarized over groups tend to show less variability than if measured with same variable on individuals.Slide11

Question 3

Suppose you wanted to predict the trend in marriage age for American women into the early part of this century.

How could you use this data graphed in Exercise 1 to get a good prediction? Marriage ages in selected years starting in 1990 are listed below. Use all or part of these data to create an appropriate model for predicting the average age at which women will first marry in 2005.

1900-1950 (10 year intervals): 21.9, 21.6, 21.2, 21.3, 21.5, 20.3

1955-1995 (5 year intervals): 20.2, 20.2, 20.6, 20.8, 21.1, 22.0, 23.3, 23.9, 24.5

To predict average age you would use the most recent ages, from 1975-1995, which are straight enough for a linear regression model. The linear model used to predict the marriage age would come out to be Age = -322.430 + 0.174(Year). The residual plot showed no pattern, but according to the plot the average age of marriage for women would be 26.44 years old.Slide12

Question 3

How much faith do you place in this prediction? Explain.

I don’t have very much faith in the prediction because the prediction is for a year that is 10 years higher than the highest year we are given.

Do you think your model would produce an accurate prediction about your grandchildren, say, 50 years from now? Explain.

NO! If the prediction from 10 years higher would be unlikely, a prediction 50 years later would not be possible following the trend from 1955-1995.Slide13

Question 5

In justifying his choice of a model, a student wrote, “I know this is the correct model because R² = 99.4%.”

Is this reasoning correct? Explain.

No, you would need a scattered plot to make this prediction.

Does this model allow the student to make accurate predictions? Explain.

No, the data could possibly be curved.Slide14

Vocabulary to know

The Outlier Condition means two things:

n

Points with large residuals or high leverage (especially both) can influence the regression model significantly.

lurking variable