Correlation and regression are powerful tools but have limitations Correlation and regression describe only linear relationship Correlation r and the leastsquares regression are not resistant ID: 631465
Download Presentation The PPT/PDF document "4.2 Cautions about Correlation and Regre..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
4.2 Cautions about Correlation and RegressionSlide2
Correlation and regression are powerful tools, but have limitations.
Correlation and regression describe only linear relationship.
Correlation r and the least-squares regression are not resistant. Slide3
Extrapolation
The use of a regression line for prediction far outside the domain of the explanatory variable x that you used to obtain the line or curve. Slide4
Such predictions are often inaccurate
Suppose that you have data on a child’s growth between 3 and 8 years of age. You find a strong linear relationship between age x and height y. If you fit a regression line to these data and use it to predict height at age 25 years, you will predict that the child will be 8 feet tall.Slide5
Lurking Variable
A variable that is not among the explanatory or response variables in a study and yet may influence the interpretation of relationships among those variables.Slide6
Remember the link between cancer and dental plaque? It could be that bad mouth hygiene is an indicator of other life style factors associated with cancer.Slide7
Lurking variables continued
Lurking variables are often unrecognized and unmeasured. Detecting their effect is challenging.
Many lurking variables change systematically over time.
one useful method of detecting lurking variables is to
plot both the response variable and the regression residuals against the time order of the observation. (See Example 4.12 on
pg
228)Slide8Slide9Slide10Slide11
Explaining AssociationSlide12
Causation
The best evidence for causation comes from experiments that actually change x while holding all other factors fixed. If y changes, then we have a good reason to think that x caused the change in y.
Even well-established causal relations may not generalize to other settings.
Sugar substitute caused bladder tumor in rats. Should we avoid this particular sugar substitute?Slide13
Common Response
The observed association between the variables x and y is explained by a lurking variable z. Both x and y change in response to changes in z. This common response creates an association even though there may be no direct causal link between x and y.
Students who are smart and who have learned a lot tend to have both high SAT scores and high college grades. The positive correlation is explained by this common response to students’ ability and knowledge.Slide14
Confounding
In short, “mixing of influences.”
Two variables are confounded when their effects on a response variable cannot be distinguished from each other. Slide15
Example of Confounding
It
is likely that more education is a cause of higher income—many highly paid
professions require
advanced education. However, confounding is also present.
People who
have high ability and come from prosperous homes are more likely to get
many years
of education than people who are less able or poorer. Of course, people who
start out
able and rich are more likely to have high earnings even without much
education. We
can’t say how much of the higher income of well-educated people is
actually caused
by their education.Slide16
Establishing Causation without Experiments
The association is strong.
The association is consistent
.
Higher doses are associated with stronger
responses.
The alleged cause precedes the effect in time
.
The alleged cause is plausible
.
See Example 4.18 Does Smoking Cause Lung Cancer (pg. 236)Slide17
4.33 FIGHTING FIRES
Someone
says, “There is
a strong
positive correlation between
the number
of firefighters at a fire and the amount of damage the
fire does
. So sending
lots of
firefighters just causes more damage.” Why is this reasoning wrong?Slide18
4.36 BETTER READERS
A
study of elementary school children, ages 6 to 11, finds a
high positive
correlation between shoe size
x
and score
y
on a test of reading
comprehension. What
explains this correlation?Slide19
Try this at home
Exercises 4.38, 4.41, 4.43, 4.45