What is correlation How to compute How to interpret This week 2 The relations between two variables How the value of one variable changes when the value of another variable changes A correlation coefficient is a numerical index to reflect the relationship between two variables ID: 573442
Download Presentation The PPT/PDF document "Social Statistics: Correlation" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Social Statistics: CorrelationSlide2
What is correlation?
How to compute?How to interpret?
This week
2Slide3
The relations between two variables
How the value of one variable changes when the value of another variable changes
A correlation coefficient is a numerical index to reflect the relationship between two variables.Range: -1 ~ +1Bivariate correlation (for two variables)
Correlation Coefficients
3Slide4
Parametric
Pearson product-moment correlation (named for inventor Karl Pearson)Non-parametric
Spearman’s rank correlationKendall tau rank correlation coefficient
Correlation Coefficients
4Slide5
For two variables which are continuous in nature
Height, age, test score, incomeBut not for discrete or categorical variables
Race, political affiliation, social class, rank
Pearson correlation coefficient
R
xy
is the correlation between variable X and variable Y
5Slide6
Direct correlation (positive correlation):
If both variables change in the same direction
Indirect correlation (negative correlation):If both variables change in opposite directions
Types of correlation coefficients
6Slide7
Below is Correlation
Report of different C
urrency Exchange Rate on November 13
– 2014
(source: Bloomberg Terminal)
-0.8 and 0.5, which is stronger?
Types of correlation coefficients
7Slide8
Pearson product-moment correlation coefficient
the
correlation coefficient between X and Y
n
the
size of the sample
X the individual’s score on the X variable
Y the individual’s score on the Y variable
XY the product of each X score times its corresponding Y score
X
2
the individual X score, squared
Y
2
the individual Y score, squared
8Slide9
Calculate Pearson correlation coefficient for US school enrollment (unit: k) in some time points of previous 50 years. (Source: United States Census Bureau)
Exercise
1. Select two columns of data – are they correlated?
2. What does this correlated mean?
9
Year
G9-12 Public
G9-12 Private
College-Public
College-Private
1965
11610
1400
3970
1951
1970
13336
1311
6428
2153
1975
14304
1300
8836
2350
1980
13231
1339
9457
2640
1985
12388
1362
9479
2768
1990
11341
1136
10845
2974
1995
12502
1163
11092
3169
2000
13517
1264
11753
3560
2005
14909
1349
13022
4466Slide10
CORREL function
Or PEARSON function
Using Excel to calculate
10Slide11
Scatterplot or scattergram
Visualizing a correlation
X
Y
11
X
Y
2
3
4
2
5
6
6
5
4
3
7
6
8
5
5
4
6
4
7
5Slide12
Visualizing a correlation
12Slide13
r =1, a perfect direct (or positive) correlation
In real life case, 0.7 and 0.8 could be the highest you will see
Direct (positive) correlation
13Slide14
Strength and direction are important
Indirect (or negative) correlation
14Slide15
Excel Scatterplot
Four sets of data with the same correlation of 0.816
15Slide16
Linear correlation means that X and Y are in one straight line
Curvlilinear
correlationAge and memory
Linear correlation
16Slide17
income
education
attitude
vote
74190
13
1
1
80931
12
3
2
81314
11
4
2
73089
11
5
2
62023
11
3
2
61217
10
4
2
84526
11
5
1
87251
11
4
1
62659
12
5
2
76450
10
6
2
70512
12
7
2
78858
9
6
1
78628
13
7
1
86212
14
8
2
74962
9
8
2
58828
119461471108578621127560071984
More than 2 variables?
How to calculate the correlation coefficient?
CORREL()Correlation in data analysis toolset
17Slide18
Correlation matrix
More than 2 variables?
Income
Education
Attitude
Vote
Income
1.00
0.35
-0.19
-0.51
Education
1.00-0.21-0.20Attitude1.000.55Vote1.00
18Slide19
Data Analysis tool - correlation
Excel
19Slide20
Correlation value:
- finite number ~ + finite numberCorrelation coefficient value:
-1.00 ~ +1.00
Meaning of Correlation coefficient
r
xy
value
Interpretation
0.8 ~ 1.0
Very strong relationship (share most
of the things in common)
0.6 ~0.8
Strong relationship (share many things in common)0.4 ~ 0.6Moderate relationship (share something in common)0.2 ~ 0.4Weak relationship (share a little in common)0.0 ~ 0.2Weak or no relationship (share very little or nothing in common)20Slide21
Coefficient of determination:
The percentage of variance in one variable that is accounted for by the variance in the other variable.
= square of coefficient
Coefficient of determination
49% of the variance in GPA can be explained by the variance in studying time
21Slide22
The amount of unexplained variance is called the coefficient of undetermination (coefficient of alienation)
Coefficient of nondetermination
correlation
determination
interpretation
0
0
0.5
0.25
0.9
0.81
22Slide23
In a small town in Greece,
The local police found the direct correlation between ice cream and crime
Ice cream and crime
23Slide24
The correlation represents the association between two or more variables
It has nothing to do with causality (there is no cause relation between two correlated variables)
Ices cream and crime are correlated, butIces cream does not cause crime
Correlation vs. causality
24Slide25
Correlation vs. causality
S
ummer
Summer is when people get together. More specifically, casual drinkers and drug users are more likely to go to bars or parties on weekends and evenings, as opposed to a Tuesday morning. These people in the social mix, flooding the city’s streets and neighborhood bars, feed the peak times for murder, experts say.
25