/
Social Statistics: Correlation Social Statistics: Correlation

Social Statistics: Correlation - PowerPoint Presentation

aaron
aaron . @aaron
Follow
402 views
Uploaded On 2017-07-27

Social Statistics: Correlation - PPT Presentation

What is correlation How to compute How to interpret This week 2 The relations between two variables How the value of one variable changes when the value of another variable changes A correlation coefficient is a numerical index to reflect the relationship between two variables ID: 573442

coefficient correlation variables variable correlation coefficient variable variables score relationship pearson variance common share cream direct correlated data coefficients

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Social Statistics: Correlation" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Social Statistics: CorrelationSlide2

What is correlation?

How to compute?How to interpret?

This week

2Slide3

The relations between two variables

How the value of one variable changes when the value of another variable changes

A correlation coefficient is a numerical index to reflect the relationship between two variables.Range: -1 ~ +1Bivariate correlation (for two variables)

Correlation Coefficients

3Slide4

Parametric

Pearson product-moment correlation (named for inventor Karl Pearson)Non-parametric

Spearman’s rank correlationKendall tau rank correlation coefficient

Correlation Coefficients

4Slide5

For two variables which are continuous in nature

Height, age, test score, incomeBut not for discrete or categorical variables

Race, political affiliation, social class, rank

Pearson correlation coefficient

R

xy

is the correlation between variable X and variable Y

5Slide6

Direct correlation (positive correlation):

If both variables change in the same direction

Indirect correlation (negative correlation):If both variables change in opposite directions

Types of correlation coefficients

6Slide7

Below is Correlation

Report of different C

urrency Exchange Rate on November 13

– 2014

(source: Bloomberg Terminal)

-0.8 and 0.5, which is stronger?

Types of correlation coefficients

7Slide8

Pearson product-moment correlation coefficient

the

correlation coefficient between X and Y

n

the

size of the sample

X the individual’s score on the X variable

Y the individual’s score on the Y variable

XY the product of each X score times its corresponding Y score

X

2

the individual X score, squared

Y

2

the individual Y score, squared

8Slide9

Calculate Pearson correlation coefficient for US school enrollment (unit: k) in some time points of previous 50 years. (Source: United States Census Bureau)

Exercise

1. Select two columns of data – are they correlated?

2. What does this correlated mean?

9

Year

G9-12 Public

G9-12 Private

College-Public

College-Private

1965

11610

1400

3970

1951

1970

13336

1311

6428

2153

1975

14304

1300

8836

2350

1980

13231

1339

9457

2640

1985

12388

1362

9479

2768

1990

11341

1136

10845

2974

1995

12502

1163

11092

3169

2000

13517

1264

11753

3560

2005

14909

1349

13022

4466Slide10

CORREL function

Or PEARSON function

Using Excel to calculate

10Slide11

Scatterplot or scattergram

Visualizing a correlation

X

Y

11

X

Y

2

3

4

2

5

6

6

5

4

3

7

6

8

5

5

4

6

4

7

5Slide12

Visualizing a correlation

12Slide13

r =1, a perfect direct (or positive) correlation

In real life case, 0.7 and 0.8 could be the highest you will see

Direct (positive) correlation

13Slide14

Strength and direction are important

Indirect (or negative) correlation

14Slide15

Excel Scatterplot

Four sets of data with the same correlation of 0.816

15Slide16

Linear correlation means that X and Y are in one straight line

Curvlilinear

correlationAge and memory

Linear correlation

16Slide17

income

education

attitude

vote

74190

13

1

1

80931

12

3

2

81314

11

4

2

73089

11

5

2

62023

11

3

2

61217

10

4

2

84526

11

5

1

87251

11

4

1

62659

12

5

2

76450

10

6

2

70512

12

7

2

78858

9

6

1

78628

13

7

1

86212

14

8

2

74962

9

8

2

58828

119461471108578621127560071984

More than 2 variables?

How to calculate the correlation coefficient?

CORREL()Correlation in data analysis toolset

17Slide18

Correlation matrix

More than 2 variables?

Income

Education

Attitude

Vote

Income

1.00

0.35

-0.19

-0.51

Education

1.00-0.21-0.20Attitude1.000.55Vote1.00

18Slide19

Data Analysis tool - correlation

Excel

19Slide20

Correlation value:

- finite number ~ + finite numberCorrelation coefficient value:

-1.00 ~ +1.00

Meaning of Correlation coefficient

r

xy

value

Interpretation

0.8 ~ 1.0

Very strong relationship (share most

of the things in common)

0.6 ~0.8

Strong relationship (share many things in common)0.4 ~ 0.6Moderate relationship (share something in common)0.2 ~ 0.4Weak relationship (share a little in common)0.0 ~ 0.2Weak or no relationship (share very little or nothing in common)20Slide21

Coefficient of determination:

The percentage of variance in one variable that is accounted for by the variance in the other variable.

= square of coefficient

Coefficient of determination

49% of the variance in GPA can be explained by the variance in studying time

21Slide22

The amount of unexplained variance is called the coefficient of undetermination (coefficient of alienation)

Coefficient of nondetermination

correlation

determination

interpretation

0

0

0.5

0.25

0.9

0.81

22Slide23

In a small town in Greece,

The local police found the direct correlation between ice cream and crime

Ice cream and crime

23Slide24

The correlation represents the association between two or more variables

It has nothing to do with causality (there is no cause relation between two correlated variables)

Ices cream and crime are correlated, butIces cream does not cause crime

Correlation vs. causality

24Slide25

Correlation vs. causality

S

ummer

Summer is when people get together. More specifically, casual drinkers and drug users are more likely to go to bars or parties on weekends and evenings, as opposed to a Tuesday morning. These people in the social mix, flooding the city’s streets and neighborhood bars, feed the peak times for murder, experts say.

25