/
Ten Difference Score Ten Difference Score

Ten Difference Score - PowerPoint Presentation

calandra-battersby
calandra-battersby . @calandra-battersby
Follow
390 views
Uploaded On 2017-05-06

Ten Difference Score - PPT Presentation

Myths By Jeffery R Edwards Presented by Chelsea Hutto Difference Scores Typically used to represent the similarity between two constructs Highly used in studies of personjob fit similarity between employee and organizational values match between employee expectations and experiences an ID: 545058

null probability difference data probability null data difference hypothesis myth scores research true tests congruence problems terms size sample effect measures pra

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Ten Difference Score" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Ten Difference Score Myths By Jeffery R. Edwards

Presented by

Chelsea Hutto Slide2

Difference Scores

Typically used to represent the similarity between two constructs

Highly used in studies of person-job fit, similarity between employee and organizational values, match between employee expectations and experiences, and the agreement between performance ratings.

Suffer from many methodological problems Slide3

Polynomial Regression Analysis

These problems can be avoided by using PRA

Uses components of difference scores along with higher order terms to represent relationships of interest in congruence research.

Treats difference scores as statements of hypotheses to be tested empirically.

Also supported by

Cafri

et. al

(2009). Slide4

Misconceptions Regarding Problems with DS

Myth 1:The Problem with Difference Scores is low reliability

Low internal consistency reliability as been viewed as

only

serious problem with DS

Reliability of any measure is ultimately an empirical matter

Problem is not whether DS are reliable in an absolute sense but also whether or not they are more reliable than other alternatives

Even with adequate reliabilities – does not solve other issues Slide5

Myth 2: Difference Scores Provide Conservative Statistical Tests

Stat’s tests based on DS labeled as conservative

Sometimes seen as appropriate for exploratory research

DS are also likely to invite conclusions that signify false positives, such that stats tests effectively become liberal.

Have not been scrutinized by PRA

Conservatism usually corresponds to effect sizes that are biased downward and Type 1 error rates that are minimized at the cost of Type 2 error.

Need a balance between liberal and conservative Slide6

Alternatives to DS that are themselves problematic

Myth 3:Measures that elicit direct comparisons avoid problems with difference scores

Merely shift the responsibility of creating a DS from the researcher to the respondent – must calculate response – error

Direct comparisons is double barreled, combines two distinct concepts into a single score

Construct validity of direct comparisons - questionableSlide7

Myth 4: Categorized Comparisons Avoid Problems with DS

Creation of subgroups based on the congruence between two component measures to avoid problems with DS

Some researchers even say it could solve reliability issues

Creates illusion

Accentuates the loss of information and reduction in explained variance

Just makes things worse Slide8

Myth 5: Product Terms are Viable Substitutes for DS

Some turn to product terms tested hierarchically in multiple regression analysis as last resort

Captures the interaction between two variables

Does not represent the effects of congruence for continuous measures Slide9

Myth 6:Hierarchical Analysis Provides Conservative Tests of DS

Some studies statistically control for component measures before estimating the effects of DS.

Characterized as conservative

Components are controlled when testing interactions using product terms

Does not yield conservative tests of DS, instead alters the relationships DS are intended to capture Slide10

Misunderstandings or misguided criticisms of PRA

Myth 7: PRA is an Exploratory, Empirically-Driven Procedure

Claimed that PR capitalizes on sample specific variance to maximize the amount of variance explained

Primary goal of PRA is to test hypotheses derived from theories of congruence

Also provides an explicit test of this hypothesis whereas using an algebraic difference score incorporates this hypothesis as an untested assumption

DS allow congruence hypotheses to evade empirical scrutiny

Lack of evidence necessary to confirm or reject hypotheses. Slide11

Myth 8: Polynomial Regression Suffers from

Multicollinearity

Concerns of

multicollinearity

between lower order and higher order terms are unfounded

Myth 9: Higher-Order Terms Do Not Enhance the Understanding of Congruence

Interpretation of higher order terms can be difficult, such difficulties arise from attempts to interpret coefficients on higher order terms individually.

Can be avoided by using response surfaces as the intermediary between congruence hypotheses and PR coefficients Slide12

Myth 10: PR Eliminates the Concept of Congruence

Comes from the assumption that a DS represents a concept that is distinct from its components. Argued that DS and their component measures are not conceptually interchangeable.

DS is calculated from its components it cannot represent a construct that is conceptually or operationally distinct from its components. Slide13

Assumptions

All can be tested empirically – so why argue?

PRA has its limitations

More comprehensive and conclusive that information obtained from difference scores Slide14

Things I Have Learned (So Far)

by Jacob Cohen Slide15

Some Things You Learn Aren’t So

Proper sample size of 30 cases per group when comparing groups

Any lower than 30 required specialized handling with “small sample statistics”

Versus critical-ratio approach

Can lead to only a fifty-fifty chance of getting significant resultsSlide16

Less is More

Should be studying few

IV’s

and even fewer

DV’s

Which

DV’s

are real and which are due to chance

As number of

IV’s

increase chances their redundancy in regards to criterion relevance also increases

Reporting numerical results

What does

r

= .12345 really mean?

Serve as a distraction from meaningful leading digits Slide17

Simple Is Better

Reporting of Data and Representation

Do not usually make it possible for most of us or consumers of products to actually see and understand the distribution

Need for graphic representation

Computers and Statistical packages

Loss of contact with data

Idea that knowledge of statistics isn’t necessary to use Slide18

Compositing of Values

Beta weights vs. unit weights

Generate a higher correlation than any other weight.

CATCH!

Only guarantees to be better than unit weights for the sample on which they were determined.

Very rare circumstances when Beta is better

Unit weights are usually more practical (+1 for positively related predictors, -1 for negatively related predictors, and 0).

Work well outside of multiple regression when we have criterion data

Better on standardized scores for our purposes than those generated by program Slide19

The Fisherian Legacy

Based on principle that science proceeds only through inductive inference, which is achieved by rejecting the null hypothesis, usually at .05 level.

Misinterpretation of Yes/No decision feature

Research is frequently designed to produce decisions, although things are not always so clearly decision oriented

Null Hypothesis – any statement about a state of affairs in a population, usually the value of a parameter, frequently zero. It is called a null hypothesis because the strategy is to nullify it or because it means “nothing doing”. Slide20

The Dreaded .05 Level

Basis for decision – cut off level

Lead to possible data fudging to massively altering data to dropping cases where there “must have been errors” Slide21

The Null Hypothesis Tests Us

Results do not tell us the truth of the null hypothesis, must turn to Bayesian stats in which probability isn’t a relative frequency but a degree of belief.

What is does tell us is the probability of the data given the truth of the null

NOT THE SAME THINGSlide22

p Value

P value does not tell us the probability that the null is true, then it cannot tell us the probability that the research is true.

Rejection of null gives us no basis for estimating the probability that a replication of the research will again result in rejecting the null.

True meaning of statistical significance

Effect is not nil, and nothing more

TemptationSlide23

Problems with NH

If the NH is almost always false, what’s the big deal about rejecting it?

Also supported by

Trafimow

and Rice (2009).

If tests exceeded critical value, you could conclude that null is false, but if you fell short of that value you couldn’t conclude it was true.

Reality: Can’t conclude anything.

If null was false – had to be false to some degree Slide24

Power Analysis

Based on four parameters

Alpha significance criterion

Sample size

Population effect size

Power of the test

Made it possible to “prove” null hypotheses

By showing that it is of no more than negligible or trivial size

Must consider the magnitude of effects Slide25

How To Use Statistics

Use of graphic and numerical analyses in ways in which we can understand them.

Plan the research

Must have credible set of specifications or discover research is not possible.

Use of effect size measures which include mean differences, correlations, and squared correlation of all kinds. All of which will lead you to a sample effect size Slide26

How To Use Statistics

After finding the sample effect size, attach a

p

value (or better) a confidence interval.

Most important rule – judgment of the scientist Slide27

Take Home Message

A single piece of research doesn’t settle an issue once and for all. Only a successful future replication in same and different settings provides an approach to settling the issue.

.05 should not be a cliff, but a reference point along the possibility-probability continuum.

Things take time. Slide28

The Earth Is Round (p

<.05)

By Jacob Cohen Slide29

Problems with Null Hypothesis

Does not tell us what we want to know

Given this data, what is the probability that NH is true

Really says, “Given that NH is true, what is the probability of these (or more extreme) data?”Slide30

The Permanent Illusion

Misapplication of deductive syllogistic reasoning

Invalid Bayesian interpretation

Level of significance at which the NH is rejected (.05) is the probability that it is correct, or at least that it is of low probability Slide31

Why P(D|Ho) ≠

P(Ho|D

)

P(D|Ho

) = when Ho is tested, finding the probability that the data could have arisen if Ho were true

The real issue =

P(Ho|D

) the inverse probability

The probability that Ho is true given the data

Reason why we conduct statistical tests – to be able to reject Ho because of its unlikelihood Slide32

Posterior Probability

Available only through Bayes’s theorem

Have to know the probability of the NH before the experiment, the “prior” probability P(Ho)

Problem: We do not normally know this

Can be done through Bayesian Stat’s by posting prior probability or distribution of probabilities.

Extremely unreliable

Use of different prior probabilities

G.K. Huysamen (2005). Slide33

Illusion of Attaining Improbability

Also known as Bayesian Id’s Wishful Thinking Error

Extremely easy to make

Made by 68 out of 70 academic psychologists studied by Oakes (1986, pp. 79-82).

Problem: Belief that after a successful rejection of Ho it is highly probable that replications will also result in rejection of Ho.

Could not be farther from the truth

Just because Ho is rejected does not mean that the theory is established.

Remember – Science experiment is not to make decisions but to make adjustments to the degree of belief. Slide34

The Nil Hypothesis

The null in Ho is taken to mean nil or zero

Which is mistakenly thought as the effect size is 0 – that the population mean difference, correlation, and raters reliability is 0 (a Ho that can almost always be rejected, even with a small sample)

Criticism – Where its use may be valid only for true experiments involving randomization (controlled clinical trials) or when any departure from pure chance is meaningful (laboratory experiments or clairvoyance) Slide35

What To Do

Do not look for an alternative to NHST

Must understand and improve data before we can generalize from our data

Report ES through confidence intervals

Improve our measurement by reducing the unreliable and invalid parts of the variance in our measures.

Use of informed judgment when using theories Slide36

Discussion Questions

Why do you think many researchers still support NHST as it stands?

Has psychology as a field become more focused on getting significant results rather than completing the proper process of an experiment? Do you think it is more prominent in other fields?

How can we as psychologists eliminate confusion and misuse of NHST?Slide37

References

Cafri

, G., Van den Berg P., &

Brannick

, M.T. (2009). What have the difference scores not been telling us? A critique of the use of self-ideal discrepancy in the assessment of body image and evaluation of an alternative data-analytic framework.

Assessment,

17(3), 361-376.

Cohen, J. (1994). The earth is round (

p

<.05).

American Psychologist,

49(12), 997-1003.

Cohen, J. (1990). Things I have learned (so far).

American Psychologist,

45(12), 1304-1312.

Edwards, J.R. (2001). Ten difference score myths.

Organizational Research Methods,

4(3), 265-287.

Huysamen

, G.K. (2005). Null hypothesis significance testing: ramifications, ruminations, and recommendations.

South African Journal of Psychology,

35(1), 1-20.

Trafimow

, D. & Rice, S. (2009). A test of the null hypothesis significance testing procedure correlation argument.

The Journal of

General Psychology,

136(3), 261-269.