/
A P-value A P-value

A P-value - PowerPoint Presentation

celsa-spraggs
celsa-spraggs . @celsa-spraggs
Follow
412 views
Uploaded On 2017-12-23

A P-value - PPT Presentation

Aint What You Think It Is Al M Best PhD Professor Periodontics School of Dentistry Professor Biostatistics School of Medicine Outline Idea for the editorial A history of significance testing ID: 617469

values bias data statistical bias values statistical data probability good hypothesis scientific practice null study asa context confidence significance

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "A P-value" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

A P-value

Ain’t

What You Think It Is

Al M Best, PhD

Professor, Periodontics, School of Dentistry

Professor, Biostatistics, School of MedicineSlide2

Outline

Idea for the editorial

A history of significance testingA guide to misinterpretationUsing a dental exampleMy practice as a collaborator

Best AM, Greenberg BL, Glick M. From tea tasting to

t

test: A

P

value

ain’t

what you think it is.

Journal of the American Dental Association.

2016 Jul;147(7):527-9. PMID:

27350642

.Slide3

7-Mar-2017 retractionwatch.com blog

http

://retractionwatch.com/2016/03/07/were-using-a-common-statistical-test-all-wrong-statisticians-want-to-fix-that/Slide4

TAS

http://www.tandfonline.com/doi/full/10.1080/00031305.2016.1154108Slide5

Metrics

amstat.tandfonline.com/

doi/citedby/10.1080/00031305.2016.1154108Slide6

Supplemental Material

Greenland,

S, Senn, SJ, Rothman, KJ, Carlin, JB, Poole, C, Goodman, SN and Altman, DG: “Statistical Tests, P-values, Confidence Intervals, and Power:A

Guide to

Misinterpretations”

Altman

, Naomi: Ideas from multiple testing of high dimensional data provide insights about reproducibility and false discovery rates of hypothesis supported by p-values

Benjamin, Daniel J, and Berger, James O: A simple alternative to p-values

Benjamini

,

Yoav

: It’s not the p-values’ fault

Berry, Donald A: P-values are not what they’re cracked up to be

Carlin, John B: Comment: Is reform possible without a paradigm shift?

Cobb, George: ASA statement on p-values: Two consequences we can hope for Gelman, Andrew: The problems with p-values are not just with p-values Goodman, Steven N: The next questions: Who, what, when, where, and why? Greenland, Sander: The ASA guidelines and null bias in current teaching and practice

Ioannidis, John PA: Fit-for-purpose inferential methods: abandoning/changing P-values versus abandoning/changing research Johnson, Valen E: Comments on the “ASA Statement on Statistical Significance and P-values" and marginally significant p-values Lavine, Michael, and Horowitz, Joseph: Comment Lew, Michael J: Three inferential questions, two types of P-value Little, Roderick J: Discussion Mayo, Deborah G: Don’t throw out the error control baby with the bad statistics bathwaterMillar, Michele: ASA statement on p-values: some implications for education Rothman, Kenneth J: Disengaging from statistical significance Senn, Stephen: Are P-Values the Problem? Stangl, Dalene: Comment Stark, PB: The value of p-values Ziliak, Stephen T: The significance of the ASA statement on statistical significance and p-valuesSlide7

Supplemental Material

Greenland,

S, Senn, SJ, Rothman, KJ, Carlin, JB, Poole, C, Goodman, SN and Altman, DG: “Statistical Tests, P-values, Confidence Intervals, and Power: A Guide to Misinterpretations” Eur J

Epidemiol

.

2016 Apr;31(4):337-50.Slide8

The Lady Tasting Tea

Classical example

Salsburg

D.

The Lady Tasting Tea

. New York,

NY: WH

Freeman and Co; 2001.

Fisher

RA.

Statistical Methods and

Scientific Inference

. 3rd ed.

New York, NY:

Hafner Press; 1973.Slide9

Coke

vs

PepsiSay I poured, hidden from you, two soft-drink cups. One with Coke and one with Pepsi. Then I ask you: “Which is Coke? And which is Pepsi?” What are the possible outcomes?From: Maita Levine and Raymond H.

Rolwing

(1993). Teaching Statistics, 15, 4-5. Slide10

Likelihood of outcomes

L

ook at the exact distribution of the number of correct. Calculate the probability of each result.Would this experiment be convincing?Slide11

Coke

vs

Pepsi: 4 cupsAssuming an equal number of Cokes and Pepsis, the next larger experiment would be 4 cups.What are the possible outcomes?Slide12

Likelihood of Outcomes

With each outcome equally likely, we calculate the

p-values for all the possibilities:Would this experiment be convincing?So if someone got all 4 right, we would be able to conclude that this person could “… tell the difference between Coke and Pepsi,

p

-value

= .1667.” Would this be convincing?Slide13

Fisher’s tea lady used 8 cups

All the possible outcomesSlide14

Likelihood of Outcomes

We calculate the p-values

If someone got all 8 right, we could conclude that this person could “… tell the difference between Coke and Pepsi, p-value = .0143.” Would this be convincing?Slide15

Inference?

“Statistical analysis of medical studies is based on the key idea that we make observations on a sample of subjects and then draw inferences about the population of all such subjects from which the sample is drawn.”

Altman D, Machin D., Bryant T, & Gardner M (Eds.) (2013) Statistics with confidence: confidence intervals and statistical guidelines. John Wiley & Sons. ISBN 0-7279-1375-1. Page 3.Gardner MJ, Altman DG. (1988) Estimating with confidence. Br Med

J. 30;296(6631

):1210-1.

PMID

: 3133015; PubMed Central

PMCID:

PMC2545695

.Slide16

Jerzy

Neyman

& Egon PearsonViewed Fisher’s work as mathematically fuzzy and heuristicInstead of focusing on what a scientist thinks about the evidence, an experiment should tell the scientist what to do.Out of this came Ha, type-I and type-II error rates, powerSlide17

Greenland’s “Guide to Misinterpretations”

Lapidus

et al. “Effect of premedication to provide analgesia as a supplement to inferior alveolar nerve block in patients with irreversible pulpitis.” JADA 2016 147(6):427-37.CONCLUSIONS: There is moderate evidence to support the use of oral NSAIDs-in particular, ibuprofen-1 hour before the administration of IANB local anesthetic to provide additional analgesia to the patient.

Greenland et al. “Statistical

Tests, P-values, Confidence Intervals, and Power:

A Guide

to Misinterpretations”

Eur

J

Epidemiol

.

2016 Apr;31(4):337-50

.Slide18

Severely infected irreversible pulpitisSlide19

Tom Hanks (

2000

) A FedEx executive must transform himself physically and emotionally to survive a crash landing on a deserted islandSlide20
Slide21

Ibuprofen versus placebo, frequency of participants

in each

group having “little or no pain during endodontic treatment.”“The probability of … is .020.”Slide22

Benzodiazepine versus placebo, frequency

of participants

in each group having “little or no pain during endodontic treatment.”“The probability of … is .954.”Slide23

True

or False?

The

p

-value is the probability that

the

null hypothesis is true.

For example, the test of the ibuprofen null hypothesis gave

P

= 0.02, the null hypothesis has only a 2% chance of being true.

Greenland et al. “Statistical

Tests, P-values, Confidence Intervals, and Power: A

Guide

to Misinterpretations”

Eur

J

Epidemiol

.

2016 Apr;31(4):337-50

.Slide24

The

p

-value is the probability that the null

hypothesis is true.

No

!

The

p

-value

simply indicates the degree to which the data conform to the pattern predicted by the

null hypothesis

and

all the other assumptions used in the test (the underlying statistical model). Slide25

Backwards

The absurdity of the common backwards interpretation might be appreciated

by pondering how the p-value, which is a probability deduced from a set of assumptions, can possibly refer to the probability of those assumptions.Slide26

True

or

False? The p-value is the probability that chance alone produced the observed association.For example, the

p

-value for the ibuprofen null hypothesis is 0.02.

And so there is a 2% probability that chance alone produced the association.Slide27

The

p

-value for the null hypothesis is the probability that chance alone produced the observed association.No! To say this is asserting that every assumption used to compute the p-value is correct, including the null hypothesis. Slide28

Greenland.

et al.’s Guide

14 misinterpretations of a single study’s p-value(s)4 misinterpretations of p-values across studies or in subgroups5 misinterpretations of confidence intervals2 misinterpretations of powerSlide29

p

< .05 means … ?

Ho is false, should be rejectedHa is trueScientifically important effect detectedSubstantially important relationship demonstratedChance of false positive finding is 5%p < .05 does NOT meanSlide30

p

> .05 means …?

Ho is true, should be acceptedHa is falseEvidence in favor of HoThere is no effectThe effect size is smallp > .05

does NOT meanSlide31

Greenland. et al.’s

Conclusions included:

The probability, likelihood, certainty, etc. for a hypothesis cannot be derived from statistical methods alone.Significance tests and confidence intervals do not by themselves provide a logically sound basis for concluding an effect is present or absent with a given probability.Slide32

Not even scientists can

easily explain

p-valuesYou can get it right, or you can make it intuitive, but it’s all but impossible to do both.Slide33

ASA: Conclusion

Good statistical practice, as an essential component of good scientific practice,

emphasizes: principles of good study design and conduct, a variety of numerical and graphical summaries of data, understanding of the phenomenon under study, interpretation of results in context, complete reporting and proper logical and quantitative understanding of what data summaries mean

.

No

single index should substitute for scientific reasoning

.Slide34

ASA: Conclusion

Good statistical practice, as an essential component of good scientific practice,

emphasizes: principles of good study design and conduct, a variety of numerical and graphical summaries of data, understanding of the phenomenon under study, interpretation

of results in context,

complete

reporting and

proper

logical and quantitative understanding of what data summaries mean

.

No

single index should substitute for scientific reasoning

.Slide35

Study Design and Conduct

PICO-T

Bias, Confounding, ContaminationAnd, eventually, chanceSlide36

Publication

Bias

1. Bias of rhetoric 2. All’s well literature bias 3. Reference bias 4. Positive results bias 5. Hot stuff bias 6. Pre-publication bias 7. Post-publication bias 8. Sponsorship bias 9. Meta-analysis bias

Selection Bias (susceptibility bias)

1. Popularity bias

2. Centripetal bias

3. Referral filter bias

4. Diagnostic access bias

5. Diagnostic suspicion bias

6. Unmasking bias

7. Mimicry bias

8. Previous opinion bias

9. Wrong sample size bias

10. Admission rate bias (

Berkson

) 11. Prevalence-incidence bias (Neyman) 12. Diagnostic vogue bias 13. Diagnostic purity bias 14. Procedure selection bias 15. Missing clinical data bias 16. Non-contemporaneous control bias 17. Starting time bias 18. Unacceptable disease bias 19. Migrator bias 20. Membership bias 21. Nonrespondent bias 22. Volunteer bias

23. Allocation bias

24. Vulnerability bias

25. Authorization bias

Exposure Bias (performance bias)

1. Contamination bias

2. Withdrawal bias

3. Compliance bias

4. Therapeutic personality bias

5. Bogus control bias

6. Misclassification bias

7. Proficiency bias

Detection Bias (measurement bias)

1. Insensitive measure bias

2. Underlying cause bias (rumination bias)

3. End-digit preference bias

4. Apprehension bias

5. Unacceptability bias

6. Obsequiousness bias

7. Expectation bias

8. Substitution game bias

9. Family information bias

10. Exposure suspicion bias

11. Recall bias

12. Attention bias

13. Instrument bias

14. Surveillance bias

15. Comorbidity bias

16.

Nonspecification

bias

17. Verification bias (work-up bias)

Analysis Bias (Transfer Bias)

1. Post-hoc significance bias

2. Data dredging bias

3. Scale degradation bias

4. Tidying-up bias (deliberate elimination bias)

5. Repeated peeks bias

Interpretation Bias

1. Mistaken identity bias

2. Cognitive dissonance bias

3. Magnitude bias

4. Significance bias

5. Correlation bias

6. Under-exhaustion bias

The Dunning-Kruger effect

Hartman

JM,

Forsen

JW Jr, Wallace MS, Neely JG. “Tutorials in clinical research: part IV: recognizing and controlling bias

.”

Laryngoscope.

2002 Jan;112(1):23-31

.

Expanded from:

Sackett

DL.

“Bias in analytic research

.”

J

Chronic Dis. 1979;32(1-2):51-63

.Slide37

Cognitive Bias CodexSlide38

ASA: Conclusion

Good statistical practice, as an essential component of good scientific practice,

emphasizes: principles of good study design and conduct, a variety of numerical and graphical summaries of data, understanding of the phenomenon under study, interpretation of results in context, complete

reporting and

proper

logical and quantitative understanding of what data summaries mean

.

No

single index should substitute for scientific reasoning

.Slide39

Context

David Moore:

“Data are numbers, but they are not ‘just numbers.’ They are numbers with a context.”Moore and Notz 2006, Statistics: Concepts and Controversies,

NY

: Freeman, p xxiSlide40

Context

Tonight we’re going to let the statistics

speak for themselvesEd Koren, © The New Yorker, 9 December 1974Slide41

ASA: Conclusion

Good statistical practice, as an essential component of good scientific practice,

emphasizes: principles of good study design and conduct, a variety of numerical and graphical summaries of data, understanding of the phenomenon under study,

interpretation

of results in context,

complete

reporting and

proper

logical and quantitative understanding of what data summaries mean

.

No

single index should substitute for scientific reasoning

.Slide42

Words Matter

CONSORT 2010

How to Report Statistics in MedicineAMA Manual of StyleMoore and Notz 2006, Statistics: Concepts and Controversies, NY: Freeman, p xxiSlide43

CON

solidated

Standards of Reporting TrialsThe CONSORT Statement comprises a 25-item checklist and a flow diagram. The checklist items focus on reporting how the trial was designed, analysed, and interpreted; the flow diagram displays the progress of all participants through the trial. The CONSORT “Explanation and Elaboration” document explains and illustrates the principles underlying the CONSORT Statement. 

www.consort-statement.orgSlide44

Specialized CONSORT

Harms (safety)

Non-inferiorityCluster randomized trialsHerbal, AcupunctureNon-pharmacologic agentsPragmatic trialsParent reported outcomesN-of-1 trialsOrthodontic trialsPilot and feasibility trialsSlide45

E

nhancing the

QUAlity and Transparency of health ResearchSTROBE – Observational studiesPRISMA – Systematic reviewsCARE – Case reportsSRQR – Qualitative researchSTARD – Diagnostic/prognostic studiesSQUIRE – Quality improvement studies… a total of 358 reporting guidelines

http://www.equator-network.org/Slide46

Dedication

Lang: To anyone who has encountered the frustration of what

I call “Statistical Buddhism”To those who know, no explanation is necessary.To those who do not know, no explanation is possible.Slide47

Everitt

BS.

The Cambridge Dictionary of Statistics in the Medical Sciences. Cambridge, England: Cambridge University Press; 1995.GlossaryP value: probability of

obtaining

the

observed

data

(or data that are

more

extreme) if the null hypothesis

were

exactly

true.

www.amamanualofstyle.comSlide48

Al’s

Conclusion

Good statistical practice, is an essential component of good scientific practiceData are information in context. Insist on a full and complete description of the context of a study. A p-value is calculated from a set of numbers encased in certain assumptions. Viewed alone, the p-value may be meaningless.

No

single index

can substitute

for scientific reasoning

.Slide49

Thank youSlide50

ASA: Six Principles

P-values

can indicate how incompatible the data are with a specified statistical model.P-values do not measure the probability that the studied hypothesis is true, or the probability that the data were produced by random chance alone.Scientific conclusions and business or policy decisions should not be based only on whether a p-value passes a specific threshold.Proper inference requires full reporting and transparency.A p-value, or statistical significance, does not measure the size of an effect or the importance of a result.By itself, a p-value does not provide a good measure of evidence regarding a model or hypothesis.Slide51

George Cobb—Looking

Ahead:

Five ImperativesGeorge Cobb (2015) Mere Renovation is Too Little Too Late: We Need to Rethink our Undergraduate Curriculum from the Ground Up, The American Statistician, 69:4, 266-282, DOI: 10.1080/00031305.2015.1093029Flatten prerequisitesCalc I → Calc II → Calc III → Probability → Math Stat → BiostatisticsStrip away technical formalism and formulasEmbrace computationExploit context

Interpretation, motivation,direction

Teach through research

Related Contents


Next Show more