/
Type I / Type II Error Control in Drug Development Type I / Type II Error Control in Drug Development

Type I / Type II Error Control in Drug Development - PowerPoint Presentation

luanne-stotts
luanne-stotts . @luanne-stotts
Follow
427 views
Uploaded On 2017-12-14

Type I / Type II Error Control in Drug Development - PPT Presentation

Andy Grieve Head of Centre of Excellence for Statistical Innovation UCB Pharma UK PSI Annual Conference London 1417 May 2017 1 2015 2016 2002 2003 2006 2016 2007 2017 66666 2017 ID: 615221

type error control prior error type prior control bayesian rate clinical power sample probability simulation alternative device statistical amp case calibrated

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Type I / Type II Error Control in Drug D..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Type I / Type II Error Control in Drug Development

Andy GrieveHead of Centre of Excellencefor Statistical Innovation,UCB Pharma, UK.PSI Annual Conference, London14-17 May 2017

1Slide2

2015

2016

2002

2003

2006

2016

2007

2017

66666

2017

Pharmaceutical Statistics Papers: 2002/17

2

Linking theme: Control of error rates Slide3

Choosing Type I and Type II Errors

3Slide4

Can the pharmaceutical industry reduce attrition rates?

Kola & Landis (2004) NATURE REVIEWS | DRUG DISCOVERY

4Slide5

Are Regulators Conservative (with a small c) ?

5Slide6

False Positive Rate as a Function of US Prevalence and Severity

6

To appear in Journal of EconometricsSlide7

Determination of Sample Size

-4

-2

0

2

4

6

8

Standardised Normal deviate

 

 

 

 

 

 

-6

7

7Slide8

£

$

Neyman

and Pearson (Phil Trans Roy

Soc

Series A, 1933)

“If we reject H

0

, we may reject when is true; if we accept H

0

, we may be accepting it when it is false, that is to say, when really some alternative

H

t

, is true. These two sources of error can rarely be eliminated completely;

in some cases it will be more important to avoid the first, in others the second

. We are reminded of the old problem considered by LAPLACE of the number of votes in a court of judges that should be needed to convict a prisoner. Is it more serious to convict an innocent man or to acquit a guilty? That will depend upon the consequences of the error; is the punishment

death

or fine; what is the danger to the community of released criminals ; what are the current ethical views on punishment. From the point of view of mathematical theory all that we can do is to show how the risk of the errors may be controlled and minimised. The use of these statistical tools in any given case, in determining just how the balance should be struck, must be left to the investigator.”

8Slide9

2012 Ecology Papers on Significance Testing

Significance, June 2012, 29-30.

PLoS ONE, 7, e32734, 2012.

9Slide10

Minimise

, where is the

ratio of the costs of making the corresponding error.

 

Alternative to Maximizing Power for Fixed Type-I Error –

Mudge

et al

10

 

 

M

udge

et al also consider the case where

0

and

1

are the prior probabilities associated with the null and alternative hypothesis.

“Conventionally the probability of type I error is set at 5% or less…; the precise choice may be

influenced by the prior plausibility of the hypothesis under test and the desired impact of the results.”

ICH E9 (1998) - Statistical Principles for Clinical TrialsSlide11

Alternative to Maximizing Power for Fixed Type-I Error

Choose

 and  to minimise the Total Expected Cost

 

 

11Slide12

For known

σ2 how to find the α to minimise Determining Weights Leading to Standard Type I and Type II Error Rates

Sample Sizing Based On Minimising Minimising the Sum of Errors and The

Neyman-Pearson LemmaThe Likelihood Principle and Sampling Frames.Bayesian Considerations

 

How to Test Hypotheses if You Must

12Slide13

13

-

(

)

Weighted Sum of Error Rates as Function of

(

s

=1,

d

0

= 2, n=21,

w

=3)

13Slide14

Optimal Weights Giving Standard Type I and Type II Error Rates

14

3.00

1.76

14Slide15

Sample Size Factor to Control the Weighted (ω or ω

-1) Sum of Error Rates to be ≤ Ψ0

15

 

compare to

 Slide16

Alternative Form of Neyman-Pearson Approach

Neyman Pearson Lemma (1933) sought a critical region R(x) maximised the power 1-b.

Suppose now we seek a critical region to minimise the weighted average of

a and

b

– weights

0

and

1

.

 

likelihood ratio

16

16Slide17

Discussion

This is not new - Savage & Lindley, Cornfield (1960s), DeGroot

(1970s), Bernardo & Smith (1990s),

Perrichi & Pereira (2012, 2013

) -> solves Lindley ‘s paradox.

Cornfield(1966

)

showed that minimising the weighted errors is also appropriate in sequential (adaptive) trials.

Spieglehalter

, Abrams & Myles (2004)

quote Cornfield “

the entire basis for sequential analysis depends upon nothing more profound than a preference for minimizing

b

for given

a

rather than minimizing their linear combination. Rarely has so mighty a structure and one so surprising to scientific common sense, rested on so frail a distinction and so delicate a preference.”

17

17Slide18

Calibration of Bayesian Procedures

18Slide19

Academic Guidelines for Reporting Bayesian Analyses

ROBUST

BAYESWATCH

BASIS

SAMPL

Prior Distribution

Specified

Justified

Sensitivity analysis

Analysis

Statistical model

Analytical technique

Results

Central tendency

SD or Credible Interval

Introduction

Intervention described

Objectives of study

Methods

Design of Study

Statistical model

Prior / Loss function?

When constructed

Prior / Loss descriptions

Use of Software MCMC , starting values, run-in,

length of runs, convergence, diagnostics

Results

Interpretation

Posterior distribution summarized

Sensitivity analysis if alternative priors used

Research Question

Statistical model

Likelihood, structure, prior & rationale

Computation

Software - convergence if MCMC, validation, methods for generating posterior summaries

Model checks, sensitivity analysis

Posterior Distribution

Summaries used:

i

). Mean,

std

, quintiles ii) posterior shape, (iii) joint posterior for

mult

comp, (iv) Bayes factors

Results of model checks and sensitivity analyses

Interpretation of Results

Limitation of Analysis

Prior Distribution

Specified

Justified

Sensitivity analysis

Analysis

Statistical model

Analytical technique

Software

Results

Central tendency

SD or Credible Interval

What’s

Missing?

19Slide20

“Because of the inherent flexibility in the design of a Bayesian clinical trial, a thorough evaluation of the operating characteristics should be part of the trial design. This includes evaluation of:

probability of erroneously approving an ineffective or unsafe device (type I error) probability of erroneously disapproving a safe and effective device (type II error) power (the converse of type II error: the probability of appropriately approving a safe and effective device) sample size distribution (and expected sample size)

prior probability of claims for the device if applicable, probability of stopping at each interim look.

Guidance for the Use of Bayesian Statistics in Medical Device Clinical Trials – FDA/CDRH 2010

20Slide21

Requires simulations to assess Bayesian approaches.

If type I error too largechange success criterion (posterior probability)reduce number of interim analysesdiscount prior information

increase sample sizealtering calculation of type I error

“the degree to which we might relax the type I error control is a case-by-case decision that depends …. Primarily on the confidence we have in prior information”

Guidance for the Use of Bayesian Statistics in Medical Device Clinical Trials – FDA/CDRH 2010

21Slide22

Examples

Bayesian monitoring of clinical trialsUse of historical informationBayesian Adaptive Randomisation (BAR)Planning and Conducting Simulation StudiesExperimental Design in Planning Simulation Experiments

Analysis and Reporting of Simulation ExperimentsProof by Simulation

Idle Thoughts of a “Well-Calibrated” Bayesian In Clinical Drug Development.

22Slide23

“requiring strict control of the type-I error results in 100% discounting of the prior information.”

If we require absolute control of the type I error - “perfectly-calibrated” - then throw away any prior information.Remember FDA’s Bayesian guidance that “it may be appropriate to control the type I error at a less stringent level than when no prior information is used”. The FDA’s remark is a recognition of the phenomenon and an endorsement of a less strict control of type I error - “well-calibrated”. Special Case of Bayesian MonitoringSingle analysis – No Interims

23Slide24

24

Accuracy of Simulations

Posch

,

Mauerer

and Bretz (SIM, 2011)

Studied adaptive design with treatment selection at an interim and sample size re-estimation

Control FWER (familywise error rate) in a strong sense – under all possible configurations of hypotheses

Conclude: That you have to be careful with the assumptions behind the simulations.

Intriguing point: the choice of seed has an impact on the estimated type I errorSlide25

25

Accuracy of Simulations

Posch

, Maurer and Bretz (SIM, 2011)

Monte Carlo estimates of the Type I error rate are not exact – subject to random error

the choice of the seed of the random number generator impacts Type I error rate estimate.

A strategy of searching for a seed that minimizes estimated Type I error rate can lead to an underestimation of Type I error rate.

Ex: Type I error rate is estimated in a simulation experiment by the percentage of significant results among 10

4

(10

5

) simulated RCTs, on average the evaluation of only 4 (45) different seeds will ensure finding a simulated Type I error rate below 0.025 when the actual error rate is 0.026.Slide26

26

Accuracy of Simulations

Posch

,

Mauerer

and Bretz (SIM, 2011)

If it is important to be able to differentiate between 0.025 and 0.026 then we should power our simulation experiment for it

A sample of 10

4

has only 10% power to detect H

A

=0.026 vs H

0

=0.025, 10

5

: 50%

80% power requires n=194,000 – search 380 seeds

90% power requires n=260,000 – search 1600 seedsSlide27

Average Run Length to find a “Good Seed” Slide28

Appropriate approach:

Choose decision rule based on clinical or commercial criteria.Determining Decision Criteria

28Slide29

0

Dose

Effect Over Placebo

ED95

*

Efficacy (>2 pts)

Futility (< 1 pt)

ASTIN Trial – Acute Stroke: Dose Effect Curve

(Grieve and Krams, Clinical Trials, 2005)

29Slide30

POC Study in Neuropathic Pain

Smith et al (Pharmaceutical Statistics, 2006)

30Slide31

Appropriate approach:

Choose decision rule based on clinical or commercial criteria.Investigate operating characteristicsIf they are unacceptable e.g., type I error > 20% then look to change them – “well-calibrated”BUT don’t strive to get exact control – “perfectly-calibrated”Conclusions: Determining Decision Criteria

31