of M ultivariate B inary D ata Bernhard Klingenberg Assoc Prof of Statistics Williams College MA wwwwilliamsedubklingen Outline Challenges Associations of various degrees among binary variables ID: 336891
Download Presentation The PPT/PDF document "Comparing Margins" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Comparing Marginsof Multivariate Binary Data
Bernhard KlingenbergAssoc. Prof. of StatisticsWilliams College, MAwww.williams.edu/~bklingenSlide2
OutlineChallenges:Associations of various degrees among binary variablesSimultaneous InferenceSparse and/or unbalanced data, Test statistics with discrete support
Asymptotic theory questionableSetup:Two indep. groups
Response: Vector
of k correlated binary
variables (multivariate binary)
Goal:
Inference about
k margins:
Marginal
Risk Differences
Marginal
Risk RatiosSlide3
Outline
Motivating ExamplesFrom drug safety or animal toxicity/carcinogenicity studiesSource: http://us.gsk.com/products/assets/us_advair.pdfSlide4
Source:
http://www.pfizer.com/files/products/uspi_lipitor.pdfSlide5
OutlineExample: AEs from a vaccine trial (flu shot):
> head(Y1) # ACTIVE Treatment n1=1971ID HEADACHE PAIN MYALGIA ARTHRALGIA MALAISE FATIGUE CHILLS2 1 1 1 1 1 1 14 0 1 1 0 0 1 05 1 0 0 0 0 0 06 1 1 1 1 1 1 17 0 0 0 0 0 1 0
9 1 0 1 1 1 1 1
> head(Y2) # PLACEBO Treatment
n2=1554
ID HEADACHE PAIN MYALGIA ARTHRALGIA MALAISE FATIGUE CHILLS
1 0 0 0 0 0 0 0
3 0 0 0 0 0 0 0
8 0 0 0 0 1 0 0
10 0 0 0 0 0 0 0
11 0 0 0 0 0 0 0
15 0 0 1 0 0 1 0Slide6
Notation and Setup
k-dimensional response vectors: Group 1 Group 2Random sample in each group: Group 1
Group 2
Joint
distrib
. in each group depends on 2
k
-1 parameters
Group
1
Group
2
Slide7
Comparing Margins
Usually only interested in k margins. Group 1 Group 2
With just
two
(k=2) adverse events:
Group 1
Group 2
No
Yes
No
Yes
Headache
Pain
No
Yes
No
Yes
Headache
PainSlide8
Comparing Margins
Group1 Group2 DiffHEADACHE 0.2603 0.2407 0.0196INJECTION SITE PAIN 0.6088
0.1384
0.4705
MYALGIA
0.2588
0.1088 0.1500
ARTHRALGIA
0.0893
0.0579
0.0314
MALAISE
0.2085
0.1332
0.0753
FATIGUE
0.2476
0.2098
0.0378
CHILLS
0.0928
0.0463
0.0465
Differences
in
marginal
incidence rates between
Group 1 (Treatment)
and
Group 2 (Control)Slide9
Family of Tests
j-th Null Hypothesis:
Unrestricted and restricted MLEs:
Slide10
Comparing Margins
Estimates of marginal incidence rates and test statistics comparing Group 1 (Treatment) and Group 2 (Control)
p-hat1
p-hat2
p-check
p-tilde
Wald
Local
Global
HEADACHE
0.260
0.241
0.252
0.260
1.34
1.33
1.32
PAIN
0.609
0.138
0.401
0.405
33.47
28.29
28.26
MYALGIA
0.259
0.109
0.193
0.210
11.87
11.21
10.85
ARTHRALGIA
0.089
0.058
0.076
0.082
3.59
3.50
3.37
MALAISE
0.209
0.133
0.175
0.196
5.99
5.84
5.60
FATIGUE
0.248
0.210
0.231
0.244
2.662.642.59CHILLS0.0930.0460.0720.0855.515.294.93Slide11
Asymptotic Test
Note: Asymptotically, multivariate normal with covariance matrix determined by Slide12
Asymptotic Test
Correlation Matrix:>
round(cov2cor(Sigma),2)
d1 d2 d3 d4 d5 d6 d7
d1
1.00 0.04 0.29 0.26 0.38
0.41
0.27
d2
1.00 0.18 0.09 0.08 0.10 0.01
d3
1.00
0.46 0.35 0.36 0.30
d4
1.00
0.33 0.33 0.32
d5
1.00
0.51
0.44
d6
1.00
0.37
d7
1.00
>
qmvnorm
(0.95, tail="
both.tails
",
corr
=cov2cor(Sigma))
$
quantile
[1]
2.656222Slide13
Asymptotic Test
Correlation Matrix:>
round(cov2cor(Sigma),2)
d1 d2 d3 d4 d5 d6 d7
d1
1.00
0.06 0.33 0.28 0.41
0.41
0.29
d2
1.00 0.28 0.11 0.15 0.12 0.09
d3
1.00
0.46 0.41 0.36 0.35
d4
1.00
0.32 0.34 0.28
d5
1.00
0.50
0.47
d6
1.00
0.37
d7
1.00
>
qmvnorm
(0.95, tail="
both.tails
",
corr
=cov2cor(Sigma))
$
quantile
[1] 2.653783Slide14
Permutation Approach
When testing can use Permutation ApproachThis assumes distributions are exchangeable (i.e. identical), much stronger assumption than under nullNeed two extra conditions:Sequences of all 0's as or more likely to occur under group 2 (Control)
Sequence of
all 1's
as or more likely to occur under group 1 (Treatment)
Slide15
Permutation vs. Asymptotic
Permutation
vs. asymptotic distribution of
Critical Value:
(
a
= 0.05)
c
perm
= 2.655
c
asympt
=
2.654
c
Bonf
= 2.690
Permut. Distr.
Asympt
. Distr.Slide16
Family of Tests
Results: Raw and Adjusted P-values
asymptotic
exact
Diff
Global
raw.P
adj.P
raw.P
adj.P
HEADACHE
0.020
1.32
0.1876
0.7061
0.1830
0.7013
PAIN
0.471
28.25
0.0000
0.0000
0.0000
0.0000
MYALGIA
0.150
10.85
0.0000
0.0000
0.0000
0.0000
ARTHRALGIA
0.031
3.37
0.0007
0.0051
0.0005
0.0032
MALAISE
0.075
5.60
0.0000
0.0000
0.0000
0.0000
FATIGUE
0.038
2.59
0.0094
0.0589
0.0082
0.0516
CHILLS0.0474.930.00000.00000.00000.0000Slide17
Simultaneous Confidence Intervals
Invert family of tests:Confidence Region: Simplifies to simultaneous confidence intervals if
Slide18
Simultaneous Confidence Intervals
Results: Inverting Score test
diff
LB
UB
HEADACHE 0.0196 -0.0196 0.0583
PAIN 0.4705 0.4323 0.5069
MYALGIA 0.1500 0.1162 0.1835
ARTHRALGIA 0.0314 0.0078 0.0547
MALAISE 0.0753 0.0416 0.1086
FATIGUE 0.0378 -0.0002 0.0752
CHILLS 0.0465 0.0239 0.0692Slide19
Simultaneous Confidence Intervals
We used (and recommend) score statistic Could use Wald statistic instead This is equivalent to fitting marginal model via GEE: asympt.
multiv
. normal, with (sandwich) covariance matrix (same as before)
Use distribution of for multiplicity adjustmentSlide20
Simultaneous Confidence Intervals
Results: GEE approach (= inverting Wald test)
diff
LB
UB
HEADACHE 0.0196 -0.0194 0.0586
PAIN 0.4705 0.4331 0.5078
MYALGIA 0.1500 0.1164 0.1836
ARTHRALGIA 0.0314 0.0082 0.0546
MALAISE 0.0753 0.0419 0.1087
FATIGUE 0.0378 0.0001 0.0755
CHILLS 0.0465 0.0241 0.0689Slide21