sibpair test for rare variant association Sebastian Zöllner University of Michigan Acknowledgements Matthew Zawistowski Keng Han Lin Mark Reppell ID: 379897
Download Presentation The PPT/PDF document "Robust and powerful" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Robust and powerful sibpair test for rare variant association
Sebastian Zöllner
University of MichiganSlide2
Acknowledgements
Matthew
Zawistowski
Keng
-Han Lin
Mark
ReppellSlide3
GWAS have been successful.
Only some heritability is explained by common
variants.Uncommon coding variants (maf 5%-0.5%) explain less.Rare variants could explain some ‘missing’ heritability.Better Risk prediction.Rare variants may identify new genes.
Rare exonic
variants may be easier to annotate functionally and interpret.
Rare Variants –Why Do We Care?Slide4
Testing individual variants is unfeasible.Limited power due to small number of observations.Multiple testing correction.
Alternative: Joint test.
Burden test (CMAT, Collapsing, WSS)Dispersion test (SKAT, C-alpha)Burden/Dispersion TestsSlide5
Gene-based tests have low power.Nelson at al (2010) estimated that 10,000 cases & 10,000 controls are required for 80% power in half of the genes.
Large sample size required
More heterogeneous sample =>Danger of stratificationStratification may differ from common variants in magnitude and pattern.Challenges of Rare Variant AnalysisSlide6
(202 genes, n=900/900
,
MAF < 1%, Nonsense/nonsynonymous variants)
Stratification in European PopulationsSlide7
Variant Abundance across Populations
African-American
Southern Asia
South-Eastern Europe
Finland
South-Western Europe
Northern Europe
Central EuropeWestern Europe
Eastern EuropeNorth-Western Europe
A gradient in diversity from Southern to Northern Europe
Sample Size
Expected Number of variants per kbSlide8
Allele Sharing
Median EU-EU: 0.71
Median EU-EU: 0.86Median EU-EU: 0.98
Measure of rare variant diversity.
Probability of two carriers of the minor alleles being from different populations (normalized).Slide9
Select 2 populations.
Select mixing parameter r.
Sample 30 variants from the 202 genes.Calculate inflation based on observed frequency differences.General Evaluation of StratificationSlide10
Inflation by Mixture Proportion
Zawistowski et al. 2014Slide11
Inflation across ComparisonsSlide12
If multiple affected family members are collected, it may be more powerful to sequence all family members.
Family-based tests can be robust against stratification.
TDT-Type tests are potentially inefficient. How to leverage low frequency?Low frequency risk variants should me more common in cases.And even more common on chromosomes shared among many cases.Family-based Test against StratificationSlide13
Consider affected
sibpairs
.Estimate IBD sharing.
Compare the number of rare variants on shared (solid) and non-shared chromosomes (blank
).Any aggregate test can be applied.
Family Test
S=0
S=2
S=1Slide14
Twice as many non-shared as shared chromosomes.Null hypothesis determines test:Shared alleles : Non-shared alleles=1:2
Test for linkage or associationShared alleles : Non-shared alleles=Shared chromosomes : Non-shared chromosomes Test for association onlyBasic PropertiesSlide15
IBD sharing is known.Individuals
don’t need phase
to identify shared variants.Except one configuration: IBD 1 and both sibs are heterozygousUnder null, probability of configuration 2 is allele frequency.
Under the alternative, we need to use multiple imputation.
Haplotypes not required
Configuration 1
+1 shared
Configuration 1
+2 non-sharedSlide16
Assume chromosome sharing status is known for each sibpair.
Count
rare variants; impute sharing status for double-heterozygotes.Compare number of rare variants between shared and non-shared chromosomes with chi-squared test (Burden Style).Evaluation of Internal Control
S=0
S=2
S=1Slide17
Classic Case-Control
Selected Cases
Enriching Based on Familial Risk
S=0
S=2
S=1
Internal ControlSlide18
Consider 2 populations.p=0.01 in pop1, p=0.05 in pop2.
1000
sibpairs for internal control design.1000 cases, 1000 controls for selected cases. 1000 cases and 1000 controls for case-control.Sample cases from pop1 with proportion .
Test for association with
α=0.05.
StratificationSlide19
Robust to Population StratificationSlide20
Realistic rare variant models are unknownTypical allele frequencyNumber of risk variants/gene
Typical effect
sizeDistribution of effect sizesIdentifiabillity of risk variantsGoal: Create a model that summarizes these unknowns intoSummed allele frequencyMean effect sizeVariance of effect sizeEvaluating Study DesignsSlide21
Assume many loci carrying risk variants.
Risk alleles at multiple loci each increase the risk by a factor independently.
Frequency of risk variant: Independent casesOn shared chromosome
Basic Genetic Model
A
Affected
AA
Affected relative pairRRisk locus genotypeSlide22
Relative risk is sampled from distribution f with mean , variance
σ
2.Simplifications: Each risk variant occurs only once in the population.Each risk variant on its own haplotype.Then the risk in a random case isEffect Size Model
A
Affected
r
1
,r
2
Carrier status of chromosome 1,2
m1,m2Relative risk of risk variants on 1,2Mean effect sizeσ
2Variance of effect sizeSlide23
To calculate the probability of having an affected sib-pair we condition on sharing S.For S>0, the probability depends on
σ
2. E.g. (S=2):Effect in Sib-pairs
AA
Affected
rel
pair
riCarrier stat chrom i
miRelative risk of variant on ifDistribution of RR
Mean RRσ2Variance of RRS
Sharing statusSlide24
Select μ, σ
2
and cumulative frequency fCalculate allele frequency in cases/controls P(R|A).Calculate allele frequency in shared/non-shared chromosomes. => Non-centrality parameter of χ2 distribution. Analytic Power AnalysisSlide25
Minor Allele Frequency
Conventional Case-Control
Internal ControlSelected CasesSlide26
Power Comparison by Mean Effect SizeSlide27
Power Comparison by VarianceSlide28
Gene-gene interaction affects power in families.For broad range of interaction models, consider two-locus model.
G now has alleles g
1,g2. The joint effect isWe compare the effect of while adjusting L and G to maintain marginal risk.
Gene-Gene InteractionSlide29
Power for Antagonistic InteractionSlide30
Power for Positive InteractionSlide31
Stratification is a strong confounder for rare variant tests.
Family-based association methods are robust to stratification.
Comparing rare variants between shared and non-shared chromosomes is substantially more powerful than case-control designs.All family based methods/samples depend on the model of gene-gene interaction. Under antagonistic interaction power can be lower than a population sample.ConclusionsSlide32
Questions?Thank you for your attention