/
Correcting for Sample Overlap in Association Meta-analysis Using Summary Statistics Correcting for Sample Overlap in Association Meta-analysis Using Summary Statistics

Correcting for Sample Overlap in Association Meta-analysis Using Summary Statistics - PowerPoint Presentation

morton
morton . @morton
Follow
71 views
Uploaded On 2023-10-25

Correcting for Sample Overlap in Association Meta-analysis Using Summary Statistics - PPT Presentation

Sebanti Sengupta 11152017 Background Metaanalysis an important strategy for genetic association studies Increases sample size and power Can lead to discovery of novel loci Can use previously published study results ID: 1024537

overlap analysis meta sample analysis overlap sample meta log10 size samples studies correcting effective estimated target estimate scores standard

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Correcting for Sample Overlap in Associa..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

1. Correcting for Sample Overlap in Association Meta-analysis Using Summary StatisticsSebanti Sengupta11/15/2017

2. BackgroundMeta-analysis an important strategy for genetic association studiesIncreases sample size and powerCan lead to discovery of novel lociCan use previously published study resultsWhat happens if different studies in the meta-analysis have samples in common?We wish to estimate the ‘bottom-line’ p-value that corrects for sample overlapUse summary statistics (Z-score, allele frequency, sample size)

3.

4.

5. Standard Meta-analysis MethodUsual way to combine estimates from different studies:where is the estimator of interest from the study (eg. Z-score or log odds ratio), the corresponding estimated variance the weight used for the study in the meta-analysis 

6. Standard Meta-analysis MethodUsual way to combine estimates from different studies: Not true if samples overlap

7. Standard Meta-analysis MethodUsual way to combine estimates from different studies: Not true if samples overlap

8. Standard Meta-analysis MethodUsual way to combine estimates from different studies: Not true if samples overlapIdea: estimate the covariance term using summary statistics

9. MethodAssume that samples are homogeneous, that is, belong to the same ancestryAllele frequencies and effect sizes same in all samplesDo not vary with overlap

10. Stratifying Markers by Sample SizeThe overlap number for a marker M may differ depending on whether it is present in the overlapping samples or notPossible combinations for a marker M00Overlap Numberwhere sample size of Cohort 2 is  

11. Estimating CovarianceSuppose the trait is independent of all markers, then sample correlation of the Z-scores can be used to estimate the covarianceTrait associated loci are expected to show correlation even when samples are independentUse Z-scores truncated at some cut-off value , to estimate the correlation Most results shown use  

12. Correcting for OverlapMeta-analysis done by adjusting the covariance term in the weightsThe updated weights to correct for overlap are as follows: 

13. Effective Sample Size of OverlapWe estimate effective sample size of overlap as:Note that this need not be the actual number of samples overlappingEg: For case-control studies, the overlap estimated may correspond to a range of overlap numbers depending on what proportion of cases and controls overlap respectively 

14. Creating Overlapping Datasets: Case-Control Study (T2D) 3 European Studies from DIAMANTE: FUSION, METSIM and MGITrue combined sample size 25,240

15. Estimated Overlap Stratified by Sample SizeObserved Sample SizeCount of MarkersNCategoryEstimated OverlapCI 4,418 B+B2,209(2,187, 2,209) 10,946 A+B+B2,225(2,175, 2,262) 20,921 B+B+C2,656(2,540, 2,765) 23,031 A+C176(52, 311) 27,449 A+B+B+C2,326(2,238, 2,365)Z-scores truncated at cutoff = 1

16. Meta-Analysis Results-log10 Naive Meta-analysis P-value-log10 Target P-valueMeta-analysis without Correcting for Overlap

17. Meta-Analysis Results-log10 Naive Meta-analysis P-value-log10 Target P-value-log10 New Method P-value-log10 Target P-valueMeta-analysis without Correcting for OverlapMeta-analysis Correcting for Overlap

18. Creating Overlapped Datasets: Quantitative TraitGLGC European Cohorts, Trait HDL-cholesterolTrue combined sample size 15,579

19. Estimated Overlap Stratified by Sample SizeCount of MarkersObserved Sample SizeSample SizeCategoryEstimated OverlapCI4,970B+B2,485(2,478, 2,485)10,223B+B+C2,409(2,324, 2500)12,811A+B+B2,448(2,305, 2583)13,094A+C0-18,064A+B+B+C2,546(2,458, 2,592)Z-scores truncated at cutoff = 1

20. Meta-analysis Results-log10 Naive Meta-analysis P-value-log10 Target P-valueMeta-analysis without Correcting for Overlap

21. Meta-analysis Results-log10 Naive Meta-analysis P-value-log10 Target P-value-log10 Weight-stratified Adjusted P-value-log10 Target P-valueMeta-analysis without Correcting for OverlapMeta-analysis Correcting for Overlap

22. Meta-Analysis with Multiple StudiesFor multiple studies, meta-analysis is conducted sequentiallyWhen meta-analyzing a pair of studies and , for each marker we calculate and this total weight is used when meta-analyzing the combined Z with a new study 

23. In SummaryWhen different studies in a meta-analysis have overlapping samples, the standard methods lead to an inflation in type I errorIf overlapping sample sizes are unknown, effective overlap sample size can be estimatedAssuming samples belong to the same ancestryActual overlap numbers may differ depending on overlap patternCovariance estimated using truncated Z-scores used to correct for overlap in meta-analysis

24. AcknowledgementsGoncalo AbecasisMichael BoehnkeDaniel Taliun

25. Thanks!

26. Estimated Overlap by Varying Cutoff Values: T2DN = 23,031N = 27,449Estimated Effective Sample Size of OverlapEstimated Effective Sample Size of Overlap

27. Estimated Overlap for Varying Cut-off Values: GLGCEstimated Effective Sample Size of OverlapEstimated Effective Sample Size of OverlapN = 13,094N = 18,064