in Human Disease Genetics amp Genomics Manolis Kellis MIT Computer Science amp Artificial Intelligence Laboratory Broad Institute of MIT and Harvard Big data Opportunities amp Challenges in human disease genetics amp genomics ID: 294341
Download Presentation The PPT/PDF document "Big Data Opportunities and Challenges" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Big Data Opportunities and Challengesin Human Disease Genetics & Genomics
Manolis Kellis
MIT Computer Science & Artificial Intelligence Laboratory
Broad Institute of MIT and HarvardSlide2
Big data Opportunities & Challenges
in human disease genetics & genomics
The goal: Mechanistic basis of human diseaseEpigenomics: Enhancers, networks, regulators, motifsGenetics: GWAS, QTLs, molecular epidemiology
The challenges / opportunities: Effects are very small, huge number of hypothesesMuch larger cohorts are needed, consent
limitationsTechnologies for privacy vs.
excuse for data hoardingOvercoming the challenges: Case study: Schizophrenia, Alzheimer’sCollaboration & sharing: personal & technologicalSlide3
CATGACTG
CATG
C
CTG
Genetic
Variant
Disease
Environment
Bringing knowledge gap from genetics to disease
Chromatin
states
Promoter
Enhancer
Insulator
Silencer
Circuitry
Control regions
Retina
Heart
Cortex
Lung
Blood
Skin
Nerve
Tissue
Cell Type
Intermediate
effects
Lipids
Tension
Eye
drusen
Metabol
is
m
Drug response
Protein
miRNA
TIMP3
ncRNA
Target
genes
Factors
Requires: systematic understanding of genome functionSlide4
The most complete map of human gene regulation2.3M regulatory elements across 127 tissue/cell typesHigh-resolution map of individual regulatory motifs
Circuitry: regulatorsregionsmotifstarget genesSlide5
Non-coding variants lie in tissue-specific regulatory regions
Yield new insights on relevant tissues and pathways
Enable linking non-coding elements to relevant target genes
Provide a mechanistic basis for developing therapeuticsSlide6
Control regions harbor 1000s
weak-effect disease SNPs
GWAS top hits only explain small fraction of trait heritabilityFunctional enrichments well past genome-wide significanceSlide7
P
oorly ranked
SNP nearby
Highly ranked
SNP nearby
Bayesian
integration of weak effects
disease modules
MAZ no direct assoc, but clusters w/ many T1D hits
MAZ indeed known regulator of insulin expression
Disease gene
Genetic association
Disease SNPSlide8
Brain methylation changes in Alzheimer’s patients
Variation in methylation patterns largely genotype driven
Global signature of repression in 1000s regulatory regions:
hypermethylation, enhancer states, brain regulator targets
Genotype
(1M SNPs
x700
ind.)
Methylation
(450k probes
x 700
ind
)
Reference Chromatin states
Dorsolateral
PFC
MAP Memory and Aging Project
+ ROS Religious Order StudySlide9
Big data Opportunities & Challenges
in human disease genetics & genomics
The goal: Mechanistic basis of human diseaseEpigenomics: Enhancers, networks, regulators, motifsGenetics: GWAS, QTLs, molecular epidemiology
The challenges / opportunities: Effects are very small, huge number of hypothesesMuch larger cohorts are needed, consent
limitationsTechnologies for privacy vs.
excuse for data hoardingOvercoming the challenges: Case study: Schizophrenia, Alzheimer’sCollaboration & sharing: personal & technologicalSlide10
Big data Opportunities & Challenges
in human disease genetics & genomics
The goal: Mechanistic basis of human diseaseEpigenomics: Enhancers, networks, regulators, motifsGenetics: GWAS, QTLs, molecular epidemiology
The challenges / opportunities: Effects are very small, huge number of hypothesesMuch larger cohorts are needed, consent
limitationsTechnologies for privacy vs.
excuse for data hoardingOvercoming the challenges: Case study: Schizophrenia, Alzheimer’sCollaboration & sharing: personal & technologicalSlide11
Scaling of QTL discovery power w/ sample
Number of meQTLs continues to increase linearly
Weak-effect meQTLs: median R2<0.1 after 400 indiv.Slide12
WCPG Hamburg 2012 (~65K)
Freeze Jan. 2013 (~70K)
Incl. SWE + CLOZUK
(~60K)
Inflection point in complex trait GWAS
Freeze May 2013 (~80K)
Incl. replication (~100K)Slide13
Schizophrenia GWAS: Number of significant loci
35,000
cases
62 loci!
3,500 cases 0 loci
10,000 cases 5 lociSlide14
Similar inflection point found in every complex trait!
Significantly associated regions (
p < 5e-08)
Adult height
Crohn’s
Schizophrenia
(per
5000/5000)
(per 1000/1000)
(per 3000/
3000)
1x
0
2
1
2x
2
4
2
3x7569x685162
18x180--
Same story in:Type 1
diabetesType 2 diabetes
Serum cholesterol levelEvery common chronic diseaseProof that Schizophrenia is a heritable, medical disorder
Genetic
architecture similar to non-brain diseases and traits
Many genes
recognition of
key pathways and processesVoltage-gated
calcium channels (CACNA1C, CACNA1D, CACNA1I, CACNB2)
Proteins interacting with FMRP, fragile X gene
Neuron organization: Postsynaptic density, dendritic spine heads
Enhancers: brain (angular gyrus,
inferior temporal lobe), immune
Larger samples lead to new biological insightsSlide15
Big data Opportunities & Challenges in human disease genetics & genomicsThe goal: Mechanistic basis of human disease
Epigenomics: Enhancers, networks, regulators, motifsGenetics: GWAS, QTLs, molecular epidemiologyThe challenges / opportunities:
Effects are very small, huge number of hypothesesMuch larger cohorts are needed, consent limitationsTechnologies for privacy vs.
excuse for data hoardingOvercoming the challenges: Collaboration, consortia, sharing of datasets
Case study: Schizophrenia, Alzheimer’s