selection regulation epigenomics disease Manolis Kellis MIT Computer Science amp Artificial Intelligence Laboratory Broad Institute of MIT and Harvard Recombination breakpoints Family Inheritance ID: 571810
Download Presentation The PPT/PDF document "Computational personal genomics:" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Computational personal genomics: selection, regulation, epigenomics, disease
Manolis Kellis
MIT Computer Science & Artificial Intelligence Laboratory
Broad Institute of MIT and HarvardSlide2
Recombination breakpoints
Family Inheritance
Me vs.
my brother
My dad
Dad’s mom
Mom’s dad
Human ancestry
Disease risk
Genomics:
Regions
mechanisms
drugs
Systems
: genes
combinations pathways
Personal genomics today: 23 and WeSlide3
Goal: A systems-level understanding of genomes and gene regulation:
The regulators
: Transcription factors, microRNAs, sequence specificities
The regions
: enhancers, promoters, and their tissue-specificity
The targets: TFstargets,
regulators
enhancers, enhancersgenes
The grammars
: Interplay of multiple TFs
prediction of gene expression
The parts list = Building blocks of gene regulatory networks
CATGACTG
CATG
C
CTG
Disease-associated
variant (SNP/CNV/…)
Gene annotation
(Coding, 5’/3’UTR, RNAs)
Evolutionary signatures
Non-coding annotation
Chromatin signatures
Roles in gene/chromatin regulation
Activator/repressor signatures
Other evidence of function
Signatures of selection (sp/pop)
Understanding human variation and human disease
Challenge: from loci to mechanism, pathways, drug targetsSlide4
Compare 29 mammals: Reveal constrained positionsReveal individual transcription factor binding sitesWithin motif instances reveal position-specific biasMore species: motif consensus directly revealed
NRSF
motifSlide5
Chromatin state dynamics across nine cell typesSingle annotation track for each cell typeSummarize cell-type activity at a glanceCan study 9-cell activity pattern across
Correlated
activity
Predicted
linkingSlide6
xx
Disease-associated SNPs enriched for enhancers in relevant cell typesE.g. lupus SNP in
GM enhancer disrupts Ets1 predicted activator
Revisiting disease-
associated variantsSlide7
HaploReg: Automate search for any disease study(compbio.mit.edu/HaploReg)
Start with any list of SNPs or select a GWA studyMine publically available ENCODE data for significant hitsHundreds of assays, dozens of cells, conservation, motifs
Report significant overlaps and link to info/browserSlide8
Experimental dissection of regulatory motifsfor 10,000s of human enhancers
54000+ measurements (x2 cells, 2x repl)Slide9
Example activator: conserved HNF4 motif match
WT expression
specific to HepG2
Non-disruptive changes maintain expression
Motif match disruptions reduce expression to background
Random changes depend on effect to motif matchSlide10
Allele-specific chromatin marks: cis-vs-trans effectsMaternal and paternal GM12878 genomes sequencedMap reads to phased genome, handle SNPs indelsCorrelate activity changes with sequence differencesSlide11
Brain methylation in 750 Alzheimer patients/controls
500,000
methylation
probes
750 individuals
10+ years of cognitive evaluations, post-mortem brains
93% of functional epigenomic variation is genotype driven!
Global repression in 7,000 enhancers, brain-specific targets
Phil de Jager, Roadmap disease epigenomics
Brad Bernstein
REMC mapping
Genome
Epigenome
meQTL
Phenotype
Epigenome
Classification
MWAS
1
2Slide12
Global hyper-methylation in 1000s of AD-associated loci
Alzheimer’s-associated probes are hypermethylated
480,000 probes, ranked by Alzheimer’s association
P-value
Methylation
Top 7000 probes
Global effect across 1000s of probes
Rank all probes by Alzheimer’s association
7000 probes increase methylation (repressed)
Enriched in brain-specific enhancers
Near motifs of brain-specific regulators
Complex disease: genome-wide effectsSlide13
Human constraint outside conserved regionsNon-conserved regions: ENCODE-active regions show reduced diversity Lineage-specific constraint in biochemically-active regions
Conserved regions:
Non-ENCODE regions show increased diversity
Loss of constraint in human when biochemically-inactive
Average
diversity
(heterozygosity)
Aggregate over
the genome
Active regionsSlide14
Covers computational challenges associated with personal genomics:- genotype phasing and
haplotype reconstruction resolve mom/dad chromosomes- exploiting
linkage for variant imputation co-inheritance patterns in human population
- ancestry painting for admixed genomes result of human migration patterns
- predicting likely causal variants
using functional genomics from regions to mechanism- comparative genomics annotation of coding/non-coding elements
gene regulation- relating regulatory variation to gene expression
or chromatin quantitative trait loci- measuring recent evolution and human selection
selective pressure shaped our genome
- using
systems/network
information to decipher
weak contributions
combinatorics
- challenge of
complex multi-
genic
traits
: height, diabetes, Alzheimer's
1000s of genesSlide15
Personal genomics tomorrow: Already 100,000s of complete genomesHealth, disease, quantitative traits: Genomics regions
disease mechanism, drug targetsProtein-coding cracking regulatory code, variation
Single genes systems, gene interactions, pathwaysHuman ancestry: Resolve all of human ancestral relationships
Complete history of all migrations, selective eventsResolve common inheritance vs. trait associationWhat’s missing is the computationNew algorithms, machine learning, dimensionality reduction
Individualized treatment from 1000s genes, genomeUnderstand missing heritability
Reveal co-evolution between genes/elementsCorrect for modulating effects in GWASSlide16
Collaborators and AcknowledgementsChromatin state dynamicsBrad Bernstein, ENCODE consortiumMethylation in Alzheimer’s diseasePhil de Jager, Brad Bernstein, Epigenome Roadmap
Mammalian comparative genomicsKerstin Lindblad-Toh, Eric Lander, 29 mammals consortium
Massively parallel enhancer reporter assaysTarjei Mikkelsen, Broad Institute
FundingNHGRI, NIH, NSFSloan FoundationSlide17
Daniel
Marbach
Mike Lin
Jason
Ernst
Jessica
Wu
Rachel
Sealfon
Pouya
Kheradpour
Manolis
Kellis
Chris
Bristow
Loyal
Goff
Irwin
Jungreis
MIT Computational Biology group
Compbio.mit.edu
Sushmita
Roy
Luke
Ward
Stata4
Stata3
Louisa
DiStefano
Dave
Hendrix
Angela
Yen
Ben
Holmes
Soheil
Feizi
Mukul
Bansal
Bob
Altshuler
Stefan
Washietl
Matt
Eaton