ID: 780193
Download The PPT/PDF document "Eukaryotic genomes are complex 3D struct..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Eukaryotic genomes are complex 3D structures comprised of modified and unmodified DNA, RNA and many types of interacting proteins
Most DNA is wrapped around a “histone core”, to form nucleosomes
The classical histone protein complexes bind very tightly to DNA and prevent association with other proteins
Modifications of the classical histones, or their replacement with unusual histone types under certain conditions, can “loosen” the interaction with DNA, allowing access to transcription factors, RNA polymerase, and other proteins
Slide2Regulatory elements can be far away: most enhancers interact with promoters and each other through long-range chromatin loops
Regulatory elements are essentially “docking sites” for specific types of DNA-binding proteins
Transcription factors, TATA-binding factors, and others
These proteins serve to attract co-factors, which then mediate protein: protein interactions across chromatin loops
Very long range interactions are common in vertebrates, less so in invertebrate species with lower
coding:nocoding ratiosChIP with an antibody that binds to “E” DNA will bring down “P” DNA as wellProteins are crosslinked very efficiently to each other, as well as to DNA, by formaldehyde treatmentWhen crosslinking is reverse the complex falls apart, and Both DNA fragments are released independentlyOnly one sequence binds to the TF!Common issue in analysis of ChIP
Shear chromatin
(Sonication or restriction enzyme)
TF
Slide3How to find the regulatory needles in the haystack?
Vertebrate genomes are mostly non-coding~2% coding; ~5% noncoding and evolutionarily conserved (at the DNA sequence alignment level)Conservation has been used to identify important functional elements, but not all functional elements are conserved at the level that DNA sequence alignments can detect
Furthermore the important question is: which elements are
accessible
in a particular cell type at a particular time and in a particular state?
Slide4Focusing on accessible chromatinEven well conserved motifs cannot be accessed in closed regions of chromatin
accessible
Not accessible
e.g. H3K9Me3, H3K27Me3
e.g. H3K27Ac
Slide5All four histones in the tetramer have “tails” that can be modified in various ways, but the most consequential modifications, with respect to transcriptional activity, appear to involve methylation or acetylation of
Lysines (K) in histone H3
Slide6How to find open chromatin? Chromatin ImmunoPrecipitation (ChIP
)
Antibody to a DNA binding protein is used to “fish out” DNA bound to the protein in a living cell
DNA and protein are
crosslinked
in the cell using brief treatment with low concentration of high quality formaldehydeCrosslinked chromatin is sheared, usually by sonication, to yield short fragments of DNA+protein complexesAntibody to a TF or other binding protein used to fish out fragments containing that DNA binding proteinDNA is then “released” and can be analyzed by various methods:Original method is PCR: query for enrichment of specific (known or suspected) DNA binding regions in ChIP-enriched DNACreates a pool of sequences highly enriched in binding sites for a particular proteinRequires availability of excellent antibodies that can detect the protein in its in vivo context
Slide7A basic ChIP-like approach can be used to map nucleic acid:protein interactions of virtually any type
Histone modifications:Secondary interactions (no direct linkage to DNA)Histone modifying proteins, such as SWI/SNF, histone deacetylases, histone methylases
Cofactors that bind to TFs at particular sites, and that
stablize
chromatin loopsProteins that link chromatin to nuclear matrix
RNA polymerase and elongation factors, to find promoters and active sites of transcriptionProteins involved in DNA recombination, repair, and replicationAll of these methods require highly specific and efficient antibodies (which are rare!)
Slide8ChIP
computational issues
First step is to map reads: BOWTIE,
Novalign
, BWA or otherChIP seq reads surround but may not contain the DNA binding siteSequence is generated from the ends of randomly sheared fragments, which overlap at the protein binding siteGives rise to two adjacent sets of read peaks separated by ~ 2X fragment length
D
efines a “shift” distance between read peaks at which you will find the true ChIP peak
summit
Programs like MACS and HOMER automatically subtracts your control (genomic input) from sample reads to define a final set of peaks
Binding site
ChIP fragments
Seq reads
Slide9ChIP Analytical challenges
Genomic neighborhoodsShear efficiency is not really “random”Some genomic regions are fragile and sensitiveSome regions are protected from shear or degradationOther artifactsCentromeres: repeat sequences that are not all represented in the genome sequence build
Polymorphic regions, and e.g. regions that are amplified in cell line DNA
Repeats: most programs cannot manage sequence reads that are not mapped uniquely
Peak width
Transcription factors are typically sharp peaks; chromatin marks are more diffuseThe best tools permit the user to modify these parametersMACS ( Xiaole Liu Lab; Zhang et al, 2008; Feng et al. Nature Methods 2102) is a user-friendly and widely used toolHOMER, a highly versatile tool with many different annotation features and high sensitivity (Chris Benner, http://homer.salk.edu/homer/ngs/)
Slide10Analyzing ChIP data
User-friendly toolsMACS: ‘Model based” peak detection, is sensitive to peak enrichment and backgroundZhang et al, Genome Biology 2008, Feng et al. 2012, Nat
Procols
PMID: 22936215 (Xiaole Liu lab);
MACS1 is best for sharp peaks (TFs); will break diffuse peaks into smaller regionsMACS2 is designed to allow broad- or sharp-peak detectionHOMER (http://homer.salk.edu/homer)Can be easily tweaked for more sensitive peak detectionComes packaged wiith a rich set of peak annotation toolsTools for DNAse-seq, High-C, differential ChIP analysis and many moreBoth tools permit generation of “wiggle files” or similar that can be viewed in the UCSC browserLooking at your data is a very important step! Peak finders can miss peaks that you can easily see by eye!
Slide11Traditional methods fail with broad, flat peaks
Most tools designed for TF proteins: discreet, sharp peaksCertain chromatin proteins, and modified histones in certain regions, bind continuously to large regions of chromatin and do not yield “peaks”MACS in default mode will carve the “mesa” into many peaks, or not detect it at allNew settings in MACS 2 can be set to overcome this problemHOMER has a wide variety of settings ideal for data of different types
Slide12ChIP analysis workflow
FASTQC -> BOWTIE -> Peak finder (MACS or HOMER)This same workflow and tools can be used for a variety of methods e.g. Methyl DIP, ATAC-seq, DNase seqDownstream analysis:Mapping peaks to nearby genes (and perhaps, DEGs)Identifying enriched motifs
For your factor
For co-binding factors
Overlapping with other genome features e.g. open chromatin, known binding sites, etc.
Slide13An ecumenical approach to open chromatin: ATAC-seq
Uses Tn5 transposase and a Transposon modified to contain Illumina primers at each endTransposon “jumps” preferentially (and randomly) into accessible chromatinBecause of the design the transposon breaks DNA where it jumps in, tagging the site with the primer
Two insertions close together yield fragments of the size amenable for
Illumina
sequencingPCR amplification between primers is all you need to make a library
Since it skips library-making steps (ligation etc), can be done with small amounts of input chromatin – e.g. 50,000 vs 1,000,000 cellsBuenrostro et al., 2013, 2015
Slide14TN5
transposase
insertion
(e.g.
Illumina
library
oligos
)
tagmentation
Continued reaction
PCR
Ready to sequence
ATAC seq: transposons preferentially “jump” into open chromatin
Slide15From
Schmidl et al., Nature Methods 2015Other Transposon-based methods
:
ChIP-
tagmentationChiP-mentation
Also based on transposon-Based library constructionSo reduce requirementsFor input chromatin!Analysis is identical to ChIP,Only the experimental methods(and input chromatin) are different
Slide16Issues related to Tagmentation Protocols
Ratio of DNA: transposaseHas to be adjusted for each cell type and chromatin prepNeed even fragmentation to avoid bias, and small enough fragments, in general, for illumina
Need to avoid making fragments too small
Bias observed in DNA: controls are complicated
Solution in “ChiPmentation”
Tagmentation while DNA is still protected by the antibody and cross-linked chromatin, still on the beadProtects from over-tagmentation, this allowing a full digestion without fear of losing the DNAAllows the protocol to work over a 25X range of DNA: transposon and lessens worries about time
Slide17Current summary
ATAC-seq and H3K27Ac ChIP win the daySimple technology, can be completed with relatively low input and low sequencing readsExcellent kits are available for beginners, and many sequencing centers will do the work for a fee
Methods work for all species and cell types
Robust computational tools are readily available
Slide18Issues related to Tagmentation Protocols
Ratio of DNA: transposaseHas to be adjusted for each cell type and chromatin prepNeed even fragmentation to avoid bias, and small enough fragments, in general, for illumina
Need to avoid making fragments too small
Bias observed in DNA: controls are complicated
Solution in “ChiPmentation”
Tagmentation while DNA is still protected by the antibody and cross-linked chromatin, still on the beadProtects from over-tagmentation, this allowing a full digestion without fear of losing the DNAAllows the protocol to work over a 25X range of DNA: transposon and lessens worries about time
Slide19How to tie back to 3D structure?Probing 3-dimensional chromatin structure with conformation capture
from Wit and de
Laat
, 2012
Slide20TFs do not act alone:Probing 3-dimensional chromatin structure with conformation capture
from Wit and de
Laat
, 2012
Slide21Requires analysis methods that are different from ChIP
Provides the essential “big picture” view, since it is otherwise impossible to predict long-range enhancer-enhancer or enhancer-promoter interactionsSequenced fragments contain a bit of DNA from two distant regionsData need to be trimmed and mapped to allow non-contiguous sequencesLong-distant contacts are numerous, and each contact point is relatively rare: peaks are small are require deep sequencing
For most of these methods, restriction enzymes are used to shear, not sonication, and your endpoints may be spread over a restriction fragment
Analytical methods create a restriction map of your viewpoint region in 4C, and bin reads to those fragments
Hi C kits are now readily available and quite reliable, giving a whole-genome view of interactions
Lots of interactions and lots of noise! Computational issues are trickyAll 3D methods require deep sequencing and paired-end reads
Slide22Summary and Overview
Many user-friendly methods and analytical tools are available to identify active elements in large genomesThe issue is finding out “who is talking to whom?”Enhancers can be shared by multiple genes
Alternative promoters for the same gene can have very different regulatory partners
Position relative to the TSS is not a reliable indicator in large vertebrate genomes
3D methods are necessary to tie enhancers and promoters together
Fortunately, 3D genomic interaction tools are becoming easier and more cost-effective so are accessible to virtually any lab!