/
DNA:chromatin DNA:chromatin

DNA:chromatin - PowerPoint Presentation

aaron
aaron . @aaron
Follow
411 views
Uploaded On 2017-12-09

DNA:chromatin - PPT Presentation

interactions Exploring transcription factor binding and the epigenomic landscape Lisa Stubbs Eukaryotic genomes are complex structures comprised of modified and unmodified DNA RNA and many types of interacting proteins ID: 613933

chromatin dna proteins enhancers dna chromatin enhancers proteins binding chip marks protein histone regions factors specific h3k4me1 cell transcription

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "DNA:chromatin" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

DNA:chromatin interactions

Exploring transcription factor binding and the

epigenomic

landscape

Lisa StubbsSlide2

Eukaryotic genomes

are

complex structures comprised of modified and unmodified DNA, RNA and many types of interacting proteins

Most DNA is wrapped around a “histone core”, to form nucleosomes

The classical histone protein complexes bind very tightly to DNA and prevent association with other proteins

Modifications of the classical histones, or their replacement with unusual histone types under certain conditions, can “loosen” the interaction with DNA, allowing access to transcription factors, RNA polymerase, and other proteins Slide3

All four histones in the tetramer have “tails” that can be modified in various ways, but the most consequential modifications, with respect to transcriptional activity, appear to involve methylation or acetylation of

Lysines

(K)

in histone H3Slide4

Histone H3 modifications, especially methylation and Acetylation, mark “open” or “closed” DNA

CLOSED: Histones bound more tightly to DNA

H3K27Me3, H3K9Me3

OPEN: Histones can be displaced by TFs, RNA Polymerase, and other proteinsH3K27Ac

,

H3K4me1

, H3K4me3

Histone marks, together with other assays of open chromatin, are presently the

only

reliable indicators of the locations and activities of regulatory elementsSlide5

Many types of regulatory elements

“Docking sites” for site-specific regulatory proteins

Transcription factors, TATA binding factors, and other site-specific binders

Recruit additional proteins: co-factors, RNA polymerase and others

Enhancers

Tissue-specific activators of transcription

Binding sites for proteins that interact with the promoter to enhance transcription

Silencers

Also prevalent, but more difficult to detect and assay

Many transcription factors repress, rather than enhance, gene expression

“Enhancers” and “Silencers” are not mutually exclusive! Most regulatory elements can serve either function, depending on the proteins bound at a particular time

Insulators

“boundary elements” that shield genes from the enhancers or heterochromatin proteins in neighboring gene “territories”

Involved in establishing loop structures that isolate genesSlide6

How to find them? Chromatin

ImmunoPrecipitation

(ChIP)

Antibody to a DNA binding protein is used to “fish out” DNA bound to the protein in a living cell

DNA and protein are

crosslinked

in the cell using

brief

treatment with low concentration of

high quality

formaldehyde

Crosslinked

chromatin is sheared, usually by sonication, to yield short fragments of DNA+protein complexesAntibody to a TF or other binding protein used to fish out fragments containing that DNA binding proteinDNA is then “released” and can be analyzed by various methods:Original method is PCR: query for enrichment of specific (known or suspected) DNA binding regions in ChIP-enriched DNACreates a pool of sequences highly enriched in binding sites for a particular proteinRequires availability of excellent antibodies that can detect the protein in its in vivo context Slide7

ChIP can be used to map

DNA:protein

interactions of virtually any type

Histone modifications:Secondary interactions (no direct linkage to DNA)Histone modifying proteins, such as SWI/SNF, histone

deacetylases

, histone

methylases

Cofactors that bind to TFs at particular sites, and that

stablize

chromatin loops

Proteins that link chromatin to nuclear matrix

RNA polymerase and elongation factors, to find promoters and active sites of transcription

Proteins involved in DNA recombination, repair, and replicationAll of these methods require highly specific and efficient antibodies (which are rare!)Slide8

ChIP Analytical challenges

Genomic neighborhoods

Shear efficiency is not really “random”

Some genomic regions are fragile and sensitiveSome regions are protected from shear or degradation

Other artifacts

Centromeres

: repeat sequences that are not all represented in the genome sequence build

Polymorphic regions, and e.g. regions that are amplified in cell line DNA

Repeats: most programs cannot manage sequence reads that are not mapped uniquely

Peak width

Transcription factors are typically sharp peaks; chromatin marks are more diffuse

The best tools permit the user to modify these parameters

MACS ( Xiaole Liu Lab; Zhang et al, 2008; Feng et al. Nature Methods 2102) is a user-friendly and widely used toolHOMER, a highly versatile tool with many different annotation features and high sensitivity (Chris Benner, http://homer.salk.edu/homer/ngs/)Slide9

ChIP

computational issues

First step is to map reads: BOWTIE,

Novalign

, BWA or other

ChIP

seq reads

surround but may not contain

the DNA binding site

Sequence is generated from the

ends

of randomly sheared fragments, which overlap at the protein binding site

Gives

rise to two

adjacent sets of read peaks separated

by

~ 2X fragment length

D

efines a “shift” distance between read peaks at which you will find the true ChIP peak summitPrograms like MACS and HOMER automatically subtracts your control (genomic input) from sample reads to define a final set of peaks

Binding site

ChIP fragments

Seq readsSlide10

Traditional methods fail with broad, flat peaks

Most tools designed for TF proteins: discreet, sharp peaks

Certain chromatin proteins, and modified histones in certain regions, bind continuously to large regions of chromatin and do not yield “peaks”

MACS in default mode will carve the “mesa” into many peaks, or not detect it at all

New settings in MACS 2 can be set to overcome this problem

HOMER has a wide variety of settings ideal for data of different typesSlide11

DNAse sensitivity assays are antibody free

The first approach: from Crawford et al.,

Genome Research 16:123, 2006 (Francis Collins’ laboratory)

Digest with DNAse I to “erase” all the hypersensitive regions

Easier to do– less need to optimize and minimize DNAse cutting

Polish and

ligate

the remaining double-strand ends

Ligate

5’-biotinylated linkers to the DS ends

Shear (

sonicate

) or restriction-digest DNA into smaller fragmentsPurify end sequences on a streptavidin columnRelease sequences, add new linkers, and sequenceDoes not allow footprinting, because TF binding sites inside the HS regions have been digested awaySlide12

Latest (and better) approach: sequences DNAse sensitive regions

per se

and permits transcription factor “Footprinting

”The easiest method uses low concentrations of

Dnase

I to generate short fragments at sensitive (“open) sites

Released fragments can be blunt-ended, ligated to linkers and sequenced directly

Permits DNase

Footprinting

: Very deep sequencing can “see” short protected regions that are absent from the released DNA, and appear as protected “valleys” inside the

DNAse

sensitive peaks

protected from DNAse I because they are occupied by TF proteinsSlide13

Related methods

and twists on the theme (see

Furey et al., 2012 for review)

Exo-ChIP

Follows sonication with an

exonuclease

step, to “pare back” all but the protein-protected region in

ChIP

“Nano-

ChIP

ChIP normally required ~107 cells as input; hard to achieve for many cell typesNano ChIP can be carried out in several ways:With carrier DNA: not the best for sequence analysis but can be doneAmplification after ChIP: very tricky because it can cause serious biases and artifacts, but can be done with care; linear amplification is the best strategyTagmentation: a new method that creates libraries directly by transposon insertionThe problem is library preparation, which needs a minimal amount of input for successSlide14

From

Schmidl

et al., Nature Methods 2015Slide15

Lessons from ENCODE chromatin assays: human and

Drosophila

data

Massive deep-sequencing of multiple chromatin features in cell lines (ENCODE), primary cell types and tissues (Epigenetics Roadmap)Histone H3 modifications: highlight on H3K4me1, H3K4me3, H3K27Ac, H3K27me3

Other chromatin proteins: e.g. P300 (

acetyltransferase

)

H3K4me3 marks are enriched at active promoters

H3K4me3 marks are largely the same in all cell lines, with a small fraction of marks being cell-specific

P300, and H3K4me1

without

H3K4me3 is enriched at enhancer

sMost P300 peaks also contain H3K4 me1P300, H3K4me1 marks are highly cell-type specificMost P300 marks are enhancers, but not all enhancers have P300Most enhancers have an H3K4me1 mark but, not all H3K4me1 marks are in enhancersOther marks: H3K27Ac or H3K27me3Mutually exclusive marks for open (Ac) versus closed (Me3) chromatin regionsH3K27Ac is perhaps the most general mark of open chromatin: promoters and enhancersCan be found in combination with H3K4me1/me3 Slide16

Combinatorial marks define subclasses of enhancers

H3K4me1+ , H3K27Ac + mark enhancers with highest levels of activity

Represent cell-type specific active enhancers in differentiated cells

Mouse enhancers: gain K27Ac upon differentiation in mouse ES cells, leading to higher expression

H3K4me1+, H3K27Ac- marks

Called “intermediate” enhancers, linked to a variety of non-specific cellular functions

In humans especially, K4me1+, K27me3+ are called “poised” enhancers,

K27me3 is a mark of

polycomb

repression;

polycomb

proteins are also associated with these sites

K9me3+ marks also found at poised enhancersThese enhancers are associated specifically with development-related functions; K27me3 may be replaced by K27Ac as differentiation progressesPoised enhancers are more likely to be conserved between species, and therefore most of the enhancers that have been tested so far are probably of this subclassExplains why H3K4me1 does not always find active enhancers (finds the “poised’ ones too)Slide17

Back to the nucleus:

Distant regulatory elements interact with promoters (and each other) through long-range chromatin loops

Regulatory elements are essentially “docking sites” for specific types of DNA-binding proteins

Transcription factors, TATA-binding factors, and others

These proteins serve to attract co-factors, which then mediate protein: protein interactions across chromatin loops

Very long range interactions are common in vertebrates, less so in invertebrate species with lower

coding:nocoding

ratios

ChIP with an antibody that binds to “E” DNA will bring down “P” DNA as well

Proteins are

crosslinked

very efficiently to each other, as well as to DNA, by formaldehyde treatment

When crosslinking is reverse the complex falls apart, and

Both

DNA fragments are released independently

Only one sequence binds to the TF!

Common issue in analysis of ChIP

Shear chromatin

(Sonication or

restriction enzyme)TF

TFSlide18

Chromatin conformation capture methods can identify these loop-linked sequences

Ends of the co-captured DNA fragments are

ligated

while still captured on the antibody-bead with protein complex

DNA is released and can be

Queried by PCR for enrichment of suspected candidate

interactors

Circularized and PCR amplified using a primer from a “bait” region (4C)

Directly sequenced for all X all interactions (5C, Hi-C, Chia-PET)

Issue include

random co-ligation between fragments that are not truly connected in the cell

Over-crosslinking, which may join sequences that are nearby, incorrectlyProvides a view of 3-D chromatin architecture, especially important for mammalian cellsFrom Wikipedia