/
Chromatin basics & ChIP-seq analysis Chromatin basics & ChIP-seq analysis

Chromatin basics & ChIP-seq analysis - PowerPoint Presentation

arya
arya . @arya
Follow
67 views
Uploaded On 2023-05-20

Chromatin basics & ChIP-seq analysis - PPT Presentation

Vladimir Teif BS312 Genome Bioinformatics Lecture 5 Next generation sequencing analysis httpsmicromagnetfsueducellsnucleusimageschromatinstructurefigure1jpg Chromatin basics reminder ID: 998497

homer seq ngs chip seq homer chip ngs genome data methods peaks https www peak dna reads http nature

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Chromatin basics & ChIP-seq analysis" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

1. Chromatin basics &ChIP-seq analysisVladimir TeifBS312 – Genome BioinformaticsLecture 5

2. Next generation sequencing analysis

3. https://micro.magnet.fsu.edu/cells/nucleus/images/chromatinstructurefigure1.jpgChromatin basics -- reminder

4. Transcription factor (TF) concentrationsProteins produced (including TFs)Teif et al. (2013), Methods. 62, 26-38Protein assembly at regulatory regionsTranscription start siteTranscription factor-centric view

5. Transcription factor (TF) concentrationsProteins produced (including TFs)Teif et al. (2013), Methods. 62, 26-38PromoterEnhancerRNA polymerase: enzyme which makes RNATranscription factor-centric view

6. Histone modifications-centric viewTurner B.M. (2005) Nature Structural & Molecular Biology, 12, 110 - 112

7. http://dev.biologists.org/content/139/6/1045Histone modifications-centric view

8. NGS METHODS AND THEIR APPLICATIONSFigure adapted fromhttp://www.scienceinschool.orgHi-CChromatin domains

9.

10. ChIP-seq (Chromatin Immunoprecipitation followed by sequencing) 1. Crosslink Protein-DNA complexes in situ2. Isolate nuclei and fragment DNA (sonication or digestion)3. Immunoprecipitate with antibody against target nuclear protein and reverse crosslinks4. Release DNA and submit for sequencingAdapted from www.VisiScience.com

11. MMMNase-seq (Micrococcal Nuclease digestion followed by sequencing)Teif et al. (2012), Methods, 62, 26-38MNase = Micrococcal Nuclease (enzyme that cuts DNA between nucleosomes)

12. FAIRE-seq (Formaldehyde-Assisted Isolation of Regulatory Elements)sequencingGiresi et al (2007), Genome Res. 17, 877–885

13. DNAse-seq (DNase I digestion followed by sequencingWang et al. (2012), PLoS ONE 7, e42414

14. How transposase works: https://www.youtube.com/watch?v=XYZHMGUGq6o ATAC-seq (Assay for Transposase-Accessible Chromatin using sequencing) Buenrostro et al. (2013) Nat Methods. 10, 1213-1218

15. MMMethods for 1D genome mappingMeyer & Liu, Nature Reviews Genetics 15, 709–721 (2014)

16. Methods for 1D genome mappingTsompana and Buck, Epigenetics & Chromatin20147:33

17. Timeline of NGS methodsRiver and Ren (2013), Cell, 155, 39-55Hu et al, Front. Cell Dev. Biol., 2018Bulk methods that require many cellsSingle-cell methods

18. Where to get NGS data? Do your own experiment Gene Expression Omnibus (GEO) https://www.ncbi.nlm.nih.gov/geo Sequence read archive (SRA) https://www.ncbi.nlm.nih.gov/sra European Nucleotide Archive https://www.ebi.ac.uk/ena The Cancer Genome Atlas (TCGA) https://tcga-data.nci.nih.gov/tcga Exome Aggregation Consortium (ExAC) http://exac.broadinstitute.org/ You also have to upload your data!

19. How to analyze NGS data? Ask a bioinformatician you need to explain what do you want, and for that you need to understand what/how can be done Do it yourself Command line –> become a bioinformatician Online wrappers –> simpler, but file size limitsExample of a convenient online tool:Galaxy http://galaxy.essex.ac.uk/

20. ChIP-seq (Chromatin ImmunoPrecipitation followed by sequencing) 1. Crosslink Protein-DNA complexes in situ2. Isolate nuclei and fragment DNA (sonication or digestion)3. Immunoprecipitate with antibody against target nuclear protein and reverse crosslinks4. Release DNA and submit for sequencingAdapted from www.VisiScience.com

21. ExperimentDataanalysishttp://www4.utsouthwestern.edu/mcdermottlab/NGS/index.html

22. ChIP-seq analysis workflowwww.utsouthwestern.edu/labs.bioinformatics-core/analysis/chip-seq.png

23. NGS data after sequencing but before mapping (.fastq file aka “raw” data):

24. Mapping with Bowtiehttp://bowtie-bio.sourceforge.net/manual.shtml-v <N> Allow no more than N mismatches, where V may be a number from 0 through 3 set using the -v option. -p <N> Use N computer processors/cores in parallel-m <N> disregard reads with >N possible alignments

25. Guess what this command does-v <N> Allow no more than N mismatches, where V may be a number from 0 through 3 set using the -v option. -p <N> Use N computer processors/cores in parallel-m <N> disregard reads with >N possible alignmentsbowtie -v 2 -p 2 -m 1 mm9 filename.fastq filename.map

26. NGS data after mapping:.bed files (BED format)Bowtie, BWA, ELAND, Novoalign, BLAST, ClustalWTopHat (for RNA-seq)

27. Reads can align to overlapping locationshttp://biocluster.ucr.edu/~rkaundal/workshops/R_feb2016/ChIPseq/ChIPseq.htmlWe need to count all reads at each base pair

28. From mapped reads to occupancy landscapesHOMER, BedTools, BamTools, NucToolsTeif et al., Methods, 2012

29. http://homer.ucsd.edu/homer/ngs/tagDir.htmlmakeTagDirectory <Directory Name> [options] <alignment file>Calculating occupancy with HOMER

30. http://homer.ucsd.edu/homer/ngs/tagDir.htmlQuality control (QC)

31. http://homer.ucsd.edu/homer/ngs/tagDir.htmlQuality control (QC)Good ChIP-seqBad ChIP-seq

32. Data view in genome browsersUCSC Genome Browser (online) IGV (install on a local computer)Jung et al., NAR 2014

33. https://genome.ucsc.edu/UCSC Genome Browser

34. http://homer.ucsd.edu/homer/ngs/ucsc.htmlmakeUCSCfile <tag directory> -o autoCreate UCSC files with HOMER

35.

36. Peak shapes can be differentPark P. J., Nature Genetics, 2009

37. Systematic analysis requires to identify all peaks in all datasets and compare differencesBadet et al. (2012) Nature Protocols, 7, 45-61

38. Peak calling is a method to identify areas in a genome enriched with aligned readsWilbanks EG (2010) PLoS ONE 5, e11471.

39. Peak calling: finding the peaksPepke et al. (2009). Nature Methods, 6, S22–S32. Input: sample that was prepared in the same way as in the ChIP-seq, but no antibody was added, so it has no specific enrichment of our protein of interest

40. Peak calling: defining statistical significance

41. Peak calling: defining statistical significanceMACS (good for TFs)CISER (histones, etc)HOMER (universal)PeakSeqedgeR CisGenomePark P. J., Nature Genetics, 2009Is this peakstatisticallysignificant?Is this peakstatisticallysignificant?

42. Finding peaks with HOMERhttp://homer.ucsd.edu/homer/ngs/peaks.html

43. Guess what this command doesfindPeaks ChIPDirectory -style factor -i InputDirectoryWe need to map our ChIP-seq and its Input (control), then create their HOMER tag directories ChIPDirectory and InputDirectory, then find peaks using both these directories.Additional optional parameters:-F <#> Enrichment ratio ChIP vs. Input (by default 4-fold)-P <#> P-value cut off (by default 0.0001

44. ChIP-seq: reads to peaks/regionsMACS, CISER, HOMERPeakSeq, edgeR, DESeq, CisGenome

45. Peaks/regions in BED formatpos2bed.pl peakfile.txt > peakfile.bedbed2pos.pl peakfile.bed > peakfile.txt

46. Intersecting genomic regionsBedTools (command line)Galaxy (online)

47. Genomic features are also regionsMattout et al., Genome Biology, 2015

48. Let’s look at many similar regionsdeepToolsNucToolshttps://github.com/fidelram/deepTools/wiki/VisualizationsEach horisontal line is one genomic region

49. ChIP-seq heat maps for all genes, scaled with respect to their start (TSS) and end (TES)https://github.com/fidelram/deepTools/wiki/Visualizations

50. Cluster heatmapshttps://github.com/fidelram/deepTools/wiki/VisualizationsdeepTools 2.0

51. Comparing cluster heatmapsbetween two cell conditionsNucTools

52. Histone modifications around TSShttp://www.ie-freiburg.mpg.de/bioinformaticsfacdeepTools

53. Motif enrichment analysisHOMER, MEMEPavlaki et al., 2017

54. Finding motifs with HOMERHOMER takes the coordinates of all ChIP-seq peaks, looks at the corresponding DNA sequences of each peak and finds the common consensus motifs that are encountered in many of these peaks.Then HOMER looks in a database and reports which motifs are similar to already known TF binding motifs, and which motifs are new.

55. The MEME Suite is even more sophisticated and contains all tools that are needed for motif analysishttp://meme-suite.org

56. Summary of ChIP-seq analysis:Map all readsOccupancy calculationDifferential peak callingIntersection of different signalsCorrelation of different signalsMotif enrichment in peaks

57. Raw reads -> mapping -> peak calling~100s types of NGS experiments; we focus on chromatinChIp-seq data structureRAW DATA; MAPPED READS; REGIONS; SITESGENOME BROWSERS. PEAKS. PEAK CALLINGMUST KNOW:HEATMAP; AGGREGATE PROFILE; GENE ONTOLOGY (GO)Optional video: https://www.youtube.com/watch?v=Ob9xGBPvr_sTake home messageWhere NGS data is stored (GEO, etc)