/
Discovery of Structural Variation with Next-Generation Sequencing Discovery of Structural Variation with Next-Generation Sequencing

Discovery of Structural Variation with Next-Generation Sequencing - PowerPoint Presentation

thomas
thomas . @thomas
Follow
65 views
Uploaded On 2024-01-29

Discovery of Structural Variation with Next-Generation Sequencing - PPT Presentation

Alexandre Gillet Markowska Alexandregilletmarkowskaupmcfr Gilles Fischer Team Biology of Genomes UMR7238 Laboratory of Computational and Quantitative Biology Université Pierre et MarieCurie Paris ID: 1042355

variations structural illumina pair structural variations pair illumina size genome physical paired sequencing detection high human insert variation mate

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Discovery of Structural Variation with N..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

1. Discovery of Structural Variation with Next-Generation SequencingAlexandre Gillet-MarkowskaAlexandre.gillet-markowska@upmc.frGilles Fischer Team – Biology of Genomes UMR7238Laboratory of Computational and Quantitative BiologyUniversité Pierre et Marie-Curie, Paris

2. Structural variations (SV)(ii) SV detection technologies (iii) Read pairs: 2 types of Illumina genomic DNA libraries(iv) SV detection using Read pairs(v) Polymorphic SV Structural Variations (SV)outline

3. 1Yes, the minimal size is arbitrary…1Structural Variations (SV)

4. Structural Variations (SV)

5. Structural Variations (SV)

6. Structural Variations (SV)

7. Structural Variations (SV)

8. INVERSION (INV)RECIPROCAL TRANSLOCATION (RT)INSERTION (INS)DELETION (DEL)refSVrefSVBalanced SVUnbalanced SV (CNV)Intrachromosomal SVInterchromosomal SVrefSVrefSVTANDEM DUPLICATION (DUP)Balanced SV versus Unbalanced SVPictures adapted from Feuk et al., 2006 Nature ReviewsCalvin Blackman Bridges, Science

9. Why Discover SV ?involved in > 30 diseases (Psoriasis, Crohn disease, ASD…)chromosomal instability detected in the vast majority of cancerspowerful mechanism of adaptation and evolution

10. SV detection technologies

11. Calvin Blackman Bridges, ScienceTimeline of technologies used to discover SVSV, Structural Variations since 19361936Lejeune, Study of somatic chromosomes from 9 mongoloid children, Hebd Seances Acad Sci1959Smith et al, Interstitial deletion of (17)(p11.2p11.2) in nine patients. Am J Med Genet1986Comparative cytogenetics

12. Calvin Blackman Bridges, Science200 et 221 CNV360 Mb CNVR (12% du génome humain)1936Lejeune, Study of somatic chromosomes from 9 mongoloid children, Hebd Seances Acad Sci1959Smith et al, Interstitial deletion of (17)(p11.2p11.2) in nine patients. Am J Med Genet1986Iafrate, Detection of large-scale variation in the human genome, NatureSebat, Large-scale copy number polymorphism in the human genome, Science2004Redon, Global variation in copy number in the human genome, Nature2006Comparative cytogeneticsMicroarraysTimeline of technologies used to discover SVSV, Structural Variations since 1936

13. Calvin Blackman Bridges, Science200 et 221 CNV360 Mb CNVR (12% du génome humain)MicroarraysKorbel et al, Paired-end mapping reveals extensive structural variation in the human genome, ScienceNGS1936Lejeune, Study of somatic chromosomes from 9 mongoloid children, Hebd Seances Acad Sci1959Smith et al, Interstitial deletion of (17)(p11.2p11.2) in nine patients. Am J Med Genet1986Iafrate, Detection of large-scale variation in the human genome, NatureSebat, Large-scale copy number polymorphism in the human genome, Science2004Redon, Global variation in copy number in the human genome, Nature200620071000 HGP, A map of human genome variation from population-scale sequencing, Nature201020 000 SV1 000 SVComparative cytogeneticsTimeline of technologies used to discover SVSV, Structural Variations since 1936

14. ‘Range of usability’ of technologiesSize limitSV type limit

15.

16. SV detection with NGS data

17. Breakpoints res.SV size rangeCNVBalanced SVFDRMissing rate>100 bp> Insert SizeYesYesVariableVariableQuinlan & Hall 2011 Trends in GeneticsLI 2011 Nature1 bp1 bp–50 kbpYesYes>10%>25%1-10 bp>10 bpYesNoHigh?High?1 bp>1 bpYesYeslowHigh?How to detect SV with NGS data ?

18. Read pairs: 2 types of Illumina genomic DNA libraries1) Illumina Paired-End2) Illumina Mate-Pair

19. 1) Illumina Paired-End

20. 2) Illumina Mate-Pair

21. Illumina Paired end vs Mate-Pair (MP allows a better genome assembly than PE)MP allows to detect SV that involve repeated elements

22. Illumina Paired end vs Mate-PairInsert-size distribution of 100,000 read-pairsInsert-size (bp)5,000(or much less…)

23. Illumina Paired end vs Mate-Pair

24. SV detection with Read pairstrim the dataalign data to reference genomeremove PCR duplicatesSV calling

25. Trim the dataFirst criteria: Chargaff rule

26. Trim the dataFirst criteria : %A = %T and  %G = %C on both DNA strands

27. Trim the dataSecond criteria: nucleotide qualityBcbio-nextgenBtrimCANGSChipsterClean readsConDeTriEa-utilsFastxFlexbarPRINSEQReaperSeqTrimSkewerSolexaQATagCleanerTrimmomaticTrimming tools

28. Align the data to reference genome

29. Remove PCR duplicatessamtools rmdup (only intra-molecular duplicates)markduplicates.jar (picard tools)FastUniq…PCR duplicates annotation tools

30. SV signaturesSV have nearly identical signatures with MP and PE

31. SV signaturesGillet-Markowska, 2014, Bioinformatics

32. SV signatures

33. SV signatures

34. Inter-tool variability is immense

35. Inter-tool variability is immense

36. Inter-tool variability is immense Adapted from ICGC-TCGA challenge

37. Inter-tool variability is immense

38. SV examples

39. Korbel et al, Science 2007SV in the Human genome

40. Not-so-identical monozygotic twinsBruder, C. E. G. et al. Phenotypically concordant and discordant monozygotic twins display different DNA copy-number-variation profiles. Am. J. Hum. Genet. 82, 763–771 (2008)

41. Butterfly mimicry

42. Butterfly mimicry

43. Livestock phenotypes caused by CNV

44. Polymorphic SV Structural Variations (SV)

45. Individual (germ line)SV in 100% of cells of each individualTissue (somatic)SV in one tissue / in a few cellsPolymorphic SV Structural Variations (SV)

46. #generationBottleneck 16090 1201502400Bottleneck 2Bottleneck 3Bottleneck 4Bottleneck 5Bottleneck 80030#cells124109Sequencing a single cultureCan we detect de novo SV occurring in a single cell culture by high throughput sequencing ?DNA extractionSequencing(n=80)DNA extractionSequencingThe physical coverage (theoretically) sets the detection thresholdS. cerevisiae30 # generations011109# cells11224122.103103138.103141.6.1046,000X700X

47. Pair-End sequencing: insert size ~ 400 bpSequencing with high physical coverageReferenceCell 1Cell 2Cell 3Cell 4Cell 5Cell 6Cell 7Cell 8Cell 9Cell 10

48. Pair-End sequencing: insert size ~ 400 bpSequencing with high physical coverageReferenceCell 1Cell 2Cell 3Cell 4Cell 5Cell 6Cell 7Cell 8Cell 9Cell 10

49. Pair-End sequencing: insert size ~ 400 bpSequencing with high physical coverage210Coverage (sequence)covseq = 0.5XReferenceCell 1Cell 2Cell 3Cell 4Cell 5Cell 6Cell 7Cell 8Cell 9Cell 10

50. Pair-End sequencing: insert size ~ 400 bpSequencing with high physical coverage210210Coverage (sequence)covseq = 0.5Xcovphys = 0.85XCoverage (physical)ReferenceCell 1Cell 2Cell 3Cell 4Cell 5Cell 6Cell 7Cell 8Cell 9Cell 10

51. Pair-End sequencing: insert size ~ 400 bpSequencing with high physical coverage210210Coverage (sequence)covseq = 0.5XcovSV = 0covSV = 0ReferenceCell 1Cell 2Cell 3Cell 4Cell 5Cell 6Cell 7Cell 8Cell 9Cell 10covphys = 0.85XCoverage (physical)

52. Mate Pair sequencing: insert size ~ 1 to 20 kbSequencing with high physical coverageReferenceCell 1Cell 2Cell 3Cell 4Cell 5Cell 6Cell 7Cell 8Cell 9Cell 10Discordant Paired Sequence

53. Mate Pair sequencing: insert size ~ 1 to 20 kbSequencing with high physical coverageReferenceCell 1Cell 2Cell 3Cell 4Cell 5Cell 6Cell 7Cell 8Cell 9Cell 102102046810covseq = 0.5Xcovphys = 5XCoverage (sequence)Coverage (physical)covSV = 1Discordant Paired SequenceMate Pair sequencing increases the sensitivity of SV detection

54.

55.

56. Illumina Paired-End

57. Illumina Paired-End