Whole Genome Sequencing for Epidemiologists A Brief Introduction Joel R Sevinsky PhD Microbial genomes Common isolate identification techniques using molecular biology Whole genome sequencing WGS ID: 768189
Download Presentation The PPT/PDF document "Whole Genome Sequencing for Epidemiologi..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Whole Genome Sequencing for Epidemiologists – A Brief Introduction Joel R Sevinsky , PhD
Microbial genomesCommon isolate identification techniques using molecular biology Whole genome sequencing (WGS) Example of WGS for outbreak investigationQuestions Objectives
Genome size varies from 4.56 to 5.70 Mb How big is 5 Mb??? Microbial Genomes
Long story, some big books!1,084,440 words in all seven booksAverage word length ~5 letters ~5,422,200 letters total in box set E.coli genomes range from 4.56 to 5.70 MbA single E. coli genome @ 1 box set Single human genome @ 1,000 box sets!!! Harry Potter Story
PFGE (Pulsed Field Gel Electrophoresis ) What do these bands really mean???
PFGE (Pulsed Field Gel Electrophoresis ) 5 Mb 5 Mb 5 Mb 5 Mb 5 Mb 5Mb 1 Mb 0.5Mb 1 site 2 sites 3 sites 2 sites 4 sites Restriction enzyme site Genome size in Mb
Book Frequency Broomstick (n) Spell (n) Wand (n) Wizard (n) Sorcerer’s Stone 27 14 62 41 Chamber of Secrets 12 6 107 44 Prisoner of Azkaban 20 6 114 39 Book Frequency Voldemort (n) Sorcerer’s Stone 31 Chamber of Secrets 20 Prisoner of Azkaban 37 Specific word = enzyme restriction site Word frequency determines banding pattern. Different words represent different enzymes. What does PFGE really tell you then? Harry Potter Story Table 1 Table 2
Isolate Identification Techniques Serotyping PFGE Pulsed Field Gel Electrophoresis Total gDNA fragments 16S rRNA Ribosomal RNA Sequencing 1 gene MLSTMulti Locus Sequence Typing 7 geneswgMLSTWhole Genome Multi Locus Sequence Typing Thousands of reference genes plus pan genomewgSNP or hqSNP Whole Genome Single Nucleotide Polymorphism TypingTotal gDNA Protein DNA Sequencing WGS Information
Whole Genome Sequencing 40 box sets
Whole Genome Sequencing ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA ATGCGTGATCTAGTAGTCTAGGAGCTGACCGATTA
WGS for Outbreak Investigations JEGX01.002 JEGX01.004 Salmonella enterica serovar Enteritidis
WGS for Outbreak Investigations JEGX01.002 JEGX01.004
WGS for Outbreak Investigations A = suspect isolate, same time/PFGE B = same patient over 5 weeks
WGS for Outbreak Investigations C = suspect isolate for outbreak 5 D = environmental isolate, egg farm swab
“…comparison of these 61 genomes sequences revealed that neither the 16S gene, nor the gene fragments usually used for MLST, provides biologically meaningful information on the relatedness of the sequenced isolates. The best way to analyze this is by taking into account all the genomic content, rather than looking at one or a few individual genes.” WGS Beyond Outbreak Investigations
Genome size varies from 4.56 to 5.70 Mb This size variation demonstrates a genomic difference of up to 1 Mb between isolates. 1 Mb = ~1,000 genes WGS Beyond Outbreak Investigations
WGS Beyond Outbreak Investigations
Reference Characterization by WGS GENUS/SPECIES: Escherichia coli SEROTYPE: O104:H4 PATHOTYPE: Shiga toxin producing and Enteroaggregative E. coli (STEC & EAEC) VIRULENCE PROFILE: stx2a, aggR , aggA , sigA, sepA, pic, aatA , aaiC , aap SEQUENCE TYPE: ST678 ANTIMICROBIAL RESISTANCE GENES: blaTEM-1, blaCTX-M-15, strAB , sul2, tet (A)A, dfrA7 wgMLST CODE : 102:45.26.35.3 ANI SerotypeFinder VirulenceFinder 7-gene MLST ResFinder Phylogenetic ID “One Shot” Characterization of STEC
Outbreak investigationSporadic vs outbreak Not just cluster but phylogenetic relationshipsMicrobial Source Tracking (MST)Microbial SurveillanceFoodEnvironment Animals, soil, food prep areas, hospitals, etc Antibiotic resistance monitoring Genotype predicts phenotypeMobile vs integrated Virulence gene monitoringWhat else??? Summary of Potential WGS Applications
Questions?