E coli O104H4 heralds a new paradigm in responding to disease threats Nicola J Holden Leighton Pritchard EHEC O104H4 outbreak Europe 2011 Unprecedented scale of outbreak 3950 affected 53 deaths multiple ID: 496460
Download Presentation The PPT/PDF document "Outbreak of" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Outbreak of E. coli O104:H4 heralds a new paradigm in responding to disease threats
Nicola J. HoldenLeighton PritchardSlide2
EHEC O104:H4 outbreak, Europe 2011
Unprecedented:scale of outbreak(3950 affected, 53 deaths; multipleimport restrictions)
emerging pathogen
(one previous case in
S.Korea)rapid production of sequence datacrowd-sourcing of assembly, and annotation via GitHubhttps://github.com/ehec-outbreak-crowdsourced/BGI-data-analysis/wikiSlide3
EHEC O104:H4 outbreak, Europe 2011
Unprecedented:scale of outbreak(3950 affected, 53 deaths; multipleimport restrictions)
emerging pathogen
(one previous case in
S.Korea)rapid production of sequence datacrowd-sourcing of assembly and annotation via collaborative revision control site: GitHubhttps://github.com/ehec-outbreak-crowdsourced/BGI-data-analysis/wikiSlide4
EHEC O104:H4 outbreak – timeline
1
st
May:
onset of outbreak26th
May: strain characteristics (
Scheutz
et al
., 2012
Eurosurveill)30th May: diagnostic laboratory information released (Muenster)2nd June: first draft assembly available (GitHub)9th to 21st June: additional sequences announced22nd June: Microbiological characteristics published (Bielaszewska et al., 2011 LID)26th July: official end of the outbreak (RKI)
refs: https://github.com/ehec-outbreak-crowdsourced/BGI-data-analysis/wiki;
RKI; Institute of Hygiene, MuensterSlide5
EHEC O104:H4 outbreak – timelineSlide6
EHEC O104:H4 outbreak – timeline
1
st
May:
onset of outbreak26th
May: strain characteristics (
Scheutz
et al
., 2012
Eurosurveill)30th May: diagnostic laboratory information released (Muenster)2nd June: first draft assembly available (GitHub)9th to 21st June: additional sequences announced22nd June: Microbiological characteristics published (Bielaszewska et al., 2011 LID)26th July: official end of the outbreak (RKI)
refs: https://github.com/ehec-outbreak-crowdsourced/BGI-data-analysis/wiki;
RKI; Institute of Hygiene, MuensterSlide7
EHEC O104:H4 outbreak – timeline
27
th
July
: Publication of open-source genomic analysisSlide8
A changing paradigm?
Kwan et al. (2011) http://precedings.nature.com/documents/6663/version/1Slide9
Meanwhile: diagnostics
27th June – 6th July
Outbreak isolate-specific,
sub-serotype
diagnosticsExploit rapid sequencing: work directly from incomplete and unordered draft genome sequencesRapidly generated (perhaps ahead of the biology?)Validated (good estimates of error rates)Easy to use and distributeCheap(er than sequencing everything)Slide10
Meanwhile: diagnostics
27th June – 6th July
Outbreak isolate-specific,
sub-serotype
diagnosticsExploit rapid sequencing: work directly from incomplete and unordered draft genome sequencesRapidly generated (perhaps ahead of the biology?)Validated (good estimates of error rates)Easy to use and distributeCheap(er than sequencing everything)Alignment-free PCR primer design
:
no
need to
identify
conserved
signature sequences prior to primer designSlide11
Alignment-free primer design: strategy
‘Positive’ genome set: 11 genome assemblies of 9 EHEC O104:H4 outbreak isolates (
GitHub
crowdsourcing)
‘Negative’ genome set: 31 genomes of E. coli and E. fergusonii (GenBank)Design many (>1000) primers to positive genome set:target CDS; optimise for qRT; 20 mers; 100
bp
amplicons
; T
A
= 58 oC Filter primers in silico: Exclude sets with predicted productive amplification in negative genomes.Screen primers to exclude sets with strong sequence similarity to any of a larger set of off-target genomes: (GenBank Enterobacteriaceae)Slide12
Alignment-free primer design: strategy
‘Positive’ genome set: 11 genome assemblies of 9 EHEC O104:H4 outbreak isolates (
GitHub
crowdsourcing)
‘Negative’ genome set: 31 genomes of E. coli and E. fergusonii (GenBank)Design many (>1000) primers to positive genome set:target CDS; optimise for qRT; 20 mers
; 100
bp
amplicons
; T
A = 58 oC Filter primers in silico: Exclude sets with predicted productive amplification in negative genomes.Screen primers to exclude sets with strong sequence similarity to any of a larger set of off-target genomes: (GenBank Enterobacteriaceae)Slide13
Automation
https://
github.com
/
widdowquinn/find_differential_primersSlide14
Alignment-free primer design
Positive
Negative
...
...
...
...
III
II
IV
VI1. Process configuration files:
Locations and classes of input sequence
files
.
2. Convert to single (pseudo)chromosomes:
Concatenate draft genome sequence.
3. Genome feature locations:
From GBK
file
or predicted from Prodigal.Slide15
Primer prediction (on positive set)
Positive
Negative
III
II
IV
V
I
4. Predict primer locations:
> 1000
thermodynamically plausible primer sets on each (pseudo)chromosome, using Primer3.Slide16
Test cross-amplification in silico
Positive
Negative
III
II
IV
V
I
5. Check cross-
amplification
:
All primer sets tested against other organisms, using
PrimerSearch
.
6. BLAST screen:
All primers screened for off-target sequences with BLAST:
7 possible primer setsSlide17
Classify primers and validation
III
II
IV
VI
...
...
...
...
...
III
IV
V
+
ve
-
ve
7. Classify primers:
Classified primer sets according to their ability to amplify
specific
classes of input sequence.
8. Validate primers:
Primer set validated on positive and negative targets
in vitro
.
5 target sequences:
prophage
gp20 (2)
hypothetical CDS (2)
impB
(1)Slide18
Validation
In silico, diagnostic primers are just another classifierValidation on unseen
data is critical
(avoid overfitting, estimation of performance)Direct experimental validation of primer candidates (Münster):‘Positive’ set = 21 clinical outbreak isolates‘Negative’ set = 32 HUSEC / EPEC isolatesPositive control = LB 226692Slide19
Primer design: validated in vitro
positive
negativeSlide20
Alignment-free primer design: summary
Individual primer sets: 100 % sensitivity; 82–94 % specificity; 9% < FDR < 22%Combining
primers: 100 % sensitivity and
specificity
A minimal combination of two primer sets discriminated absolutely between outbreak O104:H4 isolates and non-outbreak E. coli isolates, including HUSEC 041Flexibility in strategy allows for targeted design, e.g. multiplex PCR / different organisms / large gene families etc..Same approach used forResolving Dickeya plant pathogensDiscriminating between RxLR effectors in Phytophthora infestans Slide21
Alignment-free primer design: summary
Bypass the need for:multiple genomic alignments biological justification for primer choice (maybe even reveal biology…)Produce diagnostic primers for any subgroup of organisms (
possibly
…)
LimitationsScaling issue: PrimerSearch is slow (modular pipeline allows use of alternative programs)Low specificity of primers -> use qPCRVery similar organisms may not be distinguishedTime from genomes to primer sets: 90 hourspossibility for improvements as collaborative bioinformatics projects (speed up off-target primer mapping, make into user-friendly tool…)Slide22
Acknowledgements
nicola.holden@hutton.ac.uk
leighton.pritchard@hutton.ac.uk
Thanks to Nadine Brandt,
Kath Wright and Sean ChapmanSlide23
Sprouted seeds as a source of infectionsSlide24
‘
Sproutbreak’ - Jimmy Johns restaurantSlide25
Colonisation of spinach by VTEC O157:H7 Sakai (vt-)Slide26
Referencec : www.slideshare.com