Also palindromes What are they Short sequences between 4 and 13ish nucleotides one might even say theyre very short Occur multiple times in the genome at dispersed intervals not repeated right next to each other distributed or spread over a wide interval ID: 552500
Download Presentation The PPT/PDF document "Very Short Dispersed Repeats" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Very Short Dispersed Repeats
Also, palindromesSlide2
What are they?
Short sequences (between 4 and 13-ish nucleotides) – one might even say they’re very short
Occur multiple times in the genome, at
dispersed intervals (not repeated right next to each other) – “distributed or spread over a wide interval”Slide3
It turns out that many of these very short dispersed sequences are palindromic – what does that mean?
The sequence is read the same in the 5’
3’
direction as it is in the 3’
5’ direction.Slide4
Occurrence of Highly Iterated Palindromes (HIP1) in CyanobacteriaSlide5
Palindromic Sequence in Cyanobacteria
5’-GCGATCGC-3’
3’-CGCTAGCG-5’
But not all cyanobacteria
have this sequenceSlide6Slide7
Blast Search
What is the purpose of HIP1?
ATGCATGATACGTA
GCGATCGC
CACCCGGGATT
GCGATCGC
Match: GCGATCGC
What genes are nearby?
Prior research shows DAM methylase recognition and DNA profiling Slide8
Thermosynechococcus
elongatus
BP1 &
Synechococcus
Elongatus
PCC 6301Slide9
Prochlorococcus marinus mit9301 &
Trichodesmium Erythraeum IMS101Slide10
Results?
Hypothetical Proteins-highest peak in Themosynechococcus and Trichodesmium
Many metabolic proteins such as Ferredoxin, peptidase
Program to overlap nearby genes in different organismsSlide11
References
Robinson, P. J., Cranenburgh, R. M., Head, I. M. and Robinson, N. J. (1997), HIP1 propagates in cyanobacterial DNA via nucleotide substitutions but promotes excision at similar frequencies in
Escherichia coli
and
Synechococcus
PCC 7942. Molecular Microbiology, 24: 181–189.
Robinson, P. J., Gupta, A., Bleasby, A., Whitton, B., Morby, AP. Singular over-representation of an octameric palindrome, HIP1, in DNA from many cyanobacteria
Moya, A. Delaye, L. Abundance and distribution of the highly iterated palindrome 1(HIP1) among prokaryotes Mob Genet Elements. 2011 Sep-Oct; 1(3) 159-168 Slide12
O
ccurrences
of DNA uptake sequence from
Haemophilus influenzae in other Pasteurellaceae bacteria
By: Noha MudhaffarSlide13
AAGTGCGGT
1516 (479 – 1001)
1913428
0.3815231Slide14
Highly Repeated Sequences Slide15
Family: PasteurellaceaeSlide16
Organism
Length
GC-FRACTION
Occurrences of DNA USS
Actinobacillus-succinogenes-130Z
2319663
0.44918594
1690
Haemophilus-influenzae-86-028NP
1913428
0.3815231
1516
Actinobacillus-actinomycetemcomitans-HK1651
1995520
0.44412282
1507
Mannheimia-succiniciproducens-MBEL55E
2314078
0.42537978
1485
Haemophilus-influenzae-R2846
1824242
0.37971717
1461
Haemophilus-somnus-2336
2263857
0.37378067
1355
Haemophilus-somnus-129PT
2012878
0.37191722
1245
Haemophilus-influenzae-R2866
1933340
0.38079283
952
Pasteurella-multocida-subsp-multocida-str-Pm70
2257487
0.40404883
927
Haemophilus-influenzae-86028NP
1738864
0.38510486
888
Haemophilus-influenzae-Rd-KW20
1830138
0.38147888
737
Actinobacillus-pleuropneumoniae-L20
2274482
0.41299513
73
Actinobacillus-pleuropneumoniae-serovar-1-str-4074
2292348
0.41376877
63
Mannheimia-haemolytica
2498406
0.40754706
59
Haemophilus-ducreyi-35000HP
1698955
0.38220495
41Slide17
Organism
Length
GC-FRACTION
Occurrences of DNA USS
Escherichia-coli-DH10B
5004529
0.5103499
19
Escherichia-coli-53638
5289471
0.5095685
22
Escherichia-coli-HS
4643538
0.5081961
5
Escherichia-coli-E24377A
4980187
0.50621593
19
Escherichia-coli-E2348-69
5059346
0.5050742
18
Escherichia-coli-F11
5206906
0.5049682
19
Escherichia-coli-042
5379979
0.50532633
24
Escherichia-coli-E110019
5384084
0.5077157
17
Escherichia-coli-K12
4639221
0.50788873
23
Escherichia-coli-O157-H7
5594477
0.5048416
27
Escherichia-coli-B7A
5202558
0.5084804
15
Escherichia-coli-APEC-O1
5497653
0.5033812
14
Escherichia-coli-B171
5299753
0.50713533
19
Escherichia-coli-CFT073
5231428
0.50474805
7
Escherichia-coli-W3110
4646332
0.5079958
22
Escherichia-coli-E22
5516160
0.506397
22
Escherichia-coli-O157-H7-EDL933
5528445
0.5038297
10
Escherichia-coli-ATCC-8739
4746218
0.5086652
23 Slide18
Reference
Frequency and Distribution of DNA Uptake Signal Sequences in the
Haemophilus
influenzae Rd Genome.Genomic Sequence of an Otitis Media Isolate of Nontypeable
Haemophilus
influenzae
: Comparative Study with
H. influenzae Serotype d, Strain KW20.Xu Z, Yue M, Zhou R, Jin Q, Fan Y, et al. (2011) Genomic Characterization of
Haemophilus
parasuis
SH0165, a Highly Virulent Strain of Serovar 5
Prevalent in China. PLoS ONE 6(5): e19631. doi:10.1371/journal.pone.0019631.
DNA uptake signal sequences in naturally transformable bacteria.Slide19
The E
volutionary
C
hange of DNA Uptake Sequences in Neisseria meningitides.Slide20
What are DNA Uptake Sequences (DUS)
?
Neisseria sp.
Constitute ~1% of genome.
Homology
5’GCCGTCTGAA’3
Kingdom: Bacteria
Phylum:
Proteobacteria
Class:
Betaproteobacteria
Order:
Neisseriales
Family:
NeisseriaceaeGenus: NeisseriaSpecies:Neisseria
meningitidis Slide21
DNA Uptake Sequence
AT-DUS
AG-DUSSlide22
A Closer Look
Strain
# of DUS
DUS
Sequence
G+C (%)
Length of Genome (
bp
)
N.
meningitidis
MC58
1477
5’
ATGCCGTCTGAA’3
51.52272351 N. Meningitidis
Z249114495’AG
GCCGTCTGAA’3
51.8
2184406
N.
Gonorrhoeae
FA10901522
5’A
T
GCCGTCTGAA’3
52.7
2153922Slide23
DUS InversionSlide24
Phylogeny
16s
rRNA
DUSSlide25
References
http://
phil.cdc.gov/phil/details.asp?pid=2678
Frye SA, Nilsen M, Tønjum T, Ambur
OH. Dialects of the DNA uptake sequence
in
Neisseriaceae
.
PLoS Genet. 2013Slide26
Six nucleotide palindromic sequences in Mycobacteriophage
genomes
What about them? That’s a great question. I’m glad you asked.Slide27
How did we get here from Very Short Dispersed Repeats?
A very short story.
Point A:
“Singular over-representation of an octameric palindrome, HIP1, in DNA from many cyanobacteria.”
From there: what about palindromes in
mycobacteriophages
?
?Slide28
Avoidance of 6 nt
palindromes in
Mycobacteriophages
Mycobacterial genomes generally do not avoid 6 palindrome sequence. Generally, this means that the viruses that infect them will not either. When two of the mycobacteriophage genomes were examined, they were found to avoid palindromes of size 6.
“The sole exception is provided by the two M. tuberculosis
phages D29 and L5, which strongly avoid palindromes of size 6.” (Rocha, et. al) 2001
L5 and D29 –
Mycobacteriophage
cluster A2Slide29
Generated 186 random sequences of the same length of the average
Mycobacteriophage
genome (70627 nucleotides long) and same GC content (64%) and counted the number of occurrences of all 6 nucleotide palindromes in these randomly generated sequences.Slide30
Occurrences of all 6 nucleotide palindromes over the actual genomes of
Mycobacteriophages
(or at least the 186 that
BioBIKE knows) Slide31
Which phages are outliers to the right? (>2000 occurrences) - these 15
((
Mycobacterium-phage-Cali
2485) – C, C1
(
Mycobacterium-phage-
Catera
2466) – C, C1
(
Mycobacterium-phage-Alice
2437) – C, C1
(
Mycobacterium-phage-
LRRHood
2445) – C, C1
(
Mycobacterium-phage-Rizal
2465) – C, C1
(
Mycobacterium-phage-Nappy
2528) – C,
C1
(
Mycobacterium-phage-Ghost
2524) – C, C1
(
Mycobacterium-phage-
Drazdys
2506) – C,
C1
(
Mycobacterium-phage-
ScottMcG
2480) – C, C1
(
Mycobacterium-phage-Spud
2485) – C, C1
(
Mycobacterium-phage-
Sebata
2519) –
C, C1
(
Mycobacterium-phage-
Pio
2505) - C, C1
(
Mycobacterium-phage-Bxz1
2501) – C, C1
(
Mycobacterium-phage-
LinStu
2478) – C,
C1
(
Mycobacterium-phage-ET08
2466))
– C, C1
All of the cluster C1 phages in
BioBIKE
!Slide32
What’re C1 cluster phages?
“Only two of these (
Subclusters
C1 and C2) correspond to phages with myoviral morphologies (with contractile tails)”Okay, so they’re of the family Myoviridae. This means they are: generally lytic, and lack the necessary genes to become lysogenic. They have a contractile tail, and contracting the tail requires ATP.
C cluster phage isolated by Michael
Kiflezghi
!Slide33
Which sequences are occurring so frequently?
GGCGCC GACGTC CGCGCG ACCGGT
GTCGAC GCGCGC CCCGGG
TGGCCA CAGCTG AGGCCT
CGATCG CTGCAG TGCGCA
Many of these are recognition sites for restriction enzymes. Significant? There’s a chance.
Warrants more investigation? It seems likely.Slide34
References and credit for pictures
phagesdb.org
Rocha, E.,
Danchin, A., & Viari, A. (2001). Evolutionary role of Restriction/Modification systems as revealed by comparative genome analysis. Genome Research, (11), 946-958. doi:10.1101/gr.153101
Discussion of avoidance of palindromic sequences of length 4 and 6 and possible reasons for this avoidance in bacteria and bacteriophages. Mentions 2
mycobacteriophages
that exhibit an avoidance for 6nt palindromes, L5 and D29.
Article Source:
Expanding the Diversity of Mycobacteriophages
: Insights into Genome Architecture and
Evolution
. Pope
WH, Jacobs-Sera D, Russell DA, Peebles CL, Al-
Atrache Z, et al. (2011) Expanding the Diversity of Mycobacteriophages: Insights into Genome Architecture and Evolution.
PLoS ONE 6(1): e16329. doi: 10.1371/journal.pone.0016329 cyanobacteria picture: http://www.um.edu.mt/__
data/assets/image/0005/166604/oculatella2.jpgmycobacteriophage picture: http://openi.nlm.nih.gov/imgs/512/165/2884959/2884959_2711fig1.png
coral snake: http://upload.wikimedia.org/wikipedia/commons/8/8a/Coral_snake.jpg