Horizontal Gene Transfer Fan Ge LiSan Wang Junhyong Kim Mourya Vardhan Outline Controversy The extent of HGT affecting the core genealogical history Examination of this controversy by assessing the extent among core orthologous genes ID: 187911
Download Presentation The PPT/PDF document "The Cobweb of life revealed by Genome-Sc..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
The Cobweb of life revealed by Genome-Scale estimates of Horizontal Gene Transfer
Fan Ge, Li-San Wang, Junhyong Kim
Mourya VardhanSlide2
Outline Controversy : The extent of HGT affecting the core genealogical history
Examination of this controversy by assessing the extent among core orthologous genes A novel statistical method : To asses the extent of HGT based on comparisons of tree topologySlide3
IntroductionHorizontal gene transfer (HGT) refers to the transfer of genes between organisms in a manner other than traditional reproduction.
Whole genome analyses of different prokaryotes have been thought to indicate rampant HGTsThere is an on going debate over the estimation of HGT frequency and its impact on phylogenyInference of HGT from tree comparisons should be done under a proper statistical frameworkSlide4
Methodology to assess the extentNew method to explicitly test
for phylogenetic incongruence due to horizontal transfer versus statistical tree errorsUsed Clusters of Orthologous Groups (COG) from NCBI databases
Extracted most reliable COGs
Built gene tree for every COG and integrated to construct W-G tree
Comparisons of each gene tree with W-G tree to infer significant HGT
Augmented this method to pairwise comparisons of gene trees to detect conflictsSlide5Slide6
High-Quality Gene Groups and the W-G TreeCOG database is built by redoing sequence comparisons over 43 genomes
This resulted in retention of 297 high quality COG entries out of 3852 To approximate the W-G tree, they used median tree
estimator
The estimate used boot strap values from bootstrap samplingSlide7Slide8
Detection of HGT eventsBy comparison of estimated trees against other gene trees or against trees that represent the history of genomes, we infer HGTs
Discrepancy in the trees maybe caused due to HGT or other errorsDistance metrics are used to test discrepanciesThe paper explicitly asks if the discrepancies are caused by HGT events, as an additional precaution.Slide9
Comparison MetricsMaximum agreement subtree (
MAST) - If two trees differ by branches, they share common subtree, the bound on size of the shared subtree can be calculated using MASTSymmetric Difference (SD
) - Difference in the trees can be found by
this
metric Slide10
Interpretation of HGT events…Case 1:
If both MAST and SD are low, trees are most likely not differentCase 2: If both the metrics are large, can be either HGT events or errorsCase 3: But if they have large SD and low MAST values, it is most likely an HGT event.
Case 4:
Large MAST and low SD cannot occur due to algorithmic reasonsSlide11
SD and MAST scores for Gene Tree 1 and the W-G tree are 2
and 2, while
the
scores
for Gene Tree 2 and
the W-G
tree are 8 and
2Slide12
The Hypothesis Test
Hypothesis test Ɣ – difference of the two metricsComputed by generating null distribution by bootstrapping gene treesHGT was inferred when the observed Ɣ
was significant with the p-value below the
5% level
Simulation
studies applied to each
COG showed it detecting
HGT events
as follows, in
a COG tree using the 5% significance
HGT Events
Rates
1
53.8
2
70
3
77.3Slide13
ds
is the SD metricdm is the MAST metricm,n are the no. of branch splits
X is the no. of taxa
Used PAUP software to calculateSlide14
HGT Estimation via Comparisons between Each Gene Tree and the W-G Tree
Hypothesis Test was applied to each COGObservations showed that the test does not significantly vary with the p-valueAt 5% level, 33/297 (11.1%) COGs showed putative HGTsThese COGs are termed
hCOGsSlide15
The Relationship between Detecting COG entries with HGT and the
p-ValuesSlide16
HGT Estimation via Comparisons among Gene TreesProblem with comparing
the Gene tree and W-G tree is that the results are sensitive to the W-G treeCOG entries do not all share the same taxaIf
its
a
hCOG
, it should test differently for all the
comparisons
14,004 pairs
of gene trees that contained greater than or equal to
six shared taxa were compared
At 5% level
,
1,764/14,004 (
12.6%) pairs were
significantSlide17Slide18
Identification of transferred branches in gene trees.
For each COG that tested positive for HGT events, transferred branches were found by exhaustive enumeration of possible subtree matchesSearched for all combinations
of branch
prunings
to find the ‘‘troublesome
’’ branches
If there’s only one way to prune to make the trees congruent, it is an HGT eventSlide19
Color
HGT Rates
Red
>4%
Yellow
3%–4%
Pink
2%–3%
Blue
1%–2%
Green
1%Slide20
References
Goddard W, Kubicka E, Kubicki G, McMorris FR (1994) The agreement metric for labeled binary trees. Math
Biosci
123: 215–226.
Robinson
DF,
Foulds
LR (1981) Comparison of phylogenetic trees.
Math
Biosci
53:
131–147
Conover WJ (1999) Practical nonparametric statistics, 3rd ed. New
York: Wiley
. 584 p
.
Eisen
JA (2000) Horizontal gene transfer among microbial genomes:
New insights
from complete genome analysis.
Curr
Opin
Genet Dev 10: 606–611Slide21
Thank You!