/
Documentation of Documentation of

Documentation of - PDF document

oconnor
oconnor . @oconnor
Follow
344 views
Uploaded On 2021-06-29

Documentation of - PPT Presentation

1 DoBo DoBo is a sequence based protein domain boundary predictor It leverages evolutionary information contained in multiple sequence alignments to identify potential domain boundary sites These ID: 849586

boundary domain dobo sequence domain boundary sequence dobo signals confidence predicted generated boundaries signal protein machine multiple sites length

Share:

Link:

Embed:

Download Presentation from below link

Download Pdf The PPT/PDF document "Documentation of" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

1 1 Documentation of DoBo DoBo is
1 Documentation of DoBo DoBo is a sequence based protein domain boundary predictor. It leverages evolutionary information contained in multiple sequence alignments to identify potential domain boundary sites. These candidate sites are then classified using a support vector machine. Predicted domain boundary sites are finally scored and a confidence value provided. ----------------------------------------------------------------------------------------- What is DoBo? DoBo (Domain Boundary) is a tool to identify domain boundaries from sequence. It works by combining the classification power of machine learning with domain boundary signals embedded in multiple sequence alignments. More specifically, a multiple sequence alignment is generated for a query se quence and signals are generated. A domain boundary signal is defined as a gap which starts at either end of a protein sequence from the MSA and continues for at least 45 residues. Once signals are identified, they can be classified using a Support Vector Machine. Fig. 1. depicts signal detection. Fig. 1. --------------------------

2 ----------------------------------------
--------------------------------------------------------------- 2 Why should I use DoBo? DoBo is an ab - initio prot e i n domain boundary prediction package. As it does not need template or domain information from homologous proteins, it lends itself well to predicting domain boundaries in novel sequences. It also allows boundaries to be predicted a varying confidence value s. Setting the confidence level to 80% means that on average, 80 percent of the domain boundary predictions will be near (+/ - 20 residues) a true domain boundary. ----------------------------------------------------------------------------------------- Wh y was I told to lower the decision threshold? Why weren't boundaries predicted? DoBo allows a user to set a confidence threshold on predicted domain boundaries. If a signal is generated at a residue location but is not scored about the confidence threshold , a domain boundary will not be predicted at that site. To view all signals generated, you can view the "signals.lst" file. A sample is shown below. It contains three colum n s corresponding to the residue locati

3 on, signal score and confidence value of
on, signal score and confidence value of the s ite. 712 - 0.613048 0.59 705 - 0.616138 0.59 459 - 0.627498 0.59 493 - 0.651736 0.58 1207 - 0.677879 0.57 1298 - 0.711237 0.56 1059 - 0.714784 0.56 1032 - 0.710711 0.56 Other reasons why predictions might not be made are sequence length and lack of similar sequences. Due to the implentation, the minimum sequence length is 90 residues. Any sequence less than that length will not generate signals. In very rare cases, it ma y be difficult to generate a multiple sequence alignment for a query protein. To see if this is the case, you may inspect the MSA generated for your query by following the link at the bottom of the results page. ------------------------------------------- ---------------------------------------------- 3 What should I cite? Please cite: J. Eickholt, X. Deng, and J. Cheng. DoBo: Protein Domain Boundary Prediction by Integrating Evolutionary Signals and Machine Learning. BMC Bioinformatics . 12:43, 2011. For any other questions or concerns, contact Jesse Eickholt at jlec95 @ mail . mizzou . edu .