/
Computational biology Outline Computational biology Outline

Computational biology Outline - PowerPoint Presentation

garcia
garcia . @garcia
Follow
66 views
Uploaded On 2023-06-21

Computational biology Outline - PPT Presentation

Proteins DNA RNA Genetics and evolution The S equence M atching Problem RNA Sequence Matching Complexity of the Algorithms DEFINITION Computational Biology encompasses all computational methods and theories applicable to molecular biology and areas of computer based techniques for solv ID: 1001156

sequences sequence computational matching sequence sequences matching computational rna protein amino dna biology set probabilities matches hmm family alignment

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Computational biology Outline" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

1. Computational biology

2. OutlineProteinsDNARNAGenetics and evolutionThe Sequence Matching ProblemRNA Sequence MatchingComplexity of the Algorithms

3. DEFINITIONComputational Biology encompasses all computational methods and theories applicable to molecular biology and areas of computer based techniques for solving biological problems.

4. protiensBuilding blocks of living organismLarge molecule that is composed of sequences of amino acidsThere are 20 amino acids which are divided into classes hydrophobic(h-phob) hydrophillic(h-phil) polar(pos,neg)

5. Amino acidSymClassAmino AcidSym ClassAlanineAh-phobLeucineLh-phob ArginineRposLysineKposAsparagineNh-phillMetheionineMh-phobAspartic acidDnegPhenylalanineFh-phobCysterineCh-phillProlinePh-phobGlutamineQh-phillSerineSh-phillGlutamic acidEnegThreonineTh-phillGlycineGh-phobTryptophanWh-phobHistidineHposTyrosineYh-phillIsoleucineIh-probValineVh-prob

6. dnaBlueprint of living organismsDNA is composed of two strands hold by a weak hydrogen bondEach strand is a sequence of nucleotidesDNA has four bases which are classified as two chemical types Base SymbolTypeAdenineAPurineThymineTPurineCytosineCPyrimidineGuanineGPyrimidine

7. DNA double helix

8. RNARNA is chemically very similar to DNAThere are two important differences Four bases present in RNA are adenine(A) guanine(G) cystosine(C) uracil(U)RNA nucleotides contain a different sugar molecule(ribose)

9. Genetics and evoltionMutationNatural selectionGenetic drift

10. Sequence matching problemMatching DNA,RNA, or Protein sequence between a diseased organism and a healthy organismProteins are longer and DNA strands are even longerWe match them by breaking them in to shorter subsequencesBreaking and matching is done by notion of alignment.

11. Sequence matching exampleConsider two amino acid sequences: ACCTGAGAG ACGTGGCAG sequence alignment A C C T G A G – A C A C G T G – G C A C

12. Finite state machines in blastIt is used to find out which of the sequences in a database are related to the new given sequence using BLASTThe BLAST system is a three step process 1. Examine the query string and select set of substrings of length w(between 4 and 20) which are good for producing matches 2. Build a DFSM that uses set of substrings and find the sequences with the highest local matches in the database 3. Examine the matches found in step2 and try to build a longer matching sequences

13. Regular expressions specify protein motifAligning collection of related proteins we can define a motif Example: E S G H D T Y Y N K N R M D T T T T T S W Q S R G S D T T T P D M T A G P T T W R N T Once an motif is defined we can search for the occurrences of it in other protein sequence by using regular expressions

14. Hmm for sequence matchingHMM’s are used when sequences become fairly diverseWe can capture the variations among the members of the family and the probabilities associated with themSo by using HMM’s we can find the best alignment between two sequences and from which family does a given new sequence belongs to

15. HMM profile is given by M = (K,O,π,A,B)K is a set of n states, one for each position in the sequenceO is the output alphabetΠ contains the initial state probabilities A contains the transition probabilitiesB contains the output probabilities

16. Example of hmm describing protein sequence family

17. Rna sequence matching and secondary structure prediction using the tools of context-free languagesIn RNA a change to a single nucleotide in a stem region could completely alter the molecules shape and its functionSo an change in the stem must be matched by a corresponding change in the paired nucleotideContext free languages are used describe these nested dependencies and secondary structure

18. example

19. Complexity of algorithms used in computational biologyApproaches to many of the problems described here are computational like breaking up of large protein and DNA molecules into substringsNP-hardConversion to decision problem SHOERTEST-SUPERSTRING(<S,K> : S is a set of strings and there exists some superstring T such that every element of S is a substring of T and T has length less than or equal to K) – NP-complete

20. referenceAutomata, computability, and complexity|Theory and Applications [book] by Elaine Rich.http://en.wikipedia.org/wiki/Computational_biology

21. Thank you