by Sadhana S definition Protein structure predictionprotein modelling is the prediction of the threedimensional structure of protein from its amino acid sequence ie the prediction of its folding amp its ID: 750409
Download Presentation The PPT/PDF document "PROTEIN MODELLING Presented" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
PROTEIN MODELLING
Presented by Sadhana SSlide2
definition
Protein structure prediction/protein modelling is the prediction of the three-dimensional structure of protein from its amino acid sequence i.e., the prediction of its folding & its secondary, tertiary, & quaternary
structure from its primary structureSlide3
Why to predict protein structure?
Owing to significant efforts in genome sequencing over nearly three decades, gene sequences from many organism have been deduced. Over 100 million nucleotide sequences from over 300 thousand different organisms have been deposited in the major DNA databases, DDBJ/ EMBL/
GenBank
totaling almost 200 billion nucleotide bases.
Over 5 million of these nucleotide sequences have been translated into amino acid sequences and deposited in the
UniProtKB
database. Slide4Slide5
However, the protein sequences themselves are usually insufficient for determining protein function as the biological function of proteins is intrinsically linked to three dimensional protein structure.
The most accurate structural characterization of proteins is provided by X-ray crystallography and NMR spectroscopy. Owing to the technical difficulties and labor intensiveness of these methods, the number of protein structures solved by experimental methods lags far behind the accumulation of protein sequencesSlide6
Many proteins are simply too large for NMR analysis and cannot be crystallized for X-ray diffraction
. Protein modeling(computational methods) is the only way to obtain structural information if experimental techniques fail
.
The ultimate goal of protein modeling is to predict a structure from its sequence with an accuracy that is comparable to the best results achieved experimentally.Slide7
Can we predict structure from sequence?Slide8
Computational Methods
The three major approaches for three-dimensional (3D) structure predictions areAb initio methods
Threading methods
Comparative modelling
/ homology modellingSlide9
What is Homology
Modelling? It is the prediction of the three-dimensional structure of a given protein sequence (target) based on an alignment to one or more known protein structures (templates).
If similarity between the target sequence and the template sequence is detected, structural similarity can be assumed.Slide10
Homology modeling, also known as Comparative modeling
of protein is the technique which allows to construct an unknown atomic-resolution model of the "target" protein from:1. Its amino acid sequence and
2.An experimental 3Dstructure of a related homologous protein (the "template").
Homology
ModellingSlide11
Basis for homology modelling?
Structure of a protein is uniquely determined by its amino acid sequenceStructure is much more conserved than sequence during evolution.
Proteins sharing high sequence similarity should have similar protein fold.
Higher the similarity, higher is the confidence in the modeled structure.Slide12
Homology modeling is a multistep process that can be summarized in seven steps:
1. Template recognition & initial alignment2. Alignment corrections 3. Backbone generation4. Loop modeling5. Side-chain modeling
6. Model optimization7. Model validationSlide13
TEMPLATE
RECOGNITIONAchieved by searching the PDB of known protein structures using the target sequence as the query.
Templates can be found using the target sequence as a query for searching using FASTA or BLAST, & PSI-BLAST or PDB-BLAST
Select the best template(min.30%) from a library of known protein structures derived from the PDB.Slide14
ALIGNMENT
Purpose – to propose the homologies between the sites in two or more sequences
Insertions & deletions are placed
Types
Pairwise alignment
Multiple alignmentSlide15Slide16
Correct alignment is necessary to create the most probable 3D structure of the target.
If sequences aligns incorrectly, it will result in false positive or negative results.Important steps to consider:gap penalties
Scoring alignments
Alignment algorithmsSlide17
Alignments are scored (substitution score) in order to define similarity between 2 amino acid residues in the sequences
A substitutions score is calculated for each aligned pair of letters.Alignment algorithms- DPA, BLAST & FASTA
Alignment CorrectionsSlide18Slide19
example
Structure of alignment 1 and 2 with the template Slide20
Alignment Outcome
The (true) alignment indicates the evolutionary process giving rise to the different sequences starting from the same ancestor sequence and then changing through mutations (insertions, deletions, and substitutions)Slide21
One simply copies the coordinates of those template residues that show up in the alignment with the model sequence
If two aligned residues differ- only backbone coordinates(N, C-alpha, C & O) are copiedIt they are same- side chain is also included
BACKBONE GENERATIONSlide22
Backbone Generation
For SCRs - copy coordinates from known structures
.
For variable regions (VR) - copy from known structure, if the residue types are similar; otherwise, use databases for
loop
sequences.Slide23
Knowledge based- PDB is searched
Energy based- energy function is used to judge the quality of loopMolecular modeling/dynamic programs are used
Loop ModellingSlide24
Loop ModellingSlide25
1. Use of rotamer libraries (backbone dependent)
2. Molecular mechanics optimization
- Dead-end elimination (heuristic)
- Monte Carlo (heuristic)
- Branch & Bound (exact)
Side Chain ModellingSlide26
Model refinement/optimization
Idealization of bond geometryRemoval of unfavorable non-bonded contacts
Performed by energy minimization with force fields such as CHARMM, AMBER, or GROMOS
Major errors are removedSlide27
Evaluation/validation of the model
Internal evaluationSelf-consistency checksAssessment of stereochemistry of the model
PROCHECK & WHATCHECK
External evaluation
Tests whether a correct template was used
PROSA & VERIFY3DSlide28Slide29
Applications
Designing mutants to test hypotheses about the function of a protein.Identifying active & binding sites.
Predicting antigenic epitopes.
Simulating protein-protein docking.
Confirming a remote structural relationship.Slide30
Web servers
Swiss- model server (http://www.expasy.ch/swissmod/)
CPHModels (
http://www.cbs.dtu.dk/servi
ces/CPH
models/
)
SDSC1 (
http://www.cl.sdsc.edu/hm
)
FAMS (
http://www.physchem.pharm.kitasato-u.ac.jp/FAMS/fams.html
)
ModWeb
(
http://www.guitar.rockefeller.edu/modweb
)Slide31Slide32
References
Zhumur Ghosh &
Bibekanand
mallik
. bioinformatics-
P
rinciples & applications. Oxford university press
S C
Rastogi
,
N.Mendiratta
, & P
Rastogi
. Bioinformatics- methods & applications. Eastern economy edition. Prentice hall of India. New Delhi
Philip.E.Bourne
&
Helge
Wiessig
. Structural Bioinformatics. John Wiley & Sons.
NewYork
C A
Orengo
, D T Jones & J M Thornton. Bioinformatics- gene, proteins, & computers. BIOS . Scientific PublishersSlide33