Employ rigid or flexible structures for ligand and receptor Sidechains or Backbone flexible How to handle molecular motions Treat with full atomic detail or simplified models Which docking energy function is best ID: 592882
Download Presentation The PPT/PDF document "How to approximate complex physical and ..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
How to approximate complex physical and thermodynamic interactions?Employ rigid or flexible structures for ligand and receptor (Side-chains or Back-bone flexible)How to handle molecular motions?Treat with full atomic detail or simplified models? Which docking energy function is best?
Automated Molecular Docking IssuesSlide2
Given two molecules with known 3D conformations:1) Can we predict whether they bind to each other? This is harder than it sounds!2) If yes, can we accurately predict: The binding affinity? The shape of the molecule-molecule complex?3) Can we at least rank order the affinities of a range of ligands (Virtual Screening)?
Relevance to chemistry/biochemistry:
Protein-Small Ligand docking (drug design, usually rigid protein, flexible ligand)DNA-Small Ligand docking (drug design, usually DNA rigid, flexible ligand)Protein-Carbohydrate docking (usually rigid protein, flexible ligand)Protein-DNA docking (usually rigid protein, flexible ligand)Protein-Protein docking (usually rigid body)
The Molecular Docking Challenge Slide3
Electrostatic Interactions (relatively long-range, proportional to 1/R):hydrogen bonds, salt bridges, charge-chargeDispersive Interactions (short range)Van der Waals attractions (proportional to 1/R6)Van der
Waals repulsions (proportional to 1/R
12)Hydrophobic contacts (depend on displacing solvent from the binding site, and are therefore short range)Tight binding requires both the correct shape of interacting surfaces (shape complementarity) and polarities (charge complementarity)
The binding affinity is the energetic difference between the bound and free states which requires solvation and entropy to be considered
Specificity is driven by shape and hydrogen bond complementarity (easy to quantify)Affinity is driven by hydrophobic and entropic effects (hard to quantify)
Factors Affecting BindingSlide4
Estimating the binding affinity:Searching for lead structures (drug candidates) for protein targetsComparing a set of inhibitorsEstimating the influence of modifications in lead structuresDe Novo ligand designDesign of targeted combinatorial librariesPredicting the molecule complex:
Understanding the binding mode / principle
Optimizing lead structuresDetermining ligand positions in crystal structuresApplications of DockingSlide5
To make docking practical:Eliminate explicit waters (what about desolvation?) Approximate desolvationEliminate dynamics (what about entropy?)
Approximate entropy
Employ general force field (what about precision?) Treat force field energies as adjustable not absoluteIgnore the unbound state (what about Δ
G?)
Approximate ΔG
Approximations in DockingSlide6
Instead of using: ΔGBinding =
Δ
GComplex – Δ
G
Ligand –
Δ
G
Receptor
Develop a “scoring function” that takes part of the interaction energy from force field concepts and part from
Empirical Fitting to Experimental Values
:Use:
Scoring Functions (the Ugly Side of Docking)
The interactions (
E
i
) might include:
hydrogen bonds
electrostatic interactions
hydrophobic contacts
solvent exclusion volume, among others...
Each contribution has an adjustable weighting factor (
f
i
).Slide7
In determining the weighting factors (fi) the developer must choose how broadly or how narrowly the scoring function is to be applied. Is the function to be used for all classes of interactions? Or only some? For protein-protein only, or protein-drug only, or only for a particular class of drug?There are many Scoring Functions. The AutoDock
3 function is:
Scoring Functions General or Specific?
The
f
coefficients are determined empirically from a multi-linear regression (MLR) to a set of protein–ligand complexes with known binding constants.
Because the
f
coefficients are not based on physics, scoring functions are considered empiricalSlide8
Scoring Function Details (AutoDock 3) The indices i and j correspond to ligand and protein atoms, respectively.
The
Coulombic term includes the partial charges (q) and a distance-dependent dielectric function (εR).
A, B, C
and D are the Lennard
–Jones parameters in the dispersion/repulsion 12-6 and H-bonding 12-10 formulas and
R
denotes the distance between the atomic pairs.
ξ
t
is a directional weight depending on angle t at the H-bonds. S and V denote the solvation parameter (empirical) and fragmental volume, respectively, in the solvation function of Stouten et al. The AutoDock4 scoring function has different
parametrization of the desolvation
term. Slide9
Simulated annealing Search TechniqueAutodock can use one of several optimization methods to search for the best placement of the ligand.Simulated annealing: At each step of simulated annealing, the position and internal rotational state of the ligand is adjusted and the energy calculated. If the energy decreases, the move is accepted. If not, it may be accepted with some probability that depends on the current temperature of the annealing.
As the search goes on, the temperature is decreased, and eventually, the final state of the ligand is returned as the docked conformation. Because simulated annealing is a Monte Carlo (randomized) method, different runs will generally produce different solutions.
Finding Optimal Poses
http://cnx.org/content/m11456/latest/Slide10
A central paradigm which was used in the development of the first docking programs was the lock-and-key model first described by Fischer. In this model the three dimensional structure of the ligand and the receptor complement each other in the same way that a lock complements a key. However, a more accurate view of this process was first presented by Koshland in the induced fit model.
In this model the 3D structure of the ligand and the receptor
adapt to each other during the binding process. It is important to note that not only the structure of the ligand but also the structure of the receptor changes during the binding process. This occurs because the introduction of a ligand modifies the chemical and structural environment of the receptor.
Rigid or Flexible Protein?
http://cnx.org/content/m11456/latest/Slide11
Soft receptors can be easily generated by reducing the van der Waals repulsive (1/R12) contributions to the total energy score.
This makes the
receptor “softer”, thus allowing, for example, a larger ligand to fit in a binding site determined experimentally for a smaller molecule. Treating Induced Fit: Soft Receptors
http://cnx.org/content/m11456/latest/
a) van
der
Waals
representation
of a
target receptor.
b) Close up image of a section of the binding site with normal van der Waals properties. c) Same
section of the binding site as shown in b) but with reduced radii for the atoms in the receptor.
This type of soft representation allows ligand atoms to enter the grey shaded
area without incurring a high energetic penalty.Slide12
Soft receptors can be easily generated by reducing the van der Waals repulsive (1/R12) contributions to the total energy score. This makes the receptor
“softer”, thus:
Allowing a slightly larger ligand to fit in a binding site determined experimentally for a smaller molecule.Allowing a ligand to fit into a binding site from a structure that was determined in the absence of any ligand.
The
rationale behind this approach is that the receptor structure has some inherent flexibility which allows it to adapt to slightly differently shaped ligands by resorting to small variations in the orientation of binding site chains and backbone positions.
It will not correct for a case in which ligand binding requires a significant change in the binding site, such as the flipping of a side chain into a different
rotamer
.
The
main advantage of using soft receptors is ease of implementation (docking algorithms stay unchanged) and speed (the cost of evaluating the scoring function is the same as for the rigid
case (normal).Treating Induced Fit: Soft Receptorshttp://cnx.org/content/m11456/latest/Slide13
Treating Induced Fit: Side Chain RotationsRotations around single bonds, such as in side chains is a “natural” way to model induced fit. Selection of which torsion angles to permit to rotate is usually the most difficult part of this method because it requires a considerable amount of a priori knowledge of alternative binding modes for a given receptor.
Alternatively
, probable side chain orientations may be selected from rotamer librariesThe principle problem with this method is that is adds significantly to the time required for the calculation because of the exponential number of permutations of side chain
rotamers
in a binding sitehttp://cnx.org/content/m11456/latest/Slide14
To approximate the flexibility of the receptor it is possible to carefully select a few degrees of freedom. These are usually
the torsional
angles of side chains that have been determined to be critical in the induced fit effect for a specific receptor. Treating Induced Fit:
Side Chain Rotations
In this example the selected torsional angles are represented by arrows.
Stick representation of a section of a binding site
http://cnx.org/content/m11456/latest/Slide15
One possible way to represent a flexible receptor is to use multiple static receptor structures. This concept reflects the idea that proteins in solution do not exist in a single minimum energy static conformation but are in fact constantly jumping between low energy conformational
sub-states
. In this way the best description for a protein structure is that of a conformational ensemble of slightly different protein structures coexisting in a low energy region of the potential energy surface.
Thus, the
binding process can be thought of not as an
induced fit model as described by
Koshland
in
1958, but
more like a selection of a particular sub-state from the conformational ensemble that best complements the shape of a specific ligand.Treating Induced Fit: Multiple Receptor Conformations
http://cnx.org/content/m11456/latest/Slide16
Superposition of multiple conformers of a section of a binding site. Treating Induced Fit: Multiple Receptor Conformations
These can be either considered individually as rigid representatives of the conformational ensemble or can be combined into a single representation that preserves the most relevant structural information. Slide17
The use of multiple static conformations for docking gives rise to two critical questions. 1) How can we obtain a representative subset of the conformational ensemble typical of a given receptor
The
structures can be determined experimentally either from X-ray crystallography or NMR, or generated via computational methods such as Monte Carlo or molecular dynamics simulations. 2) What is the best way of combining this large amount of structural information for a docking study?
Should the multiple shapes be averaged in some way, or should independent docking be performed on each one? How many shapes should be used? These questions
also
remain open.
Treating Induced Fit:
Multiple Receptor Conformations
http://cnx.org/content/m11456/latest/Slide18
One of the main advantages of using multiple structures instead of using a selection of degrees of freedom to represent protein flexibility is that the flexible region is not limited to a specific small region of the protein. The multiple structure approach allows the consideration of the full flexibility of the protein – including the back bone – without
the exponential blow up in terms of computational cost that would derive from including all the degrees of freedom of the protein.
On the other hand, only
a small fraction of the conformational space of the receptor is
represented by a limited number of shapes. Multiple Receptor Conformations
versus
Rotatable Side ChainsSlide19
Accuracy – Ability to discriminate binders from non-binders (Scoring) – Ability to identify bound conformation (Internal Energies) – Ability to identify binding site (Search Algorithm)Efficiency – Conformation searching and pose searching are inversely proportional to ligand flexibility (Smaller is Better
)
Ligand Docking (Handle with Care!)
Scoring functions have not been tuned for glycans (Aromatic Stacking)
Docking functions do not include appropriate internal energiesInduced fit in the protein is ignoredSlide20
Ligand Docking (Handle with Care!)
Accuracy – Ability to discriminate binders from non-binders (Scoring)
– Ability to identify bound conformation (
Internal Energies
) – Ability to identify binding site (Search Algorithm)
Efficiency – Conformation searching and pose searching are
inversely proportional to ligand flexibility (
Smaller is Better
)
Docking is: Fast
Fun and CheapBut which pose is the winner?Slide21
Better Worse
Docking Energies Should Distinguish
Good from Bad Poses
Non-Binders
Binding Energy 0
neg
pos
RMSD relative to known 3D structure
AutoDock
3.0.5Slide22
Better Worse
Docking Energies Should Distinguish
Good from Bad Poses
Non-Binders
Binding Energy 0
neg
pos
RMSD
AutoDock
(VINA-CARB) with
Carbohydrate Internal EnergiesSlide23
Inclusion of Glycosidic Energy in Autodock
VINA:
AutoDock VINA-Carb
*Averaged over top 20 poses, flexible glycan docked to positive control antibody
Antibody
Average
Internal Energy (kcal/mol)*
RMSD of Lowest
Energy Pose (Å)
VINA
VINA-CARBVINAVINA-CARB
1MFA3.7
1.12.8
1.21MFD4.8
1.1
2.5
1.5
1S3K
9.0
1.4
1.7
1.2
1UZ8
0.5
0.5
0.4
0.4
1M7D
8.1
1.0
1.1
1.0
1M7I
15.1
1.9
10.2
1.1Slide24
Inclusion of Glycosidic Energy in Autodock
VINA:
AutoDock VINA-Carb
Antibody
Average
Internal Energy (kcal/mol)*
RMSD of Lowest
Energy Pose (Å)
VINA
VINA-CARB
VINAVINA-CARB1MFA3.7
1.12.8
1.21MFD
4.81.12.5
1.5
1S3K
9.0
1.4
1.7
1.2
1UZ8
0.5
0.5
0.4
0.4
1M7D
8.1
1.0
1.1
1.0
1M7I
15.1
1.9
10.2
1.1
*Averaged over top 20 poses, flexible glycan docked to positive control antibody