Bioinformatics Not only small molecules and QM, MM techniques rule the world. - PowerPoint Presentation

342 views
Uploaded On 2022-08-01

Bioinformatics Not only small molecules and QM, MM techniques rule the world. - PPT Presentation

Central dogma of molecular biology Term is due to Francis Crick The conversion DNA protein is not direct RNA is involved DNA is the information store RNA is messenger mRNA transporter tRNA biomolecular nanomachine rRNA ID: 932054

sequence structure dna systems structure sequence systems dna rna alignment biology sequences analysis proteins protein http bioinformatics genome score

Link:

Copy

Embed:

<iframe width="560" height="315" src="https://www.docslides.com/embed/932054" frameborder="0" allowfullscreen></iframe>

Download Presentation from below link

Download Presentation The PPT/PDF document "Bioinformatics Not only small molecules ..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.

Presentation Transcript

Slide1

Bioinformatics

Slide2

Not only small molecules and QM, MM techniques rule the world.

Slide3

Central dogma of molecular biology

Term is due to Francis Crick

The conversion DNA

→ protein is not direct, RNA is involved

DNA is the information store, RNA is messenger (mRNA), transporter (tRNA), biomolecular nanomachine (rRNA)

source: wikipedia.org

Slide4

Nucleic acidsfour letters (DNA, RNA)

sequence - AACTAACG (5’

→

3’)

DNA – double helixRNA – “single stranded” helix, folding (double helical regions, C2’ -OH → secondary and tertiary motifs)

Slide5

leosid

leo

Slide6

B-DNA

A-DNA

Z-DNA

Slide7

RNA secondary motifs

Nowakowski and Tinoco

Seminars in Virology 8, 153, 1997.

Slide8

RNA

source: http://complex.upf.es/~josep/RNA.jpg, http://www.biosci.ki.se/groups/ljo/images/phe_trna_large.jpg, http://rna.ucsc.edu/rnacenter/images/70s_atrna.jpg

Slide9

Proteins20 lettersprimary structure - sequence AMNTSSTVG (N-end

→

C-end)

Alberts

, Molecular

Biology of

the

Cell, 5th Ed.

Slide10

secondary structure (random coil, -helix

-sheet

loops)several secondary structure elements form motifse.g. greek key, β-α-β, HTH

Slide11

tertiary structure (the arrangements of motifs into

domain/s

)

quartenary structure

(multimeric complexes)

Slide12

Proteins

source:http://calstate.fullerton.edu/news/arts/2003/photos/protein-art.jpg

Slide13

Proteins

source: Petsko, Ringe – Protein structure and function

Slide14

http://www.cellsignal.com/reference/pathway/NF_kappaB.html

Slide15

Systems biologyfocuses on the systematic study of complex interactions in biological systems using a new perspective - holism instead of reductionism

holism

– the properties of a system cannot be determined or explained by its component parts alone

one of the goals of systems biology is to discover new emergent properties

new field, boom since 2000, very little covered in CZ

Slide16

Systems biology

source: wikipedia.org

Slide17

Systems biologybased on mathematical modelling of systems, control theory, cyberneticsengineering view on complex biological systems

e.g. answers questions about robustness of the given system when one of its part fails

or about response of a systems upon the change of the environmental conditions

Slide18

quantum chemistrymolecular dynamics

bioinformatics

systems biology

Slide19

Bioinformaticsapplication of information technology to the field of molecular biology, genomics and related biological disciplines

tremendous amount of data

the creation and advancement of databases, algorithms, computational and statistical techniques, and theory to solve problems arising from the management and analysis of biological data

Slide20

Podle definičního třídění ruských vědců rozlišujeme dva obory paranormálních jevů:

bioinformatika

a bioenergetika. Bioinformatika (tzn. mimosmyslové vnímání, ESP) zahrnuje získávání a výměnu informací mimosmyslovou cestou (nikoli normálními smyslovými orgány). V podstatě rozlišujeme následující formy bioinformace: hypnózu (kontrolu vědomí), telepatii, dálkové vnímání, prekognici, retrokognici, mimotělní zkušenost, "vidění" rukama nebo jinými částmi těla, inspiraci a zjevení.

zdroj: http://www.esoterika.cz/clanek/2992-mimosmyslova_spionaz_dalkove_pozorovani_i_.htm

Slide21

Bioinformaticssequence analysis (sequence bioinformatics)structural analysis (structural bioinformatics)

functional analysis (systems biology)

Slide22

genetic codegene

genome, genomics

large data sets

high throughput

human genome

DNA localized mainly in nucleus, each nucleus carries the whole genetic information3.2 billions bp25 000 – 30 000 genesca 1,5 % codes for proteins, the rest - junk DNAwhat is proteome?proteomics

Is it more difficult to study genome or proteome?

Slide23

Sequential bioinformaticsreconstruction of sequence fragments

searching of genes and other interesting regions in the genome

junk DNA

– 95% of human genome is made by non-coding sequences, either no function, or not yet understood

querying huge genomes for a given sequencecomparison of genes within a specie – similarities between protein functionscomparison of genes between species – organism's evolutionary relationships (phylogenetic analysis)

Slide24

Sequence alignmentProcedure of comparing sequences

Point mutations

– easy

More difficult example

However, gaps can be inserted to get something like this

ACGTCTGATACGCCGTATAGTCTATCTACGTCTGATTCGCCCTATCGTCTATCT

GAT

ACGCCGTAT

AGTCTATCTCTGATTCGC

TCGTC

TAT

ACGTCTGAT

ACGCCGTATAGTCTATCT----CTGATTCGC---ATCGTCTATCT

gapless alignmentgapped alignment

insertion × deletion

indel

Slide25

Flavors of sequence alignment

pair-wise alignment

multiple

sequence alignment

Slide26

Flavors of sequence alignment

global alignment

local alignment

global

local

align entire sequence

stretches of sequence with the highest density of matches are aligned, generating islands of matches or subalignments in the aligned sequences

Slide27

Identity matrix

Scoring systems I

DNA and protein sequences can be aligned so that the number of identically matching pairs is maximized.

Counting the number of matches gives us a score (3 in this case).

Higher score means better alignment

.This procedure can be formalized using substitution matrix.

A T T G - - - T

A –

G A C A T

Slide28

Scoring systems II

For nucleotide sequences identity matrix is usually good enough.

For protein

sequences,

identity matrix is not sufficient to describe biological and evolutionary proceses.It’s because amino acids are not exchanged with the same probability as can be conceived theoretically.

For example substitution of aspartic acids D by glutamic acid E is frequently observed. And change from aspartic acid to tryptophan W is very rare.Why is that?Triplet-based genetic code GAT (D) → GAA (E), GAT (D) → TGG (W)Both D and E have similar properties, but D and W differ considerably. D is hydrophylic, W is hydrophobic, D → W mutation can greatly alter 3D structure and consequently function.

Slide29

Substitution matrices

small, polar

small, nonpolar

polar or acidic

basic

large, hydrophobic

aromatic

Zvelebil, Baum, Understanding bioinformatics

Positive score

– frequency of substitutions is greater than would have occurred by random chance.

Zero score

– frequency is equal to that expected by chance.

Negative score

– frequency is less than would have occurred by random chance.

Slide30

Sequence database searchBLAST

Google of sequence world

Slide31

Phylogenetic analysis

Slide32

Structural bioinformaticsthe function of chemical moiety is given by its structure

while DNA structure is “given” (double-helix), RNA and proteins can accommodate very different conformations (i.e. specific arrangements of atoms in 3D space)

structural bioinformatics covers

analysis of the NA and proteins structure

prediction of structure from the sequence

Slide33

Protein structure predictionsecondary structure predictionthe conformational state of each residue is predicted as H

(helix), E (extended,

-sheet), C (coil)

accuracy: 80%tertiary structure predictionwhy?many sequences are known, not that many 3D structures has been solvedsome proteins (e.g. transmembrane) are difficult to characterize experimentally

many proteins have known function, but unknown structure (which is however needed to understand their behavior in detail)ab initio, threading, homology modelling

Slide34

CASPCritical Assessment of Structure Prediction

http://predictioncenter.org/

since 1994, every

years, CASP10 in preparationpredict solved, but not publicly released structurescompetition of individual groups in 3D prediction:human groups – answer in 14 days

software (automated prediction) – answer in 48 hours