/
Alignment & Secondary Structure Alignment & Secondary Structure

Alignment & Secondary Structure - PowerPoint Presentation

conchita-marotz
conchita-marotz . @conchita-marotz
Follow
376 views
Uploaded On 2017-12-15

Alignment & Secondary Structure - PPT Presentation

You have learned about Data amp databases Tools Amino Acids Protein Structure Today we will discuss Aligning sequences After this You know how to perform structural alignments ID: 615511

alignment cmbi 2009 sequence cmbi alignment sequence 2009 structure information sequences aligning structural 2011 helices amino helix protein identity

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Alignment & Secondary Structure" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Alignment & Secondary Structure

You have learned about:

Data & databases

Tools

Amino Acids

Protein Structure

Today we will discuss:

Aligning sequences

After this:

You know how to perform structural alignments

You are ready to apply this knowledge in your bioinformatics research project!

Slide2

©CMBI 2011

Why align sequences?

The problem:

There a lots of sequences with unknown structure and/or function

There are a few sequences with known structure and/or function

Alignment can help:

If one of them has known structure/function, then alignment gives us insight in structural and/or functional aspects of the aligned sequence(s)

Transfer of information!Slide3

©CMBI 2011

Sequence Alignment (1)

A sequence alignment is a representation of a whole series of evolutionary events, which left traces in the sequences.

The purpose of a sequence alignment is to line up all residues in the sequence that were derived from the same residue position in the ancestral gene or protein.Slide4

©CMBI 2009Sequence Alignment (2)

gap =

insertion

or

deletion

(

indel

)

A

B

B

ASlide5

©CMBI 2011

Structural alignment

T

o carry over

structural information

, we need a

structural alignment

.

The implicit meaning of placing amino acid residues below each other in the same column of a protein (multiple) sequence alignment is that

they are at the equivalent

position in the

3D structures

of the corresponding proteins!!Slide6

©CMBI 2009

Examples

1) the 3 active site residues

H, D, S

, of the serine protease we saw earlier

2)

Cysteine

bridges (disulfide bridges):

ST

C

TKGALKLPV

C

RK

TS

C

TEG--RLPGCKRSlide7

©CMBI 2009Transfer of information

Such

information

can

be

:

Phosphorylation

sites

Glycosylation

sites

Stabilizing

mutationsMembrane

anchorsIon binding sitesLigand binding residues

Cellular localization

Typically what one finds in the feature (FT) records of Swissprot!

Slide8

©CMBI

2011

Significance of alignment

One can only transfer information if the similarity is significantly high between the two sequences.

The “threshold curve” for transferring structural information from one known protein structure to another protein sequence:

If the sequences are > 80

aa

long, then >25% sequence identity is enough to reliably transfer structural information.

Structure is much more conserved than sequence!

Slide9

©CMBI 2009

Significance of alignment (2)

Slide10

©CMBI 2009Aligning sequences by hand

Examples: which is the better alignment (left or right)?

1)

CPISRTWASIFRCW

CPISRTWASIFRCW

CPISRT---LFRCW CPISRTL---FRCW

2)

CPISRTRASEFRCW

CPISRTRASEFRCW

CPISRTK---FRCW CPISRT---KFRCWSlide11

©CMBI 2011

Aligning sequences by hand (2)

Procedure of

aligning

depends

on

information

available

:

In most cases

you will start with a

alignment program (e.g. CLUSTAL) Then

use your knowledge of the

amino acids to improve the alignment,

for instance by correcting

the position of gaps.

Also use explicitly

the secondary structure preference of the

amino acids, especially for

N-termini of helices and beta-turns.

Use 3D information if

one or more of the structures in the

alignment are known.Slide12

©CMBI 2009Helix

Slide13

©CMBI 2009

-4 -3 -2 -1 1 2 3 4 5 total

- - - - H

H

H

H

H

ASP 98 110 121 260 98 197 167 49 86 1186

Dataset of good helices from PDB files

Count all Asp residues in & before helices

Identify preferential positions for Asp residues

Positional preferences in helices (1)

Position 1 in helixSlide14

©CMBI 2009Aligning

2

sequences

when sequence

identity

is low

S G V S P D Q L A

A

L K L I L E L A L K

G T S L E T A L

L

M Q I A Q K L I A G

Helix 1:

Helix 2:Slide15

©CMBI 2009

Fill this table for all 20 amino acids

Use this information when aligning helices who have low percentage of sequence identity

-4 -3 -2 -1 1 2 3 4 5 total

- - - - H

H

H

H

H

ALA 143 148 99 58 189 205 187 241 268 1538

CYS 24 31 29 22 14 17 18 33 17 205

ASP 98 110 121 260 98 197 167 49 86 1186

GLU 91 100 71 71 152 287 269 70 147 1258

(…) TRP 29 25 29 14 30 26 28 30 29 240

TYR 66 65 75 33 58 44 56 72 48 517Positional preferences in helices (2)

Position

1 in helixSlide16

Protein threading

The word threading implies that one drags the sequence (ACDEFG...) step by step through each location on the template

©CMBI 2009Slide17

©CMBI 2009Aligning

2 helices

when

sequence identity

is low

S G V S P D Q L A

A

L K L I L E L A L K

-1-4-4-1-4-1 3-2 1 1-2 2

-3-2 -3 2 5 1 2 2 1 5

4 -2 3 4 3 3 4

1 5 4 4 5

5 5

G T S L E T A L

L

M Q I A Q K L I A G

-4-1-1-2 2-1 1-2

-3 3 1 3 3 2 1

4 3 4 5 4 5

5 Slide18

©CMBI 2009Aligning

2 helices

when

sequence identity

is low

S G V S P D Q L A

A

L K L I L E L A L K

-1-4-4

-1

-4-1

3

-2 1 1-2 2

-3-2

-3

2 5 1 2 2 1 5

4 -2 3 4 3 3 4

1 5 4 4 5

5 5

G T S L E T A L L M Q I A Q K L I A G

-4-1-1-2

2-1 1-2-3

3 1 3 3

2 1 4 3 4 5

4 5

5

Final alignment:

S G V S P D Q L A A L K L I L E L A L K

- G T S L E T A L L M Q I A Q K L I A GSlide19

©CMBI 2009Use of 3D structure info (1)

If you know that in structure 1 the Ala is pointing outside and the Ser is pointing inside:

Where does the Arg in structure 2 go?

(and what will CLUSTAL choose?)

A

BSlide20

Use of 3D-structure info (2)

Sequence A:

FDICRLPGSAEAV

Sequence B1:

FNVCRMP-

--EAI

Sequence B2:

FNVCR-

--M

P

EAI

S

G

P

L

A

E

R

C

I

V

C

R

M

P

E

V

C

R

M

P

E

 Correct alignment

F-D-

-A-VSlide21

©CMBI 2011

What you have learned today

A good sequence alignment is necessary to carrying over information between proteins.

Putting amino acids below each other in a sequence alignment implies that you predict that they are on equivalent positions in both proteins.

Alignments can be optimized by using

secondary structure preferences (

especially for helix positioning and prediction of beta-turns

)

3D structure info

If the aligned sequences are > 80

aa

long, then >25% sequence identity is enough to reliably transfer structural information. Slide22

©CMBI 2011

Alignment videos

Swift.cmbi.ru.nl/teach/B1M

=> Seminars

=> Link to

Aligning video page

Slide23

©CMBI 2009

You are ready to…

Applying these lessons to the practical exercises

Performing your own bioinformatics research project!

Take home lesson:

Please remember to always use all structural information available to you to optimize a sequence alignment. This can be real 3D data, but can also be “just” your own knowledge about the properties and preferences of the amino acids.