/
Fission yeast Schizosaccharomyces Fission yeast Schizosaccharomyces

Fission yeast Schizosaccharomyces - PowerPoint Presentation

ava
ava . @ava
Follow
342 views
Uploaded On 2022-05-17

Fission yeast Schizosaccharomyces - PPT Presentation

pombe Budding yeast Saccharomyces pombe sugar fungus Proteins dictate function in an organism What happens as proteins evolve In our project well be determining if functional homologs of ID: 911539

acid amino proteins sequence amino acid sequence proteins sequences blastp protein word acids function words score substitutions scores target

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Fission yeast Schizosaccharomyces" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Fission yeast

Schizosaccharomyces pombe

Budding yeast

Saccharomyces pombe (sugar fungus)

Proteins dictate function in an organism:What happens as proteins evolve?

In our project, we'll be determining if functional homologs of

S. cerevisiae

Met proteins are present in

S. pombe

Slide2

This semester: Five genes from

S. pombe

will be transferred to

S. cerevisiae

What organism should the class study after we finish

S. pombe

genes?

A look at the molecular phylogeny should help

Slide3

Are there any correlations between the kind of amino acid substitutions observed over evolution with their chemistry?

How are bioinformatics tools used to analyze the conservation of protein sequences? How can I identify regions of proteins that are most strongly conserved and most likely to be important for function?

Slide4

For proteins to maintain their function, they don't tolerate drastic changes to their shapes

Amino acid substitutions that significantly perturb the structure of a protein or alter its chemistry can cause the protein to lose function

Met16p from S. cerevisiae

complexed with PAP (2OQ2)

Slide5

Recall that the final folded form of a protein is determined by its primary sequence

R (“reactive”) groups form a variety of bonds important for structure and function

Slide6

C

ustom view of Met16p highlights Cys

Protein: backbone viewPAP: ball-and-stickCysteine: space-fill

Cys-254 is in close proximity to the end-product, PAP, suggesting that it plays a role in catalysisCysteine is one of the most evolutionarily constrained amino acids

Slide7

Glu

(E)

Asp (D)

Acidic

Arg (R) Lys (K)

His (H)

Basic

Charged

Asn

(N)

Gln

(Q)

Polar

Thr

(T)

Gly

(G)

Cys

(C)

Ser (S)

Ala (A)

Small

Neutral

Tyr (Y)

Aromatic

Hydrophobic

Val (V)

Ile (I)

Leu

(L)

Met (M)

Pro (P)

Trp

(W)

Phe

(F)

Amino acids can be grouped according to the chemistry and size of their R groups

Slide8

Most amino acids are abbreviated by their first letter:

(Abundant, hydrophobic ones get preference)A Ala alanine

C Cys cysteine

G Gly glycineH H

is histidineI Ile isoleucineL L

eu leucine

M Met methionineP P

ro

proline

S S

er serine

T

T

hr

threonine

V V

al

valine

Phonetic abbreviations:

F

Ph

e

phenylalanine

R

Ar

g

arginine

Oddballs:

(Charged, aromatic, some polar)

D

Asp aspartic acid

E

Glu

glutamic

acid

K

Lys lysine

N

Asn

asparagine

Q

Gln

glutamine

W

Trp tryptophanY Tyr tyrosine

The one letter code needs to be part of a 21

st century biologist’s vocabulary

Slide9

Matrix assigns scores for substitutions:

Maximum score for the same amino acid (completely conserved, possibly essential)Positive scores

are awarded for common amino acid substitutions, in decreasing order, based on their occurrence in proteinsNegative scores

are unlikely substitutionsBLOSUM62 (BLOck

SUbstitution Matrix) was based on statistical alignments seen in proteins that are at least 62% identical

Studying the evolutionary conservation of amino acids in sequences provides a sense of the importance of the amino acid to protein function

Note the high score for Cys!

The biochemical connection

:

Higher scores are frequently correlated with conservative amino acid substitutions based on amino acids chemistry and size

Slide10

Are there any correlations between the kind of amino acid substitutions observed over evolution with their biochemistry?

How are bioinformatics tools used to analyze the conservation of protein sequences?

How can I identify regions of proteins that are most strongly conserved and most likely to be important for function?

Slide11

BLAST

BLAST is an acronym for Basic L

ocal Alignment S

earch Tool, a computer algorithm for finding homologous sequences in databases BLASTN compares

nucleic acid sequences BLASTP compares p

rotein sequences

BLOSUM62 is the default scoring matrix for BLASTP

Slide12

P

ij is the observed frequency of two amino acids (i and j) replacing each other in homologous sequences

Qi and Q

j are probabilities of finding i and j randomly in a sequence

Score = k log

10

P

ij

Q

i

*

Q

j

(

)

Scaling factor used to produce integral values

BLOSUM 62 scores relate the frequency of a particular substitution to the probability that it occurs by chance in proteins that are at least 62% identical throughout their length

Slide13

Positive and negative scores suggest amino acid changes have been selected for (positive) or against (negative) during evolution

Magnitude of the score suggests the strength of the selection

Score of zero suggests that a particular substitution can be explained by chance alone

Slide14

BLASTP begins with a query

sequence (e.g. your MET sequence)

If a target entry has two or more matches to "

words" from the query, the alignment is extended in both directions looking for additional similarity

Word match

Word match

Target sequence

BLAST searches for matches (or synonyms) in

target

entries in the database

Word match

Word match

Target sequence

The

query

sequence is broken into "

words

" that will act as seeds in alignments

Words

Query

Slide15

"Words" are integral to the BLASTP search

BLASTP uses a sliding window to identify wordsConsider the sequence:

E A G L E S

BLASTP would break this down into a series of four 3-letter words:

E A G A G L G L E L E S

Tip

!

Use a non-proportional word font such as Courier when working with database entries.

The fonts are uglier, but the letters have a constant spacing that generates nice columns!

Next: words are given a numerical score

Slide16

BLASTP uses the BLOSUM62 matrix as its default for assigning values to words

E A G

A G L

G L E L E S

5 + 4 + 6 = 154 + 6 + 4 = 146 + 4 + 5 = 15

4 + 5 + 4 = 13

BLASTP next checks for word

synonyms

(1-letter replacements)with a score greater than a default

threshold of 10

E A G

A G L

G L E

L E S

K A G (11)

E S G (12)

E C G (11)

E T G (11)

E V G (11)

G I E (13)

G L D (12)

G L Q (12)

S G L (11)

A G I (12)

I E S (13)

BLASTP will search for all of these words and synonyms in the protein database

Of the 60 possible synonyms for each word, only a small handful are statistically likely to appear in homologous proteins

Slide17

Sequences must have at least two words for further consideration

BLASTP uses word matches as a nucleus and extends them in both directions, looking for additional similarityAs BLASTP extends the alignment out from the match, it calculates a running score – extension stops when the score drops below a threshold value

Penalties are assigned for gaps and mismatches

Plus signs in summary line indicate a positive BLOSUM62 value

Word match

Target sequence

Original search word

Q A S T L Y E - A

G L E

S

E A T T N - - R R E I

+ A + T +

+ +

G L E

S

E A + + R + E +

N A A T Y

W D A S

G L E

S - -

- S Q I I R K E L

Query

Summary

Target

Slide18

Are there any correlations between the kind of amino acid substitutions observed over evolution with their biochemistry?

How are bioinformatics tools used to analyze the conservation of protein sequences?

How can I identify regions of proteins that are most strongly conserved and most likely to be important for function?

Slide19

Highly conserved protein sequences are often essential for function

You will compare sequences of homologous proteins from model organisms

Escherichia coli

K

-12

(gram negative)

Caenorhabditis

elegans

Mus

musculus

Arabidopsis thaliana

Bacillus

subtilis

str. 168

(gram positive)

Slide20

Phylogeny.fr

provides tools for preparing multiple sequence alignments and phylogenetic trees

Slide21

Multiple sequence alignments show regions of conservation

Identical amino acids are shown in blue – conservative changes in grey

Slide22

Tree

Dyn generates a phylogenetic tree

Bootstrap values

predict reliability of nodes in the tree (max = 1.0)

Length of branches reflects time since divergene from a node

Length corresponds to 600 million years

Slide23

Weblogo

program provides a graphical depiction of multiple sequence alignments

Sizes of different amino acids reflects the frequency with which a particular amino acid is found at the position – note the positions of amino acids with high BLOSUM scores