/
Lecture 6A – Introduction to Trees & Optimality Criteria Lecture 6A – Introduction to Trees & Optimality Criteria

Lecture 6A – Introduction to Trees & Optimality Criteria - PowerPoint Presentation

goldengirl
goldengirl . @goldengirl
Follow
344 views
Uploaded On 2020-06-17

Lecture 6A – Introduction to Trees & Optimality Criteria - PPT Presentation

Branches n taxa gt 2 n 3 branches 1 2 4 6 amp 7 are external leaves 3 amp 5 are internal branches edges Nodes A E are terminals x y amp z are internal vertices ID: 780573

nodes min node state min nodes state node tree daughter length step character accumulated internal empty intersection amp matrix

Share:

Link:

Embed:

Download Presentation from below link

Download The PPT/PDF document "Lecture 6A – Introduction to Trees &am..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Lecture 6A – Introduction to Trees & Optimality Criteria

Branches:

n

-taxa -> 2n-3 branches

1, 2, 4, 6, & 7 are external (leaves) 3 & 5 are internal branches (edges)

Nodes A – E are terminals

x, y, & z are internal (vertices)

Slide2

If we break branch 3, we have two sub-trees: (A,B) and (C,(D,E)).

((A,B),C,(D,E)).

Newick

Format

Slide3

Rooting – The tree is an unrooted

tree.

Slide4

Also note that there is free rotation around nodes:

(1 2)

(1 2)

(1 2 3)

(1 2 3)

(1 2 3 4)

(1 2 3 4)

Slide5

Growth of tree space.

Slide6

The Scope of the Problem

Taxa

Unrooted Trees 3 1 4 3 5 15 6 105

7 945 8 10,395 9 135,135 10 2.027 X 106 22 3 X 10

23

50 3 X 10

74

100 2 X 10

82

1000 2 X 10

2,860

10 mil 5 X 10

68,667,340

Slide7

II. Optimality Criteria

A. Parsimony First, the score of a tree (i.e., its length) for the entire data set is given by:

l

i

is the length of character

i

when optimized on tree

t

.

w

i

is the weight we assign to character

i

.

Slide8

The Fitch Algorithm (1971): state sets and accumulated lengths.

(Unordered states with equal transformation costs)

We erect a state set at each terminal node and assign an accumulated length of zero to terminal nodes. This is the minimum number of changes in the daughter

subtree.

Slide9

The Fitch Algorithm: state sets and accumulated lengths.

1 – Form the intersection of the state sets of the two daughter nodes. If the

intersection is

non-empty, assign the set for the internal node equal to the intersection. The accumulated length of the internal node is the sum of those of the daughter nodes.

2 – If the intersection is empty, we assign the

union

of the two daughter nodes to

the state set for the internal node. The accumulated length is the

sum

of those

of the daughter nodes

plus one

.

empty

Union:

0+0+1=1

non-empty

Intersection:

0+0+0=0

empty

Union:

1+0+1=2

So

l

i

= 2

Slide10

Sankoff Algorithm – Character-state vectors and step matrices.

Step Matrix – define ci,j  A C G T A -- 4 1 4 C 4 -- 4 1 G 1 4 -- 4 T 4 1 4 --

Step one: Fill in the character-state vectors for terminal nodes.

Each cell is indexed by

s

k

(

i

),

the cost of having

state

i

at node

k

.

Slide11

Step two: Fill in vectors for other nodes, descending tree.

Node 1 (k = 1):

Node 2 (

k = 2):

A C G T A -- 4 1 4 C 4 -- 4 1 G 1 4 -- 4

T 4 1 4 --

s

1(A)

=

c

AG

+

c

AA

= 1 + 0 = 1,

s

1(C)

=

c

CG

+

c

CA

= 4 + 4 = 8,

s

1(G)

= cGG + cGA

= 0 + 1 = 1,

s

1(T)

=

c

TG

+

c

TA

= 4 + 4 = 8

s

2(A)

= 4 + 4 = 8

s

2(C)

= 0 + 0 = 0

s

2(G)

= 4 + 4 = 8

s

2(T)

= 1 + 1 = 2

Slide12

For nodes below, we must calculate the cost for each possible state

assignment for daughter nodes.s3(A) = min[s1A + cAj] + min[s2A +

cAj]

s

3(C)

= min[s

1C

+

c

Cj

] + min[s

2C

+

c

Cj

]

s

3(G)

= min[s

1G

+

cGj] + min[s2G + cGj]

s3(T) = min[s1T + c

Tj] + min[s2T + cTj

]So we fill in the character-state vector for node 3:

From daughter node 1

From step matrix

= min[

1

,12,2,12] + min[8,

4

,9,6] = 1+4 = 5

5

= min [

5

,8,

5

,9] + min[12,

0

,12,3] = 5+0 = 5

5

= min [2,12,

1

,12] + min[9,

4

,8,6] = 1+4 = 5

5

= min [

5

,9,

5

,8] + min[12,

1

,12,2] = 5+1 = 6

6

A C G T

A -- 4 1 4

C 4 -- 4 1

G 1 4 -- 4

T 4 1 4 --

Slide13

2) One can’t compare tree lengths across weighting schemes. In the first example, with

all transformations having the same cost, the length of the character on this tree was 2.

In the second, with a 4:1 step matrix to weight

transversions, the length was 5.

So, li = 5

Points to note:

1) Two types of weighting are possible: weighting of transformations within characters (which we demonstrated with the step matrix) and weighting among characters, which are reflected in the weighted sum of lengths across characters (

w

i

).