/
Genome 540: Discussion Section Genome 540: Discussion Section

Genome 540: Discussion Section - PowerPoint Presentation

Thrillseeker
Thrillseeker . @Thrillseeker
Follow
343 views
Uploaded On 2022-08-03

Genome 540: Discussion Section - PPT Presentation

Week 5 Dani Faivre Agenda HW Quick Recap HW4 Questions HW5 Introduction HW comments Testing Use test cases with known output Descriptions Descriptions should reflect some annotation from the ID: 933568

start score read position score start position read max starts cumul segment output sequence segments number elevated txt diff

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Genome 540: Discussion Section" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Genome 540: Discussion Section Week 5

Dani Faivre

Slide2

Agenda

HW Quick Recap

HW4 Questions?

HW5 Introduction

Slide3

HW commentsTesting

Use test cases with known output

Descriptions:

Descriptions should reflect some annotation from the

genbank

files

It may be that there is no overlapping annotations in which case you do not have to report one

Working together

It’s okay to compare final output, just not code

Slide4

HW4 questions?Notes:

Assume that any input graph text file lists the vertices in depth order

Write your representation of the graph image in depth order

Make sure you write the sequence graph file in depth order

You can write separate functions for parts 1, 2, and 3 instead of programs, but two programs are recommended.

You can round any floating point numbers, but do include at least 2 decimal places!

What if there are multiple highest weighted paths?

Slide5

HW5

Due 11:59pm on Sunday, Feb 14

Assignment: use D-segment algorithm to identify sequence segments with high copy number.

Input:

File with read start counts at each position along a chromosome (Chromosome\

tPosition

\

tScore

)

Scoring scheme

Output:

Number of normal and elevated copy-number segments

List of elevated copy-number segments (start, end, score)

Annotations for the first three segments (look up using UCSC genome browser)

Histograms of read-start counts (i.e. number of positions with 0, 1, 2, and >=3 read-starts) for non-elevated and elevated segments

Slide6

Checking if you match the template

When testing your code on the example, run ‘diff’ between your output and the sample output

> diff

your_output.txt

example_output.txt

The only differences should be the header.

Slide7

Diff Example

Slide8

Diff Example

diff –y file1.txt file2.txt

I need to go to the store.

I need to go to the store.

I need to buy some apples.

I need to buy some apples.

>

Oh yeah, I also need to buy grated cheese.

When I get home, I'll wash the dog. When I get home, I'll wash the dog.

Slide9

Maximal segment vs. Maximal D-segmentMaximal segment:No

subsegment

has a higher score

No segment properly containing the segment satisfies the above condition

Maximal D-segment:

No

subsegment

has score < D, where D is the

dropoff

value

No D-segment properly containing the D-segment satisfies the above condition

The segment score must be >= S, where S >= -D

Slide10

Slide11

S

0

D

sequence position

cumulative score

Slide12

S

0

D

sequence position

cumulative score

Slide13

S

0

D

sequence position

cumulative score

Slide14

position 1 2 3 4 5 6 7 8 9 10 11 12 13 14

# read starts 0 0 0 0 1 2 0 4 1 2 0 0 0 0

score -0.5 -0.5 -0.5 -0.5 0.52 1.1 -0.5 1.7 0.52 1.1 -0.5 -0.5 -0.5 -0.5

D = -3

S = 3

max = 0

start = 1

end = 1

cumul

= 0

Slide15

position 1 2 3 4 5 6 7 8 9 10 11 12 13 14

# read starts 0 0 0 0 1 2 0 4 1 2 0 0 0 0

score -0.5 -0.5 -0.5 -0.5 0.52 1.1 -0.5 1.7 0.52 1.1 -0.5 -0.5 -0.5 -0.5

D = -3

S = 3

max = 0

start = 2

end = 2

cumul

= 0

Slide16

position 1 2 3 4 5 6 7 8 9 10 11 12 13 14

# read starts 0 0 0 0 1 2 0 4 1 2 0 0 0 0

score -0.5 -0.5 -0.5 -0.5 0.52 1.1 -0.5 1.7 0.52 1.1 -0.5 -0.5 -0.5 -0.5

D = -3

S = 3

max = 0

start = 2

end = 2

cumul

= 0

Slide17

position 1 2 3 4 5 6 7 8 9 10 11 12 13 14

# read starts 0 0 0 0 1 2 0 4 1 2 0 0 0 0

score -0.5 -0.5 -0.5 -0.5 0.52 1.1 -0.5 1.7 0.52 1.1 -0.5 -0.5 -0.5 -0.5

D = -3

S = 3

max = 0

start = 3

end = 3

cumul

= 0

Slide18

position 1 2 3 4 5 6 7 8 9 10 11 12 13 14

# read starts 0 0 0 0 1 2 0 4 1 2 0 0 0 0

score -0.5 -0.5 -0.5 -0.5 0.52 1.1 -0.5 1.7 0.52 1.1 -0.5 -0.5 -0.5 -0.5

D = -3

S = 3

max = 0

start = 3

end = 3

cumul

= 0

Slide19

position 1 2 3 4 5 6 7 8 9 10 11 12 13 14

# read starts 0 0 0 0 1 2 0 4 1 2 0 0 0 0

score -0.5 -0.5 -0.5 -0.5 0.52 1.1 -0.5 1.7 0.52 1.1 -0.5 -0.5 -0.5 -0.5

D = -3

S = 3

max = 0

start = 4

end = 4

cumul

= 0

Slide20

position 1 2 3 4 5 6 7 8 9 10 11 12 13 14

# read starts 0 0 0 0 1 2 0 4 1 2 0 0 0 0

score -0.5 -0.5 -0.5 -0.5 0.52 1.1 -0.5 1.7 0.52 1.1 -0.5 -0.5 -0.5 -0.5

D = -3

S = 3

max = 0

start = 4

end = 4

cumul

= 0

Slide21

position 1 2 3 4 5 6 7 8 9 10 11 12 13 14

# read starts 0 0 0 0 1 2 0 4 1 2 0 0 0 0

score -0.5 -0.5 -0.5 -0.5 0.52 1.1 -0.5 1.7 0.52 1.1 -0.5 -0.5 -0.5 -0.5

D = -3

S = 3

max = 0

start = 5

end = 5

cumul

= 0

Slide22

position 1 2 3 4 5 6 7 8 9 10 11 12 13 14

# read starts 0 0 0 0 1 2 0 4 1 2 0 0 0 0

score -0.5 -0.5 -0.5 -0.5 0.52 1.1 -0.5 1.7 0.52 1.1 -0.5 -0.5 -0.5 -0.5

D = -3

S = 3

max = 0

start = 5

end = 5

cumul

= 0

Slide23

position 1 2 3 4 5 6 7 8 9 10 11 12 13 14

# read starts 0 0 0 0 1 2 0 4 1 2 0 0 0 0

score -0.5 -0.5 -0.5 -0.5 0.52 1.1 -0.5 1.7 0.52 1.1 -0.5 -0.5 -0.5 -0.5

D = -3

S = 3

max = 0.52

start = 5

end = 5

cumul

= 0.52

Slide24

position 1 2 3 4 5 6 7 8 9 10 11 12 13 14

# read starts 0 0 0 0 1 2 0 4 1 2 0 0 0 0

score -0.5 -0.5 -0.5 -0.5 0.52 1.1 -0.5 1.7 0.52 1.1 -0.5 -0.5 -0.5 -0.5

D = -3

S = 3

max = 0.52

start = 5

end = 5

cumul

= 0.52

Slide25

position 1 2 3 4 5 6 7 8 9 10 11 12 13 14

# read starts 0 0 0 0 1 2 0 4 1 2 0 0 0 0

score -0.5 -0.5 -0.5 -0.5 0.52 1.1 -0.5 1.7 0.52 1.1 -0.5 -0.5 -0.5 -0.5

D = -3

S = 3

max = 1.62

start = 5

end = 6

cumul

= 1.62

Slide26

position 1 2 3 4 5 6 7 8 9 10 11 12 13 14

# read starts 0 0 0 0 1 2 0 4 1 2 0 0 0 0

score -0.5 -0.5 -0.5 -0.5 0.52 1.1 -0.5 1.7 0.52 1.1 -0.5 -0.5 -0.5 -0.5

D = -3

S = 3

max = 1.62

start = 5

end = 6

cumul

= 1.62

Slide27

position 1 2 3 4 5 6 7 8 9 10 11 12 13 14

# read starts 0 0 0 0 1 2 0 4 1 2 0 0 0 0

score -0.5 -0.5 -0.5 -0.5 0.52 1.1 -0.5 1.7 0.52 1.1 -0.5 -0.5 -0.5 -0.5

D = -3

S = 3

max = 1.62

start = 5

end = 6

cumul

= 1.12

Slide28

position 1 2 3 4 5 6 7 8 9 10 11 12 13 14

# read starts 0 0 0 0 1 2 0 4 1 2 0 0 0 0

score -0.5 -0.5 -0.5 -0.5 0.52 1.1 -0.5 1.7 0.52 1.1 -0.5 -0.5 -0.5 -0.5

D = -3

S = 3

max = 1.62

start = 5

end = 6

cumul

= 1.12

Slide29

position 1 2 3 4 5 6 7 8 9 10 11 12 13 14

# read starts 0 0 0 0 1 2 0 4 1 2 0 0 0 0

score -0.5 -0.5 -0.5 -0.5 0.52 1.1 -0.5 1.7 0.52 1.1 -0.5 -0.5 -0.5 -0.5

D = -3

S = 3

max = 2.82

start = 5

end = 8

cumul

= 2.82

Slide30

position 1 2 3 4 5 6 7 8 9 10 11 12 13 14

# read starts 0 0 0 0 1 2 0 4 1 2 0 0 0 0

score -0.5 -0.5 -0.5 -0.5 0.52 1.1 -0.5 1.7 0.52 1.1 -0.5 -0.5 -0.5 -0.5

D = -3

S = 3

max = 2.82

start = 5

end = 8

cumul

= 2.82

Slide31

position 1 2 3 4 5 6 7 8 9 10 11 12 13 14

# read starts 0 0 0 0 1 2 0 4 1 2 0 0 0 0

score -0.5 -0.5 -0.5 -0.5 0.52 1.1 -0.5 1.7 0.52 1.1 -0.5 -0.5 -0.5 -0.5

D = -3

S = 3

max = 3.34

start = 5

end = 9

cumul

= 3.34

Slide32

position 1 2 3 4 5 6 7 8 9 10 11 12 13 14

# read starts 0 0 0 0 1 2 0 4 1 2 0 0 0 0

score -0.5 -0.5 -0.5 -0.5 0.52 1.1 -0.5 1.7 0.52 1.1 -0.5 -0.5 -0.5 -0.5

D = -3

S = 3

max = 3.34

start = 5

end = 9

cumul

= 3.34

Slide33

position 1 2 3 4 5 6 7 8 9 10 11 12 13 14

# read starts 0 0 0 0 1 2 0 4 1 2 0 0 0 0

score -0.5 -0.5 -0.5 -0.5 0.52 1.1 -0.5 1.7 0.52 1.1 -0.5 -0.5 -0.5 -0.5

D = -3

S = 3

max = 4.44

start = 5

end = 10

cumul

= 4.44

Slide34

position 1 2 3 4 5 6 7 8 9 10 11 12 13 14

# read starts 0 0 0 0 1 2 0 4 1 2 0 0 0 0

score -0.5 -0.5 -0.5 -0.5 0.52 1.1 -0.5 1.7 0.52 1.1 -0.5 -0.5 -0.5 -0.5

D = -3

S = 3

max = 4.44

start = 5

end = 10

cumul

= 4.44

Slide35

position 1 2 3 4 5 6 7 8 9 10 11 12 13 14

# read starts 0 0 0 0 1 2 0 4 1 2 0 0 0 0

score -0.5 -0.5 -0.5 -0.5 0.52 1.1 -0.5 1.7 0.52 1.1 -0.5 -0.5 -0.5 -0.5

D = -3

S = 3

max = 4.44

start = 5

end = 10

cumul

= 3.94

Slide36

position 1 2 3 4 5 6 7 8 9 10 11 12 13 14

# read starts 0 0 0 0 1 2 0 4 1 2 0 0 0 0

score -0.5 -0.5 -0.5 -0.5 0.52 1.1 -0.5 1.7 0.52 1.1 -0.5 -0.5 -0.5 -0.5

D = -3

S = 3

max = 4.44

start = 5

end = 10

cumul

= 3.94

Slide37

position 1 2 3 4 5 6 7 8 9 10 11 12 13 14

# read starts 0 0 0 0 1 2 0 4 1 2 0 0 0 0

score -0.5 -0.5 -0.5 -0.5 0.52 1.1 -0.5 1.7 0.52 1.1 -0.5 -0.5 -0.5 -0.5

D = -3

S = 3

max = 4.44

start = 5

end = 10

cumul

= 3.44

Slide38

position 1 2 3 4 5 6 7 8 9 10 11 12 13 14

# read starts 0 0 0 0 1 2 0 4 1 2 0 0 0 0

score -0.5 -0.5 -0.5 -0.5 0.52 1.1 -0.5 1.7 0.52 1.1 -0.5 -0.5 -0.5 -0.5

D = -3

S = 3

max = 4.44

start = 5

end = 10

cumul

= 3.44

Slide39

position 1 2 3 4 5 6 7 8 9 10 11 12 13 14

# read starts 0 0 0 0 1 2 0 4 1 2 0 0 0 0

score -0.5 -0.5 -0.5 -0.5 0.52 1.1 -0.5 1.7 0.52 1.1 -0.5 -0.5 -0.5 -0.5

D = -3

S = 3

max = 4.44

start = 5

end = 10

cumul

= 2.94

Slide40

position 1 2 3 4 5 6 7 8 9 10 11 12 13 14

# read starts 0 0 0 0 1 2 0 4 1 2 0 0 0 0

score -0.5 -0.5 -0.5 -0.5 0.52 1.1 -0.5 1.7 0.52 1.1 -0.5 -0.5 -0.5 -0.5

D = -3

S = 3

max = 4.44

start = 5

end = 10

cumul

= 2.94

Slide41

position 1 2 3 4 5 6 7 8 9 10 11 12 13 14

# read starts 0 0 0 0 1 2 0 4 1 2 0 0 0 0

score -0.5 -0.5 -0.5 -0.5 0.52 1.1 -0.5 1.7 0.52 1.1 -0.5 -0.5 -0.5 -0.5

D = -3

S = 3

max = 4.44

start = 5

end = 10

cumul

= 2.44

Slide42

position 1 2 3 4 5 6 7 8 9 10 11 12 13 14

# read starts 0 0 0 0 1 2 0 4 1 2 0 0 0 0

score -0.5 -0.5 -0.5 -0.5 0.52 1.1 -0.5 1.7 0.52 1.1 -0.5 -0.5 -0.5 -0.5

D = -3

S = 3

max = 4.44

start = 5

end = 10

cumul

= 2.44

Slide43

position 1 2 3 4 5 6 7 8 9 10 11 12 13 14

# read starts 0 0 0 0 1 2 0 4 1 2 0 0 0 0

score -0.5 -0.5 -0.5 -0.5 0.52 1.1 -0.5 1.7 0.52 1.1 -0.5 -0.5 -0.5 -0.5

D = -3

S = 3

max = 4.44

start = 5

end = 10

cumul

= 2.44

D-segment: 5, 10, 4.44

(start, end, max)

Slide44

Pseudo-code for the D-segment algorithm: