MCB3895-004 Lecture #11

MCB3895-004 Lecture #11 MCB3895-004 Lecture #11 - Start

Added : 2016-03-13 Views :15K

Download Presentation

MCB3895-004 Lecture #11




Download Presentation - The PPT/PDF document "MCB3895-004 Lecture #11" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.



Presentations text content in MCB3895-004 Lecture #11

Slide1

MCB3895-004 Lecture #11Sept 30/14

De novo

genome assembly using Velvet; using Perl to execute system commands

Slide2

Velvet

Velvet was one of the first widely used de novo assembly programs

http

://www.ebi.ac.uk/~

zerbino/velvet/Manual.pdf

It is still quite common, although frequently superseded

Slide3

Velvet step #1: De brujin graph construction

http://en.wikipedia.org/wiki/Velvet_assembler

Red bases

= sequencing error

Slide4

Velvet step #2: graph simplification

Unbranched paths collapsed into single nodes

http://en.wikipedia.org/wiki/Velvet_assembler

Slide5

Velvet step #3: tip removal

Dead end paths removed

http://en.wikipedia.org/wiki/Velvet_assembler

Slide6

Velvet step #4: bubble popping

"Bubbles" caused by sequencing error resolved

http://en.wikipedia.org/wiki/Velvet_assembler

Slide7

Running velvet: single end

Step #1:

velveth

to set up database

$

velveth

SRR826450_1 21 -short -

fastq

SRR826450_1.fastq

Parameters for directory name,

kmer

, read type, input file name

Step #2:

velvetg

to make graph

$

velvetg

SRR826450_1 -

exp_cov

auto

Parameters for directory name, automatic coverage detection

Note: the max

kmer

size is set during software compilation. In our case the max is

kmer

=61

Note: only odd

kmer

numbers are allowed

Slide8

Running velvet: paired end

In a fit of cussedness, velvet requires its paired-end data in a single interleaved file

Use

shuffleSequences_fastq.pl

Copy from

/

opt/bioinformatics2/velvet_1.2.10_31kmer/

contrib

/

shuffleSequences_fastq

/

into your directory

Running velvet is otherwise similar

$

velveth

SRR826450

21 -

shortPaired

-

fastq

SRR826450_paired.fastq

$

velvetg

SRR826450

-

exp_cov

auto

Slide9

Velvet output

contigs.fa

- contains output

contigs

If paired end,

contigs

are joined into scaffolds using strings of Ns

Log - contains what commands have actually been run along with parameters, results summary

Slide10

Today's assignment part 1

Compare a single-end and paired-end assembly of SRR826450

Use a small

kmer

(e.g., 21) to keep things computationally simple for now

Slide11

Running commands using Perl

Perl can be used to run terminal commands, just like Bash

Use the

system

command to run whatever follows as a terminal command

e.g.,

system

"

ls

-l > list"

;

# lists all of the files in the

# directory where the script was

# run and stores this information

# in a new file "list"

Slide12

Using system to run repetitive analyses

A useful way to use this sort of command is in a loop to run the same program multiple times using slightly different parameters

e.g.,

for

(

my

$

kmer

=

15;

$

kmer

<=

31;

$

kmer

=

$

kmer

+

2){

print

"

kmer

=

$

kmer

\n

"

;

system

"

velveth

infile

_

$

kmer

$

kmer

-

shortPaired

-

fastq

infile

"

;

system

"

velvetg

infile

_

$

kmer

-

exp_cov

auto"

;

}

Slide13

Hints for system loops

Use

print statements can be helpful to keep track of

loop

Run the loop first with all system commands converted to print statements

Run only a single iteration of the entire loop

Slide14

Today's assignment part 2

Answer the question: what is the optimal

kmer

size to assemble the SRR826450 paired end reads?

For Thursday's class, read and be ready to discuss:

Magoc

et al. 2013 Bioinformatics 29:1718-1725

Bradnam

et al.

GigaScience

2:10

Slide15

Slide16

Slide17

Slide18

Slide19


About DocSlides
DocSlides allows users to easily upload and share presentations, PDF documents, and images.Share your documents with the world , watch,share and upload any time you want. How can you benefit from using DocSlides? DocSlides consists documents from individuals and organizations on topics ranging from technology and business to travel, health, and education. Find and search for what interests you, and learn from people and more. You can also download DocSlides to read or reference later.
Youtube