/
Genomics Virtual Lab: Genomics Virtual Lab:

Genomics Virtual Lab: - PowerPoint Presentation

pamella-moone
pamella-moone . @pamella-moone
Follow
427 views
Uploaded On 2015-10-22

Genomics Virtual Lab: - PPT Presentation

analyze your data with a mouse click Igor Makunin imakuninuqeduau QAAFI UQ April 8 2015 Research Computing Centre UQ Genomics Virtual Laboratory Genome scale experiments are relatively cheap and very popular ID: 169414

genome galaxy genomics data galaxy genome data genomics gvl qld virtual datasets ucsc nectar users tools analysis user workflows learn resources big

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Genomics Virtual Lab:" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Genomics Virtual Lab: analyze your data with a mouse click

Igor Makunini.makunin@uq.edu.auQAAFI, UQ, April 8, 2015

Research

Computing

Centre

@ UQSlide2

Genomics Virtual Laboratory

Genome scale experiments are relatively cheap and very popular - cost of high throughput sequencing is going down - available data (genomes, transcripts

etc)Analysis of NGS data is a bottleneck (

infrastructure, skills

)

Genomics Virtual Lab: take the IT out of Bioinformatics - web-based resources (biologists-friendly) - DIY bioinformatics environment (for geeks)GVL advantages: - public resources (no charges to users) - available immediatelySlide3

GVL products and services

Genomics Virtual Lab: genome.edu.auThe main aim: facilitate the genomics research in Australia

Galaxy:

Tutorials

and

protocols (nextGen sequencing)Galaxy for tutorials: galaxy-tut.genome.edu.auGalaxy for full-scale analysis: galaxy-qld.genome.edu.au

“roll your own” Galaxy on the Australian government funded computer infrastructure

(NeCTAR cloud) + ipython Notebook

+ RStudioDeploy your own computer cluster (NeCTAR

cloud)

Mirror of UCSC Genome Browser

RStudio

Learn

UseGet

InfoSlide4

Galaxy: how does it look like

Tools

Working window

HistorySlide5

Galaxy: possibilities

You can:analyze genome-scale nextGen sequencing data without bash

scriptingwork with big

datasets, genomic regions, sequences

etc.

create and use workflows (record steps of your analysis)share results and workflows with a user or make it available to anyoneData import: upload through the web interface ftp (for big datasets)Public data: UCSC Genome BrowserUCSC

Archaea Microbial data

EBA SRAOver 2,000 tools available through the Galaxy tool shedSlide6

Use: local Galaxy-

qld server

GVL Galaxy in Queensland:

galaxy

-

qld.genome.edu.auBWA, bowtie, bowtie2Velvet (microbial genome assembly)Trinity (de novo transcript assembly)tophat, tophat2 (RNA-

Seq)DESeq

, edgeR, Cufflinks (differential gene expression)Variant detection toolsMetagenomics

toolsMACS, MACS2, SPP (ChIP-Seq)SAMtools

Picard

100s users

1000s jobs per month

up to 1 Tb per user

(for the UQ users)Slide7

Data manipulation on Galaxy-

qld

GVL Galaxy in Queensland:

galaxy

-

qld.genome.edu.auUseful tools for data manipulation:FASTA manipulation MEME (identification of motifs)BLAST searchText manipulation: add column, merge, cut,

trim, compute expression etc.

Filter and SortJoin, Subtract and GroupFormat conversion (genomics)

Operate on Genomics Intervals (including Fetch closest feature)StatisticsSlide8

Good user practice for Galaxy-

qldGVL Galaxy in Queensland:

galaxy-

qld.genome.edu.au

Register with your UQ email and get a bigger disk allocation.

Use ftp for big datasets – it is faster. Galaxy recognises .gz compression.Do not store unneeded datasets. Delete temporary files such as SAM. Purge deleted datasets.Do not start many big jobs in parallel (BWA, bowtie, bowtie2, tophat, tophat2, velvet, trinity).Create and use workflows for multi-step analysis.Specify the quality score encoding for

nextGen sequencing data (FASTQ files).Slide9

Mirror of UCSC Genome Browser

ucsc.genome.edu.au

full mirror, regular update

keep user data for a long timeSlide10

Use:

RStudio

http

://gvl-rstudio.genome.edu.au/rstudio

/

Based on the GVL cluster

Genome data from Galaxy

Email to:

help@genome.edu.aufor the registrationSlide11

Genomics Virtual Lab: Learn

Genomics VL site:

genome.edu.au

Easy-to-follow Galaxy tutorials (DIY, online)

A dedicated Galaxy server:

galaxy-tut.genome.edu.au

Topics: RNA-Seq,

variant detection, ChIP-Seq, microbial genome assembly …Training through QFAB (with a nominal fee):

qfab.orgSlide12

GVL Get: roll your own Galaxy

Default NeCTAR

allocation for the UQ users: 2 CPUs, 8 GB RAMStart

you own virtual computer cluster on

the

NeCTAR cloudStart your own Galaxy on the NeCTAR cloud - admin rights (can add tools) - as powerful as needed (based on allocation) - ability to add worker nodes - ipython Notebook - RStudioDetailed instructions are available on the Genomics VL siteFollow announcements on QFAB

web site: qfab.orgSlide13

Summary

GVL provides resources for genomics research:

learn & Galaxy-tut local Galaxy-qld

roll your own

We are interested in users and the feedbackWhat you want to do? Any special needs? (tools, datasets, resources)What you want to learn?Do you want to share / promote your workflows with other people?Talk to us: Igor Makunini.makunin@uq.edu.au Slide14

Thank you!

GVL site: www.genome.edu.auGalaxy for tutorials: galaxy-tut.genome.edu.au

Galaxy Queensland: galaxy-

qld.genome.edu.au

Contributors and participants: