/
Failures in interpretation of sequencing results Failures in interpretation of sequencing results

Failures in interpretation of sequencing results - PowerPoint Presentation

danika-pritchard
danika-pritchard . @danika-pritchard
Follow
386 views
Uploaded On 2017-04-28

Failures in interpretation of sequencing results - PPT Presentation

v10 Laura Biggins Interpretation Library Contamination Biological Interpretation Technical Tracking Interpreting results QC and visualisation still important Easy to draw wrong conclusions from the data ID: 542446

chr interpretation content genes interpretation chr genes content seq rna activity categories analysis sample mouse binding library data public

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Failures in interpretation of sequencing..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Failures in interpretation of sequencing results

v1.0Laura BigginsSlide2

InterpretationSlide3

Library

Contamination

Biological

Interpretation

Technical

Tracking

Interpreting results

QC and visualisation still important

Easy to draw wrong conclusions from the data Slide4

Interpretation exercise

Interpretation

RNA-

seq

dataSlide5

Interpretation

Sample 1 genome viewSlide6

Interpretation

Sample 2 genome viewSlide7

Interpretation

Genes

upregulated

in sample 1Slide8

Interpretation

Genes

downregulated

in sample 1Slide9

Interpretation

Interpretation exerciseSlide10

Functional Enrichment Analysis

Gene ontology analysisPathways analysisAny set of predefined functional categories with genes assigned to the categories

Enrichment test – gene list

vs

background

Useful and powerful but easy to produce false positives

InterpretationSlide11

Location bias

GO analysis of all genes on chromosome (mouse)

Interpretation

Michael Reik

Chr

Category

BH

adj

p-value

Chr 1

GO:0050662~coenzyme binding

1.61E-02

Chr 2

GO:0007608~sensory perception of smell

1.54E-64

Chr 3

GO:0005509~calcium ion binding

2.32E-01

Chr 4

GO:0009615~response to virus

2.58E-07

Chr 5

GO:0001730~2'-5'-oligoadenylate synthetase activity

2.44E-04

Chr 6

GO:0005529~sugar binding

1.54E-25

Chr 7

GO:0004984~olfactory receptor activity

1.08E-30

Chr 8

GO:0042742~defense response to bacterium

2.36E-16

Chr 9

GO:0007608~sensory perception of smell

2.68E-12

Chr 10

GO:0008227~amine receptor activity

3.24E-05

Chr 11

GO:0045111~intermediate filament cytoskeleton

1.79E-23

Chr 12

GO:0034097~response to cytokine stimulus

2.59E-04

Chr 13

GO:0000786~nucleosome

2.47E-17

Chr 14

GO:0004522~pancreatic ribonuclease activity

2.09E-22

Chr 15

GO:0045095~keratin filament

1.28E-17

Chr 16

GO:0004869~cysteine-type endopeptidase inhibitor activity

8.62E-08

Chr 17

GO:0042611~MHC protein complex

5.39E-19

Chr 18

GO:0007156~homophilic cell adhesion

1.01E-26

Chr 19

GO:0005506~iron ion binding

2.12E-07

Chr X

GO:0045449~regulation of transcription

6.88E-04Slide12

Mapping to genome

multi-mappingGO enrichment for “differentially expressed” genes

- ribosomal categories (p < 1E-20)

- histones & chromatin assembly (p < 1E-7)

Interpretation

Michael ReikSlide13

GC content

InterpretationSlide14

GC content

Interpretation

GOrilla

analysis using mouse genes with GC content

> 60% (~170

genes)Slide15

GC content

Interpretation

GOrilla

analysis using mouse genes with GC content < 35% (~200 genes)Slide16

Public RNA-seq

Public RNA-seq data from a range of mouse tissues Called differentially expressed genes between replicates within datasets

Gene lists enriched in functional categories:

Ribosome

Extracellular

Secreted

Glycoprotein

Myofibril, cytoskeleton

InterpretationSlide17

GO analysis of genes that appeared

in > 5 datasets

Extracellular

, glycoprotein categories

absent – large, diverse categories

Interpretation

Public RNA-

seqSlide18

Membrane associated transcripts

Re-sequencing library = same result

Remaking library from tissue = changes gone

InterpretationSlide19

Sfi1 – spindle associated

transcript

Interpretation

Misbehaving genesSlide20

Misbehaving genes

Titin, USH2A (big!)Mucin, Mid1, Sfi1 (duplication events)Olfactory receptors (big families)Poorly annotated (RIKEN, EST, Gm123,RP11 etc

)

InterpretationSlide21

Power

Interpretation

RNA-seq

Gene/transcript length

Expression level

Bisulphite-seq

CpG

content

Slide22

Confounding factors

Interpretation