0K - views

Bioinformatics analysis of a sweet potato

Bioinformatics analysis of a sweet potato RNASeq experiment: from sequences to pathways Rolando Morán 1 ; María Elena Ochagavía 2 ; Cathie Martin 3 ; Aylin Nordelo 1 ; Chris Watkins 4 1

Embed :
Presentation Download Link

Download Presentation - The PPT/PDF document "Bioinformatics analysis of a sweet potat..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.

Bioinformatics analysis of a sweet potato

Presentation on theme: "Bioinformatics analysis of a sweet potato"— Presentation transcript:

Bioinformatics analysis of a sweet potato RNASeq experiment: from sequences to pathways Rolando Morán 1 ; María Elena Ochagavía 2 ; Cathie Martin 3 ; Aylin Nordelo 1 ; Chris Watkins 4 1 CIGB, Camagüey, Cuba; 2 CIGB Havana , Cuba; 3 John Innes Centre, Norwich, UK; 4 The Genome Analysis Centre (TGAC), Norwich, UK

OverviewSweet potato ( Ipomoea batatas L) Fifth most important food crop in developing countries.Rich in complex carbohydrates, dietary fiber and beta-carotene (a provitamin A carotenoid).

General uses As human food and animal feedFor industrial production of sugar, starch and alcoholFor pharmaceutical purposes as medicinal plant

Traditional usesThe leaves decoction is used in folk remedies for:Tumors of the mouth and throat (Hartwell,1971). As bactericide , demulcent, fungicide, laxative, asthma, bug bites, burns, catarrh, ciguatera, convalescence, diarrhea, fever, nausea, splenosis and stomach distress (Duke and Wain, 1981).In region of Kagawa, Japan, a variety of white sweet potato has been eaten raw to treating anemia, hypertension and diabetes ( Ludvik et al., 2004). The most common use of the roots is to treat constipation ( Pereda -Miranda & Bah, 2003).

Sweet potato bioactive compounds Rev. bras . farmacogn. vol.22 no.3 Curitiba May/June 2012 Epub 2012


In Cuba, sweet potato is planted in more than 60 000 ha annually . Instituto de Investigaciones de Viandas Tropicales (INIVIT). Variety CEMSA 78354

Negative effects of some terpenoidsSome terpenoids are insect attractors.In particular, Boehmeryl acetate, a triterpene produced by sweet potato plants acts as an ovipositional stimulant for C.formicarius L. (Sweet Potato Weevil pest).Sweet potato weevil is considered to be the most serious pest of sweet potato, with reports of losses ranging from five to 97% in areas where the weevil occurs. Triterpenes are produced through the mevalonate pathwa y (MVA) from farnesyl diphosphate (FDP ) compound in cytosol. This pathway is an important target to fight the sweet potato weevil pest.

Transcriptome Study GoalsIdentification of enzymes participating in the triterpene synthesis, and other pathways related to plant defense, to fight against sweet potato weevil pest.Identification of transcription factors related to flavonoids to improve sweet potato benefits as food crop and/or for the development of bioreactors for the industrial production of those compounds.Identification of differentially expressed genes in different development stages and tissues to explore active pathways in such conditions.

Sweet potato transcriptome studies NCBI Sequence Read Archive (SRA)

Currently available Sweet potato sequences Sept 29, 2014 55468 (96.12%) from Tao X et al. 2012 SwissProt : 37 TrEMBL : 882 NCBI

Cuban Sweet Potato Transcriptome Study Samples : 4 young leaves (YL) mature leaves (ML) young roots (YR) mature roots (MR) Library type: Paired end, not strand specific.Platform: Illumina HiSeq 2000.Reads length : 2x100 nt .

Bioinformatics pipeline Transcriptome Assembly Transcript abundance estimation Differential Gene Expression Raw reads Trinity RSEM/ bowtie Bioconductor R packages SwissProt , nr NCBI Blastx Quality Control Trimmomatic Nature Protocols 8(8):1494-1512 (2013) Pipeline: McGill University and Genome Quebec Innovation Center, Montreal, Canada Predicted proteins vs. SwissProt Prediction of coding regions Scan PFAM with predicted proteins Signal peptides prediction Prediction of transmembrane helices Prediction of rRNA genes Blastp HMMER SignalP tmHMM Trinotate RNAMMER  Transdecoder MEGAN Taxonomy profiling KEGG Pathways Functional Annotation Gene Networks BisoGenet

Sequencing results and Quality Control M inimum quality required to keep a base=20Minimum length of reads to be kept=32

Trinity Assembly results

Gene abundance estimationTMM normalization of Fragments Per Kilobase of exon per M illion fragments mapped (FPKM)

Sample distances All unigenes Distance: Euclidean on transformed RSEM countsComplete-linkage clustering

Diff. Exp. Genes Leaves vs. RootsDEGs with DESeq adj. p-value < 1x10-4Distance: Euclidean on transformed RSEM counts Complete-linkage clustering

DEGs YL vs. ML and YR vs. MR

Functional annotationBlast resultsMatches to nr NCBI/SwissProt (E-value < 1x10 -5 ): 76523. Hits to SwissProt: 66175 (86.48%)Hits to nr NCBI protein database: 39597 (51.74%)1034836926 nr NCBI SwissProt 29249 38% 14% 48%

Best Blast Hits by taxa

Terpenoid backbone biosynthesis pathway (KEGG Metabolism)68 transcripts mapped to 30 KO groups

Terpenoid backbone biosynthesis pathway farnesyl diphosphate synthase Mevalonate pathwa y Sesquiterpenoid and triterpenoid biosynthesis farnesyl pyrophosphate Expressed in pheromone gland of Lutzomyia longipalpis Parasites & Vectors 2013, 6:56

Sesquiterpenoid and triterpenoid biosynthesis pathways 3 1 transcripts mapped to 6 KO groups

Sesquiterpenoid and triterpenoid biosynthesisfarnesyl-diphosphate farnesyltransferasefarnesyl-diphosphate farnesyltransferase squalene monooxygenase farnesyl pyrophosphate squalene Triterpenoids   lupeol has a potential to act as an anti-inflammatory, anti-microbial, anti- protozoal , anti-proliferative, anti-invasive, anti- angiogenic and cholesterol lowering agent lupeol synthase 2

Plant hormone signal transduction pathway (Environmental Information Processing-Signal transduction) 344 transcripts mapped to 42 KO groups

Plant hormone signal transduction

Plant-pathogen interaction pathway (Organismal System-Environmental adaptation)328 transcripts mapped to 34 KO groups

Plant-pathogen interaction pathway

After scanning PFAM, using HMMER with the assembled transcripts we found:54 hits to bHLH PFAM family. 12 hits to R2R3-MYB family. 111 hits to WD40 family. Main flavonoids transcription factors and co-factors Protein families R2R3-MYB, bHLH , and WD40 have been shown to control multiple enzymatic steps in the biosynthetic pathway responsible for the production of flavonoids in many plants. bHLH (PF00010): Basic Helix-Loop-Helix motif R2R3-MYB  (PF14215 ): WD40 proteins (PF00400) is a short structural motif of approximately 40 amino acids, often terminating in a tryptophan-aspartic acid (W-D) dipeptide. Tandem copies of these repeats typically fold together to form a Beta-propeller

New Ipomoea Batatas transcriptsBlasts searches (E-value < 0.001) against known I. Batatas genes and proteins.Trinity.fasta vs nr NCBI Nucleotide db (I.Batatas)Trinity.fasta vs nr EST (I.Batatas)Trinity.fasta vs nr NCBI protein db (I.Batatas)Trinity.fasta vs nr SwissProt db (I.Batatas)Results: 65 with GO transcription factor activity

ECERIFERUM 1 like: Expressed in seedlings, stems, leaves, flowers, fruits and siliques. Not detected in roots, pollen and seeds

Osmotin-like protein TPM-1Antifungal protein ECERIFERUM 1 like: Expressed in seedlings, stems, leaves, flowers, fruits and siliques. Not detected in roots, pollen and seeds

Proteinase inhibitor type-2 CEVI57

BisoGenet network of I. Batatas orthologues to A.thaliana genes 5775 sweet potato expressed genes having orthologues to A. thaliana (Best reciprocal Hits method)5258 found in SysBiomics.1561 genes involved in 2447 protein interactions.

BisoGenet network of new genes and their potentially interacting neighbors N odes colored with MultiColoredNodes plugin (Gregor Warsow (Institute for Biostatistics and Informatics in Medicine and Ageing - University of Rostock, Germany))Colors represent expression level in each sample.Green: Low White: Median Red: High

Network of WD40 proteins and their neighbors

Limitations of this workNo biological replicates (due to costs issues).KEGG pathways annotation is outdated in MEGAN (paid KEGG license is required).Trinotate annotation limited to largest transcript of Trinity components (MUGQIC pipeline limitation to avoid excessive consumption of computer system resources).

ConclusionsMajor findings.30538 Sweet potato unigenes were found expressed in at least one of studied tissues.54 bHLH , 12 R2R3-MYB and 111 WD40 transcription factors were identified. These TF families have been previously associated to flavonoids pathways.2293 new I. Batatas transcripts, which represent a 4% increase of current deposited transcripts in the NCBI nucleotide db, were discovered.65 new I. Batatas transcription factors were predicted according their orthologues GO annotation.This is the first report of lupeol synthase expression in I. Batatas. This suggest that lupeol, a pharmacologically active triterpene with several potential medicinal properties, is produced by I. Batatas . Identification of I. Batatas enzymes participating in M evalonate and Triterpenoids pathways which have been reported to be related to insect attraction. In particular, the enzyme farnesyl diphosphase syntase (FDPS), a precursor of triterpene pathway, has been found highly expressed in all studied tissues. Two FDPS transcripts were found over expressed in young roots and one in young leaves.Taken together, our results could contribute to identify target genes to combat sweet potato weevil pest, the implementation of techniques for large-scale production of bioactive compounds in bioreactors and to conceive new strategies for Sweet Potato crop improvement.

Authorship contributionRolando Moran and Cathie Martin: Design of the strategy to apply NGS methods to study gene expression in different sweet potato tissues and different developmental stages .  Selection of target pathways to be used as potential applied approaches for crop improvement. Last RNA sample purification and quantification steps were performed by Rolando Moran at Cathie Martin´s lab. Apart from Cathie´s work supervision, the cost of all the analysis were kindly covered by the research budget of Cathie´s group.     Chris Watkins: QC of RNA samples to be submitted to RNAseq. Selection of the best method to be applied for. Generation of primary data and assembling. Sending the data to Cuba in a hard drive.Aylin Nordelo: RNA sample purification. Arrangements and sending of samples from CIGB to John Innes Centre to be processed.Maria Elena Ochagavia: Selection of target pathways to be used as potential applied approaches to fight Sweet Potato weevil pest and to find medical bioactive compounds. Development of the Bioinformatics analysis. AcknowledgmentsThe authors wish to acknowledge the contribution of MSc. Francois Lefebvre, bioinformatician at the McGill University and Génome Québec Innovation Centre (MUGQIC), Montréal, Canada for training to M.E.Ochagavia in MUGQIC RNA Seq pipelines and, also to Dr. Guillaume Bourque , Bioinformatics Director at MUGQIC, who allowed us to perform MUGQIC De Novo RNASeq pipeline on these data at MUGQIC facilities.