/
Creating a … Community Database Creating a … Community Database

Creating a … Community Database - PowerPoint Presentation

Kingslayer
Kingslayer . @Kingslayer
Follow
342 views
Uploaded On 2022-08-03

Creating a … Community Database - PPT Presentation

OrganismSpecific Database ModelOrganism Database Why Create a PGDB Perform pathway analyses as part of a genome project Analyze omics data Create a central public information resource for the ID: 933532

organism pgdb registry release pgdb organism release registry tools biocyc access update information pathway ptools create community genome sri

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Creating a … Community Database" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Creating a …Community DatabaseOrganism-Specific DatabaseModel-Organism Database

Slide2

Why Create a PGDB?Perform pathway analyses as part of a genome projectAnalyze omics data

Create a central

public information

resource for the

organism, update on ongoing basis

Create a metabolic model

Perform comparative analyses

Slide3

Model Organism DatabasesDBs that describe the genome and other information about an organismCurated by experts for that organism

No one group can curate all the world’s genomes

Distribute workload across a community of experts to create a community resource

Every sequenced organism with an active experimental community requires a MOD

Integrate genome data with information about the biochemical and genetic network of the organism

Integrate literature-based information with computational predictions

Slide4

Rationale for MODsEach “complete” genome is incomplete in several respects:

40%-60% of genes have no assigned function

Roughly 7% of those assigned functions are incorrect

Many assigned functions are non-specific

MODs are platforms for global analyses of an organism

Interpret omics data in a pathway context

In silico

prediction of essential genes

Characterize systems properties of metabolic and genetic networks

Slide5

What is Curation?Ongoing updating and refinement of a PGDBCorrect false-positive and false-negative predictionsIncorporate information from experimental literature

Update genome sequence

Update gene functions, gene positions, gene names

Author comments and citations

Add new pathways, modify existing pathways

Enter information about regulatory networks

Slide6

Issues in Creating Public MODsScope/prioritize the projectIdentify user community

Obtain buy-in and help from scientific community

Obtain funding

IT

: Set up database server, Web server

Hire and train curators

Slide7

Administering Pathway Tools

Slide8

New Pathway Tools ReleasesMajor releases = External software releases

Twice per year

Announced on

ptools

-users mailing list

Minor releases twice per year affect only our

BioCyc.org

Web site and

flatfile

distributions

We support one prior release only

Releases announced on

ptools-users@ai.sri.com

Read release notes at

http://brg.ai.sri.com/ptools/release-notes.html

Install process:

Upgrade

schema of your DB (software assisted)

Slide9

PGDB Storage:File or Relational DatabaseFile storage:

Advantages:

No RDBMS installation and configuration

Disadvantages:

Must be loaded and saved in its entirety

No transaction history

No concurrent access for multiple users

MySQL

storage:

Advantages:

Faster read access, faster saves

Concurrent update access for multiple users

Stores

transaction history

of all PGDB updates

Disadvantages:

RDBMS

must be installed and configured

Slide10

Multiuser Access to PGDBsPGDB stored within one MySQL server

Each curator installs PTools on their

computer

Curator computers query

RDBMS server via internet

For

each frame access, PTools queries

In-memory cache, disk cache, RDBMS server

After

curator saves changes, all changes made by other users are loaded into curator’s session

Slide11

How to Release a PGDB?Decide on release frequency and schedule

Don’t wait until it’s perfect to release it!

Quality

assurrance

Run consistency checker

Tools -> Consistency Checker

Also updates organism-summary statistics

Update publications, authors in organism frame

Update via Organism editor

Create new version of PGDB

ptools

-local/

pgdbs

/

yeastcyc

/1.0/kb/

yeastbase.ocelot

Edit against the new version, release the old version

Author release notes

Register PGDB in SRI PGDB registry

Will allow SRI to include it in BioCyc

Slide12

Pathway Tools Data Import/ExportFile->ExportFile->Import

Export/import to/from tab-delimited files

Export to Genbank,

GFF3 (

soon),

SBML

,

BioPAX

Export to attribute-value files

Attribute-value files can be imported into BioWarehouse

Relational database system for bioinformatics database integration

Slide13

Registry: Public PGDB SharingPGDB registry maintained by SRI at URL

http://biocyc.org/registry.html

Registry operations

List contents of registry

Download PGDBs listed in the registry

Register PGDBs you have created

Slide14

Registry DetailsWhy register your PGDB?Facilitate its download by other scientists

Facilitate its inclusion in

BioCyc.org

Why download a PGDB?

Desktop Navigator provides

faster/more

functionality than Web

Comparative operations

Programmatic querying and processing of PGDB

Slide15

Changes Planned for BioCyc.orgBioCyc will be starting a subscription model

July 1

Slide16

Why?Government funding for databases shrinkingBioCyc funding cut 27% as number of genomes climbed 5X in 5 years

No other foreseeable sources of funding for "Big Knowledge" in life sciences

Goals:

Create high-quality curated EcoCyc-like DBs for many organisms

Couple with extensive user-friendly bioinformatics tools

Slide17

How?Subscription access to BioCyc.org by institutions, individuals

Subscription rates will depend on usage levels from previous year

EcoCyc and MetaCyc will remain free

Pathway Tools will remain free