/
A Provenance assisted Roadmap for A Provenance assisted Roadmap for

A Provenance assisted Roadmap for - PowerPoint Presentation

popsmolecules
popsmolecules . @popsmolecules
Follow
345 views
Uploaded On 2020-08-26

A Provenance assisted Roadmap for - PPT Presentation

Life Sciences Linked Open Data Cloud Ali Hasnain et al Insight Center for Data Analytics National University of Ireland Galway Agenda Motivation Linked Life Sciences Roadmap Cataloguing and Linking ID: 803395

amp data provenance metadata data amp metadata provenance gene multiple query catalogue endpoints sciences life sparql heterogeneous sider results

Share:

Link:

Embed:

Download Presentation from below link

Download The PPT/PDF document "A Provenance assisted Roadmap for" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

A Provenance assisted Roadmap for

Life Sciences

Linked Open Data Cloud

Ali

Hasnain et. al

Insight

Center for Data

Analytics

National University of Ireland,

Galway

Slide2

AgendaMotivationLinked Life Sciences RoadmapCataloguing and LinkingExtending Catalogue – Metadata & ProvenanceQuery Engine

Results

Slide3

MotivationBiomedical Data is heterogeneous and spread across multiple sources (SPARQL endpoints).Navigation is a challenge.C

ontaining trillions of triples and represented with insufficient vocabulary reuse.Biologists sometimes want to get more information regarding the data including its source

, creator, publisher and also statistics with respect to its size (Metadata & Provenance).3

Slide4

How to deal heterogeneous data?

DrugBank

DailyMed

CheBI

, KEGG

Reactome

Sider

BioPax

Medicare

Slide5

We want to query the content, not the source

Proteins

Molecules

Genes

Diseases

Slide6

A Linked Life Sciences Roadmap

Proteins

Molecules

Genes

Diseases

:

Protein

:

Molecule

:

Gene

:

Disease

Uniprot

PDB

Pfam

PROSITE

ProDom

Uniref

UniPark

Daily

med

Drug

Bank

ChemBL

Pub

Chem

KEGG

Gene

Ontology

GeneID

Affy

metrix

Homo

gene

MGI

Disea

some

SIDER

Slide7

2- Possible SolutionsTo assemble queries over multiple graphs at multiple endpoints, either: vocabularies and ontologies are reused, Or translation maps between different terminologies are created (“a posteriori integration

”)

Slide8

a-priori

v.s a-posteriori Integration

8

Slide9

Cataloguing and Linking9

Slide10

Describing DataSets- an Extract from Catalogue

Slide11

Extending Catalogue – Metadata & Provenance

Slide12

Slide13

Slide14

Query Enginehttp://srvgal86.deri.ie:8000/graph/Granatum

Slide15

Visual & Graphical View

Slide16

SPARQL Endpoints returning results per query

Slide17

Runtimes taken by different queries (Max, Min, Average, Median)

Slide18