/
CRISP WP 17 1 / 2 Proposed Metadata Catalogue Architecture Document CRISP WP 17 1 / 2 Proposed Metadata Catalogue Architecture Document

CRISP WP 17 1 / 2 Proposed Metadata Catalogue Architecture Document - PowerPoint Presentation

sistertive
sistertive . @sistertive
Follow
348 views
Uploaded On 2020-06-17

CRISP WP 17 1 / 2 Proposed Metadata Catalogue Architecture Document - PPT Presentation

Work package 17 IT amp DM Metadata Management and Data Continuum Objectives choose implement data management and metadata mining services and establish an environment permitting a data continuum from raw data to publications across the ID: 779860

nicola metadata data icat metadata nicola icat data esrf catalogues evaluate bessone requirements api web search xml authentication model

Share:

Link:

Embed:

Download Presentation from below link

Download The PPT/PDF document "CRISP WP 17 1 / 2 Proposed Metadata Cata..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

CRISP WP 171 / 2

Proposed Metadata Catalogue Architecture Document

Slide2

Work package 17 - IT & DM:

Metadata

Management and Data Continuum

Objectives:choose, implement data management and metadata mining services and establish an environment permitting a data continuum from raw data to publications across the participating Research Institutes (RIs): ILL, ESRF, SLHC and EuroFEL.Task plan:Evaluate and adapt metadata catalogues according to the RIs requirements.Deploy and integrate metadata cataloguePrototype of data mining on metadata services.

2

Bessone Nicola - ESRF

Slide3

Evaluate metadata catalogues:Use cases

Identified a list of requirement based on ILL, ESRF and DASY use cases.

Select a list of most suitable metadata catalogue system on the market.

Match the requirements with features proposed by the metadata catalogues.3Bessone Nicola - ESRF

Slide4

Evaluate metadata catalogues:Requirements

AAA

Authentication

Modular integration of different authentication systems.AuthorizationCustomizable access control system.AccountingGranular logging information levels.4Bessone Nicola - ESRF

Slide5

Evaluate metadata catalogues:Requirements

Metadata Model

Core

Scientific Metadata Model (CSMD) already been developed at STFCBessone Nicola - ESRF5

Study

Investigation

Sample

Dataset

Datafile

Parameter

Slide6

Evaluate metadata catalogues:Requirements

Searching

method

Fulfill user’s search needs, being easy to use and to access (web).Provide data mining to Facilities and Scientific management about data use/access/search/modific.Cross platformService APIStable set of API possibly programming language agnostic.Bessone Nicola - ESRF6

Slide7

Evaluate metadata catalogues:Requirements

Sustainability

Open source

Project organization:Actively maintained, Release plan (documentation, update mechanism, backward comp.), Patch release process (security, bug fix)Cutting edge TechnologyLicenseFree of chargeBessone Nicola - ESRF7

Slide8

Evaluate metadata catalogues:Requirements

Data

policy

Dynamic authorization system.Scalability & PerformanceILL host ~2’000 experiment /year producing ~10’000 datasets. Other facilities possibly more…Data ingestionManually & automatic + possible harvest (OAI-PMH)SecurityProtect intellectual property.Bessone Nicola - ESRF8

Slide9

Evaluate metadata catalogues:Metadata catalogue systems

ICAT

Dspace

FedoraCkanInvenioTardisISPyBiRODSSRB-MCATMS. Zentity9

Bessone Nicola - ESRF

Slide10

Evaluate metadata catalogues:Selection result

Different solutions have been explored, amongst them ICAT appears to be the only one that currently fits the Data Model requirements. This is the key element for a successful implementation in a reasonable time frame

.

10Bessone Nicola - ESRF

Slide11

Evaluate metadata catalogues:ICAT

Authentication plug-in

Rule

based authorization mechanismFlexible metadata modelSearch method: full-text, numerical and string search and SQL like query syntaxSet of API (Java and Python)Database configurable (Oracle, Posgres and MySQL)Federated search via TopCATCore Scientific Meta-Data Model (CSMD)11Bessone Nicola - ESRF

Slide12

Evaluate metadata catalogues:ICAT

Plug-in for DAWN/

Mantid

Licence: FreeBSDWeb interface: TopCATIn use at 11+ RIs 12Bessone Nicola - ESRF

Slide13

Evaluate metadata catalogues:ICAT

Work-in-progress:

Improve web interface (

TopCat)Possibility to harvest (OAI-PMH)Installation processSynonym mechanismIntegration with Umbrella authentication Bessone Nicola - ESRF13

Slide14

Deploy and integrate ICAT:

ESRF - Pilot

14

Bessone Nicola - ESRFSpecSpec

ICAT API

RDBMS

Web Service API

Spec

Tomo Xml

TomoDB

DB

Tomo to ICAT

xml converter

ICAT Xml

ICAT xml

ingest

Actual TomoDB

metadata

collect structure

1

2

1

2

3

SMIS

3

SMIS to ICAT

ingester

Slide15

Deploy and integrate ICAT:ESRF - future

Bessone Nicola - ESRF

15

SpecSpecNew

Sequencer

Experiment

metadata

Management

Scientist controlling

the Experiment

ICAT API

RDBMS

Web Service API

SMIS API

RDBMS

Web Service API

WEB

Interface

Data

Manager

Spec

Spec

Spec session

NEW

beamline

control system

Slide16

Deploy and integrate ICAT:ILL

Data policy published in Dec 2011

Implementation Oct 2012

ICAT deployment Dec 2012Currently, ingestion of the Data since Nov 201216Bessone Nicola - ESRF

Slide17

Future work

Complete the deployment (ingestion) at the participating facilities.

Data mining

Collect uses cases from the different facilitiesCurrently all use cases are technically simple (no request for correlation for instance) Work on the search engine (lucene)Reporting Bessone Nicola - ESRF17

Slide18

Bessone Nicola - ESRF

18