/
EUDAT AAI for a Collaborative Data Infrastructure EUDAT AAI for a Collaborative Data Infrastructure

EUDAT AAI for a Collaborative Data Infrastructure - PowerPoint Presentation

magdactio
magdactio . @magdactio
Follow
343 views
Uploaded On 2020-10-01

EUDAT AAI for a Collaborative Data Infrastructure - PPT Presentation

Johannes Reetz EUDAT VAMP workshop Helsinki 30 Sep 2013 Challenges and Approaches The CDI concept Collaborative Data Infrastructure Trust Data Curation Data Generators Users Common Data Services ID: 812836

eudat data center store data eudat store center access community registered service user repository metadata replication high services storage

Share:

Link:

Embed:

Download Presentation from below link

Download The PPT/PDF document "EUDAT AAI for a Collaborative Data Infra..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

EUDAT

AAI for a Collaborative Data Infrastructure

Johannes Reetz, EUDATVAMP workshopHelsinki, 30 Sep 2013

- Challenges and Approaches -

Slide2

The CDI conceptCollaborative Data Infrastructure

Trust

Data CurationData GeneratorsUsersCommon Data ServicesCommunity Support Services

User-focused functionality, data capture & transfer, VREs

Data discovery & navigation, workflow creation, annotation, interpretability

Persistent storage, identification, authenticity, workflow execution, mining

2

Slide3

EPOS: European Plate Observatory

SystemCLARIN: Common Language Resources and Technology InfrastructureENES: Service for Climate Modelling in Europe

LifeWatch: Biodiversity Data and ObservatoriesVPH: The Virtual Physiological HumanINCF: International NeuroinformaticsAll share common challenges:Reference models and architecturesPersistent data identifiersMetadata managementDistributed data sourcesData interoperabilityInitially s

ix research communities on Board

3

Slide4

Communities

and

Data CentersIdentifying basic requirementsIdentify

commonalities,

common data services

Slide5

What community users see …

Today

commutity dataCommunity LayerCommunity specific authentication, authorization & single sign-on

Community portal,

single credential type

Slide6

common

metadata

explorationcommon data stage-in and stage-out servicesdata services for the long tail data, also from citizen scientistscommon replication services with access to distributed storageUnified Authentication, Authorization & Single Sign-OnOtherveryusefuldata

Tomorrow

community data

commutity

data

What community users see …

Various

c

ommunity

p

ortals,

different credential types

EUDAT portal, for

non-affiliated users, many credential types

Slide7

from: Analysis of

the FIM doc (v0.7, L. Florio et al. 2013

)User friendliness (high)Browser & non-browser federated access (high)Multiple technologies with translators including dynamic issue of credentials (medium)Bridging communities (medium)Implementations based on open standards and sustainable with compatible licenses (high)Different Levels of Assurance with provenance (high)Authorisation under community and/or facility control (high)Attributes must be able to cross national borders(high)Well defined semantically harmonised attributes(medium)Flexible and scalable IdP attribute release policy(medium)EUDAT supports

these

requirements

, but

emphasizes

#3, #4

and

#9

(high)

(high)

(high)

Slide8

EUDAT Sites

g

eneral

data

centres

(

replica

)

storages

community

centres

repositories

Slide9

Safe Replication Service

Robust, safe and highly available data replication service for small- and medium- sized repositoriesTo guard against data loss in long-term archiving and preservation

9EUDAT CDI Domain of registered data

Data center

store

Data center

store

Data center

store

Community

repository

PIDs

Policy

rules

To optimize access for user from different regions

To bring data strategically closer to systems for powerful compute-intensive analysis

PIDs are used to keep track on location

and

can provide attributes

Slide10

Use Case: CLARIN – Safe Replication

EPIC PID

registry

Slide11

VPH / VIP

diXa

INCF

EPOS / PP

WG7

CLARIN /

Replix

ENES

/CMIP5,IPCC-AR5

Safe Replication “islands”

CLARIN / CUNI

CLARIN / CUNI

EPOS / Orpheus

NeuGrid

community

centres

repositories

g

eneral

data

centres

replica

storages

Slide12

Data Staging Service

Support researchers in transferring large data collections from EUDAT storage to HPC facilitiesReliable, efficient, and easy-to-use tools to manage data transfers

12EUDAT CDI Domain of registered data

Data center

store

Data center

store

PRACE

HPC

HPC

Provide the means to ingest computational results into the repository via the EUDAT infrastructure

Slide13

EUDAT Services (1)

13

Safe Replication ServiceReplicating Data Objects (DO) from a Repository to Replica StoragesRepository & Replica Storage belong to separate administrative zonesRegistration of Original DO and ReplicaPID / object

identifier

Service

Create DO

handles

Manages

/

Maintain

DO

handles

Resolve

DO

handles

Data

Staging

Service

Replication of Data from the

domain

of

registered

data

(Stage-Out)

Replication

of

data

objects

into

the

domain

of

registered

data

(Stage-In)Replication

of not-registered Data Objects between

scratch storages

Slide14

Service specific actors/

actions (1)

14Safe Replication ServiceRepository Data Manager replicatesReplica Storage Manager registers DOs1) (community) user access data via repository2) User access data via replica storagePID (Handle) ServiceRepository Data Manager: creates/manages primary object handleReplica Storage Manager: creates/manages secondary

object

handles

Users

and

others

resolves

the

location

of the physical storage the

handles (PIDs)Data StagingUsers access

and fetch data from either the repository

or

the

replica

storage

User

ingest

new

data

into

the

repository

EUDAT CDI

Domain

of

registered

data

Data center

store

Data center

store

Data center

store

Community

repository

PIDs

Policy

rules

Slide15

Simple Store for ”long-tail” data and the Citizen scientists

Allow registered users to upload ”long tail” data into the EUDAT storeEnable sharing objects and collections with other researchers

EUDAT CDI Domain of registered data

Simple

store

portal

Simple

upload

Simple

metadata

PID

registration

Data center

store

Data center

store

Data center

store

Utilise

other EUDAT services to provide reliability and data retention

PIDs are assigned to uploaded DO

Slide16

Definition of the data sets as objects for entitlement

Find and define collections of scientific data – generated either by various communities or via EUDAT services (e.g. facetted search)

Access those data collections through the given references in the metadata to the relevant data stores EUDAT CDI Domain of registered data

Data center

store

Community

repository

Community

repository

Data center

store

Metadata

portal

Joint Metadata Service

Slide17

EUDAT Services (2)

17

Simple Store ServiceRepository for registered data with metadata for the sharingDigital objects are registered (handles are assigned)Fragmented User Group: many communities & „citizen

scientists

are

contributing

and

retrieving

data

EUDATbox

Service

Temporary

shareable

storage space for data, not necessarily

registered

User

deposits

data

– not

necessarily

with

metadata

Not a

homogeneous

user

group

:

many

communities

, „

citizen

scientists“

(Joint)

Metadata ServiceMetadata

from various

repositories are

harvested and

collected

Metadata

exploration

,

facetted

search

:

result

sets

define

data

set

for

entitlement

Slide18

Service specific actors/

actions (2)

18Simple Store (Repository)Users deposit data and metadataUser search for and access dataRepository Storage Manager (needs to create the handle service)EUDAT boxUser deposit dataUser shares data by

inviting

other

users

User

access

data

(

Joint)

Meta

Data Service

Manager harvests metadata from (many)repositories

also via the replica site

EUDAT CDI

Domain

of

registered

data

Data center

store

Community

repository

Community

repository

Data center

store

Metadata

portal

Slide19

Attribute Provider

AuthZ either community-managed or ( ) attributes provided by user’s home IdP are reused

Communities

Identity

credential

conversion

AtP

1

AtP

2

AtP

3

z

oned

credential

conversion

service

unique

user

Ids

,

project-wise

mapped

to

a

ttribute

b

ased

access

c

ontrol

information

Different types of

Identity Providers

AuthN

*

 

 

 

 

c

onsolidated

credentials

IdP

A

IdP

B

IdP

D

IdP

C

eID

shib

OpenID

x.509

*

 

Slide20

20

Slide21

EUDAT AAI-TF approach

21

ConSec: Contrail Security code

Slide22

22

The Figure shows the high level view: SAML is used for authentication (possibly translated from OpenID (not shown));

OAuth (version 2) is used for delegation (internally, within the federation), and XACML is used for access control policies. Control (in the workflow sense) roughly goes from left to right and from top to bottom. Internally, an X.509 certificate with authorisation attributes is generated; this certificate is also managed internally and thus not usually exposed to (or accessible by) the user. Its purpose is threefold: (a) to ensure that non-HTTP services can be accessed (i.e., outside the OAuth delegation workflow), such as GridFTP and iRODS, and (b) to allow fine-grained authorisation, and (c) to allow command line access to services for expert users. In OAuth, the authorisation server remains the central hub where access is delegated. However since, EUDAT needs finer grained access, so the generated X.509 certificate carries also authorisation attributes (see below), which are checked against pre-defined access policies. The system deployed and used by EUDAT was built by the Contrail project, so we are reusing the Contrail Security (ConSec) code and tools developed within this pilot project. This decision was based on the evaluation of options, where ConSec promised most of the features required by the EUDAT communities. EUDAT is currently running a ConSec authentication infrastructure for integration at FZJ. EUDAT is currently not running an authorisation infrastructure.