/
Friday, March 29, 2013 Friday, March 29, 2013

Friday, March 29, 2013 - PowerPoint Presentation

danika-pritchard
danika-pritchard . @danika-pritchard
Follow
342 views
Uploaded On 2019-11-23

Friday, March 29, 2013 - PPT Presentation

Friday March 29 2013 1 ADMIRe Data catalogues and the data repository ADMIRe JISC MRD Dr Tom Parsons March 2013 Friday March 29 2013 ADMIRe 2 A worldclass university One of the worlds top 100 universities Nottingham is recognised globally for groundbreaking research and teaching excell ID: 767282

admire data friday march data admire march friday 2013 research metadata search web bag storage site metastorm files requirements

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Friday, March 29, 2013" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Friday, March 29, 2013 1 ADMIRe Data catalogues and the data repositoryADMIRe JISC MRD Dr Tom Parsons March 2013

Friday, March 29, 2013 ADMIRe 2 A world-class university One of the world’s top 100 universities, Nottingham is recognised globally for ground-breaking research and teaching excellence. 40,000 students from more than 150 countries, two overseas campuses and strong links with universities around the world Heavily focused on research: Medical & Health Sciences, Sciences, Engineering, Social Sciences and Arts Large research income (£100m) – primarily RCUK, UK/EU government, commercial and charities

Friday, March 29, 2013 ADMIRe 3 RDM policy “1.5 . The University will provide mechanisms and services for storage, backup, registration, deposit, retention and preservation of research data assets in support of current and future access, during and after completion of research projects .” Key priorities for ADMIRe: I s the current provision good enough ? Where are the gaps? What do we need to provide?

Friday, March 29, 2013 ADMIRe 4 Understanding requirements Approaches: Survey (summer 2012) Focus groups (November 2012) Interviews (May 2012 onwards) Mixture of ADMIRe , in-house, JISC MRD & Sero Outputs: service model, detailed requirements catalogue, logical models & prototype Institutional requirements: “Enterprise Architecture compliant”, use and integrate with existing systems

Friday, March 29, 2013 ADMIRe 5 Survey results: Types of data

Friday, March 29, 2013 ADMIRe 6 Survey results: Data storage

Friday, March 29, 2013 ADMIRe 7 Survey results: Metadata …

Friday, March 29, 2013 ADMIRe 8 Sharing data?

Friday, March 29, 2013 ADMIRe 9 Survey results: Total research data estimates From the survey’s 366 responses 75 Gb average (mean/frequency)

Friday, March 29, 2013 ADMIRe 10 Total research data estimates 75 Gb average x approx. numbers of PIs & post-grads (4000) = 300TB (+-90%) Large number of unknowns A large amount of data, a large amount of files and a good case for managing it

Friday, March 29, 2013 ADMIRe 11 Focus groups to understand more Five Faculty based focus groups (30 people in total) Based upon California Digital Library model

Friday, March 29, 2013ADMIRe13Active data

Friday, March 29, 2013ADMIRe14Archivedata

Friday, March 29, 2013 ADMIRe 15 Preservation activities   Function Actors   Req. Freq R S A 1 – Tag       Enter metadata describing a bag of research data assets M M 2 – Bag       Zip the data files up in a bag C M 3 – Transfer +     Transfer a bag to archival storage C M 4 – Ingest +     Ingest a bag in to storage C M 5 – Update       Update (enhance, correct) metadata for a stored bag O L 6 – GetDOI       Get (public, private) DOIs for designated assets C L 7 – Publish + +   Publish assets appropriately on landing pages C L 8 – Relocate       Relocate assets and update locators O L 9 – Search       Search for assets by keyword or field M H 10 – Access       Access metadata and data according to permissions M M 11 – Notify + + + Notify actors automatically about data events O P 12 – Annotate       Create notes about a bag or its contents O L 13 - Check   +   Check (verify) that the contents of a bag are in order M P 14 – Report       Run reports on aspects of the system (DOI, bag, user) O L 15 - Administer       Administer permissions and system parameters M M

Mapping requirements

Friday, March 29, 2013 ADMIRe 17 Where are we now?

SolutionDescriptionScopeInterfaces/IntegrationsDirect Users Data Retention PlatformA storage platform that enables storage of “unstructured” data files.BPM Metastorm frontend.Storage of files and very basic (file type, size, retention period, user)AD to support access. (Note that Open Access will be supported by providing a persistent account used by the Research data web site server that has read only access to all “Open” data sets. ResearchersResearch data search and retrieve web siteWeb Site. Expected to be CMS or possibly SharePoint Web site with relevant information and screens to search and return results1. Data Retention Platform via REST to enable http(s) data transfer.2. FAST (embedded function) to allow search from a web page.3. Equella (API) to expose metadata onto search results.4. Active Directory/LDAP to authenticate file accessThose searching for data sets Equella Metadata Database Stores metadata See Metastorm, FAST and Research Web Site N/A FAST Search Engine Provides search results and rich search functionality on the metadata 1. Potential federation to Primo 2. Crawl of Equella Anyone Baggit File collection tool Tool to assist researchers in selecting and bringing files into a collection Linked to from Metastorm PI

SolutionDescriptionScopeInterfaces/IntegrationsDirect Users DMP OnlineOn line tool providing support for creating Data Management plan that is managed to ensure Research Council Requirements are metUsed to create Data Management Plan1. Metastorm will link this within curation workflow2. Metastorm will take the XML output of this and read key fileds directly to automate some metadata creation in Equella 3. Metastorm will save the output file of this toolPIDOIOn line tool for creating a unique digital object identifierWorkflow to fork out to this system to allow researcher to create a persistent object identifier. See Metastorm PIActive File ServicesFile services primarily for storage of active (ie not curated) files   The source of files for curation (“Bagging”). Selectable by browsing using B aggit tool. PI “Other Repository”     Sometimes Selectable by browsing using B aggit tool as the source of files for curation (“Bagging”). However these may be databases or alternative repositories that are used instead. If used, and where possible, the DOI will point to these.

ADMIRe Phasing: Drop 1 (to June 2013) Objective: Deliver Key Functions but without over integrationDeliverables:1. Instructions and links on web site on how and why to use DMP Online2. Instructions and links on web site on how and why to use DOI3. Implementation (but not integration) of Baggit for Research users4. Delivery of Metadata in EquellaIncluding instructions and links on web site on how and why to use5. Creation of Research Data Search PageIncluding instructions and links on web site on how and why to useImplementation of FAST search crawl Embed of FAST in web pageDelivery of Results page to include relevant information6. Metastorm development that:Creates User (PI Researcher) interface to EquellaProvides fields to add all metadata into EquellaIncluding Research Project Information, Subject Specific Information, Technical MetadataAllows Researcher to choose when a page is searchable Friday, March 29, 2013 ADMIRe

ADMIRe Phasing: Drop 2 (to Dec 2013)Deliverables1. Delivery of Retention platformDelivered outside of ADMIRe project2. Delivery of Open Access Platform(Subset of Retention platform)3. Definition and Delivery ofEnd to end workflow automation and integration for data management process with a vision of “Input Once”Integrations of Baggit, Agresso Awards Management, DMP Online, DOI4. Definition and Delivery of a report for Research Councils that Confirms project adherence (at Project close) to funding requirements for data management and accessEnables non-conformance to be addressed Friday, March 29, 2013ADMIRe

Friday, March 29, 2013 ADMIRe 22 Reusable outputs Focus groups/interview formats Requirements catalogue Use cases Survey – questions, write-up etc Software? No…

ADMIRe23Questions?tom.parsons@nottingham.ac.ukADMIRe Project Manager Friday, March 29, 2013