/
Data Preservation in HEP: Overview Data Preservation in HEP: Overview

Data Preservation in HEP: Overview - PowerPoint Presentation

audrey
audrey . @audrey
Follow
69 views
Uploaded On 2023-09-24

Data Preservation in HEP: Overview - PPT Presentation

Open Data amp Round T able on Data Access What Why How JamieShierscernch November 2015 International Collaboration for Data Preservation and Long Term Analysis in High Energy Physics ID: 1020751

collaboration data software preservation data collaboration preservation software cern hep analysis cases physics term long lep digital effort including

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Data Preservation in HEP: Overview" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

1. Data Preservation in HEP: OverviewOpen Data & Round Table on Data AccessWhat, Why,How Jamie.Shiers@cern.ch November 2015International Collaboration for Data Preservation and Long Term Analysis in High Energy Physics

2. CERN Circular Colliders + FCC2Constr.PhysicsLEPConstructionPhysicsProtoDesignLHCConstructionPhysicsDesignHL-LHCPhysicsConstructionProtoDesignFuture Collider19801985199019952000200520102015202020252030203520 yearsHEP has a long history of planning, financing and executing multi-decade projects

3. 2020 Vision for LT DP in HEPLong-term – e.g. FCC timescales: disruptive changeBy 2020, all archived data – e.g. that described in DPHEP Blueprint, including LHC data – easily findable, fully usable by designated communities with clear (Open) access policies and possibilities to annotate further Best practices, tools and services well run-in, fully documented and sustainable; built in common with other disciplines, based on standardsDPHEP portal, through which data / tools accessed“HEP FAIRport”: Findable, Accessible, Interoperable, Re-usableAgree with Funding Agencies clear targets & metrics3

4. Use Cases – “all HEP”Bit preservation – basically OK (at CERN) but not a formal policyOn the path to Certification of WLCG “digital repositories” (Tier0/Tier1)Preserve data, software, and know-how in the collaborationsFoundation for long-term DP strategyAnalysis reproducibility: Data preservation alongside software evolutionShare data and associated software with (wider) scientific communityAdditional requirements:Storage, distributed computingAccessibility issues, intellectual propertyFormalising and simplifying data format and analysis procedureDocumentationOpen access to reduced data set to general publicEducation and outreachContinuous effort to provide meaningful examples and demonstrationsStrategy and scope in policy documents for LHC collaborationshttp://opendata.cern.ch/collection/data-policies8/6/2015DPHEP Collaboration Workshop4

5.

6. LEP TimelineDateCollider (e+e-)Computing1981Approved by CouncilCard readers still exist!1983Civil Engineering startsComputing at CERN in the LEP era published1988LEP Tunnel completedData Management project requested by experiments19891st beams, collisions, 1st and resultsWas the s/w really ready?1992LHC Computing starts Mainframes replaced by Unix, later PCs1996LEP 2 (W pairs) starts2000Final run of LEPHEP gets bitten by Grid

7. Data Preservation in High Energy PhysicsThe road to DPHEP13/11/15DPHEP/ICFA7http://dphep.org

8. Aspects of LT DPA common approach across the main HEP labs worldwide, including:Data (bit preservation) – state of the art at exascale (1PB-10PB-100PB-1EB etc);Software (and environment) – combination of validation + virtualisation;Documentation (I would say “knowledge”) – digital library technologies + regular testing as part of training and data re-useLEP – and other Colliders worldwide – allow us to “see into the future” and compare different options for LTDPExpectation for LEP is that data will be usable (and used) until ~2030 – 3 decades after end of data taking! (Copy on disk + 2 on tape @ CERN!)Data will (should) be available much longer; “resurrection” of HEP data + software has been demonstrated but requires significant motivation + effort

9. CAP Use Cases (I) (=know-how?)The person having done (part of) an analysis is leaving the collaboration and has to hand over the know-how to other collaboration members.A newcomer would like join a group working on some physics subjectIn a large collaboration, it may occur that two (groups of) people work independently on the same subjectThere is a conflict between results of two collaborations on the same subject13/11/159DOI: http://dx.doi.org/10.5281/zenodo.33693

10. CAP Use Cases (II)A previous analysis has to be repeatedData from several experiments, on the same physics subject, have to be statistically combinedA working group or management member within a collaboration wishes to know who else has worked on a particular dataset, software piece or MCPresentation or publication is submitted for internal/collaboration review and approval: lack of comprehensive metadataPreparing for Open Data Sharing13/11/1510