Enrique Garcia IFICValencia on the behalf of the ProdSy2 team What is ProdSys 2 2 JOSE ENRIQUE GARCIA NAVARRO N ew distributed production framework ProdSys2 DEfT task request and task definition ID: 792612
Download The PPT/PDF document "ProdSys2 Commissioning Jose" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
ProdSys2
Commissioning
Jose
Enrique Garcia
–
IFIC-Valencia
on
the behalf of the
ProdSy2 team
Slide2What is ProdSys2?
2JOSE ENRIQUE GARCIA NAVARRO
New distributed production framework: ProdSys2
DEfT
– task request and task definition
JEDI
– dynamic job definition and task executionIntegrated with PanDA (replaces Bamboo)Engine for user analysis tasksWork closely with Job Transforms & RucioIntegration with monitoring – BigPanDA
from Kaushik De
Slide3What is
ProdSys2?3
JOSE ENRIQUE GARCIA NAVARRO
DEFT
JEDI
PanDA
BigPanDA
Slide4ProdSys2 Team
4JOSE ENRIQUE GARCIA NAVARROCoordination
– Kaushik De, Alexei Klimentov
DEfT
MC
production request and its processing :
M.BorodinCore SW and communication with JEDI : D.GolubkovWeb UI and Authentication : S.Belov, S.GayazovJEDI – Tadashi Maeno
Within the BigPanDA project :
ProdSys2 monitoring
–
Jaroslava
Schovancova
, Torre
Wenaus
,
R.Mashinistov
Packaging, Software organization
(SVN,
github
) – J.Schovancova
Expertise, integration with other domain/areas
–
Alessandra
Forti
,
Jose E.
Garcia,
Nurcan
Ozturk
,
Camille Belanger-
Champagne, Bruno
Lenzi
, David South,
Andreu
Pacheco,...
Slide5DEfT : Database Engine for Tasks
5JOSE ENRIQUE GARCIA NAVARRO
Production request handling interface for production of MC samples, DPDs
or
data reprocessing.
Requests placed in new
web interface,
information stored for review and submissionTemplates containing parameters needed for task definitions, such that they can be applied in an easy wayChain of steps or slice – series of step templates applied on input data.Request – series of chainsPost-production Interface including standard tasks performed by the production managers (abort, change priorities, …)
Slide6DEfT : Database Engine for Tasks
6JOSE ENRIQUE GARCIA NAVARRO
Slide7JEDI : Job Execution and Definition Interface
7JOSE ENRIQUE GARCIA NAVARRO
In production for user analysis since August.New features available for production :
Dynamic
job definition
,
lost file recovery
, network- aware brokerage, log file merging, output merging at T2 before transferring back to T1, ... Adding flexibility to allow users to specify how secondary datasets are sampled and job parameters are generatedMachinery to retry and merge jobs for event service
Boosting priorities of nearly finished tasks
Slide8BigPanDA : PanDA Monitoring
8JOSE ENRIQUE GARCIA NAVARRO
Next generation of
PanDA
monitoring
Modular
, easy to bring up a new project/
VOClear separation between data access and visualization Runs on top of Oracle or MySQL DB backendsDescribes configuration and modules Deployed with RPMs
Slide9COMMISSIONING
Slide10Schedule Run-2
10
JOSE ENRIQUE GARCIA NAVARRO
Run-2
Twiki
NOW
Commissioning
MC15
Slide11ProdSys2 Commissioning
11JOSE ENRIQUE GARCIA NAVARRO
MC Production
Group Production
Data Reprocessing
RAW
ESD
AOD
HIST
ESD
AOD
HIST
Merging
Merging
Merging
Slide12MC PRODUCTION
Slide13Standard MC Production Chain
13JOSE ENRIQUE GARCIA NAVARRO
Event Generation :Event generation
includes single and double step generators. Single step run only using JOs, double step need also input files. Both types have been tested successfully.
Pending :
creation of
tarball
needed to create the production e-tag (done using scripts)Simulation :Simulation transforms (old and new) have been tested successfully. HITS Merging :Merging : Currently HITs are merged using a transform in a separate step to a defined number of events. This method works already in the new system.JEDI Merging : Final details to commission the JEDI internal merging are ongoing.
Slide14Standard MC Production Chain
14JOSE ENRIQUE GARCIA NAVARRO
Digi+Reco :Standard mc12 configuration has been used to test proper production of AODs.
AOD Merging :
Working in ProdSys2
Sample A finished successfully from EVNT to AODs
Slide15Other MC Production Chains
15JOSE ENRIQUE GARCIA NAVARRO
Upgrade :
Transforms and tags similar to standard MC production. Main issue in
ProdSys
came from size of output of the files, memory consumption and job length. Things that should be better handled by ProdSys2/JEDI.
Overlay
: new transforms will be needed (under development) to produce overlay in ProdSys2FTK : Production involves large chain, needs checking. With standard chain validated improvements should be done (FullChain transforms and JEDI merging)
Slide16Request placing and post-processing
16JOSE ENRIQUE GARCIA NAVARRORequest Interface :
Implemented :
Change of processing parameters, start chain from any step, finding inputs, …
Pending :
Cloning Request/Task/Chain
Include processing parameters in Templates
List in JIRA : https://its.cern.ch/jira/browse/PRODSYSPost-processing :Production post-processing can be done from the web interface. Can be applied to tasks or chains and requests are logged in JIRA and panda logger.
Slide17GROUP PRODUCTION
Slide1818
NURCAN OZTURKTwo types of productions have been tested/being tested in ProdSy2.Derivation (train) production as per the new analysis model for Run-2:
Workflow being implemented to run derivation production on input
xAOD
(high priority)
Workflow implemented to run
derivation production on input NTUP_COMMON (full production not launched yet – metadata issue).Standalone productions (Run-1 style; NTUP_COMMON, NTUP_TOP, NTUP_SUSY, NTUP_TRUTH, etc.:A handful of these productions have been validated in ProdSys2 by the physics groups themselves, ready to switch over to ProdSys2 (next slides).Pending :Output file merging (JEDI-internal merging enabled for derived outputs)Replicating output to group disk for standalone productions (missing functionality between ProdSys2 and Rucio
)DEfT to be used by group contacts to place their standalone production requests directly into ProdSys2 (approval done by production managers)
Overall Status
Slide1919
Workflows ValidatedNot all NTUP types will be tested – similar workflows, ASG gave green
light for switch over to ProdSys2
NURCAN OZTURK
19
Slide2020
NURCAN OZTURKDEfT works well by both uploading
the task submission list files and by making the list files on-the-fly using the interface itself.
Tested by group production manager only
Pending
:
Group names need to use the same naming convention as in the GLANCE system (metadata group).
Needs to be tested by a few group contacts after required features are added, being discussed in: https://its.cern.ch/jira/browse/PRODSYS-231. High priority as DPD Savannah will migrate to JIRA and no plans to use JIRA for receiving production requests, but only as a problem tracking tool.Parameters required in task definitions should be finalized, for instance:“lumiblock=yes”, JEDI is supposed to use this internally, lingering issue since last Software week.“destination” handling. H
andled by ProdSys2 till the required functionality between ProdSys2 and Rucio is in place?
Request Placing
Slide21DATA REPROCESSING
Slide22Data Reprocessing
22BRUNO LENZIReprocessing request somewhat different than MC
Can involve more than one Reco
Merging chain that uses some of the outputs of a previous
chain
Implemented and tested successfully (although not extensively) in ProdSys2Many problems encountered during DC14 with reco and merging software prevented further tests. System not thoroughly validatedReprocessing wish listhttps://twiki.cern.ch/twiki/bin/view/Atlas/
ReprocessingProdSys2
including
e.g. configuration of merging parameters, file exclusion list,
etc
Slide23Next Steps and Schedule
23JOSE ENRIQUE GARCIA NAVARRO
MC Production :
Validate the output of the
sample A
in physics validation
–
September
Exercise DC14 and MC15 chains to ensure readiness
–
September
Start processing some production requests in the system
–
September
Implement missing functionality for event generation and open fully to all requests
–
October
Implement special cases
–
September to November
Phase out
ProdSys
–
November
Group Production:
Launch full scale derivation production (with JEDI-internal merging enabled)
– this week
Finalize
DEfT
to be used by group contacts, get tested by a few group contacts
–
September
Group contacts use
DEfT
fully, DPD Savannah migrates to JIRA
–
October
Same plan as in e. and f.
above
Slide24Next Steps and Schedule
24JOSE ENRIQUE GARCIA NAVARRO
Data Reprocessing :
Test with f-tag (need input from PROC)
Submission of special runs of
DC14
– this
week
Re-submission of failed jobs of DC14 (new cache
needed)
Suggest
to keep ProdSys1 as
long as possible (end
of the
year)