Bookkeeping amp Monitoring Tutorial 2 Bookkeeping content Contains records of all jobs and all files that are created by production jobs Job In fact technically a step in a workflow ID: 460075
Download Presentation The PPT/PDF document "Bookkeeping Tutorial" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Bookkeeping TutorialSlide2
Bookkeeping & Monitoring Tutorial
2
Bookkeeping content
Contains records of all “jobs” and all “files” that are
created by
production jobs
Job:
In fact technically a “step” in a workflow
E.
g
. “Gauss step”, “Brunel step”…
For real RAW data: the “job” is in fact a DAQ run
Has input files (except runs and Gauss)
Has output files
Note that files may not be kept (i.e. have a replica)
All files are registered in order to keep the full history
Has metadata
Location, production number, application,
CPUTime
, etc…
Files:
Always
defined as output
of a “job”
Files are defined by an LFN (Logical File Name)
Contain metadata
Number of events, size, event type, etc…Slide3
Bookkeeping purpose
Provenance database
C
ontains
the full history of productionsTraceability of datasetsUser dataset search
Select a list of files from selection criteria
O
nly files with a replica!Generate Gaudi configuration fileGive also access to the job/file treeE.g. investigate history of a fileProduction datasets searchSelect the dataset to be processed by production jobsEnsures consistency of input files for a productionUses directly the BK API to get the list of files
Bookkeeping & Monitoring Tutorial
3Slide4
Bookkeeping partitioning
Configuration N
ame
/ version
Real data<DAQ partition> / <activity>
Simulated data
“MC” / <activity>
<activity> : “DC06” / “MC09” …ConditionsParameters of initial dataAll subsequent processed data inherit the “conditions”Real dataDAQ conditionsBeam conditions, energy, magnetic field, detector conditions…Simulated dataSimulation conditionsBeam energy, magnetic field, luminosity, generator settings…
Bookkeeping & Monitoring Tutorial
4Slide5
Processing pass
Associated to a level of processing
Within a given partition (
config
name / version + conditions)C
orresponds
to the whole processing workflow
Single workflow for a given processing passCompatible versions of applicationsSpecifies the processing pass of input data when applicableSequence of processingRe-processing creates branchesBookkeeping & Monitoring Tutorial5
Gauss
SIM
Boole
DIGI
Brunel
DST
DaVinci
ETC
Brunel
DST
SimReco
StrippingSlide6
Other query parameters
Event type
F
ile
propertyReal data90000000 : real data full stream
90000001 : real data express stream
Types to be defined for stripping streams
Simulated dataLHCb convention for decay treeFile typeData content / formatFormat not yet usedBookkeeping & Monitoring Tutorial
6Slide7
Running the bookkeeping GUI
Needs a valid Grid
certificate
https://twiki.cern.ch/twiki/bin/view/LHCb/FAQ/Certificate
Needs an X server
On
lxplus
: lhcb-bkkSetupProject DiracSets up the environmentIf needed: lhcb-proxy-initCreates a valid Grid proxy
dirac-bookkeeping-gui
Individual commands can be issued from the prompt
!
You can also install Dirac locally on your Linux machine:
https://twiki.cern.ch/twiki/bin/view/LHCb/ProductionProcedures - Installing_DIRAC_on_non_CERN_mac
Bookkeeping & Monitoring Tutorial
7Slide8
The query tree
Bookkeeping & Monitoring Tutorial
8Slide9
More info
Right click on
Conditions
Processing pass
Bookkeeping & Monitoring Tutorial
9Slide10
Event type and file type
Bookkeeping & Monitoring Tutorial
10Slide11
Dataset selection
Bookkeeping & Monitoring Tutorial
11
Logical File nameSlide12
Limit number of files per page
Bookkeeping & Monitoring Tutorial
12Slide13
Saving configuration (a.k.a. options) file
P
ython
configuration (default)
Still possible to create .opts (discouraged!).txt file for just a list of LFNs
All files or selected files (if any)
Bookkeeping & Monitoring Tutorial
13Slide14
Advanced saving
Select files for a site (for local usage, not Grid job)
LFN+XML catalog
Next slide
PFN
Bookkeeping & Monitoring Tutorial
14Slide15
Advanced saving (LFN)
LFNs
+ XML catalog
Bookkeeping & Monitoring Tutorial
15Slide16
Other queries
Select another “tree”
Different order for the query
Production lookup
If you are interested in a particular production number
Run lookup
For real data (currently FEST)
Bookkeeping & Monitoring Tutorial16Slide17
Dealing with PFNs
or XML catalogs
Using
ganga
+ DIRACBookkeeping integrated in ganga
:
dataset =
browseBK()LFN handling is then automatic…genXMLCatalogSame functionality as “Advanced save” of GUIEnsures files are available on the specified siteGets the PFN from the Storage Element Not constructed “by hand”
Bookkeeping & Monitoring Tutorial
17Slide18
DIRAC Monitoring
web portalSlide19
General information
Entry point to the DIRAC web portal
http://dirac.cern.ch
Web implementation of (almost) a full desktop application
Monitoring of productions / jobs
Accounting (jobs, data management)
Allows to take actions on jobs
Authentication / authorisation is mandatoryAnonymous access gives minimal informationGet a certificate and load it in our in your browserhttps://twiki.cern.ch/twiki/bin/view/LHCb/FAQ/CertificateDIRAC authorisation
through “DIRAC groups”Default
:
lhcb_user
Other groups:
lhcb_prod
,
dirac_admin
…
Future: specific groups per physics groups, PPG (for production
authorisation
)…
Capabilities depends on the group
Bookkeeping & Monitoring Tutorial
19Slide20
The DIRAC portal home page
Bookkeeping & Monitoring Tutorial
20
Identity
DIRAC group
DIRAC
instance
MenusSlide21
Job Monitoring
Bookkeeping & Monitoring Tutorial
21
Selection
Monitoring info
ActionsSlide22
Job Monitoring (cont’d)
Selection
For group
lhcb_user
, only see your own jobsCan select with
Status
S
iteDate…ColumnsCan tailor the columns to be displayedClicking toggles the sorting in the columnRowsJobs displayed in pages (default 25 rows, don’t exceed 100)Can scroll pagesBookkeeping & Monitoring Tutorial22Slide23
Logging info
Bookkeeping & Monitoring Tutorial
23Slide24
Output peeking
Bookkeeping & Monitoring Tutorial
24Slide25
Attributes
Bookkeeping & Monitoring Tutorial
25Slide26
Parameters
Bookkeeping & Monitoring Tutorial
26Slide27
Job statistics
Bookkeeping & Monitoring Tutorial
27Slide28
Accounting
Gives you access to your jobs
Select parameters:
Plot
Time rangeItem to plot against§ (site, status…)
Selection criteria
Site
(Final) StatusBookkeeping & Monitoring Tutorial28Slide29
Accounting screenshots
Bookkeeping & Monitoring Tutorial
29Slide30
Accounting (cont’d)
Bookkeeping & Monitoring Tutorial
30Slide31
Job CPU efficiency
Bookkeeping & Monitoring Tutorial
31