/
Bookkeeping Tutorial Bookkeeping Tutorial

Bookkeeping Tutorial - PowerPoint Presentation

briana-ranney
briana-ranney . @briana-ranney
Follow
455 views
Uploaded On 2016-09-03

Bookkeeping Tutorial - PPT Presentation

Bookkeeping amp Monitoring Tutorial 2 Bookkeeping content Contains records of all jobs and all files that are created by production jobs Job In fact technically a step in a workflow ID: 460075

monitoring bookkeeping amp tutorial bookkeeping monitoring tutorial amp files dirac lhcb data job twiki file processing real production cern

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Bookkeeping Tutorial" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Bookkeeping TutorialSlide2

Bookkeeping & Monitoring Tutorial

2

Bookkeeping content

Contains records of all “jobs” and all “files” that are

created by

production jobs

Job:

In fact technically a “step” in a workflow

E.

g

. “Gauss step”, “Brunel step”…

For real RAW data: the “job” is in fact a DAQ run

Has input files (except runs and Gauss)

Has output files

Note that files may not be kept (i.e. have a replica)

All files are registered in order to keep the full history

Has metadata

Location, production number, application,

CPUTime

, etc…

Files:

Always

defined as output

of a “job”

Files are defined by an LFN (Logical File Name)

Contain metadata

Number of events, size, event type, etc…Slide3

Bookkeeping purpose

Provenance database

C

ontains

the full history of productionsTraceability of datasetsUser dataset search

Select a list of files from selection criteria

O

nly files with a replica!Generate Gaudi configuration fileGive also access to the job/file treeE.g. investigate history of a fileProduction datasets searchSelect the dataset to be processed by production jobsEnsures consistency of input files for a productionUses directly the BK API to get the list of files

Bookkeeping & Monitoring Tutorial

3Slide4

Bookkeeping partitioning

Configuration N

ame

/ version

Real data<DAQ partition> / <activity>

Simulated data

“MC” / <activity>

<activity> : “DC06” / “MC09” …ConditionsParameters of initial dataAll subsequent processed data inherit the “conditions”Real dataDAQ conditionsBeam conditions, energy, magnetic field, detector conditions…Simulated dataSimulation conditionsBeam energy, magnetic field, luminosity, generator settings…

Bookkeeping & Monitoring Tutorial

4Slide5

Processing pass

Associated to a level of processing

Within a given partition (

config

name / version + conditions)C

orresponds

to the whole processing workflow

Single workflow for a given processing passCompatible versions of applicationsSpecifies the processing pass of input data when applicableSequence of processingRe-processing creates branchesBookkeeping & Monitoring Tutorial5

Gauss

SIM

Boole

DIGI

Brunel

DST

DaVinci

ETC

Brunel

DST

SimReco

StrippingSlide6

Other query parameters

Event type

F

ile

propertyReal data90000000 : real data full stream

90000001 : real data express stream

Types to be defined for stripping streams

Simulated dataLHCb convention for decay treeFile typeData content / formatFormat not yet usedBookkeeping & Monitoring Tutorial

6Slide7

Running the bookkeeping GUI

Needs a valid Grid

certificate

https://twiki.cern.ch/twiki/bin/view/LHCb/FAQ/Certificate

Needs an X server

On

lxplus

: lhcb-bkkSetupProject DiracSets up the environmentIf needed: lhcb-proxy-initCreates a valid Grid proxy

dirac-bookkeeping-gui

Individual commands can be issued from the prompt

!

You can also install Dirac locally on your Linux machine:

https://twiki.cern.ch/twiki/bin/view/LHCb/ProductionProcedures - Installing_DIRAC_on_non_CERN_mac

Bookkeeping & Monitoring Tutorial

7Slide8

The query tree

Bookkeeping & Monitoring Tutorial

8Slide9

More info

Right click on

Conditions

Processing pass

Bookkeeping & Monitoring Tutorial

9Slide10

Event type and file type

Bookkeeping & Monitoring Tutorial

10Slide11

Dataset selection

Bookkeeping & Monitoring Tutorial

11

Logical File nameSlide12

Limit number of files per page

Bookkeeping & Monitoring Tutorial

12Slide13

Saving configuration (a.k.a. options) file

P

ython

configuration (default)

Still possible to create .opts (discouraged!).txt file for just a list of LFNs

All files or selected files (if any)

Bookkeeping & Monitoring Tutorial

13Slide14

Advanced saving

Select files for a site (for local usage, not Grid job)

LFN+XML catalog

Next slide

PFN

Bookkeeping & Monitoring Tutorial

14Slide15

Advanced saving (LFN)

LFNs

+ XML catalog

Bookkeeping & Monitoring Tutorial

15Slide16

Other queries

Select another “tree”

Different order for the query

Production lookup

If you are interested in a particular production number

Run lookup

For real data (currently FEST)

Bookkeeping & Monitoring Tutorial16Slide17

Dealing with PFNs

or XML catalogs

Using

ganga

+ DIRACBookkeeping integrated in ganga

:

dataset =

browseBK()LFN handling is then automatic…genXMLCatalogSame functionality as “Advanced save” of GUIEnsures files are available on the specified siteGets the PFN from the Storage Element Not constructed “by hand”

Bookkeeping & Monitoring Tutorial

17Slide18

DIRAC Monitoring

web portalSlide19

General information

Entry point to the DIRAC web portal

http://dirac.cern.ch

Web implementation of (almost) a full desktop application

Monitoring of productions / jobs

Accounting (jobs, data management)

Allows to take actions on jobs

Authentication / authorisation is mandatoryAnonymous access gives minimal informationGet a certificate and load it in our in your browserhttps://twiki.cern.ch/twiki/bin/view/LHCb/FAQ/CertificateDIRAC authorisation

through “DIRAC groups”Default

:

lhcb_user

Other groups:

lhcb_prod

,

dirac_admin

Future: specific groups per physics groups, PPG (for production

authorisation

)…

Capabilities depends on the group

Bookkeeping & Monitoring Tutorial

19Slide20

The DIRAC portal home page

Bookkeeping & Monitoring Tutorial

20

Identity

DIRAC group

DIRAC

instance

MenusSlide21

Job Monitoring

Bookkeeping & Monitoring Tutorial

21

Selection

Monitoring info

ActionsSlide22

Job Monitoring (cont’d)

Selection

For group

lhcb_user

, only see your own jobsCan select with

Status

S

iteDate…ColumnsCan tailor the columns to be displayedClicking toggles the sorting in the columnRowsJobs displayed in pages (default 25 rows, don’t exceed 100)Can scroll pagesBookkeeping & Monitoring Tutorial22Slide23

Logging info

Bookkeeping & Monitoring Tutorial

23Slide24

Output peeking

Bookkeeping & Monitoring Tutorial

24Slide25

Attributes

Bookkeeping & Monitoring Tutorial

25Slide26

Parameters

Bookkeeping & Monitoring Tutorial

26Slide27

Job statistics

Bookkeeping & Monitoring Tutorial

27Slide28

Accounting

Gives you access to your jobs

Select parameters:

Plot

Time rangeItem to plot against§ (site, status…)

Selection criteria

Site

(Final) StatusBookkeeping & Monitoring Tutorial28Slide29

Accounting screenshots

Bookkeeping & Monitoring Tutorial

29Slide30

Accounting (cont’d)

Bookkeeping & Monitoring Tutorial

30Slide31

Job CPU efficiency

Bookkeeping & Monitoring Tutorial

31