/
EDELWEISS  data structure EDELWEISS  data structure

EDELWEISS data structure - PowerPoint Presentation

contera
contera . @contera
Follow
343 views
Uploaded On 2020-07-03

EDELWEISS data structure - PPT Presentation

and analysis framework Motivation to build a new data structure and analysis framework Kdata We had Edw II data analysis dispersed between Ana and Era 2 experts full time analysis ID: 793889

schmidt data kdata benjamin data schmidt benjamin kdata amp structure event analysis raw processing loop file pulse hla bolo

Share:

Link:

Embed:

Download Presentation from below link

Download The PPT/PDF document "EDELWEISS data structure" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

EDELWEISS data structureand analysis framework

Slide2

Motivation to build a new data structure and analysis framework (Kdata)We had:

Edw

-II data analysis dispersed between

Ana and Era2 experts (full time analysis)Each with their own code single(few local)-user / single-programmer2010 A. Cox and I struggling to find, to access and to analyze Edw2 dataCoincidence (Muon-Veto/Bolometer) study as diploma work

Benjamin Schmidt

Era

Root based, but difficult access, no server with most recent code/data…

Saclay

AnaFortran, Paw and C, No paw support, French comments in code/data…

lack of documentation

Task:Get the data

J

. Cham

Slide3

Short term facilitate data accessBuild flexible event based data structure

Single combined

HLA-file:

muon-veto and bolometer dataMake code and data easily availableDocumentationLong term establish a common collaboration-wide analysis and data storage toolShare tasks (calibration, template creation, …) / Remove barriers (documentation)Allow for upgrade to 100’s of detectors – develop automatic processing schemeBenjamin Schmidt

Motivation to build a new data structure and analysis framework (Kdata)

Slide4

The general picture – The idea All software modules Benjamin Schmidt

KDS

data structure

KPTA

pulse trace analysis

Kamping

Raw

Amp

HLA

Analysis:

KDataPy

KQPA

DAQ

KSamba

ampToHLA

A bit special:

Standalone code

Extensive use of templates

Slide5

Specific known - unknown requirements during Kdata development

Requirements Edw-3:

10 -> 40

detectorsLarger workload for debugging, calibration and analysisNew detector design (channel number/specifics initially unknown)New electronics (some specifics unknown)1st time resolved ionization signals (trace length?, num traces?)Change in analog amplifiers -> signal shape?, trace length?, sampling?new efforts to optimize signal

treatment neededIntegrate muon

-veto in bolo DAQBenjamin Schmidt

Slide6

The idea:Build a data storage and analysis framework use ROOTfor event-based physics dataFast I/OSupport for LHC lifetimeData compression

Statistics tools

Well known

C++ class library for data encapsulationKeep it modularKeep it flexible and generalTry to keep it simple Keep fully split tree (library independent)Document itMake it easily accessibleBenjamin SchmidtEvent based data sorage

Kdata - implementation

repository

https://

edwdev-ik.fzk.de/SVN_Repository_for_the_KIT_Dark_Matter_Group/KData.html

Slide7

Kdata event structure in detailUse ROOT types

No nested

arrays

Kdata library not needed to read dataLong livety of data guaranteedKdata coded consistent to ROOT and taligent coding style:Easier to read/collaborate/check codeFor example:classes defined in header .h; implemented in .cxxvariables start with small f (fChannelName; fAmp; fExtra; …)functions

start with capital letter GetChannelName(); GetTrace

();…Kds completely implemented with Get…() and Set…() methodsTab completion (ipython, root session)

Benjamin Schmidt

Slide8

Kdata event structure in detailROOT TTree with single event branch

Event with flexible structure:

Variable sized

TClonesArrays for Bolometer-, BoloPulse-, PulseAnalysis-, Samba- and MuonModule informationAllows to change in hardware number of bolos/number of channels per bolo… without code change in “kds” (data structure source code)!Requires some effort to get to know, thoughBenjamin Schmidt

Slide9

Kdata event structureLogic Layout:Benjamin Schmidt

TTree

KEvent

KBoloPulseRecord

= Channel

KPulseAnalysisRecord

KSambaRecord

KMuonModuleRecords

KBolometerRecord

Logic event structure via

TRef

and

TRefArray

Very powerful – can be spread over files,….

A word of caution though:

Require specific handling in event building: Never forget to reset the referenced object count

TProcessID

::

SetObjectCount

->blowing up file size otherwise

Probably most bugs and

pbs

in

kds

were related to

TRef issues

Slide10

Kdata event structureLogic Layout:Benjamin Schmidt

TTree

KEvent

KBolometerRecord

KBoloPulseRecord

= Channel

KPulseAnalysisRecord

KSambaRecord

KMuonModuleRecords

Looping in python:

for event in

filereader

:

for bolo in

event.boloRecords

():

for pulse in

bolo.pulseRecords

():

for

analyis

in

pulse.analysisRecords

():Looping C++ style in python:for i in range(f.GetEntries()): f.GetEntry(i) event = f.GetEvent() for ii in range(event.GetNumBolos()): bolo = event.GetBolo(ii) samba

= bolo.GetSambaRecord() print samba.GetNtpDateSec() for iii in range(bolo.GetNumPulseRecords()):

pulse = bolo.GetPulseRecord(iii) Trace = pulse.GetTrace()

KPulseAnalysisRecord

KPulseAnalysisRecord

Bandpass analysis

Optimal filter

Trapezoidal filter …

Slide11

Kdata event structure in detailBenjamin Schmidt

Structure

subclassed

inRaw: KRawEvent, KRawBolometerRecord, …Amp: KAmpEvent, KAmpBolometerRecord, ….HLA: KHLAEvent, KHLABolometerRecord, …

Raw – with pulse traces!

No

KPulseAnalysisRecords

Amp and HLA – no pulse traces, but KPulseAnalysisRecord

With a quick calculation

2.87* 356/1850 *2.35

 FWHM 1.04

keV

Ana 1.1

keV

< 1/10

raw file size

~ 1/2 samba file size

Slide12

Python and KDataPy

Benjamin Schmidt

Slide13

simpleEventViewer output:

Benjamin Schmidt

Slide14

Looping

utilites

no need to write the looping/plotting

Benjamin Schmidt

Use KDataPy.util with plotpulse(), looppulse(), loopbolo() andKDataPy.loop_amp

with loopchannel(), plotchan_x(), plotchan_x_files(), plotchan_x_dir()

Loop_amp to be completed with plotchannel_xy(), … and loop/plotbolo functions – Note that KDataPy.util loopbolo() also works for Amp and HLA data

Basic usage:import ROOTimport KDataPy.util as ut

ut.plotpulse(“/sps

/edelweis/kdata/data/raw/nk23b002_000.root”, “

chalB FID823”)Documentation

Slide15

Our data acquisition chains revisited

Benjamin Schmidt

Samba Macs

Muon

Veto

DAQ

Bolo-Raw data

Automated

proc0: copy to Lyon

proc1:

rootification

p

roc2: raw->amp

proc3: amp->

hla

p

roc4: merge/skim

muon

/

hla

bolo data

spsToHpss

:

backup on tape drive

Kdata - ROOT on kalinkaOur look up place

Modane

Lyon

Karlsruhe

Radon

Slide16

Using the Kdata pulse processing library

Benjamin Schmidt

Adam Cox our benevolent dictator for life

Slide17

The KPulseAnalysisChain

Benjamin Schmidt

The

kpta

-

chain is applied before your analysis function

Slide18

Ionisation channel after pattern removal:

Benjamin Schmidt

Slide19

Advantages – Drawbacks (personal opinion)Flexibility of data structureConsistency of data structure (over time)

Same data structure for different detector systems -> Great for coincidence studies

Same data structure for different processing/analyses (

bandpass, optimal filter, …)Decouple high level analyses from DAQ/processing changesIndependent kpta libraryHas been reused with (flat) data from EURECA test standVery versatileBenjamin Schmidt

Flexibility of data structure comes with some complexity (

heavyness

)

Especially

Ttree.Draw() more

complexSingle raw data folder  restricted use of ls

Writing kpta with templates a bit more complex

Slide20

Usage of pyhtonBenjamin Schmidt

90 % of the time

python

feels like the right solution

Shorter, more legible code

Vast set of external libraries

Extremely handy for scripting

Basic Documentation in python always via ‘’’docstrings’’’

Main price – speed:

Circumvent by producing an additional set of data files skimmed by detector

Future use of

pypy

+ ROOT6

Slide21

Benjamin Schmidt

But 50 x slower

PyPY

-JIT compile 1.06 x slower

Slide22

Benjamin Schmidt

Slide23

CouchDB for everything else andpython to glue everything together

Automat database (117 parameters every 20 sec)

dataDB

Samba header informationUseful to find data under conditions(temperature, voltage, run_type,…)Processing stateHistory of processing/file location (complete documentation)Supplementary processing databasesTemplates, high-/lowpass filter parameters, cutsRadon measurements…

Benjamin Schmidt

Slide24

A more complex example:Heat template fitting codeThree python modules (all part of

KDataPy

!):

templateFitSelection.py (looping over data, select pulses, average parameters; call the other scripts)pulsetempy.py (perform template fit)uploadAnalyticalTemplateToDB.py (save fit parameter to DB)Usage:Import KDataPy.TemplateFitSelection as tfittfit.templateFitSelection(‘/sps/kdata/data/raw/nk23b002_000.root’)t

fit.run(‘chalB FID808’)

Note that there are some more options though!The code itself is commented and should help to discover more optionsSorry – Documentation (web) has not been updated yetBenjamin Schmidt

Slide25

Basic looping once moreMore verbose version:Use

plotPulseEventViewer

module in

kanacodewokimport plotPulseEventViewer as pltplt.plotPulseEventViewer(‘/sps/edelweis/kdata/data/raw/nk23b002_000.root’, ‘chalA FID823’)

Benjamin Schmidt

Slide26

More advanced usageHook in an analysis function

Benjamin Schmidt

Slide27

Processing – some detailsDatabase driven:Proc0:

scp

of samba raw data to

ccage (Lyon)Task1: change scp account to keitel (all tests finished, batch-, hpss-,…)Task2: add md5 checksum test after transfer Proc1: rootification (Modane)

scp to ccage

(Lyon)Task: transfer rootification to LyonProc2: processing and filteringTemplate fitting tools with DB access implementedAdaptation of processing to 8 step function ionization channels

All data from november processed with KFeldbergKampSite

(BW Bandpass filter – all channels treated seperately) sps/

edelweis/kdata/data/amp/Run305Task1: automate using DB and redhook.sh scriptTask2: implement KSeebugKampSite

(BW Bandpass with simultaneous heat-ionization fits)Task3: (longer term) revive/debug optimal filter KChamonixKAmpSite

Benjamin Schmidt

Slide28

Processing – some more detailsProc3: calibration of Amp level filesTask1: portation of Era scripts: perform calibration, store results (

calibDB

)

Task2: implement Amp->HLA process using calibDBProc4/5/6:Tasks: concat/Merge/Skim dataWhat can/should be automated?Tasks: facilitate access to data:Implement run list based on datadb (see talks by Cecile/Lukas/Valentin)Write python utilities to facilitate plotting/loopingKDataPy.utilKDataPy.loop_amp …

spsToHPSS:

Fully workingTask1: nj13b…tar. There is a file that was too big for automatic processingTask2: implement md5 checksum test after writingBenjamin Schmidt

Slide29

Template fittingThe program is rather verbose!

Benjamin Schmidt

Slide30

Template fittingBenjamin Schmidt

Strong dependence on initial parameters

Initial

params from last fit pulstemplates

dbSome tweaking still necessary (larger amplitude…)

Slide31

A useful trick – Quitting your loop

Benjamin Schmidt

Slide32

Loop-/plotboloYou need to correlate channels?

 skip looping at bolometer level

Benjamin Schmidt

Slide33

Benjamin Schmidt

Okay a stupid example, but a quick one

Note the documentation with further examples:

KDataPy Utility functions

Slide34

From theory to practice – Part 2Working with Amp level dataBenjamin Schmidt

Structure

subclassed

inRaw: KRawEvent, KRawBolometerRecord, …Amp: KAmpEvent, KAmpBolometerRecord, ….HLA: KHLAEvent, KHLABolometerRecord, …

Raw – with pulse traces!

No

KPulseAnalysisRecords

Amp and HLA – no pulse traces, but KPulseAnalysisRecord

With a quick calculation

2.87* 356/1850 *2.35

 FWHM 1.04

keV

Ana 1.1

keV

< 1/10

raw file size

~ 1/2 samba file size

Slide35

Ttree.Draw() exampleBenjamin Schmidt

With a quick calculation

2.87* 356/1850 *2.35

 FWHM 1.04

keV

Ana 1.1 keV

TTree

->Draw() command or rather

TChain

->Draw() (called from python)

c.Draw

("

fPulseAna

[].

GetAmp

()", "

fPulseAna

[].

GetBoloPulseRecord

().

GetChannelName

() == \"

slowD

FID823\" &&

fPulseAna

[].GetExtra(8)==5 ")

Slide36

Using loop_amp

Benjamin Schmidt

Or – if the automatic binning is too crude:

Slide37

Loop_amp together with file lists/directoriesUse loop.plotchan_x_files([“file1.root”, “file2.root”], ‘channel’, …) or

use

loop.plotchan_x_dir

(‘directory’, ‘file-pattern’, ‘channel’, …)Benjamin Schmidt

Amplitude

Entries

Entries

Amplitude

Entries

Slide38

Plotting a Tgraph of two variables – very first example: RMS vs energy

Benjamin Schmidt

Chi2

Amplitude

These are just examples

Develop your own “hook-in” functions!

x_some_function

()Xy_some_function()….

Slide39

Calibrated dataERA calibrated data in Kdata v3.0 format for Run12

Computing Center in Lyon and at KIT

Ana calibrated data in

Kdata (dev-version) for Run20https://edwdev-ik.fzk.de/wsvn/EDELWEISS/analysis/kdata/branches/newhla2/An initial data set FID804 available at KIT and Lyon/sps/edelweis/schmidt/AnaToKData/Run20

KData preliminary analysis files of single detectors Run12 – Run20 – Run 304 at KIT

Benjamin Schmidt

Hole collecting

Hole veto

Electron veto

Electron collecting