Mohammad AlTurany 62513 M AlTurany ALICE Offline Meeting 1 What is FairRoot Framework And why it is needed 62513 M AlTurany ALICE Offline Meeting 2 http fairrootgside ID: 798999
Download The PPT/PDF document "FairRoot Status and plans" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
FairRoot Status and plans
Mohammad Al-Turany
6/25/13
M. Al-Turany, ALICE Offline Meeting
1
Slide2What is FairRoot Framework? And why
it is needed?6/25/13
M. Al-Turany, ALICE Offline Meeting
2
http://
fairroot.gsi.de
Simulation-, Reconstruction-, and Analysis-Framework
2003
started as 2 person project for the CBM
experiment at FAIR
Long list of base and/or ready to use modules and base classes of needed by the particle experiments
Slide3Current hot topics in FairRootDatabase interface
Re-design the database interface based on TSQLServerZeroMQ
integrationUse of ZeroMQ
as a communication layerBuilding, testing and quality assurance systemsCoverage tests, quality tests and unit testsOnline monitoring For test beams and detector proto-types
GPU support and integration
Time based simulation
6/25/13
M. Al-Turany, ALICE Offline Meeting
3
Slide4long list of people who have contributed pieces of code to FairRoot
since the project started end of 20036/25/13
M. Al-Turany, ALICE Offline Meeting
4
Core Team:
Mohammad Al-Turany
IT
Denis
Bertini
IT
Florian
Uhlig
CBM
/ IT
Radek
Karabowicz
PANDA /
IT
Dmytro
Kresan R3B/ ITTobias Stockmanns PANDA (FZJ)
People participated to major features:
Ilse König HADESVolker Friese CBMOlaf Hartman PANDA
FairRoot Developers:
Student:
Dennis Klein (finished 02.2013)
Alexey
Rybalchenko
(EE)
Slide5FairRoot Group at the GSIMohammad Al-Turany (IT)
Denis Bertini (IT)Radoslaw Karabowicz
(IT/PANDA)Dymtro Kresan (
IT/R3B)Anar Manafov
(
IT
)
Alexey
Rybalchenko
(Master Student)
Yago
Gonzalez
Rozas (Guest scientist)
Florian Uhlig (IT/CBM) N.N. (Sep.2013) (IT)
6/25/13
M. Al-Turany, ALICE Offline Meeting
5
Slide6Design
6/25/13
M. Al-Turany, ALICE Offline Meeting
6
13.03.13
Florian Uhlig ROOT Users Workshop, Saas Fee
Root
TEve
ROOT IO
TGeo
TVirtualMC
Cint
TTree
…
Proof
Geant3
Geant4
Genat4_VMC
Libraries
…
VGM
FairRoot
…
Run Manager
IO Manager
Runtime DB
DB Interface
Event Display
MC Application
Module
Detector
Task
Magnetic Field
…
Event
Generator
CbmRoot
PandaRoot
AsyEosRoot
R3BRoot
SofiaRoot
MPDRoot
FopiRoot
EICRoot
Slide7Start testing the VMC concept for CBM
First Release of CbmRoot
MPD (NICA) start also using
FairRoot
ASYEOS joined
(ASYEOSRoot)
GEM-TPC seperated from PANDA branch (FOPIRoot)
Panda decided to join->
FairRoot: same Base package for different
experiments
R3B joined
EIC (Electron Ion Collider BNL)
EICRoot
2011
2010
2006
2004
FairRoot
: Timeline
2012
SOFIA (Studies On Fission with
Aladin
)
6/25/13
M. Al-Turany, ALICE Offline Meeting
7
ENSAR-ROOT
Collection of modules used by structural nuclear phsyics exp.
2013
Slide8Database Re-Design6/25/13
M. Al-Turany, ALICE Offline Meeting
8
Slide9Database in FairRoot:The real database in FairRoot
is completely hidden from the user and/or software developer
The runtime database is not a database in the classical sense, but a parameter manager.
It knows the “I/O”s defined by the user and all parameter containers needed for the actual analysis and/or Simulation.
It
manages the automatic initialization and saving of the parameter containers
After
all initialization the complete list of runs and related parameter versions are saved either to
Database (Oracle,
MySql
, …) or
to ROOT files.
6/25/13
M. Al-Turany, ALICE Offline Meeting
9
Slide10FairRoot DB Design (Old)
10
FairRoot
Run Manager
RunTime Database
ASCII File
Configuration
parameters.
IO Manager
Root File
MC-points
Digits, etc
Root File
Configuration
parameters.
Oracle
6/25/13
M. Al-Turany, ALICE Offline Meeting
Slide11FairRoot DB extended11
FairRoot
Run Manager
RunTime Database
ASCII File
Configuration
parameters.
IO Manager
Root File
MC-points
Digits, etc
Root File
Configuration
parameters.
TSQLServer
Oracle
Postgresql
MySQL
DB Interface
6/25/13
M. Al-Turany, ALICE Offline Meeting
Slide12Re-design Database interface based on ROOT Database Connectivity (RDBC) API which provides uniform interface to Oracle, MySQL, PgSQL
Database Interface in FairRoot
using TSQLServer
(MySQL, Oracle,
PostGre
,... )
Allows multiple connections to
Dbs
at runtime
Adds
Version Management
Data type: Real and/or MC
Detector type
Date and Time Range
Reduces SQL coding
Simple Predefined Table
Only Simple SQL used
Ultimately
Generic Container
Handles Write/Read access
6/25/13
M. Al-Turany, ALICE Offline Meeting
12
Slide13Detector
Time
Version
Validity time range (UTC)
STS CAL
MVD CAL
MVD TEMP
Version
Mangment
It must be possible to get a consistent set of information for any date
(e.g.
T
he
start time of a certain run).
It
must be possible to get an answer to the question: '
Which parameters were used when analyzing this run X years ago?
' (The calibration might have been optimized several times since this date. Maybe some bugs have been detected and corrected in the mean time.)
6/25/13
M. Al-Turany, ALICE Offline Meeting
13
RunID
t
Time
Slide14Version Management
The Query process
Context (
Timestamp,Detector,Version
) is the primary key
Context converted to unique
SeqNo
SeqNo
used as keys to access all rows in main table
System gives user access of all such rows
SEQNo
Context
matched
SeqNO
Col 1 …
Col
n
Validity Frame
900001020
900001020
900001020
900001020
Bigtable
a Distributed Storage System for
Structured Data, Google inc. OSDI 2006
Auxiliary validity table
D.
Bertini
14
6/25/13
M. Al-Turany, ALICE Offline Meeting
Slide15New Data transfer layer for FairRoot6/25/13
M. Al-Turany, ALICE Offline Meeting
15
Slide16The Online Reconstruction and analysis6/25/13
300 GB/s
20M
Evt/s
< 1 GB/s
25K
Evt
/s
We have the fastest algorithms but
:
How to distribute the processes?
How to manage the data flow?
How to recover processes when they crash?
How to monitor
the whole system?
……
1
TB/s
1 GB/s
> 60 000
CPU-core
or Equivalent
GPU, FPGA, …
> 60 000
CPU-core or Equivalent GPU, FPGA, …
M. Al-Turany, ALICE Offline Meeting16
Slide17Design constrainsHighly flexible:
different data paths should be modeled. Adaptive: Sub-systems are continuously under
development and improvementShould works for simulated and real data: developing and debugging the algorithmsIt should support all possible hardware where the algorithms could run (CPU, GPU, FPGA)
It has to scale to any size! With minimum or ideally no effort.
6/25/13
M. Al-Turany, ALICE Offline Meeting
17
Slide18Data transport How to handle dynamic components, i.e. pieces that go away temporarily?
How to handle messages that we can't deliver immediately? Particularly, if we're waiting for a component to come back on-
lineWhat if we need to use a different network transport. Say, multicast instead of TCP unicast? Or IPv6? Do we
need to rewrite the applications, or is the transport abstracted in some layer?6/25/13
M. Al-Turany, ALICE Offline Meeting
18
Slide19Before Re-inventing the Wheel
What is available on the market and in the community?A very promising package:
ZeroMQ is available since 2 yearsDo we intend to separate online and offline?
NOMulti-Threaded concept or Multi-Processes based on message queues?Message based systems allow us
to
decouple producers from consumers.
We can
spread the work to be done over several processes and
machines.
We
can manage/upgrade/move around
programs (processes)
independently of each other.
6/25/13
M. Al-Turany, ALICE Offline Meeting
19
Slide20ØMQ (zeromq)
A socket library that acts as a concurrency framework.Carries messages across inproc, IPC, TCP, and
multicast.Connect N-to-N via fanout
, pubsub, pipeline, request-reply.Asynch
I/O
for scalable multicore message-passing
apps.
30
+ languages including C, C++, Java, .NET,
Python.
Most OS’s
including
Linux
, Windows, OS
X, PPC405/PPC440.
Large and active open source community.LGPL free software with full commercial support from iMatix.
6/25/13
20
M. Al-Turany, ALICE Offline Meeting
Slide21What does it deliver?It handles I/O asynchronously, in background threads.
These communicate with application threads using lock-free data structures,
Concurrent ØMQ applications need no locks, semaphores, or other wait states.Components can come and go dynamically and ØMQ will automatically reconnect. Y
ou can start components in any order. You can create "service-oriented architectures" (SOAs) where services can
join and
leave the network at any time.
When
a queue is full,
ØMQ
Automatically
blocks senders, or
T
hrows
away messages, depending on the kind of messaging you are
doing (the so-called "pattern").
6/25/13M. Al-Turany, ALICE Offline Meeting
21
Slide22What does it deliver?It does not impose any format on messages. They are blobs of zero to gigabytes large.
You can use any other product (Protocol) on top to represent your data (Google's protocol buffers, etc).
Applications talk to each other over arbitrary transports: TCP, multicast, in-process,
inter-process. You don't need to change your code to use a different transport.
6/25/13
M. Al-Turany, ALICE Offline Meeting
22
Slide23The built-in core ØMQ patterns are:Request-reply, which connects a set of clients to a set of services.
(remote procedure call and task
distribution pattern)Publish-subscribe, which connects a set of publishers to a set of subscribers
. (data distribution pattern)Pipeline, which connects nodes in a fan-out / fan-in pattern that can have multiple steps, and
loops.
(Parallel
task distribution and collection
pattern
)
Exclusive pair
, which connect two sockets exclusively
6/25/13
M. Al-Turany, ALICE Offline Meeting
23
Slide24Current Status
The Framework deliver some components which can be connected to each other in order to to optimize data flow
topology.All component share a common base called Device (ZeroMQ
Class).Devices are grouped by three categories:Source: Sampler
Message
-based Processor
:
Sink
,
BalancedStandaloneSplitter
,
StandaloneMerger
, Buffer
Content
-based Processor: Processor
6/25/13
M. Al-Turany, ALICE Offline Meeting
24
Slide25Panda Example6/25/13
Experiment/detector specific code
Framework classes that can be used directly
M. Al-Turany, ALICE Offline Meeting
25
FairMQ
package
Slide26Computing Unit
Detector
Simulation
Example for Panda online
reconstruction
hierarchy (scenario)
MVD Pixel data
Mvd
Strip data
Clusterer
Clusterer
REQ
REP
REP
REP
REP
Tracker
REQ
REQ
SUB
PUB
SUB
SUB
Parameter
database
PUB
SUB
SUB
PUB
SUB
REP
6/25/13
M. Al-Turany, ALICE Offline Meeting
26
Log XPUB
Log
X
PUB
Log XPUB
Log Aggregate
Log Writer
X
SUB
XPUB
X
SUB
Slide27Correct semantics for logging Pub/Sub socketsNever blockLossy! (if needed)
Buffer sizes / locations configurableArbitrary message size
6/25/13
M. Al-Turany, ALICE Offline Meeting
27
Slide28ResultsThroughput of 940 Mbit/s
was measured which is very close to the theoretical limit of the TCP/IPv4/GigabitEthernet
The throughput for the named pipe transport between two devices on one node has been measured around
1.7 GB/s
6/25/13
M. Al-Turany, ALICE Offline Meeting
28
Each
message
consists
of digits in one panda event for one
detector, with size of
few
kBytes
Slide29Payload in Mbyte/s as function of message size
6/25/13
M. Al-Turany, ALICE Offline Meeting
29
ZeroMQ
works on
InfiniBand
but using IP over IB
Slide30ZeroMQ
Root (Event loop)
6/25/13
FairRootManager
FairRunAna
FairTasks
Init
()
Re-
Init
()
Exec()
Finish()
FairMQProcessorTask
Init
()
Re-
Init
()
Exec()
Finish()
ROOT Files,
Lmd
Files, Remote event server, …
Integrating the existing software: M. Al-Turany, ALICE Offline Meeting30
Slide31FairBase/examples/Tutorial3
6/25/13
M. Al-Turany, ALICE Offline Meeting
31
Fairbase
/example/Tutorial3
Slide32Next to implementLocal and central Log processorsCommand channels and objects (messages)
Automatic monitoring and configuration (hopefully till the end of this year!)
6/25/13M. Al-Turany, ALICE Offline Meeting
32
Slide33SummaryZeroMQ communication layer is integrated into our offline framework (FairRoot)
On the short term we will keep both options ROOT based event loop and concurrent processes communicating with each other via ZeroMQ.On long Term we are moving away from single event loop to distributed processes.
Thanks you !
6/25/13M. Al-Turany, ALICE Offline Meeting
33
Slide34Native InfiniBand/RDMA is faster than IP over IB
6/25/13
M. Al-Turany, ALICE Offline Meeting
34
Implementing
ZeroMQ
over IB verbs will improve the performance.
Slide35DeviceEach processing stage of a pipeline
is occupied by a process which executes an instance of the Device class
6/25/13
M. Al-Turany, ALICE Offline Meeting
35
Slide36SamplerDevices with no inputs are categorized as sources
A sampler loops (optionally: infinitely) over the loaded events and send
them through the output socket. A variable event rate limiter has been implemented to
control the sending speed6/25/13
M. Al-Turany, ALICE Offline Meeting
36
Slide37Message format (Protocol)Potentially any content-based processor or any source can change the application protocol. Therefore, the framework provides a
generic Message class that works with any arbitrary and continuous junk of memory (
FairMQMessage). One has to pass a pointer to the memory buffer, the size in bytes, and can optionally pass a function
pointer to a destructor, which will be called once the message object is discarded.6/25/13
M. Al-Turany, ALICE Offline Meeting
37
Slide38New simple classes without ROOT are used in the Sampler (This enable us to use non-ROOT clients) and reduce the messages size.
6/25/13
M. Al-Turany, ALICE Offline Meeting
38
Slide39Processor design6/25/13
M. Al-Turany, ALICE Offline Meeting
39
Slide40Content-based ProcessorThe Processor device has at least one input and one output socket.
A task is meant for accessing and potentially changing the message content.
6/25/13
M. Al-Turany, ALICE Offline Meeting
40
Slide41Message-based ProcessorAll message-based processors inherit from Device and operate on messages
without interpreting their content. Four message-based processors have
been implemented so far6/25/13
M. Al-Turany, ALICE Offline Meeting
41
Slide426/25/13
MVD data
Clusterer
MVD Tracker
MVD data
FairMQBalancedStandaloneSplitter
Clustrer
Clustrer
Clustrer
FairMQStandaloneMerger
Tracker
Tracker
Tracker
Example for Fan-out/Fan-in
the data path for load balancing
M. Al-Turany, ALICE Offline Meeting
42
Slide436/25/13
MVD data
Clusterer
MVD Tracker
MVD data
FairMQBalancedStandaloneSplitter
Clustrer
Clustrer
Clustrer
FairMQStandaloneMerger
Example for Fan-out/Fan-in
the data path for load balancing
M. Al-Turany, ALICE Offline Meeting
43
MVD Tracker