GSI Forum: General Discussions » My analysis is slow

Home » R3BRoot » General Discussions » My analysis is slow

Show: Today's Messages :: Polls :: Message Navigator

My analysis is slow [message #24111]

Thu, 13 June 2019 18:38

Bastian Löher
Messages: 8
Registered: October 2015

occasional visitor

From: *gsi.de

This should initiate a discussion on how to cope with increasing demands on CPU power during analysis due to

- our increasing adoption of R3BRoot for all our detectors
- our increasing level of analysis (raw -> calib -> tracking)
- our increasing number of channels (100s -> 1000s)
- our increasing accepted trigger rate (1 kHz -> more than 30 kHz)

In the beginning it was barely noticeable that data analysis took time on the CPU, because

- few detectors were actually handled
- only a few tasks per detector were implemented
- incoming event rate was low

This year is a turning point with respect to most of these aspects, which lead to noticeable delays in the data analysis. Especially the online analysis was close to unusable at times during the s444/s473 data taking. This was partly amplified by the low network bandwidth and unlucky resource usage (analysis, data transport and user home directory on shared machine). In s454 the situation was a bit better, but mostly because major detector systems were not participating in the analysis (Neuland, AMS, PSP).

Also, when looking at the amount of data collected, we've reached a new record with stored lmd files of more than 40 TB. We've collected many millions of events, and offline data analysis can currently proceed only at a fraction of the acquisition speed. This means that analysing and re-analysing the full data sample takes up a considerable amount of time for the student / post-doc working with the data.

Therefore, I'd like to discuss the options we have for improving the performance of data analysis within R3BRoot and/or with the help of external tools.

A few ideas come to mind very quickly:

- parallelization using PROOF (old-fashioned?)
- fan-out of events from ucesb to several R3BRoot analysis processes, then merging histograms and trees in a final step (ucesb part is already implemented)
- parallelization using FairRoot framework (using FairMQ, use control macro to deploy to batch farm, what is the status there?)
- separate each FairTask into standalone 'micro-service', which always runs and processes data as soon as it is available (similar to DAQ nodes)

I believe we have to make a distinction here also for online and offline analysis, because different boundary conditions apply:

- online does not need full statistics, offline must process every event
- online precision can be lower (e.g. calibration parameters), offline should be as accurate as possible
- online should be single pass, offline can take multiple iterations
- online should result in histograms, offline should produce a tree for further processing

Please share your ideas, thoughts, suggestions, because this is the next important step we have to tackle regarding our data analysis.

Bastian

Report message to a moderator

[Message index]

		My analysis is slow By: Bastian Löher on Thu, 13 June 2019 18:38
		Re: My analysis is slow By: Vadim on Tue, 02 July 2019 12:46

Previous Topic:	Problems with GLAD magnetic field
Next Topic:	Directory structure: where to put which file?

Goto Forum:

-=] Back to Top [=-

[ Syndicate this forum (XML) ] [

] [

]

Current Time: Tue Apr 23 16:14:49 CEST 2024

Total time taken to generate the page: 0.01050 seconds