[FYI] Introducing a python based online for ucesb #1243

klenze · 2025-07-15T15:44:19Z

klenze
Jul 15, 2025

For your information, the following is a copy of a mail I just sent to the analysis WG.

Dear colleagues,
One of the time-honored traditions of R3B is that at some point, DAQ persons write their own analysis framework as an alternative to FairRoot/R3BRoot, and try unsuccessfully to get the collaboration to adopt it.

This here is then me trying exactly that.

Behold pyjsroot [0] (better name pending), a proof of concept online monitor based on my pyh101 ucesb client [1].

A short anecdote: few weeks ago, I wanted to integrate a new detector prototype (FastTof, which is currently read out with VTFX modules) into an online analysis so that I can check coarse and fine time correlations between it and LOS with a pulser.

In R3BRoot that would have taken 14 new files in the main git (plus two in R3BParams) and edits in five more to add the new detector and histogram time differences between specific detectors. [2]

With pyjsroot, it took me about an hour to create a class for displaying the calibrated time differences of arbitrary TDC channels around the fine- and coarse range, and then five minutes more to specify which detectors and channels should be considered in this case. Imagine my surprise when I noticed I was having fun building an online analysis!

Why Python? See [3] for why I think that C++ is not a suitable analysis language for our collaboration.

At the moment, pyh101 can unpack arbitrary ucesb data encountered in R3B unpackers into nice python structures, assemble white rabbit timestamps, apply a fine time calibration for GSI TDC systems (Tamex, VFTX) on the fly (but optionally with caching of the calibration), calculate ToT values for Tamex data and identify which detector channels correspond to which trigger channels. This can be expanded upon either from Python or in the CPython module.

pyjsroot can read unpacker data from file or stream and show a few basic los and rolu spectra, a basic time-of-flight spectrum between LOS and FastTof, histogram the tdc differences of pulsed channels to make sure our 200MHz are in sync and analyze White Rabbit timestamps.

Given the complexity of our present beam monitoring solution
(r4l -[UDP]-> udp_repeater -> udp_reader -[zmq]-> websocket server
-[ws]-> browser), I think a pyh101 based solution could greatly reduce complexity at the cost of a somewhat higher delay, likely a few seconds.

Again, the idea of my proof of concept is not that I realistically aim to replace R3BRoot, but simply to challenge the prevailing opinion on what is a normal and healthy amount of boilerplate code. One can build a ucesb reader and jsroot server in a couple of thousand lines of code (i.e. ~1% of the FR+R3BRoot codebase) easily, and you can completely define a histogram in a single statement just fine instead of having to add code in four different places.

If you want to give it a try, use the links below and drop me a mail if you run into trouble.

Cheers,
Philipp

[0] https://github.com/klenze/pyjsroot/
[1] https://github.com/klenze/pyh101/
[2] https://github.com/klenze/pyjsroot/blob/master/docs/intro.md
[3] https://github.com/klenze/pyjsroot/blob/master/docs/soapbox.md

YanzhaoW · 2025-07-15T22:58:20Z

YanzhaoW
Jul 15, 2025
Collaborator

Thanks for this interesting post. I have few questions:

Are you deserializing the binary data from ucesb yourself, instead of using ucesb_client?
Is only one browser client allowed to connect to the http server? Or multiple as well?
For now, each experiment is using its own unpacker. So when we want to deserialize the binary data from a different experiment, we have to use a different version of upexp. Is this the same for your program as well?

Why Python? See [3] for why I think that C++ is not a suitable analysis language for our collaboration.

I kind of disagree here as someone who doesn't care much about online analysis. Apart from online analysis, R3BRoot also contains the offline analysis (calirbation), event reconstruction and simulation. These parts are very computation heavy and mostly dealing with millions or even billions of events. C++ is the perfect tool for this. The workloads, data IO, event scheduling, parameter management are handled by FairRoot framework. Well, I admit that FairRoot is pretty bad and many things are completely over-engineered. But important thing is GSI has a dedicated software department to maintain this, and we don't have this much of resources.

About python, I would say it's kinda good and bad. I use pyroot a lot in jupyterlab for some high level data analysis. It can replace "almost" everything what we do in root macro file. Drawing histogram also becomes easy and fast instead of popping up an ugly TBrowser window and transmitting everything in x11 via a slow internet. (Though jsroot is really slow and I always turn it off.) That being said, I have found quite a few problems using python/pyroot:

It can't analyze a large number of events. Up to 100k events are "fine" with RDataFrame. Once the number of events is larger than 1 million, it's basically "unplayable". And this already requires me not to use any for-loop equivalents. There is a numba stuff, which can be used to pass a function to RDataFrame. But it's really difficult to use.
Incompatibility between different versions of dependencies. Well, there is nothing to say about python regarding to version management. It's just the worst.

0 replies

klenze · 2025-07-17T14:16:16Z

klenze
Jul 17, 2025
Author

Are you deserializing the binary data from ucesb yourself, instead of using ucesb_client?

I use a version of ext_data_client from ucesb which I have slightly modified. Normally, ext_data_client would read the desired field names from your ext_h101_* file preprocessor macros which define what data your compiled code would expect where, then try to match the data format provided by a particular ucesb unpacker to these fields.

I instead just take the fields in the order I receive them from ucesb, and then use the metadata (field names, types, array sizes; note that this requires breaking abstraction a bit in ext_data_client.h) to decide how the fields should be translated into Python objects during initialization, then fill all of these pythonizer instances (called *_iteminfo for some reason) into a std::vector and just run through them for every event.

Is only one browser client allowed to connect to the http server? Or multiple as well?

I think it is is the same as for the R3BOnline jsroot: multiple connections are allowed, and we have not run into a relevant limit yet. With high binning histograms and auto-update you would probably run into bandwidth issues at some point, I guess.

For now, each experiment is using its own unpacker. So when we want to deserialize the binary data from a different experiment, we have to use a different version of upexp. Is this the same for your program as well?

Yes, only I have no compile-time preconception of what fields a particular unpacker might provide. For example, you could just check if "ROLU2" is a key in the main dict, and if it is, you instantiate a 2nd ROLU online for it. Of course, if your code expects there to be the key like "TPAT" in the dict and it is not, then you will get an exception. And if you had the case where in one unpacker, TPAT, TPATv are an array and in another TPAT is just an integer, then you would get a type error when you try to call len(d["TPAT"]).

Apart from online analysis, R3BRoot also contains the offline analysis (calirbation), event reconstruction and simulation. These parts are very computation heavy and mostly dealing with millions or even billions of events. C++ is the perfect tool for this.

I agree that simulation is computationally heavy. However, I imagine that most of the time spent running simulations is actually spent in some GEANT4 calls.

C++ is an acceptable solution if you have experts who can write and review code. As I have argued, our codebase is basically full of lines where it is apparent that the author had no understanding of object ownership. In my mind, that means they should never have been allowed to touch a raw pointer, but anyone working with FairRoot or ROOT will have to touch raw pointers. Sure, you could switch to smart pointers (which would involve rewriting basically every ROOT and FairRoot interface), and train everyone to use them (which would require first to teach them how templates work and thus be about as much effort to train them to use Python), but then you have people using multi-dimensional C arrays, or using sprintf or a myriad of other behaviors which were state of the art in 1995.

UCESB works because it has a maintainer (@inkdot7) who really knows his C/C++ and will carefully review any PRs. If I tried to get him to merge a change where I had just copy-renamed a class, or called free on some random pointer I had gotten from god-knows-where, then it is very unlikely that my code would get merged.

In R3BRoot, there is very little in the way of code reviews.

The workloads, data IO, event scheduling, parameter management are handled by FairRoot framework. Well, I admit that FairRoot is pretty bad and many things are completely over-engineered.

I think that FR does a couple of things, some better than others:

FairSoft: Not part of FR proper, but whatever. Actually not bad, a way to get Geant4, ROOT, boost and friends to users which is at least slightly simpler than having them clone and compile them manually.
CMake stuff: Yeah, so if you want to generate rootmap files so people can instantiate classes from cling you have to do some ugly hacks, which cause more trouble and confusion than having people compile their code ever would.
TVirtualMC: I think this was a major selling point back in the day when Geant4 was newish and untrusted and people still wanted to run Geant3 -- by providing a common interface which could be used with either MC backend, you got around having to have a separate version for both (and also FORTRAN). Personally, I have not heard "we ran a Geant3 simulation and" in a decade, everyone seems to use G4. This makes TVirtualMC a liability, a middle-man interface without a purpose. Sure, with ROOT, you can save your geometries to .root and view them in a TBrowser (until it segfaults), but at the end of the day, your geometry is procedurally created by some code (or possibly from a CAD file), and there is no good reason not to just do that at the initialization of your simulation. Getting a single detector class where you can set up your geometry (including sensitive material) and then have FR call you back with a list of hits is nice, but could be accomplished in plain G4 without too much trouble. (Edit: digging a bit more into it, it seems that your Detector class gets called once per vertex, and it is up to you to keep track of state for the event, instead of calling you once per track or once per event. And of course, you need TVirtualMC::GetMC()->TrackPosition() etc because no FairRoot method would be complete without you having to fetch a singleton.)
FairTask: okay, so they have a modular event-processing framework based on calling a method for every task in a vector. Except that they also insist on their two-stage initialization¹. Plus another method for re-initializing and yet another for assigning parameter containers. No, they can not pass you FairRootManager or FairRuntimeDb instances in these methods, because that would cause the stock of Seagate to crash.
FairParams: honestly we would be better off having a random summer student add an ad-hoc implementation based on a randomly chosen XML/yaml/json library.
TTree IO: Okay, I can appreciate that there is likely some glue code involved, but serializing TObjects to TTrees seems like it could be accomplished with plain root without too much effort.
Event data reading: my favorite part. "We fetch you the subevent you were interested in, you just parse it, it's just a few pointer increments and bitshifts, how hard can it be?" Now the R3B solution with one hundred and one ext_h101 files is admittedly messy, but if R3BRoot did have to do the unpacking it would be a hundred times more messy.
FairLogger: also not strictly part of FairRoot. Okay, I guess. Per task-verbosities would have been nice, but they are hardly to blame for the abomination that is R3BLOG²
FairMQ et al: a lot of stuff in FR is not used in R3B, which is fine by me (I just wish FairParams were included in that set), but we can hardly say we depend on it if we are not using it.

But important thing is GSI has a dedicated software department to maintain this, and we don't have this much of resources.

Allow me a logistics metaphor. When the FAIR factory (facility) was planned in 1850 (early 2000s), it was decided that we needed a solution to supply it with raw materials (simulate and analyse event data). At that time, most bulk transport used waterways (compiled languages were standard for serious data analysis), so they created a project called FairShip and designed it according to the local nautical tradition of wooden sailships (based their design on ROOT<=5, which also inspired their design mindset despite not being state of the art software design). R3B naturally decided that as a FAIR project, they would use FairShip, because it would be provided for free for us and with good support. The only problem was that the R3B site is about 360km inland, and R3B would be responsible for building a waterway (we would need to write all of the classes which handle our detectors in the boilerplate-heavy FairRoot way). To make matters worse, our architects insist on not building any locks or using excarvators (having no templates and separate classes to process/store identical data -- such as Mapped TAMEX data for NeuLAND, TofD, LOS, ROLU, Tofi, Sci2, Sci8). Now you are saying that water transport is still the cheapest way to transport bulk goods (compiled languages have a runtime advantage), and you certainly correct, but the question is if for R3B, continuing to dig a meandering canal (try to use C++ FairStyle, when most contributors only have a rudimentary understanding of C++, or worse are just copying R3BRoot "best practices") is the best option. I am not saying that having people shovel a canal for FairShip can not be done -- this is how the Suez canal was created, after all. I am simply saying that it is a very large and painful undertaking which is not worth the cost, because supplying R3B will not take the bandwidth of the Suez canal. I have just created a footpath, thereby demonstrating that overland transport is indeed a possibility, even if it would take quite some effort to go from that to a truck-capable road.

(end metaphor, finally).

R3B has -- by preference or by funding constraints -- a grand total of one (1) tenure track on-site DAQ and controls operators positions (previously Basti, now Martin). The odds of them hiring a team lead with a strong C++ background to manage the development of the simulation and analysis framework are slim indeed -- after all, analysis can just be done by random grad students without a CS background, right?

From my perspective, FairRoot does not bring enough to the table to be worth the pain. If they were providing excellent data unpacking frameworks -- which you could naively expect, given that there is about 10m of corridor between the FR devs and GSI EE who build most of our electronics, e.g. an uber-ucesb fully integrated in FR which also does fine time calibrations out of the box, that might be reason enough to keep them. But they do not (and in fact GSI EE prefers GO4, which is a completely different framework which had its own set of problems (e.g. ad-hoc unpacking) last time I checked, which admittedly was some years ago).

"We should do everything in C++ because we need speed" sounds like a terminal case of premature optimization. Nothing is stopping us from starting with an interpreted implementation and replace bottlenecks with a compiled implementation once we identify them. With R3BRoot premature optimization is the only kind of optimization which is widely done. At least I have not seen the telltale signs of targeted optimization in any Exec methods I have read (but then again I have not read a lot of NL Exec methods lately).

At the moment, pyjsroot is meant for what it says on the tin, creating online histograms with a minimum of pain, which is useful to me as someone who sometimes still runs DAQs. I have little hope of it providing a fully viable alternative to R3BRoot in the same way that GNU/Linux evolved to be a viable alternative to Microsoft Windows (for many but not all tasks). I would rejoice if my work would be the GNU hurd in that metaphor, pointing in the right direction without ending up as part of the solution itself.

transmitting everything in x11 via a slow internet.

PSA: GSI lxpool computers can be connected to using x2goclient.

https://wiki.r3b-nustar.de/misc/x2go

I have not used RDataFrame or interfaced TTrees from python so far. (I think I did my histogramming in compiled C++ and used pyroot mostly for fitting and pretty-printing root objects.)

This is done to avoid having exceptions or error handling in the constructor, IIRC. Now, exceptions during construction are messy, sure, but only when you plan to catch and survive them. But if you are an event processing framework and not an inertial navigation system. If an unexpected error happens, just halt and catch fire. ↩
R3BLOG(warn, "Apparent streaming into strings has confused "<<fConfused<<" people so far"); ↩

0 replies

YanzhaoW · 2025-07-25T12:10:09Z

YanzhaoW
Jul 25, 2025
Collaborator

Sorry for the late reply.

I think it is is the same as for the R3BOnline jsroot: multiple connections are allowed, and we have not run into a relevant limit yet. With high binning histograms and auto-update you would probably run into bandwidth issues at some point, I guess.

I've never used jsroot (except that I turn it off in Jupyterlab because it's too slow and makes my browser laggy). But you mentioned you are also using websocket to the data transmission to the browser. If so, do you need to manage the multiple connections, and know how to establish the connection and how to shut down the connection when the user closes the tab?

I don't quite understand your metaphor. But personally I think R3BRoot is doing fine at the moment (again speaking of offline analysis) because of its modularity. That means online monitoring, calibration, simulation and event-reconstruction of different detectors are separated into independent components. What comes out those independent components is root file containing event data (a ROOT Tree or RNTuple in the future), which will be merged and used for the next level data analysis. In which way the ROOT data is generated differs among different WGs. If the code was written in a terrible way such as, like lots of (heap) memory allocations, using old ugly C style arrays or new/delete everywhere, they will have hard time to maintain it themselves. If the code is well-structured and written well, it will be easy for them to maintain. For me, who is only working on NeuLAND, I couldn't care less about the code quality of other detectors (except LOS). This is fair as one gets what one deserves.

Now the issues you have mentioned in the post, in my opinion (probably wrong), are due to the poor "division of labors". If one has to do the high level analysis for physics, and data acquisition of a detector, and calibration, and simulation, of course, it's hard for everyone as each different process is handled best by a different tool. If you are doing high level data analysis, which probably most of our remote members are doing, you just need to learn Python and retrieve the data from root files. On the other hand, calibration and simulation need to be done by people with decent knowledge of C++, like how the memory is allocated, how to use the STL algorithms and how the FairRoot framework works. But this area doesn't need many people as the process is basically the same for all experiments.

UCESB works because it has a maintainer (@inkdot7) who really knows his C/C++ and will carefully review any PRs. If I tried to get him to merge a change where I had just copy-renamed a class, or called free on some random pointer I had gotten from god-knows-where, then it is very unlikely that my code would get merged.

I find it really hard to agree here. From my limited painful experience with UCESB, I think it's a total mess and anything is better than this. It's written in old C (probably compiled by a C++ compiler) in a such way that I have to take some rests while reading the code. We should just get rid of it and dump the raw data directly to the ROOT files.

CMake stuff: Yeah, so if you want to generate rootmap files so people can instantiate classes from cling you have to do some ugly hacks, which cause more trouble and confusion than having people compile their code ever would.

Hmm, I don't understand what you mean by "ugly hacks". We inject dictionary information directly to the CMake target, like:

root_generate_dictionary(
    Lib_dict
    ${HEADERS}
    MODULE
    R3BLib
    LINKDEF
    R3BLibLinkDef.h)

This is way better and cleaner than what you see from the obsolete gnu make stuff from UCESB and its home-made language.

TVirtualMC: I think this was a major selling point back in the day when Geant4 was newish and untrusted and people still wanted to run Geant3 -- by providing a common interface which could be used with either MC backend ...

I agree with this. They should get rid of Geant3 and Fortran stuff. But I could also see the purpose behind embedding the simulation inside a larger framework. For example, I have been using simulation to generate cal level data. To convert the simulated cal level data to the hit level data, I could use the same task that I use in the calibration.

FairTask: okay, so they have a modular event-processing framework based on calling a method for every task in a vector. Except that they also insist on their two-stage initialization1. Plus another method for re-initializing and yet another for assigning parameter containers. No, they can not pass you FairRootManager or FairRuntimeDb instances in these methods, because that would cause the stock of Seagate to crash.

Yeah, two-stage initialization is definitely weird. I will probably make a post to discuss this with FairRoot team. This has nothing to with FairTask itself, but rather with FairRun. I'm not aware of any problems about throwing exception in the constructor. I mean, error handling in the constructor is one of the reasons why exception exists.

FairParams: honestly we would be better off having a random summer student add an ad-hoc implementation based on a randomly chosen XML/yaml/json library.

I guess the reason why they choose root file to store data is again the data parsing. Parsing a nested data structure to a binary file is kind of difficult for XML/yaml/json files. We do have some nice json libraries, like nlohmann-json. But I feel Yaml/Json is more for storing configurations instead of data.

TTree IO: Okay, I can appreciate that there is likely some glue code involved, but serializing TObjects to TTrees seems like it could be accomplished with plain root without too much effort.

Well, we have R3BIOConnector to make this easier. But FairRoot is going to shift to RNTuple anyway.

"We should do everything in C++ because we need speed" sounds like a terminal case of premature optimization.

Well, like I said, high level data analysis should be done in Python. But python is not suitable for any processes that need to analyze more than hundreds of thousands of non-empty events. So use the best tool for the job.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FYI] Introducing a python based online for ucesb #1243

Uh oh!

{{title}}

Uh oh!

Replies: 3 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Select a reply

Uh oh!

[FYI] Introducing a python based online for ucesb #1243

Uh oh!

klenze Jul 15, 2025

Replies: 3 comments

Uh oh!

Uh oh!

YanzhaoW Jul 15, 2025 Collaborator

Uh oh!

klenze Jul 17, 2025 Author

Footnotes

Uh oh!

Uh oh!

YanzhaoW Jul 25, 2025 Collaborator

klenze
Jul 15, 2025

YanzhaoW
Jul 15, 2025
Collaborator

klenze
Jul 17, 2025
Author

YanzhaoW
Jul 25, 2025
Collaborator