-
Notifications
You must be signed in to change notification settings - Fork 31
Add a base algorithm to create NTuples #364
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
I have played only a little bit with this now and I generally like the concept. There are a few things that are not entirely clear to me yet, and a I stumbled over a few others, where I think we might need to improve a bit.
Some issues with flexibilityIn general the approach has some type-safety built in, e.g. it's not possible to assign a value that is not part of the Runtime type changesConsider a slightly changed example, where the assignment to the PDG field changes type at runtime. // ... all the rest
if (m_counter > 5) {
NTupleMap["PDG"] = particle.getPDG();
} else {
NTupleMap["PDG"] = particle.getMomentum();
}
// ... all the restThis will pass assignment, since The type of the branch will (quite expectedly) be the first type that is used. Conditionally filling slotsAgain consider the slightly changed example, now the following way // ... all the rest
if (m_counter < 5) {
NTupleMap["AConditionallyFilledSlot"] = 42;
} else {
NTuplesMap["AnotherConditionallyFilledSlot"] = 54;
}
// ... all the restThis will result in an output that only has the AConditionallyFilledSlot but not the AnotherConditionallyFilledSlot. It will still compile and run without any problems. However, the AConditionallyFilledSlot field will still contain the same number of entries as all other slots, for the ones where no new value has been assigned, it will simply re-use the one that has been previously assigned. Given, that we have quite a few things in place that mandate that all events should look the same in other places (e.g. for regular EDM4hep output), IMO this breaks that consistency and can lead to quite unexpected results. Some random thoughtsI haven't yet fully though many of these through, but I would still like to write them down, while the impressions are fresh. Maybe they can serve as a starting point for further discussion.
|
Anyone can create an algorithm that creates a TFile, a TTree, creates all the member variables that will hold the values and sets the addresses of the branches to them and then they are filled in the event loop. Therefore, if we provide something here, it has to be different. The following proposal allows
static_cast.NTupleMap["NameOfTheBranch"] = value;Runs single-threaded, to have a solution running multithreaded the options provided by ROOT to write in a multithreaded way should also be studied in addition to an algorithm that can write multithreaded TTrees or RNTuples the same way as in this PR but locking.
Why not use the NTuple writers from Gaudi? None of the existing algorithms will support reading an arbitrary number of collections like we do in this repository. The
GenericNTupleWritertakes inputs directly from the store (see https://gitlab.cern.ch/gaudi/Gaudi/-/blob/master/GaudiTestSuite/tests/pytest/NTuple/test_GenericNTupleWriter_st.py#L69) so it won't understand our collections, doesn't support RNTuples and sets a new address for each event.BEGINRELEASENOTES
std::variantthat allows writing TTrees and RNTuples with simplemap[key] = valueassignments and has some protections at compile time related to output types.ENDRELEASENOTES