-
Notifications
You must be signed in to change notification settings - Fork 1.4k
[RDF] Prepare for snapshot with variations (part 4) #19727
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Due to the deprecation marker, the snapshot documentation is parsed incorrectly in doxygen. Here, the main documentation is moved to the non-template overload, and the other overloads are referring to this one using the full argument list. "See above" typically doesn't work, since doxygen sorts the functions.
When a Define is added to a branch that gets varied, a new column reader was created for every variation and for every slot, irrespective of whether the column gets varied or not. For snapshot with variations, this column would be written for every variation, although it always evaluates to the same value. To prevent this duplication, the column readers that point to the same Define are shared after this commit. The readers are still cloned per slot though, so the Define's value caches remain thread safe. This allows SnapshotWithVariations to detect that columns are identical, so they are not written multiple times. The memory savings amount to 24 bytes per suppressed column reader,so they won't have a notable impact.
In order to not write a column multiple times in a snapshot with variations, the RDefineBase pointer needs to stay invariant when a variation is generated. However, the RJittedDefine was returning the pointer to its member, so it looked like the column was affected by variations. Here, this is changed to returning "this" when no variations are in effect.
Given that snapshots with variations will be enabled via RSnapshotOptions, the snapshot overload for Vary can be removed, and a static_assert can be used to signal the failure.
In order to implement snapshots with systematic variations, it is beneficial to place all members related to output branches into structs. At present, this only constitutes a move from member of SnapshotHelper to member of RBranchData struct. The TTree snapshot helpers are changed accordingly.
It is superseded by RBranchData.
Test Results0 tests 0 ✅ 0s ⏱️ Results for commit aed8ece. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The last commit LGTM, but since I requested changes in #19726 I will wait for that PR to be merged before continuing
// in the output tree. | ||
auto sizeLeafIt = | ||
std::find_if(outputBranches.begin(), outputBranches.end(), | ||
[&sizeLeafName](RBranchData const &bd) { return bd.fOutputBranchName == sizeLeafName; }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick: in RDF there is the practice of using const-qualifiers on the left of the type
[&sizeLeafName](RBranchData const &bd) { return bd.fOutputBranchName == sizeLeafName; }); | |
[&sizeLeafName](const RBranchData &bd) { return bd.fOutputBranchName == sizeLeafName; }); |
if (!sizeLeafIt->fOutputBranch) { | ||
// The size leaf was emplaced, but not initialised yet | ||
const auto sizeTypeStr = ROOT::Internal::RDF::TypeName2ROOTTypeName(sizeLeaf->GetTypeName()); | ||
// Use Original basket size for Existing Branches otherwise use Custom basket Size. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Use Original basket size for Existing Branches otherwise use Custom basket Size. | |
// Use original basket size for existing branches otherwise use custom basket size. |
@@ -449,17 +499,16 @@ void ROOT::Internal::RDF::UntypedSnapshotTTreeHelper::UpdateCArraysPtrs(const st | |||
// associated to those is re-allocated. As a result the value of the pointer can change therewith | |||
// leaving associated to the branch of the output tree an invalid pointer. | |||
// With this code, we set the value of the pointer in the output branch anew when needed. | |||
assert(values.size() == fBranches.size()); | |||
assert(values.size() <= fBranchData.size()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you remind me of the reason for this change? IIRC it's because the Snapshot helper will be able to deal both with the nominal-only case and with the case including systematic variations so there may be less values than the number of branches. But just by typing this out something doesn't click in my mind :/
This is part 4 of the work in #19725.
Please review part 2 #19726 first, since the commits will disappear from here once it gets rebased.
What's going to be left here is a refactor of the members of all TTree snapshot helpers, which will allow keeping all necessary data for a column in a single struct. This will simplify snapshot with variations, since each varied columns are created by copying the struct of the original column, and overriding a few members.
If the data hadn't been colocated like it's done here, it would become much harder to keep everything in a consistent state for both nominal and variations.