cmsrel CMSSW_15_0_10
cd CMSSW_15_0_10/src
cmsenvNote: no need to set up official NanoAOD-tools as it has been integrated into CMSSW.
git clone https://github.com/colizz/NanoHRT-tools.git PhysicsTools/NanoHRTTools -b dev/nanov15scram b -j8October 2025: This update adapts the framework based on the Run 2 UL setup [1] and two subsequent improvements for early Run 3 (2022/2023, for processing NanoAOD v12) [2,3], and makes it compatible with all currently used NanoAOD versions (NanoAOD v9, v12, v15). When running on NanoAOD v9/v12 samples, the framework gives consistent results to [2,3]
[1] https://github.com/colizz/NanoHRT-tools/tree/dev-UL-0201
[2] https://github.com/lpaizano/NanoHRT-tools/tree/dev/run3
[3] https://github.com/zichunhao/NanoHRT-tools/tree/wz-calibration
Changes:
- Moved files in
src/interface/python/datafrom original NanoAOD-tools to NanoHRT-tools if they are not migrated to CMSSW's NanoAOD-tools. - Specialized support for different NanoAOD versions, including: fatjet taggers, jet b-tag WPs, usage of MET branches, jet corrections.
- Alignment with the latest data campaigns: luminosity values, golden JSON, PU reweighting files (FIXME), lepton ID/isolation, JEC/JER.
- Refactoring of the
qcdchannel - Updates to JetID logic: in nanoAOD v12,
Jet_jetIdis preserved but re-computation is recommended. - Added
jet_veto_mapsfollowing the logic in [2], updated to use the latest minimal jet selection criteria.
**Cross validation with early NanoHRT-tools branches**
1. Validation with Run 2 UL setup for the qcd channel (deriving sfBDT SFs) [1]
[1] https://github.com/colizz/NanoHRT-tools/tree/dev-UL-0201
Configure the runHeavyFlavTrees.py script by updating the default_config dictionary
default_config.update({
'nano_version': 'V9',
'fill_sv': True,
})Then run the production
python runHeavyFlavTrees.py -o /eos/<some-eos-path-on-lxplus>/val_nanov9 --jet-type ak8 --channel qcd --sample-dir samples_nanov9 --year 2018 -n 12. Validation with Run 2 UL setup for the muon channel (deriving top/W SFs) [1,1a]
[1] https://github.com/colizz/NanoHRT-tools/tree/dev-UL-0201
[1a] https://github.com/hqucms/NanoHRT-tools/tree/dev/UL
Configure the runHeavyFlavTrees.py script by updating the default_config dictionary
default_config.update({
'nano_version': 'V9',
})Then run the production
python runHeavyFlavTrees.py -o /eos/<some-eos-path-on-lxplus>/val_nanov9 --jet-type ak8 --channel muon --sample-dir samples_nanov9 --year 2018 -n 13. Validation with early Run 3 setup for the muon channel (deriving top/W SFs) [2]
[2] https://github.com/lpaizano/NanoHRT-tools/tree/dev/run3
Configure the runHeavyFlavTrees.py script by updating the default_config dictionary
default_config.update({
'nano_version': 'V12',
'use_existing_jet_ids': True, # a jetId bug has been identified. Latest recommendation is to re-compute jetId via jet branches (set it to False) but here we use the existing jetId for cross validation
'jec': True, # should re-compute JECs for NanoAOD v12
})Then run the production
python runHeavyFlavTrees.py -o /eos/<some-eos-path-on-lxplus>/val_nanov12 --jet-type ak8 --channel muon --sample-dir samples_nanov12 --year 2022EE -n 14. Validation with early Run 3 setup for the qcd channel (deriving sfBDT SFs) [3]
[3] https://github.com/zichunhao/NanoHRT-tools/tree/wz-calibration
Configure the runHeavyFlavTrees.py script by updating the default_config dictionary
default_config.update({
'nano_version': 'V12',
'fill_sv': True,
'custom_tagger_list': ["globalParT_QCD0HF", "globalParT_QCD1HF", "globalParT_QCD2HF", "globalParT_Xbb", "globalParT_Xcc", "globalParT_XbbVsQCD", "globalParT_massRes", "globalParT_massVis"], # presented in DAZSLE custom NanoAOD v12 samples
'use_existing_jet_ids': True, # a jetId bug has been identified. Latest recommendation is to re-compute jetId via jet branches (set it to False) but here we use the existing jetId for cross validation
'jec': True, # should re-compute JECs for NanoAOD v12
})Then run the production
python runHeavyFlavTrees.py -o /eos/<some-eos-path-on-lxplus>/val_nanov12 --jet-type ak8 --channel qcd --sample-dir samples_nanov12 --year 2022EE -n 1**Production recipes for bookkeeping (keep updating)**
For qcd channel:
A. For generating gen hadron N-subjettiness variables for sfBDT training.
Updating the default_config dictionary:
default_config.update({
'nano_version': 'V15',
'fill_sv': True,
'require_sv_cut': False, 'run_gen_hadron_nsubs': True, # for qcd channel -> dedicated for generating gen hadron N-subjettiness variables for sfBDT training
'jec': True,
})Running the production (after properly configuring the samples to run in e.g. run/samples_nanov15/qcd_2024_MC.yaml):
python runHeavyFlavTrees.py -o /eos/<some-eos-path-on-lxplus>/20251024_ULNanoV15_gen_hadron_nsubs --jet-type ak8 --channel qcd --sample-dir samples_nanov15 --year 2024 -n 1B. For nominal qcd channel production.
Updating the default_config dictionary:
default_config.update({
'nano_version': 'V15',
'fill_sv': True,
'jec': True,
})Running the production (after properly configuring the samples to run in e.g. run/samples_nanov15/qcd_2024_MC.yaml):
python runHeavyFlavTrees.py -o /eos/<some-eos-path-on-lxplus>/20251024_ULNanoV15 --jet-type ak8 --channel qcd --sample-dir samples_nanov15 --year 2024 -n 1python runPostProcessing.py [-i /path/of/input] -o /path/to/output -d datasets.yaml --friend
-I PhysicsTools.NanoHRTTools.producers.hrtMCTreeProducer hrtMCTree -n 1To merge the trees, run the same command but add --post -w '' (i.e., set -w to an empty string ('') -- we do not add the cross sections, but simply reweight signals to match the QCD spectrum afterwards).
Make trees for heavy flavour tagging (bb/cc) or top/W data/MC comparison and scale factor measurement:
python runHeavyFlavTrees.py -i /eos/uscms/store/user/lpcjme/noreplica/NanoHRT/path/to/input -o /path/to/output
(--sample-dir custom_samples) --jet-type [ak8,ak15] --channel [photon|qcd|muon|inclusive|higgs|mutagged|simple-matching] --year [2016APV|2016|2017|2018] -n 10
(--batch) (--run-data) (--run-syst)
(--condor-extras '+AccountingGroup = "group_u_CMST3.all"')Command line options:
- the preselection and basic configurations for each channel is coded in
runHRTTrees.py. Remember to set them to correct values before submitting jobs (the values will go tometadata.jsonafter a job is created). - add
--run-datato make data trees - add
--run-systto make the systematic trees - can run data & MC for multiple years together w/ e.g.,
--year 2016APV,2016,2017,2018. The--run-dataoption will be ignored in this case. Add also--run-systto make the systematic trees. - use
--sample-dirto specify the directory containing the sample lists. Currently we maintain two sets of sample lists: the default one is under samples_* (e.g.--sample-dir [samples_nanov9](run/samples_nanov9)) which is used for running over official NanoAOD datasets remotely, and the other one is custom_samples which is used for running over privately produced NanoAOD datasets locally. To run over the private produced samples, ones needs to add--sample-dir custom_samplesto the command line. - the
--batchoption will submit jobs to condor automatically without confirmation - remove
-ito run over remote files (e.g., official NanoAOD, or private NanoAOD published on DAS); consider adding--prefetchto copy files first before running - [NEW] use
--condor-extrasto pass extra options to condor job description file.
More options of runPostProcessing.py or runHRTTrees.py (a wrapper of runPostProcessing.py) can be found with python runPostProcessing.py -h or python runHRTTrees.py -h, e.g.,
-
To resubmit failed jobs, run the same command but add
--resubmit. -
To add cross section weights and merge output trees according to the config file, run the same command but add
--post. The cross section file to use can be set with the-woption.
For maximal flexibility, a number of truth-matching varibles are defined in HeavyFlavBaseProducer for hadronically decaying top quarks and W, Z, Higgs bosons. For W/Z/H we define:
fj_idx_dr_X: deltaR of the fatjet to the nearest hadronically decaying X particle. If found, this top quarkXis then used to define all the following variables. Default to 99 if no hadronically decaying X in the event.fj_idx_dr_X_daus: max deltaR between the fatjet and the two quarks from X decay.fj_idx_X_pt: pt of Xfj_idx_X_decay: max abs(pdgId) of the two quarks from X decay. For H/Z, this means 5: bb, 4: cc, <4: qq. For W, this means 4: cx, <4: qq. Default to 0 if no hadronically decaying X in the event.
Top quark is treated a bit differently:
fj_idx_dr_T: deltaR of the fatjet to the nearest hadronically decaying top quark. If found, this top quarkTis then used to define all the following variables. Default to 99 if no hadronically decaying top in the event.fj_idx_dr_T_b: deltaR between the fatjet and the b quark from the hadronicTdecay.fj_idx_dr_T_Wq_(max|min): max|min deltaR between the fatjet and the two quarks from the W decay.fj_idx_T_Wq_(max|min)_pdgId: pdgId (w/o taking the absolute value) of the corresponding two quarks from W decay.fj_idx_T_pt: pt ofT
- top-matched: all three quarks contained in the fatjet
fj_1_dr_T_b<jetR && fj_1_dr_T_Wq_max<jetR
- W-matched: only the two W quarks contained, the b quark is outside the jet cone (if the W is from top quark decay)
((fj_1_T_Wq_max_pdgId==0 && fj_1_dr_W_daus<jetR) || (fj_1_T_Wq_max_pdgId!=0 && fj_1_dr_T_b>=jetR && fj_1_dr_T_Wq_max<jetR))- [Note] the first part is mainly intended for tW events where the top quark decays leptonically, and the W boson decays hadronically. This can be a sizeable contribution to the W-matched events and needs to be taken into account properly. The trick here makes use of the fact that
fj_1_T_Wq_max_pdgIdis non-zero only if there is a hadronic top in the event.
- unmatched: defined as
(NOT top-matched) and (NOT W-matched), i.e.,!(fj_1_dr_T_b<jetR && fj_1_dr_T_Wq_max<jetR) && !((fj_1_T_Wq_max_pdgId==0 && fj_1_dr_W_daus<jetR) || (fj_1_T_Wq_max_pdgId!=0 && fj_1_dr_T_b>=jetR && fj_1_dr_T_Wq_max<jetR))
[Extra] For selecting specifically W->cx decays from the W-matched jets:
- W(cx)-matched:
((fj_1_T_Wq_max_pdgId==0 && fj_1_dr_W_daus<jetR && fj_1_W_decay==4) || (fj_1_T_Wq_max_pdgId!=0 && fj_1_dr_T_b>=jetR && fj_1_dr_T_Wq_max<jetR && (abs(fj_1_T_Wq_max_pdgId)==4 || abs(fj_1_T_Wq_min_pdgId)==4)))
- triggers
- lumi values
- golden JSON
- PU rewgt
- lepton ID/ISO
- b-tag WP
- JEC/JER
- MET filters
- MET recipes (if any)
- samples (check also those in PRODUCTION status)