Skip to content

Conversation

@HarryWang0619
Copy link

@HarryWang0619 HarryWang0619 commented Jun 3, 2025

Overview

This PR introduces native Julia implementation of jet flavour tagging capabilities, providing a Julia alternative to the existing Python-based functionality in FCCAnalyses.

FCCAnalyses currently provides jet flavour tagging through Python ONNX runtime integration. To enable seamless jet reconstruction and analysis workflows entirely within Julia, we need native Julia implementations that can:

  • Process EDM4hep reconstructed particles directly
  • Interface with pre-trained ParT/ParticleNet ONNX models
  • Provide comparable performance to the existing Python implementation
  • Integrate naturally with the existing JetReconstruction.jl ecosystem

Implementation Details

This implementation provides the core infrastructure for jet flavour tagging by:

Core Components

  • JetConstituentBuilder.jl: Converts EDM4hep reconstructed particles into format suitable for ONNX model input
  • JetConstituentUtils.jl: Utility functions for preprocessing jet constituents (normalization, feature extraction, etc.)
  • JetFlavourTagging.jl: Main interface module that orchestrates the flavour tagging workflow
  • JetFlavourHelper.jl: Interference with ONNX RunTime

Dependencies

  • EDM4hep.jl: For reconstructed particle input handling
  • ONNXRunTime.jl: For ONNX model inference
  • ?JSON.jl: For ONNX model inference

Expected

-Example: An example that demonstrates a full workflow from EDM4hep ROOT File to ONNX weight output
-Tests: Test with a sample EDM4hep file

This is the code part for issue #163

This part act as the main part of the jet flavour tagging module.
(Withe the ONNX runtime part updating)

Expected Next:
1. Updated JetFlavourHelper (ONNX Interface)
2. Examples
@HarryWang0619 HarryWang0619 changed the title Jet Flavour Extension (Main part) Jet Flavour Extension Jun 4, 2025
@codecov
Copy link

codecov bot commented Jun 4, 2025

Codecov Report

❌ Patch coverage is 0% with 1181 lines in your changes missing coverage. Please review.
✅ Project coverage is 40.25%. Comparing base (c4ad094) to head (ca4cdea).
⚠️ Report is 3 commits behind head on main.

Files with missing lines Patch % Lines
ext/JetFlavourTagging/JetConstituentUtils.jl 0.00% 906 Missing ⚠️
ext/JetFlavourTagging/JetFlavourHelper.jl 0.00% 250 Missing ⚠️
ext/JetFlavourTagging/JetFlavourTagging.jl 0.00% 12 Missing ⚠️
ext/JetFlavourTagging/JetConstituentBuilder.jl 0.00% 7 Missing ⚠️
src/JetReconstruction.jl 0.00% 6 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##             main     #164       +/-   ##
===========================================
- Coverage   76.92%   40.25%   -36.68%     
===========================================
  Files          20       24        +4     
  Lines        1296     2477     +1181     
===========================================
  Hits          997      997               
- Misses        299     1480     +1181     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Check if the latest update of JetReconstruction.jl fixed the formatting for this file
New Files: ONNX inference (JetFlavourHelper)

Changes:
1. Update build_constituents_cluster
2. Changed Bz to Float32 (same as EDM4hep)
3. Changed Track_L to AbstractArray{AbstractFloat} where we don't need to copy but just view the track_L (more efficient)
Update with 'On working sections'
add the /example/flavour-tagging/data folder (for local debug) to .gitignore. Will remove this and use cern eos path in future updates.
Add JSON to Library project.toml as ONNX profile need to be load with JSON.
@graeme-a-stewart graeme-a-stewart self-requested a review June 16, 2025 13:47
@graeme-a-stewart
Copy link
Member

Hello @HarryWang0619 - thanks for starting this PR off. Can you let us know what sort of feedback you're looking for at the moment? The code doesn't seem to be very runnable (simple-flavour-tagging.jl is empty and simple-flavour-tagging.ipynb has a hand crafted path to /Users/harrywanghc/Developer/2025/JuliaHEPForkToMain/JetReconstruction.jl - wouldn't just a simple dev JetReconstruction suffice?). There are also no suitable input files for us to try.

There is a lot of good advice about conventions, documentation and testing in https://juliahep.github.io/JetReconstruction.jl/dev/contributing/.

@graeme-a-stewart
Copy link
Member

Hello @HarryWang0619 - I am very excited about this PR! I have made a quick pass and left some general comments that I think will point you in the right direction for finishing things.

new example by running with julia --project simple-flavour-tagging.jl
@HarryWang0619
Copy link
Author

HarryWang0619 commented Jul 17, 2025

Here's some important recent updates.

  1. All the LoopVectorization are removed and are replaced by @simd.
  2. Added a new .jl example file in the examples/flavour-tagging. To run it, we you might need to change the path of the EDM4hep file. cd examples/flavour-tagging; julia --project simple-flavour-tagging.jl
  3. Added a compare test for one module of Flavour Tagging.
  4. Fix dependencies issues that causes previous failures.

Questions:

  1. For the EDM4hep, I think there is a way to read it as a root://xxxxxxx/event_xxxx.root. But I am not sure how to set that up.
  2. For the testing, as I am building the flavour tagging as an extension (I found out EDM4hep is also extension, so I think it make sense for me to also make jet flavour tagging, as an extension), the test can not be performed with the julia --project test/runtests.jl even if I write the test module. It can only be run with -e using Pkg, Pkg.add(extension libraries). However, that would make the file always fail with the codecov testing whenever I commit a change. How should I deal with that?
  3. MC Vertex: Currently, the MC Vertex value is reading using in very weird way as the vertex is not recorded under the Reconstructed Particle. It slows down the event processing time significantly. (It cost twice times if I added the MC Vertex. Previously, I am using (0,0,0,0) as vertex. Below is my code:
mcps = RootIO.get(reader, evt, "Particle") # MC Particles
MCRecoLinks = RootIO.get(reader, evt, "MCRecoAssociations") # Link of MC particles to reco particles
mc_vertices = Vector{LorentzVector{Float32}}(undef, length(recps)) # empty vector to store the vertex
reco_to_mc = Dict(link.rec_idx.index => link.sim_idx.index for link in MCRecoLinks)
for (rec_idx, mc_idx) in reco_to_mc
    mc_vertices[rec_idx+1] = LorentzVector(mcps[mc_idx+1].vertex.x, mcps[mc_idx+1].vertex.y, mcps[mc_idx+1].vertex.z, mcps[mc_idx+1].time)
end

TODO (as of Jul17):

  • Update the get_weights function so it can read the tag in ONNX config JSON and feed to ONNX accordingly. (It's now more hardcoded)
  • Figure out a better way to reading MCVertex
  • Some quick variable name fix after first two

@graeme-a-stewart
Copy link
Member

Hi @HarryWang0619 - thanks a lot for these recent improvements. I am happy to say that the simple-flavour-tagging.jl script is now working nicely for me (there are still some issues with the notebook version). I like the little bars on the output.

We have an issue that this extension is depending on a lot of other packages so, e.g., the tests don't work and won't work unless we add a lot of other packages into the test environment, which is pretty heavy.

So I now think that this is really a significant enough piece of work that we should put into a new package, probably called FlavorTagging (I'll yield to US spelling conventions when it comes to code 😄). This package would then have its own dependencies, including JetReconstruction.

Should it be FCCeeFlavorTagging or can we envisage making this more general package for this? In principle the package should allow the user to load up whatever ONNX model they like, so it could be (and it would be preferable) to have the package in principle experiment agnostic. @mattleblanc - do you have thoughts on this, you're more aware of the technical details.

@HarryWang0619 if you have never made a Julia package before it's not that hard, but I can also help with it (or do it and create the core pieces for you to PR to).

I am pretty convinced this is now the correct way to proceed, rather than hanging everything off JetReconstruction.

@graeme-a-stewart graeme-a-stewart marked this pull request as draft July 24, 2025 14:08
@mattleblanc
Copy link
Contributor

Hi @graeme-a-stewart, that's an interesting suggestion. It's too bad that there can't be a separate testing environment for extensions of the main repository, but I agree that it might be too heavy for everything to go in one spot.

Regarding the scope of a new package, I would like for it to be experiment-agnostic but the reality is that flavour tagging implementations depend strongly on the experiment you're looking at. Harry's code is designed to run on edm4hep files, I would say that JetTaggingFCC.jl might be a sensible way to go? I think other taggers could be easily implemented here, so we probably don't need to pick a side on spelling conventions in the package name.

I've not made a julia package before but we can give it a shot if you offer a pointer to some instructions to follow. Should this live in the JuliaHEP ecosystem?

@graeme-a-stewart
Copy link
Member

Hi @mattleblanc - thanks for your thoughts. I am not afraid of making additional packages in Julia - because of the excellent package manager, dependency management is a breeze.

Probably best to be cautious with the name, so I am happy with JetTaggingFCC.jl. As use cases become clearer than we can reconsider if that could be refactored into a shared package. But that could also lead to the same testing nightmare as you need to pull in all of the experiment specific EDMs...

There is a really good guide to creating packages here: https://modernjuliaworkflows.org/sharing/.

It may require someone with JuliaHEP admin rights to create the initial repo - I can check later.

@mattleblanc
Copy link
Contributor

Sounds good. If someone with JuliaHEP admin can make the repo and give @HarryWang0619 and I the privileges we need, we can get it set up.

@graeme-a-stewart
Copy link
Member

https://github.com/JuliaHEP/JetTaggingFCC.jl

I have invited you all (@HarryWang0619 @emadmtr and @mattleblanc) into the JuliaHEP organisation so that we can manage repo permissions more easily.

@HarryWang0619
Copy link
Author

https://github.com/JuliaHEP/JetTaggingFCC.jl

I have invited you all (@HarryWang0619 @emadmtr and @mattleblanc) into the JuliaHEP organisation so that we can manage repo permissions more easily.

Thank you! I just accepted the invitation and I will follow the routine that you shared to set up the repository!

@graeme-a-stewart
Copy link
Member

If you're unsure of anything, just ask!

@HarryWang0619
Copy link
Author

If you're unsure of anything, just ask!

I have just made my first commit following the PkgTemplate.jl set up guide. However, I mistakenly named the default branch as Repository-Setup. I wonder can you help me to change the default branch to main? Thank you!

@m-fila
Copy link
Member

m-fila commented Jul 26, 2025

I wonder can you help me to change the default branch to main? Thank you!

You can rename it in repository settings > general >default branch. Docs

@HarryWang0619
Copy link
Author

Hi Mateusz, Thank you for the recourse! I can not see such option under the settings page under the repository. I think it's because I don't have the admin access for the JetTaggingFCC.jl repo.

@m-fila
Copy link
Member

m-fila commented Jul 26, 2025

Right, sorry. Changed it for you

@HarryWang0619
Copy link
Author

Right, sorry. Changed it for you

Thank you! I will be starting to commit to the new repository and I think we can close this PR. (We can move to the other side!)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants