Skip to content

Slow inference time #170

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
programath opened this issue Apr 16, 2025 · 10 comments
Open

Slow inference time #170

programath opened this issue Apr 16, 2025 · 10 comments

Comments

@programath
Copy link

Hi, I am testing the latency of the model RFDETRBase at different platform. (Quadro GV100 and A800), the latency of inference time are 0.25s and 0.14s. which is much slower then the reported speed. Is that normal speed of rf-detr?

the timer is calcuated as follows:

image = Image.open("/vepfs-sf/wtp2/R-Human/images_resized/video3_frame_00068_204.00s.jpg")
time1 = time.time()
for _ in range(10):
    detections = model.predict(image, threshold=0.5)
print("Inference time:", (time.time() - time1) / 10)
@picjul
Copy link

picjul commented Apr 16, 2025

Hi!

Times in README are calculated inferencing with TensorRT at FP16 precision. As I now, currently TensorRT is not supported as engine in model.predict() function.

Probably the times that you are measuring are based on PyTorch model in FP32 precision, so it's ok that they are higher.

Am I right?

@programath
Copy link
Author

One more thing, how to use TensorRT to accelerate the inference of RFDETR? is there any tutorial?

@programath programath reopened this Apr 17, 2025
@SkalskiP
Copy link
Collaborator

Hi @programath 👋🏻, unfortunately, we don't have a tutorial for this. Do you think it would be helpful?

@DatSplit
Copy link

DatSplit commented Apr 17, 2025

Good afternoon @SkalskiP and @programath,

I've converted a custom RT-DETR-B and RT-DETR-L model to TensorRT for deployment on a Jetson Orin Nano 8GB.
I used the trtexec function in rfdetr/deploy/export.py to retrieve the command I should use to convert the ONNX model to TensorRT.

Would it be beneficial if I wrote a tutorial for this?
If so, what details must I at least include? @SkalskiP (reproducible code + explanation + environment details?)

@SkalskiP
Copy link
Collaborator

Hi @DatSplit! That would be amazing — really appreciate it! 🔥

Here’s what I’m thinking: we should start breaking out the documentation into separate .md files. Tutorials like this are super valuable, but I’d prefer not to put them directly into the README.md.

Could you open a PR that creates a docs/ directory at the root of the project, and add a new file called export.md where you walk through:

  • Exporting to ONNX
  • Converting to TensorRT
  • Deploying on Jetson

Something like this for structure:

rf-detr /
├── docs /
│   └── export.md
├── README.md
...

Later, we’ll link to docs/export.md from the README.md to keep things clean and easy to navigate.

Let me know what you think!

@DatSplit
Copy link

Good evening @SkalskiP,

That sounds like a solid plan!
I'll be working on that documentation and will create a PR once I believe it is good enough.
Is there a "deadline" indication? (E.g., in one to two weeks)

@SkalskiP
Copy link
Collaborator

@DatSplit the sooner the better. It can be really rough. I can help you out with structuring it properly. I mostly care about the steps to follow.

@DatSplit
Copy link

Good afternoon @SkalskiP,

I have quite a busy weekend.
So, I'll most likely start working on it Sunday evening/Monday morning.
I expect the rough draft to be done Monday morning.

@mirza298
Copy link

Hi @DatSplit , how has the real-time performance (in FPS) been for the base and large models on your Jetson so far?

@SkalskiP
Copy link
Collaborator

@DatSplit that sounds perfect! Thank you! 🙏🏻

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants