Interested in Project 10 and have some clarifying questions (Fine-tuning Vision Language Models (VLMs) for Object Detection and Hierarchical Classification using the OpenVINO Ecosystem) #29355
Replies: 4 comments 2 replies
-
@rajeshgangireddy @samet-akcay this is your potential contributor :) |
Beta Was this translation helpful? Give feedback.
-
Hi @Jyc323
Let us know if you have more questions. |
Beta Was this translation helpful? Give feedback.
-
Hi @rajeshgangireddy, |
Beta Was this translation helpful? Give feedback.
-
Hi @rajeshgangireddy @adrianboguszewski @mlukasze Hope you're doing well! I am Praroop, currently a masters student at Texas A&M , focused on computer vision, multi-model learning and Generative AI. I did some preliminary research and settled down on GroundDINO, I set up the code base and ran a small fine tune on KITTI dataset, training only the decoder layer. I used NVIDIA A100 GPU and keeping the batch size small to 6, training was using 9~10 GB of VRAM. Further I am planning to: -
Would really love to know your thoughts on this. Best, |
Beta Was this translation helpful? Give feedback.
-
Dear Rajesh Gangireddy, Laurens Hogeweg, and Samet Akcay,
I’m Jiayu Li, and I have hands-on experience in deep learning, multimodal learning, computer vision, and object detection. I’m very interested in Project 10 – Fine-tuning Vision Language Models (VLMs) for Object Detection and Hierarchical Classification using the OpenVINO Ecosystem.
As I prepare my proposal, I’d love to clarify a few details to ensure I’m aligned with the project’s goals:
I’d really appreciate any guidance you can provide, and I look forward to discussing how I can contribute effectively to this project.
Best regards,
Jiayu Li
@adrianboguszewski @mlukasze Could you please help connect me with the mentors?
Beta Was this translation helpful? Give feedback.
All reactions