Merge pull request #312 from sergiopaniego/add-vlm-grpo

merveenoyan · web-flow · commit af061cd5b5e9 · 2025-08-04T12:57:00.000+02:00
🧑‍🍳 Added `Post training an VLM for reasoning with GRPO using TRL` recipe
diff --git a/notebooks/en/_toctree.yml b/notebooks/en/_toctree.yml
@@ -130,6 +130,8 @@
           title: Fine tuning a VLM for Object Detection Grounding using TRL
         - local: fine_tuning_vlm_mpo
           title: Fine-Tuning a Vision Language Model with TRL using MPO
+        - local: fine_tuning_vlm_grpo_trl
+          title: Post training an VLM for reasoning with GRPO using TRL
 
     - title: Search Recipes
       isExpanded: false
diff --git a/notebooks/en/fine_tuning_vlm_grpo_trl.ipynb b/notebooks/en/fine_tuning_vlm_grpo_trl.ipynb
diff --git a/notebooks/en/index.md b/notebooks/en/index.md
@@ -7,11 +7,11 @@ applications and solving various machine learning tasks using open-source tools
 
 Check out the recently added notebooks:
 
+- [Post training an VLM for reasoning with GRPO using TRL](fine_tuning_vlm_grpo_trl)
 - [Fine-Tuning a Vision Language Model with TRL using MPO](fine_tuning_vlm_mpo)
 - [Fine tuning a VLM for Object Detection Grounding using TRL](fine_tuning_vlm_object_detection_grounding)
 - [Hyperparameter Optimization with Optuna and Transformers](optuna_hpo_with_transformers)
 - [Fine-tuning T5 for Automatic GitHub Tag Generation with PEFT](finetune_t5_for_search_tag_generation)
-- [Documentation Chatbot with Meta Synthetic Data Kit](fine_tune_chatbot_docs_synthetic)
 
 You can also check out the notebooks in the cookbook's [GitHub repo](https://github.com/huggingface/cookbook).