Update SFT QLoRA notebook with 14B model on free Colab #4336

sergiopaniego · 2025-10-24T17:38:09Z

What does this PR do?

Fine tune a 14B model on free Colab using TRL+SFT!
It already uses this PR so its dependent on it

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a GitHub issue? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes?
Did you write any new necessary tests?

Who can review?

HuggingFaceDocBuilderDev · 2025-10-24T17:43:58Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

qgallouedec · 2025-10-24T18:49:35Z

examples/notebooks/sft_trl_lora_qlora.ipynb

+        }
+      ],
+      "source": [
+        "prompt = hf_tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)\n",


I think you can use llm.chat directly here

qgallouedec · 2025-10-24T18:51:20Z

examples/notebooks/sft_trl_lora_qlora.ipynb

+        "import torch\n",
+        "\n",
+        "llm = LLM(\n",
+        "    model=f\"sergiopaniego/{output_dir}-merged\", # Replace with your HF username or organization\n",


no need to do it right now but in the future it could be nice to centralize every artifact in hf.co/trl-lib

qgallouedec · 2025-10-24T18:56:37Z

For some reason the new trackio space doesn't render (the previous one does)

qgallouedec · 2025-10-24T18:57:19Z

examples/notebooks/sft_trl_lora_qlora.ipynb

+        "    **model_inputs,\n",
+        "    max_new_tokens=512\n",
+        ")\n",
+        "output_ids = generated_ids[0][len(model_inputs.input_ids[0]):].tolist()\n",


nit: tolist no needed

Suggested change

"output_ids = generated_ids[0][len(model_inputs.input_ids[0]):].tolist()\n",

"output_ids = generated_ids[0][len(model_inputs.input_ids[0]):]\n",

…sft_qlora_notebook

sergiopaniego · 2025-10-27T10:47:34Z

Updated based on comments! Ready for final review/merge

For some reason the new trackio space doesn't render (the previous one does)

I've tested it downloading the updated notebook from here and it shows the Space when opening it from Colab or VS Code. Maybe the Space was on sleeping state at that point.

…sft_qlora_notebook

…_lora_qlora.ipynb`

qgallouedec

super cool notebook, merging now, sorry for late review

sergiopaniego added 3 commits October 24, 2025 19:36

Update SFT QLoRA notebook with 14B model

b17dc1b

Removed output

1edebbc

Add GPU type

74ec737

Added some comments

8f893a8

qgallouedec reviewed Oct 24, 2025

View reviewed changes

qgallouedec and others added 5 commits October 24, 2025 19:07

some cleaning

1e1bd69

Updated based on feedback

41745eb

Merge

f737b03

Merge branch 'sft_qlora_notebook' of github.com:huggingface/trl into …

8fcdd63

…sft_qlora_notebook

Added GPU

ee41d9b

sergiopaniego and others added 8 commits October 27, 2025 11:47

Merge branch 'main' into sft_qlora_notebook

4029374

Updated Open in Colab button

1b58eee

Merge branch 'sft_qlora_notebook' of github.com:huggingface/trl into …

a5be1fe

…sft_qlora_notebook

Merge branch 'main' into sft_qlora_notebook

af75784

Add missing liger-kernel dependency

641e59e

Merge branch 'sft_qlora_notebook' of github.com:huggingface/trl into …

2042d58

…sft_qlora_notebook

Merge branch 'main' into sft_qlora_notebook

ae498ea

`nb-clean clean -M --preserve-cell-outputs examples/notebooks/sft_trl…

9ed6b43

…_lora_qlora.ipynb`

qgallouedec approved these changes Oct 31, 2025

View reviewed changes

qgallouedec merged commit f683420 into main Oct 31, 2025
3 checks passed

qgallouedec deleted the sft_qlora_notebook branch October 31, 2025 02:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Update SFT QLoRA notebook with 14B model on free Colab #4336

Update SFT QLoRA notebook with 14B model on free Colab #4336

Uh oh!

sergiopaniego commented Oct 24, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Oct 24, 2025

Uh oh!

qgallouedec Oct 24, 2025

Uh oh!

qgallouedec Oct 24, 2025

Uh oh!

qgallouedec commented Oct 24, 2025

Uh oh!

qgallouedec Oct 24, 2025

Uh oh!

sergiopaniego commented Oct 27, 2025

Uh oh!

qgallouedec left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

	"output_ids = generated_ids[0][len(model_inputs.input_ids[0]):].tolist()\n",
	"output_ids = generated_ids[0][len(model_inputs.input_ids[0]):]\n",

Update SFT QLoRA notebook with **14B** model on free Colab #4336

Update SFT QLoRA notebook with **14B** model on free Colab #4336

Uh oh!

Conversation

sergiopaniego commented Oct 24, 2025

What does this PR do?

Before submitting

Who can review?

Uh oh!

HuggingFaceDocBuilderDev commented Oct 24, 2025

Uh oh!

qgallouedec Oct 24, 2025

Choose a reason for hiding this comment

Uh oh!

qgallouedec Oct 24, 2025

Choose a reason for hiding this comment

Uh oh!

qgallouedec commented Oct 24, 2025

Uh oh!

qgallouedec Oct 24, 2025

Choose a reason for hiding this comment

Uh oh!

sergiopaniego commented Oct 27, 2025

Uh oh!

qgallouedec left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Update SFT QLoRA notebook with 14B model on free Colab #4336

Update SFT QLoRA notebook with 14B model on free Colab #4336