GitHub - toranb/sloth: python sftune, qmerge and dpo scripts with unsloth

Mistral 7B chat fine tuning

SFT with unsloth

git clone [email protected]:toranb/sloth.git
cd sloth
uv venv
source .venv/bin/activate
uv pip install -r requirements.txt
## add data.json with instruction, output pairs for supervised fine tune
python3.11 sftune.py

Merge from checkpoint (optional)

This cmd will merge a given checkpoint, creating a new model directory

rm -rf model
python3.11 zmerge.py --peft /home/toranb/sloth/workspace/checkpoint-2600

DPO alignment (optional)

mkdir fin
export DPO=/home/toranb/sloth/model
export JSON=/home/toranb/sloth/dpo.json
export OUTPUTDIR=/home/toranb/sloth/fin
## add dpo.json with prompt, chosen, rejected
python3.11 dpo.py --base $DPO --out $OUTPUTDIR --json $JSON

Dataset note

I'm having success with this SFT configuration using a dataset of 21k instruction, output pairs that are in total 3MIL tokens. This 21k dataset is a combination of 10k from a subset of airoboros and 11k from a proprietary dataset.

Installation note

I want pip install to work from the requirements.txt I have included here but sadly that rarely works so I'd ignore that detail and start with unsloth to be sure you have a solid installation.

March 2025: I've had success installing unsloth with uv using these steps with CUDA 12.6 & torch 2.5.1

uv python install 3.11
uv venv
source .venv/bin/activate
uv pip install "unsloth[cu126-ampere-torch250] @ git+https://github.com/unslothai/unsloth.git"
uv pip install torch==2.5.1 xformers ninja setuptools wheel sentencepiece
uv pip install --no-deps trl numpy pytz pandas peft accelerate bitsandbytes
uv pip install datasets transformers
uv pip install rich click pydantic unsloth_zoo
uv pip install flash-attn --no-build-isolation

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SFT with unsloth

Merge from checkpoint (optional)

DPO alignment (optional)

Dataset note

Installation note

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
README.md		README.md
data.json		data.json
dpo.json		dpo.json
dpo.py		dpo.py
legacy.py		legacy.py
requirements.txt		requirements.txt
sftune.py		sftune.py
zmerge.py		zmerge.py

toranb/sloth

Folders and files

Latest commit

History

Repository files navigation

SFT with unsloth

Merge from checkpoint (optional)

DPO alignment (optional)

Dataset note

Installation note

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages