Nano-R1: Fine-Tuning Qwen2.5-3B-Instruct Model

Welcome to the Nano-R1 repository! This project demonstrates the process of fine-tuning the Qwen2.5-3B-Instruct model using Generalized Reward Policy Optimization (GRPO) on the GSM8K dataset. This README provides all the information you need to get started with the project.

Introduction

In the realm of natural language processing, fine-tuning models for specific tasks is essential for achieving high performance. The Qwen2.5-3B-Instruct model is designed to understand and generate human-like text. This project focuses on fine-tuning this model using GRPO to enhance its performance on the GSM8K dataset, which consists of various mathematical problems.

Getting Started

To get started with the Nano-R1 project, you will need to clone the repository and install the necessary dependencies. Follow the steps below to set up your environment.

Prerequisites

Python 3.7 or higher
pip (Python package installer)
Git

Clone the Repository

You can clone the repository using the following command:

git clone https://github.com/Mikesterner87/Nano-R1.git
cd Nano-R1

Install Dependencies

After cloning the repository, install the required packages:

pip install -r requirements.txt

Features

Fine-tuning of the Qwen2.5-3B-Instruct model.
Implementation of GRPO for effective training.
Utilization of the GSM8K dataset for model evaluation.
Support for various adapters and configurations.
Easy integration with Hugging Face libraries.

Installation

To run the project, you need to install the required libraries. The requirements.txt file contains all necessary dependencies. Use the following command to install them:

pip install -r requirements.txt

Make sure to have the following libraries installed:

transformers
torch
safetensors
trl
text-generation-inference

Usage

Once you have set up the environment, you can start fine-tuning the model. Use the following command to begin the training process:

python train.py --dataset gsm8k --model qwen2-5

You can adjust parameters in the train.py script to customize your training process. Refer to the comments in the code for guidance.

Example of Fine-Tuning

Here’s a simple example of how to use the fine-tuned model for text generation:

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "your_fine_tuned_model"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

input_text = "What is 7 + 5?"
inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**inputs)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

This code snippet demonstrates how to load your fine-tuned model and generate text based on a prompt.

Contributing

We welcome contributions to the Nano-R1 project. If you would like to contribute, please follow these steps:

Fork the repository.
Create a new branch (git checkout -b feature/YourFeature).
Make your changes.
Commit your changes (git commit -m 'Add some feature').
Push to the branch (git push origin feature/YourFeature).
Open a pull request.

Please ensure that your code adheres to the existing style and includes tests where applicable.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Releases

You can find the latest releases and downloadable files here. Please download and execute the files as needed to get the most out of this project.

For detailed information about each release, visit the Releases section.

Contact

For any questions or feedback, feel free to reach out:

GitHub: Mikesterner87
Email: [email protected]

Thank you for your interest in the Nano-R1 project! We hope you find it useful for your fine-tuning tasks.

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
LICENSE		LICENSE
README.md		README.md
nano_r1_model.ipynb		nano_r1_model.ipynb
nano_r1_model.py		nano_r1_model.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Nano-R1: Fine-Tuning Qwen2.5-3B-Instruct Model

Table of Contents

Introduction

Getting Started

Prerequisites

Clone the Repository

Install Dependencies

Features

Installation

Usage

Example of Fine-Tuning

Contributing

License

Releases

Contact

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

Mikesterner87/Nano-R1

Folders and files

Latest commit

History

Repository files navigation

Nano-R1: Fine-Tuning Qwen2.5-3B-Instruct Model

Table of Contents

Introduction

Getting Started

Prerequisites

Clone the Repository

Install Dependencies

Features

Installation

Usage

Example of Fine-Tuning

Contributing

License

Releases

Contact

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages