Anni is a high-performance code assistant built upon the Qwen3 14B architecture. Fine-tuned on the OpenCodeReasoning-2 dataset, Anni is engineered to excel in deep algorithmic reasoning, competitive programming logic, and the implementation of complex, high-efficiency data structures.
| Property | Value |
|---|---|
| Base Model | Qwen3 14B |
| Model Type | Language Model for Code |
| Context Length | 32,000 tokens |
| Precision | BF16 / safetensors (merged) |
| Inference Framework | vLLM compatible |
Get started immediately using the provided Google Colab notebooks:
-
(Recommended) GGUF Inference : Open the Colab Notebook to run standard inference.
-
vLLM Serving: Open the Colab Notebook to run inference using the vLLM server.
- Python Dependencies:
pip install -r requirements.txt
- System Tools:
Ensure
tmuxis installed on your system (required for training scripts).
-
Environment Variables: Rename the example environment file and add your API tokens (WandB, HuggingFace, ModelScope).
mv config/example.env config/.env # Edit config/.env with your keys -
Training Config: Edit config/config.yaml to adjust hyperparameters.
- Note: Specify the
LOCAL_STORAGE_PATHin src/train.py before starting training.
- Note: Specify the
To start the training process, run the shell script:
./scripts/train.sh| File | Description |
|---|---|
preprocess.py |
Downloads the OpenCodeReasoning-2 dataset and preprocesses it for training. |
train.py |
Downloads the base model and fine-tunes it on the preprocessed dataset. |
save.py |
Loads the fine-tuned LoRA adapters and saves the model as merged 16-bit and GGUF formats. |
upload.py |
Uploads the merged model to Hugging Face and ModelScope. |
| File | Description |
|---|---|
train.sh |
Runs the training script with specified parameters. |
eval.sh |
Evaluates the model on the LiveCodeBench dataset. |
serve.sh |
Serves the model using the vLLM server. |
terminate_train.sh |
Terminates the training process. |
The frontend code for Anni is available in the web directory.
👉 View Frontend Documentation
This repository’s model and its training code are released under the MIT License.
All other elements, such as frontend code, project name and logo, are trademarks of the developer and owner of this repository (Hans) and may not be used without explicit permission.
The training dataset includes openly licensed sources under CC-BY-4.0, which permits commercial use with attribution.
Attribution:
- OpenCoderReasoning-2 (CC-BY-4.0)
Note: The dataset itself is not included in this model release.
This model may generate incorrect or unsafe code. Evaluate and verify outputs before using in production.
