Skip to content

Official code repository of One-Pass to Reason: Token Duplication and Block-Sparse Mask for Efficient Fine-Tuning on Multi-Turn Reasoning

License

Notifications You must be signed in to change notification settings

devrev/One-Pass-to-Reason

Repository files navigation

One-Pass-to-Reason

Official code repository of One-Pass to Reason: Token Duplication and Block-Sparse Mask for Efficient Fine-Tuning on Multi-Turn Reasoning

One-Pass to Reason: Token Duplication and Block-Sparse Mask for Efficient Fine-Tuning on Multi-Turn Reasoning
Ritesh Goru, Shanay Mehta, Prateek Jain
Paper: https://arxiv.org/abs/2504.18246
Dataset: https://huggingface.co/datasets/devrev-research/MathChatSync-reasoning

Setup instructions

Run bash setup.sh.

The script will:

  • Clone the appropriate version of LLamaFactory
  • Apply our modifications

Use the special_mask_for_reasoning flag to train your models in a single pass!

Citation:

@article{goru2025efficientsinglepasstrainingmultiturn,
    title={Efficient Single-Pass Training for Multi-Turn Reasoning},
    author={Ritesh Goru and Shanay Mehta and Prateek Jain},
    year={2025},
    eprint={2504.18246},
    archivePrefix={arXiv},
    primaryClass={cs.CL},
    url={https://arxiv.org/abs/2504.18246},
}

About

Official code repository of One-Pass to Reason: Token Duplication and Block-Sparse Mask for Efficient Fine-Tuning on Multi-Turn Reasoning

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published