This repository provides the implementation of PRISM, an alignment framework that integrates principled reasoning with safety through structured, multi-step reasoning.
conda create -n PRISM python=3.10
conda activate PRISM
pip install 'ms-swift[all]' -U
pip install vllmWe open-source the training datasets on Hugging Face:
- PRISM-CoT: https://huggingface.co/datasets/andyc03/PRISM-CoT
- PRISM-DPO: https://huggingface.co/datasets/andyc03/PRISM-DPO
First, prepare the data. We have released the PRISM-CoT and PRISM-DPO datasets. Convert your dataset to a Swift-compatible format by providing the absolute path to your data folder:
python utils/formatting.py --folder /your_path_here/PRISM_COTThen add the special tokens for your model using utils/add_tokens.py:
python utils/add_tokens.py --model_path /your_mode_path_hereNow you can train your PRISM model. Update the JSON and model path in training_scripts/qwen2_vl.sh, for example:
cd training_scripts
# For Qwen2-VL with full-parameters SFT
bash qwen2_vl.shWe provide the model weights used in our experiments on Hugging Face:
- Qwen2-VL-PRISM-SFT: https://huggingface.co/andyc03/Qwen2-VL-PRISM-SFT
- Qwen2-VL-PRISM-DPO: https://huggingface.co/andyc03/Qwen2-VL-PRISM-DPO
If you want to generate preference data using Monte Carlo Tree Search (MCTS), we provide scripts to help you do so:
cd PRISM_DPO_dataFirst, change the model path of your downloaded PRISM-CoT model in scripts/activate_vllm.sh, then launch it:
bash scripts/activate_vllm.shNext, configure your model path and data in config/qwen_tree_generate.yaml, then run MCTS data generation:
# Then run MCTS data generation
bash scripts/generate_MCT.shConfiguration parameters:
actor_model_dir: Path to your modeltrain_prompt_path: Input prompts for data generationiterations: Number of MCTS iterations (default: 200)c: UCB exploration parameter (default: 1.5)max_depth: Maximum reasoning depth (default: 5)
Please refer to TTS/TTS.md for running details.
This project is licensed under the MIT License — see the LICENSE file for details.
If you use PRISM in your research, please consider citing our paper:
@misc{li2025prismrobustvlmalignment,
title={PRISM: Robust VLM Alignment with Principled Reasoning for Integrated Safety in Multimodality},
author={Nanxi Li and Zhengyue Zhao and Chaowei Xiao},
year={2025},
eprint={2508.18649},
archivePrefix={arXiv},
primaryClass={cs.CR},
url={https://arxiv.org/abs/2508.18649},
}Built on top of excellent open-source projects including ms-swift, vLLM, and STAIR.
For questions, issues, or discussions, please open an issue in this repository or contact the author at [email protected].