This repository contains the code for the paper "Stay Focused: Problem Drift in Multi-Agent Debate".
This repository is still under construction and subject to change.
The human dataset DRIFTEval is available at DriftEval.json. It includes both the labels and the explanations of the labels for 170 discussion excerpts.
conda env create -f environment.yaml
To run the code, you need the MALLM framework which is available here and have it running.
Experiment 1 concerns the investigation of multi-agent debate. Experiment 2 concerns the DRIFTJudge and DRIFTPolicy.
First, you need to download the datasets:
python data/data_download.py
Then, you can run this code with the following commands:
Run experiments:
python batch_mallm.py exp1/exp1_batch.json
python batch_mallm.py exp2/exp2_batch.json
Run evaluations:
python exp1_evaluation.py
python exp2_evaluation.py
Create figures:
python exp1_create_figures.py
python exp2_create_figures.py
comming soon