Skip to content

MIR-MU/negation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Tool for evaluations of the modified dataset

This tools evaluates the accuracy of instruct models on datasets that have hypotheses with and without negation in them.

How to set up environment

# create conda env:
conda create -n nllm python=3.10

# activate the environment
conda activate nllm

# install poetry 
pip install poetry

# install the project dependencies
poetry install --no-root

How to run predictions

First, launch the vllm server with the desired model

model_name=Qwen/Qwen2.5-0.5B-Instruct
port=8000
apikey=makesomethingup
gpu=7

CUDA_VISIBLE_DEVICES=$gpu \
HF_CACHE=.cache/ \
  vllm serve $model_name \
  --port $port \
  --api-key $apikey \
  --dtype auto \
  --task generate \
  --max-model-len 1600 \
  --enable-prefix-caching

Some parameters might need some tweaking, depending on your HW or the model used.

Optionally, use the following arguments for quantization

  --quantization bitsandbytes --load-format bitsandbytes

Mistral models needs these arguments:

  --tokenizer-mode mistral --config-format mistral --load-format mistral

If you need a quantized mistral, you are out of luck because you cannot pass --load-format bitsandbytes and --load-format mistral at the same time. In that case, you have to quantize the model yourself with quantize.py into a local file. Then, you run the vllm server with a local path to the quantized model, and don't add any mistral-specific arguments.

After the inference server is running, you can launch script for generating predictions:

python run.py http://localhost:${port}/v1 ${apikey} ${model_name} nofever-ces.csv ces_prompt.txt --output_dir ./output ; \
python run.py http://localhost:${port}/v1 ${apikey} ${model_name} nofever-eng.csv eng_prompt.txt --output_dir ./output ; \
python run.py http://localhost:${port}/v1 ${apikey} ${model_name} nofever-ukr.csv ukr_prompt.txt --output_dir ./output ; \
python run.py http://localhost:${port}/v1 ${apikey} ${model_name} nofever-deu.csv deu_prompt.txt --output_dir ./output

You can see more options with python run.py --help

This script creates 3 output files:

  • ./output/Qwen_Qwen2.5-0.5B-Instruct_<timestamp>_P.csv: csv for the results on the positive hypotheses, containing dataset_id, predict_token (True or False), predicted_polarity (polarity of the hypothesis if predict_token True, the opposite polarity if False), correct_polarity (polarity of the actual correct hypothesis)
  • ./output/Qwen_Qwen2.5-0.5B-Instruct_<timestamp>_N.csv: csv for the results on the negative hypotheses, same structure as the *_P.csv
  • ./output/Qwen_Qwen2.5-0.5B-Instruct_<timestamp>_res.json: JSON object containing accuracy and other information about the run

<timestamp> is the timestamp of the start of the run.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published