CSI: A Coarse Inventory for 85% WSD
Caterina Lacerra, Michele Bevilacqua, Tommaso Pasini and Roberto Navigli
Sapienza, University of Rome
Department of Computer Science
{lacerra, bevilacqua, pasini, navigli} [at] di.uniroma1.it
This repository contains the code to reproduce the experiments reported in CSI: A Coarse Sense Inventory for 85% Word Sense Disambiguation, by Caterina Lacerra, Michele Bevilacqua, Tommaso Pasini and Roberto Navigli. For further information on this work, please visit our website.
@inproceedings{lacerraetal:2020,
title={ {CSI}: A Coarse Sense Inventory for 85\% Word Sense Disambiguation},
author={Lacerra, Caterina and Bevilacqua, Michele and Pasini, Tommaso and Navigli, Roberto},
booktitle={Proceedings of the 34th Conference on Artificial Intelligence},
pages={8123--8130},
publisher={{AAAI} Press},
year={2020},
doi = {10.1609/aaai.v34i05.6324}
}
Run the python scripts src/main_all_words.py, src/main_one_out.py and src/main_few_shot.py to reproduce the experiments for the all-words, one-out and few-shot settings (Table 4 and 6 of the paper, respectively).
The arguments for the scripts are the same for each setting:
inventory_nameis one of the tested inventories, i.e. csi, wndomains, supersenses, sensekeys.model_namecan be either BertDense or BertLSTM.data_diris the path where data is located (typically./data).data_outis the path of the output folder.wsd_data_diris the path where wsd datasets are located (typically./wsd_data)start_from_checkpointis set if continuing training from a dumped checkpoint (optional).starting_epochis different from 0 only ifstart_from_checkpointis set. It is the starting epoch for the training (optional).do_evalis a flag to perform model evaluation only (optional).epochsis the number of training epochs (optional, 40 by default).
Please note that the few-shot setting continues training from the best epoch achieved with the one-out setting, thus it is necessary to run the one-out script first.
To train a model in the all words setting with CSI sense inventory, run
python main_all_words.py inventory_name=csi model_name=BertLSTM data_dir=./data/ data_output=./output/ wsd_data_dir=./wsd_data/
To evaluate a previously trained model, just add the do_eval parameter:
python main_all_words.py --inventory_name=csi --model_name=BertLSTM --data_dir=./data/ --data_output=./output/ --wsd_data_dir=./wsd_data/ --do_eval
Otherwise, to continue training a model for which checkpoints are available (e.g. from epoch 9):
python main_all_words.py --inventory_name=csi --model_name=BertLSTM --data_dir=./data/ --data_output=./output/ --wsd_data_dir=./wsd_data/ --start_from_checkpoint --starting_epoch=9
The output folder defined with data_out will be created and filled with results during training and test.
For each experiment configuration (i.e. all words, one out or few shot) will be created a folder that will contain results for each sense inventory used.
Let's assume we run the all_words experiment with csi; what we have will be:
+-- output_folder
| +-- csi
| +-- weights
| +-- logs
| output_files
| processed_input_files
Checkpoints for each training epoch will be contained inside the weights directory, while the logs directory
will contain logs for TensorBoard.
There will be one tab-separated output file for each test dataset. The format of the files, is the following:
flag instance_id predicted_label gold_label
where flag is w or c for wrong and correct instances, respectively and instance_id uniquely identifies
the instance in the dataset.
Please note that the output file for the dev set is computed (and overwritten) at the end of each training epoch,
while the output files for the other datasets are computed at test time.
The processed input files, instead, are computed both for the training and the test datasets, and the format is the following:
instance_id target_word gold_label target_sentence
Once again, the files are tab-separated.
The authors gratefully acknowledge the support of the ERC Consolidator Grant MOUSSE No. 726487 under the European Union’s Horizon 2020 research and innovation programme.
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Share - copy and redistribute the material in any medium or format
Adapt - remix, transform, and build upon the material
Attribution - You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
NonCommercial - You may not use the material for commercial purposes.
ShareAlike - If you remix, transform, or build upon the material, you must distribute your contributions under the same license as the original.
No additional restrictions - You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.
