In proceedings of EMNLP 2019.
For convinience, create a symlink: cd e2e-coref && ln -s ../wiki ./wiki
For pre-training the coreference resolution system, OntoNotes 5.0 is required. [Download] [Create splits]
Data for training the reward models and fine-tuning the coreference resolver (place in <PROJECT_HOME>/data):
- 2M triples for RE-Text [Download]
- 12M triples for RE-KG [Download]
- 60k triples for RE-Joint [Download]
- Development data [Download]
- 10k wikipedia summaries for fine-tuning [Download]
Note: If you want to make these files from scratch, follow the instructions in the triples folder.
- Best performing reward model (RE-Distill) [Download]
- Best performing coreference resolver (Coref-Distill) [Download]
Unzip Coref-Distill into e2e-coref/logs folder and run GPU=x python evaluate.py final
- Download pytorch big-graph embeddings (~40G, place in
<PROJECT_HOME>/embeddings) [Download] - Run
wiki/embs.pyto create an index of the embeddings (you need to do this only once) - Run reward module training with
cd wiki/reward && python train.py <dataset-name>
- Follow
e2e-coref/README.mdto setup environment, create ELMO embeddings, etc. - Run coreference pre-training with
cd e2e-coref && GPU=x python train.py <experiment>
- Start the sling server with
python wiki/reward/sling_server.py - Change
SLING_IPinwiki/reward/reward.pyto the IP of the sling server - Run coreference fine-tuning with
cd e2e-coref && GPU=x python finetune.py <experiment>(seee2e-coref/experiments.conffor the different configurations)
wiki/reward/combine_models.pycan be used to distill the various reward modelse2e-coref/save_weights.pycan be used to save the weights of the fine-tuned coreference models so that they can be combined by setting thedistillflag in the configuration file
@inproceedings{aralikatte-etal-2019-rewarding,
title = "Rewarding Coreference Resolvers for Being Consistent with World Knowledge",
author = "Aralikatte, Rahul and
Lent, Heather and
Gonzalez, Ana Valeria and
Herschcovich, Daniel and
Qiu, Chen and
Sandholm, Anders and
Ringaard, Michael and
S{\o}gaard, Anders",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)",
month = nov,
year = "2019",
address = "Hong Kong, China",
publisher = "Association for Computational Linguistics",
url = "https://www.aclweb.org/anthology/D19-1118",
doi = "10.18653/v1/D19-1118",
pages = "1229--1235"
}