Skip to content

Latest commit

 

History

History
47 lines (35 loc) · 1.75 KB

ReadMe.md

File metadata and controls

47 lines (35 loc) · 1.75 KB

Official code for the paper "Systematic Investigation of Strategies Tailored for Low-Resource Settings for Low-Resource Dependency Parsing". If you use this code please cite our paper.

Requirements

  • Python 3.7
  • Pytorch 1.1.0
  • Cuda 9.0
  • Gensim 3.8.1

We assume that you have installed conda beforehand.

conda install pytorch==1.1.0 torchvision==0.3.0 cudatoolkit=9.0 -c pytorch
pip install gensim==3.8.1

Pretrained embeddings for Sanskrit

  • Pretrained FastText embeddings for STBC/VST can be obtained from here. Make sure that .txt file is placed at data/
  • The main results are reported on the systems trained by combining train and dev splits.

How to train model for Sanskrit

To run proposed system: (1) Pretraining (2) Integration, then simply run bash script run_STBC.sh or run_VST.sh for the respective dataset. With these scripts you will be able to reproduce our results reported in Section-3 and Table 2.

bash run_STBC.sh

Citations

@misc{sandhan_systematic,
  doi = {10.48550/ARXIV.2201.11374},
  url = {https://arxiv.org/abs/2201.11374},
  author = {Sandhan, Jivnesh and Behera, Laxmidhar and Goyal, Pawan},
  keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
  title = {Systematic Investigation of Strategies Tailored for Low-Resource Settings for Low-Resource Dependency Parsing},
  publisher = {arXiv},
  year = {2022},
  copyright = {Creative Commons Attribution 4.0 International}
}

Acknowledgements

Our ensembled system is built on the top of "DCST Implementation"