- Computes embeddings of wikipedia texts.
- Based on UBI-AGML-NLP Embeddings, Hugging Face and BERT.
- Code repository
- Processed data at FTP server
- embeddings-wikipedia.ipynb
(nbviewer)
 Jupyter notebook for computation of embeddings
- cosine-similarity-wikipedia-a.ipynb
(nbviewer 6e2d52b)
 Cosine similarity of means of embeddings
- data_access.ipynb
(nbviewer)
 Access of pre-computed data (Jupyter version of data_access.py)
- cosine-similarity-wikipedia-b.ipynb
(nbviewer 50edd35)
 Comparison of cosine-similarity and difference-value, data investigation
- cosine-similarity-wikipedia-c.ipynb
(nbviewer)
 Further cosine-similarity tests
Data Science Group (DICE) at Paderborn University
Machine Learning Group / CoR-Lab at Bielefeld University
This work has been supported by the German Federal Ministry of Education and Research (BMBF) within the project EML4U under the grant no 01IS19080B.