PyTorch Word2Vec implementation

Description

Unofficial implementation of the Efficient Estimation of Word Representations in Vector Space paper written in PyTorch with code for training and demonstration of the properties of the trained model. Emphasis was placed on the Skip-gram Model only.

Content

Files to be familiarized with:

word2vec.pth is a pre-trained model on the Amazon Fashion dataset with a 4000-word vocabulary,
inference.ipynb contains the playground and demonstrates some properties of the model,
train.ipynb trains word2vec from scratch. Use it if you want to customize the training process for yourself,
extra/cloud.svg shows t-SNE visualization of the most distinct word clusters.

Installation

git clone https://github.com/tejpaper/word2vec.git
cd word2vec
pip install -r requirements.txt

Some clusters


Emotions and feelings


Family


Seasons


Numbers


Colors


Body parts


Clothes


Sizes

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
extra		extra
logs		logs
word2vec		word2vec
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
inference.ipynb		inference.ipynb
requirements.txt		requirements.txt
train.ipynb		train.ipynb
word2vec.pth		word2vec.pth

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

PyTorch Word2Vec implementation

Description

Content

Installation

Some clusters

License

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

sapromak/word2vec

Folders and files

Latest commit

History

Repository files navigation

PyTorch Word2Vec implementation

Description

Content

Installation

Some clusters

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages