GitHub - sirish999/Image-Captioning-with-CNN-RNN-and-LSTM: Tech Stack: Tensorflow, Keras, Python, CNN-RNN and LSTM, Image processing and NLP Github URL: Project Link • In this project, I have created a neural network architecture to automatically generate captions from images. • After using the Microsoft Common Objects in COntext (MS COCO) dataset to train your network, this project' network can be used on novel images! • I passed all the inputs as a sequence to an LSTM. A sequence looks like this: first a feature vector that is extracted from an input image, then a start word, then the next word, the next word, and so on. • Embedding Dimension: The LSTM is defined such that, as it sequentially looks at inputs, it expects that each individual input in a sequence is of a consistent size and so we embed the feature vector and each word so that they are embed

sirish999 / Image-Captioning-with-CNN-RNN-and-LSTM Public

Notifications You must be signed in to change notification settings
Fork 1
Star 0

Tech Stack: Tensorflow, Keras, Python, CNN-RNN and LSTM, Image processing and NLP Github URL: Project Link • In this project, I have created a neural network architecture to automatically generate captions from images. • After using the Microsoft Common Objects in COntext (MS COCO) dataset to train your network, this project' network can be used…

0 stars 1 fork Branches Tags Activity

Star

Notifications

Branches Tags

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
0_Dataset.ipynb		0_Dataset.ipynb
1_Preliminaries.ipynb		1_Preliminaries.ipynb
2_Training.html		2_Training.html
2_Training.ipynb		2_Training.ipynb
3_Inference.html		3_Inference.html
3_Inference.ipynb		3_Inference.ipynb
4_Zip Your Project Files and Submit.ipynb		4_Zip Your Project Files and Submit.ipynb
Readme.txt		Readme.txt
Untitled.ipynb		Untitled.ipynb
data_loader.py		data_loader.py
filelist.txt		filelist.txt
model.py		model.py
project2.zip		project2.zip
training_log.txt		training_log.txt
vocab.pkl		vocab.pkl
vocabulary.py		vocabulary.py

Repository files navigation

Tech Stack: Tensorflow, Keras, Python, CNN-RNN and LSTM, Image processing and NLP

• In this project, I have created a neural network architecture to automatically generate captions from images.

• After using the Microsoft Common Objects in COntext (MS COCO) dataset to train your network, this project' network can be used on novel images!

• I passed all the inputs as a sequence to an LSTM. A sequence looks like this: first a feature vector that is extracted from an input image, then a start word, then the next word, the next word, and so on.

• Embedding Dimension: The LSTM is defined such that, as it sequentially looks at inputs, it expects that each individual input in a sequence is of a consistent size and so we embed the feature vector and each word so that they are embed_size.