The project converts the American hand sign gesture image to text. It uses Convolutional Neural Networks to process images with the help of Transfer Learning. The pre-trained model used here is VGG16.
About VGG16
VGG16 refers to the VGG model, also called VGGNet. It is a convolution neural network (CNN) model supporting 16 layers. K. Simonyan and A. Zisserman from Oxford University proposed this model. The VGG16 model can achieve a test accuracy of 92.7% in ImageNet, a dataset containing more than 14 million training images across 1000 object classes. It is one of the top models from the ILSVRC-2014 competition.
- Jupyter Notebook
- Streamlit
Watch video demo here
sld-demo.webm
The dataset encompasses all the English alphabet letters, providing extensive coverage for American Sign Language (ASL) gestures. Size of training dataset is 12875. Size of testing dataset is 4268. Images are of the size (310, 310).
Dataset Source
Dataset
Python(^3)
Numpy
Pandas
Matplotlib
Pillow
Tensorflow
Keras
OpenCV
Streamlit
pip install streamlit
cd app
streamlit run app.py
The model was able to predict with test accuracy of 96.37%. The test loss is 1.511
Read the Blog Here
If you have any query, feedback or suggestion feel free to drop a mail at [email protected] :)