Vectorized implementation of the Layers/Building Blocks of a Convolutional Neural Network for faster training and prediction used to Classify Handwritten Bengali Digits.
Bengali Handwritten Digit Recognition Competition by Bengal.ai
- Convolution: Four (hyper)parameters: a. Number of output channels b. Filter dimension c. Stride d. Padding
- ReLU Activation
- Max-pooling: Two parameters: a. Filter dimension b. Stride
- Flattening layer: Converts a (series of) convolutional filter maps to a column vector.
- Fully-connected layer: a dense layer. One parameter: output dimension.
- Softmax: Converts final layer projections to normalized probabilities.
Full Specifications of build: cnn from scratch
- The Architecture used was the standard LeNet-5
- The images were preprocessed as follows:
- resize to 64 by 64
- invert from colors of pixel
- transpose to have the shape (batch, channels, image_dimension, image_dimension)
Training was done using combined images from 'training-a', 'training-b' and 'training-c'. Train-Test was split 80-20. Training was done using mini-batch gradient descent.
- Epochs: 24
- batch size: 32
- Learning Rate: 0.1 (tuned)
- batch size of mini-batch: 32
- Accuracy: 0.9525473399458972
- Validation Loss: -7.247408912811393
- Macro f1 score: 0.9523544234682374 ,
- Training Loss: 2.0793054652897394e-05
Testing was done on 'training-d' as per instructions of project supervisor
- Accuracy: 0.9540704070407041,
- Macro f1 score: 0.9541251631006855
Full Report on the Results: Training Results