Text-Emotion-Classification-NLP

Overview

This repository presents a comprehensive approach to emotion classification on text data labeled with six emotions:

Sadness
Joy
Love
Anger
Fear
Surprise

We explore and compare four deep learning architectures, each in its own notebook:

CNN (cnn.ipynb)
- Embedding layer, convolutional filters, and global max pooling
- Achieves around 90% accuracy (depending on hyperparameters)
LSTM (lstm.ipynb)
- Incorporates LSTM units to capture sequential text patterns
- Dropout for regularization and optional RMSprop/Adam optimizers
BiGRU (BiGRU.ipynb)
- Bidirectional GRU capturing context in both directions
- Approx. 94% accuracy with extensive preprocessing (emoji handling, chat word cleanup)
BERT (Bert.ipynb)
- Fine-tunes a pre-trained BERT model
- Often yields the highest accuracy (~97%) but requires more computational resources

Dataset

The dataset is sourced from:
“Emotion dataset.” Kaggle: NIDULA ELGIRIYEWITHANA. https://www.kaggle.com/datasets/nelgiriyewithana/emotions (accessed Apr. 2, 2024).

Approximately 400,000 tweets labeled with one of the six emotions.
Preprocessing includes tokenization, stopword removal, padding, and optional class balancing.

Key Features

Data Preprocessing: Includes removing emojis/chat words, balancing classes, tokenizing, and padding.
Model Development: Each notebook details training setup, hyperparameters (batch size, learning rate, epochs), and the chosen optimizer.
Evaluation Metrics: Accuracy, Precision, Recall, F1-score, Confusion Matrix, and sometimes ROC curves.
Comparison: Highlights how model complexity and architecture choice affect classification performance.

Project Structure

.
├── cnn.ipynb           # CNN-based text classification
├── lstm.ipynb          # LSTM-based approach
├── BiGRU.ipynb         # Bidirectional GRU
├── Bert.ipynb          # BERT-based model
├── text.csv                # Contains Kaggle Emotion dataset 
└── README.md           # Project documentation

Installation & Setup

Clone this repository:

git clone https://github.com/YourUsername/Text-Emotion-Classification-NLP.git
cd Text-Emotion-Classification-NLP

Obtain and place the Kaggle Emotion dataset:
- Download from Kaggle: NIDULA ELGIRIYEWITHANA’s Emotion dataset.

Usage

Launch Jupyter Notebook (or JupyterLab):
```
jupyter notebook
```
Open any of the notebooks (e.g., cnn.ipynb, lstm.ipynb, etc.):
- Run the cells in order to load data, build the model, and evaluate results.
- Each notebook details its own architecture and training routine.
Compare Results:
- Look at accuracy, F1 scores, and confusion matrices to see how CNN, LSTM, BiGRU, and BERT differ in performance.

Results

CNN: About 90% accuracy, excels at extracting local features from text.
LSTM: Captures long-term context, around the low- to mid-90s accuracy.
BiGRU: Often surpasses LSTM in capturing bidirectional context, ~94% accuracy.
BERT: Typically the best performer (~97% accuracy), though more computationally intensive.

License

This project is licensed under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
Bert.ipynb		Bert.ipynb
BiGRU.ipynb		BiGRU.ipynb
LICENSE		LICENSE
README.md		README.md
cnn.ipynb		cnn.ipynb
emo_dis.png		emo_dis.png
lstm.ipynb		lstm.ipynb
project_report.pdf		project_report.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Text-Emotion-Classification-NLP

Overview

Dataset

Key Features

Project Structure

Installation & Setup

Usage

Results

License

About

Releases

Packages

Languages

License

shallowManica/Text-Emotion-Classification-NLP

Folders and files

Latest commit

History

Repository files navigation

Text-Emotion-Classification-NLP

Overview

Dataset

Key Features

Project Structure

Installation & Setup

Usage

Results

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages