Deep Sentiment Analysis Project

Project Overview

This project delves into the realm of sentiment analysis, specifically focusing on three-class sentiment classification of Twitter data.

Word cloud for negative Tweets	Word cloud for positive Tweets

Data Analysis and Preprocessing: Extensive data analysis was carried out, involving data distribution analysis, data cleaning, preprocessing, and feature extraction. This phase also included stopword removal and text normalization for enhancing the quality of the dataset.
Text Vectorization Techniques: Various text vectorization techniques were explored, including TF-IDF, Bag of Words, Word2Vec, and BERT embeddings. Additionally, one-hot encoding was applied for categorical data derived from dates and times, ensuring comprehensive data representation.
Machine Learning Models: The project involved an in-depth exploration of various machine learning models, such as Support Vector Machine (SVM), Naive Bayes, Decision Tree, Extreme Gradient Boosting (XGBoost), and a Neural Network based on BERT embeddings. Each model was employed for sentiment classification.
Hyperparameter Optimization: Hyperparameter optimization was carried out for each classifier using a grid search methodology and 10-fold cross-validation, ensuring that the models achieved their optimal performance.

Performing three-class sentiment analysis on Twitter data, handling data cleaning, preprocessing, and feature extraction.
Investigating and implementing diverse text vectorization techniques for a comprehensive understanding of text data.
Exploring and applying multiple machine learning models for sentiment classification, including SVM, Naive Bayes, Decision Tree, XGBoost, and a Neural Network based on BERT embeddings.
Optimizing hyperparameters for each classifier through grid search and cross-validation, ensuring that the models were finely tuned for their classification tasks.

The following charts summarize the main results.

F1 score	Accuracy

Please check the following link for the complete report.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
README.md		README.md
Report.pdf		Report.pdf
SentimentAnalysisProject.ipynb		SentimentAnalysisProject.ipynb
obama.txt		obama.txt
results.png		results.png
results1.png		results1.png
romney.txt		romney.txt
wordCloudB.png		wordCloudB.png
wourdCloudA.png		wourdCloudA.png