Game Review Analysis

A comparative analysis of Steam reviews for games with male vs. female protagonists, exploring sentiment, topics, and classification potential.

🔍 Overview

This project analyzes user reviews from Steam games featuring male and female protagonists to identify differences in sentiment, thematic content, and classification performance.
It seeks to explore whether reviews of games with female leads differ significantly from those with male leads, and what insights can be drawn about player reception and potential bias.

Investigates sentiment and topical content differences in game reviews based on protagonist gender.
Offers a machine learning pipeline to predict review polarity (positive or negative).
Aims to surface trends in language, tone, and content focus in gaming discourse.
This was both a learning exercise and an exploration of real-world representation and audience bias in gaming.

🛠️ Tech Stack

Python 3
spaCy (NLP preprocessing)
NLTK (Sentiment Analysis with VADER)
Scikit-learn (ML models, LDA topic modeling)
imbalanced-learn (resampling techniques)
Seaborn / Matplotlib (visualizations)
Pandas / NumPy
Jupyter Notebooks

🚀 Features

Preprocesses user reviews using spaCy and stores metadata for easy manipulation
Performs sentiment analysis using NLTK’s VADER model
Runs a 5-fold cross-validation using SVM, Naive Bayes, and Random Forest classifiers
Upsamples minority classes to handle extreme class imbalance
Visualizes sentiment polarity and classification confusion matrices
Extracts review topics using LDA and compares topic distribution by protagonist gender

📁 Project Structure

/project-root
│
├── data/                     # Pickled preprocessed review bins
├── MLTable.png               # Summary results for ML models
├── README.md                 # This file
├── LICENSE                   # License GPLv3
├── requirements.txt          # Requirements to run notebook
├── steamScrape.py            # Script to scrape data
├── analysis.html             # Rendered Output for easy viewing
└── analysis.ipynb            # Full analysis pipeline

📈 Results

SVM Accuracy: ~95% with upsampling
Statistical Test (KS): Significant difference (p < 0.05) in polarity distributions between male and female protagonist reviews
Topic Modeling: Distinct topic distributions; female-lead games showed more discussion around "vibe"-oriented themes
Sentiment Distributions: Female-led games slightly skewed more positively

Note: Small and imbalanced datasets reduce generalizability. Genre is a confounding variable.

🧠 What I Learned

How to handle small, imbalanced datasets with upsampling techniques
How to interpret and visualize sentiment polarity using dictionary-based models
How to implement and evaluate multiple supervised classifiers on sparse review data
Challenges of topic modeling with LDA and the value of more modern transformer-based methods
The importance of dataset scope, control variables (e.g., genre), and reproducibility in social-NLP analysis

📦 Installation & Usage

git clone https://github.com/yourname/gendered-game-review-analysis.git
cd gendered-game-review-analysis
pip install -r requirements.txt
jupyter notebook analysis.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Game Review Analysis

🔍 Overview

🛠️ Tech Stack

🚀 Features

📁 Project Structure

📈 Results

🧠 What I Learned

📦 Installation & Usage

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
data/pkl		data/pkl
.gitignore		.gitignore
LICENSE		LICENSE
MLTable.png		MLTable.png
README.md		README.md
analysis.html		analysis.html
analysis.ipynb		analysis.ipynb
requirements.txt		requirements.txt
steamScrape.py		steamScrape.py

License

NathanielH-snek/SteamReviewNLP

Folders and files

Latest commit

History

Repository files navigation

Game Review Analysis

🔍 Overview

🛠️ Tech Stack

🚀 Features

📁 Project Structure

📈 Results

🧠 What I Learned

📦 Installation & Usage

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages