Analyzing Mental Health Posts on Social Media: A Deep Learning Approach to Reddit Post Classification

MINDSCANNER

This project aims to predict a user’s potential mental health condition based on the sentiment and linguistic cues found in Reddit posts. Social media platforms, particularly Reddit, are widely used for expressing emotions and seeking support, often anonymously. Users frequently share personal experiences that may indicate underlying mental health issues such as autism, anxiety, bipolar disorder, borderline personality disorder, dpression, schizophrenia.

To address this, we developed and evaluated deep learning models, including both models built from scratch and fine-tuned transformer-based models. These models were trained on labeled data from dedicated mental health subreddits and optimized for classifying the specific condition associated with each post. We trained and evaluated our models and fine-tuned transformer-based models on two versions of the original data. The first version where we did the cleaning which can found by running the code here and and second version where we apply the transformation on the clean version of the data which can be found by running the code here.

Our work contributes to the field of public mental health by leveraging user-generated content to identify potential mental health conditions. Beyond academic value, this project has real-world implications for improving online mental health support. By accurately classifying Reddit posts according to specific mental health conditions, our models can help guide users toward the most appropriate subreddit communities, where they are more likely to receive relevant support, shared experiences, and information. This can enhance early detection, reduce miscommunication, and improve users’ chances of accessing the right resources or treatment pathways within these online communities.

In total, four models from scratch were developed and two transformer models were fine-tuned and compared to identify the most effective architecture for detecting a user’s potential mental health condition based on the sentiment and linguistic cues found in Reddit posts. Our best-performing model, a BERT fine-tuned model, achieved an accuracy of 87.3% on the test set, demonstrating that fine-tuned models can perform better than models build from scratch with minimal code.
Explore the docs »

Report Bug · Request Feature

Table of Contents

About The Project
Getting Started With Georgios Ioannou Code
- Prerequisites
- Setup
Getting Started With Zechen Yang Code
- Prerequisites
- Setup
Usage
Code With Plotly Graphs
Paper
Contributing
License
Contact

About The Project

Tasks

Tasks
Create Github Repository
Brainstorm Project
Find Dataset
EDA
Data Preprocessing and Cleaning
Data Modeling: BiLSTM, CNN+BiLSTM, CNN, BiGRU
Fine-Tune: BERT, MISTRAL-7B
Model Evaluation
Write Paper

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
code		code
datasets		datasets
inference_example		inference_example
logo		logo
neural_network_models		neural_network_models
paper		paper
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml

Model	Accuracy (%)	Precision (%)	Recall (%)	F1 (%)
BiLSTM	85.3	85.3	85.3	85.3
CNN + BiLSTM	84.7	84.7	84.7	84.7
CNN	84.1	84.1	84.1	84.1
BiGRU	85.2	85.2	85.2	85.2
BERT	86.4	86.4	86.4	86.1

Model	Accuracy (%)	Precision (%)	Recall (%)	F1 (%)
BiLSTM	84.5	84.5	84.5	84.5
CNN + BiLSTM	83.8	83.8	83.8	83.8
CNN	83.3	83.3	83.3	83.3
BiGRU	84.6	84.6	84.6	84.6
BERT	83	81	79	80

Model	Accuracy (%)	Precision (%)	Recall (%)	F1 (%)
BERT	87.3	87.5	87.1	87.3
Mistral7b	69.9	76.6	69.5	70.8

License

GeorgiosIoannouCoder/mindscanner

Folders and files

Latest commit

History

Repository files navigation

Analyzing Mental Health Posts on Social Media: A Deep Learning Approach to Reddit Post Classification

MINDSCANNER

About The Project

Tasks

Dataset

Cleaning

Transformation

Models Built With Version 1 of The Original Data

Models Built With Version 2 of The Original Data

Models Performance

Performance Metrics on Version 1 of Data

Performance Metrics on Version 2 of Data

Performance Metrics on Version 2 of Data With Balanced Classes

Evaluation Metrics Used

Models Weights

Inference Example

Built With

Getting Started With Georgios Ioannou Code

Prerequisites

Setup

Getting Started With Zechen Yang Code

Code With Plotly Graphs

Paper

Contributing

License

Contact

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages