Banking Fraud Detection Project

This is a complete, end-to-end Fraud Detection project using real-world transactional data. It simulates a practical data analyst workflow for a banking domain, involving Python, SQL, Tableau, and a simple Machine Learning model. The goal is to detect fraudulent transactions, build insights for decision-making, and demonstrate resume-ready data skills.

Dataset Used

Source: Kaggle Credit Card Fraud Dataset
File: creditcard.csv
Duration: 2 days of anonymized transaction data
Features: 30 columns (V1–V28, Amount, Time), with Class as the fraud label (1 = fraud, 0 = non-fraud)

Project Structure & Workflow

1. `clean_data.py` – Data Cleaning & EDA

Loads and inspects the raw dataset
Checks for duplicates and missing values
Visualizes class imbalance and amount distributions
Outputs: clean_creditcard.csv

2. `sql_queries.py` – SQL Integration

Loads the cleaned CSV into an in-memory SQLite database
Runs SQL queries:
- Total fraud vs non-fraud counts
- Average transaction amount by class
Outputs: sql_avg_amount.csv for use in Tableau

3. `fraud_model.py` – Machine Learning

Trains a Logistic Regression model to predict fraud
Handles class imbalance using class_weight='balanced'
Scales features and splits data into training/testing
Evaluates using confusion matrix and classification report

4. Tableau Dashboard

Visualizes key fraud insights:
- Bar chart: Fraud vs Non-Fraud Count
- Line chart: Transactions Over Time
- KPI Cards: SQL-derived Average Amounts
- Filters: Transaction Amount Range, Class (Fraud vs Non-Fraud)

Step 1: Clone or download the repo

cd fraud_transaction_model/

Step 2: Install dependencies

pip install -r requirements.txt

Step 3: Run everything

python run_all.py

TABLEAU

Worksheet 1

Show how many transactions were fraud vs non-fraud.

Worksheet 2

Show how transactions (especially fraud) evolve over time. This visual illustrates how transaction volume fluctuates over time, with Fraud events plotted alongside Non-Fraud ones. Fraud is rare (blue line hugging the X-axis), but its timing can still be analyzed for patterns.

Worksheet 3

SQL calcuated average transaction amount

Python Files

clean_data.py This cleasn up the csv dataset
sql_queries.py load the dataset into an in memory sqlite db and performs queries
fraud_model.py simple logistic regression model to classify transactions as fraud or non-fraud. Despite the class imbalance, It uses balancing techniques and is able to achieve strong recall — which is critical in minimizing undetected fraud.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
tableau		tableau
.gitattributes		.gitattributes
README.md		README.md
clean_data.py		clean_data.py
fraud_model.py		fraud_model.py
fraud_project_summary.txt		fraud_project_summary.txt
fraud_project_talking_points.txt		fraud_project_talking_points.txt
requirements.txt		requirements.txt
run_all.py		run_all.py
sql_queries.py		sql_queries.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Banking Fraud Detection Project

Dataset Used

Project Structure & Workflow

1. `clean_data.py` – Data Cleaning & EDA

2. `sql_queries.py` – SQL Integration

3. `fraud_model.py` – Machine Learning

4. Tableau Dashboard

Step 1: Clone or download the repo

Step 2: Install dependencies

Step 3: Run everything

TABLEAU

Worksheet 1

Worksheet 2

Worksheet 3

Python Files

About

Uh oh!

Releases

Packages

Languages

bigjosher/Fraud-Detection-Model

Folders and files

Latest commit

History

Repository files navigation

Banking Fraud Detection Project

Dataset Used

Project Structure & Workflow

1. clean_data.py – Data Cleaning & EDA

2. sql_queries.py – SQL Integration

3. fraud_model.py – Machine Learning

4. Tableau Dashboard

Step 1: Clone or download the repo

Step 2: Install dependencies

Step 3: Run everything

TABLEAU

Worksheet 1

Worksheet 2

Worksheet 3

Python Files

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

1. `clean_data.py` – Data Cleaning & EDA

2. `sql_queries.py` – SQL Integration

3. `fraud_model.py` – Machine Learning

Packages