This project focuses on developing a fraud detection model using historical transaction data. The goal is to predict whether a transaction is fraudulent or not, making it a binary classification problem.
The dataset is highly imbalanced, with approximately 97.14% of transactions labeled as non-fraudulent and 2.86% labeled as fraudulent.
XGBoost Classifier emerged as the top-performing algorithm, providing high accuracy, precision, recall, and F1 score on both training and test sets.
Data Collection:
Data Cleaning and Preprocessing:
Exploratory Data Analysis (EDA):
Feature Engineering:
Model Selection and Training:
Model Evaluation:
Deployment and Prediction:
Insights Generation:
Temporal Patterns: Transaction month is crucial for detecting fraudulent activity. Behavioral Characteristics: Features like 'IsOldDevice' and 'webSessWebBrowser' offer insights into fraudster behavior. Transaction Details: Attributes such as 'V6CF' and 'V3CF' play a critical role in fraud prediction.