This project simulates a real-world marketing experiment where machine learning is used to personalize call-to-action (CTA) messages and A/B testing is conducted to validate the effectiveness of the personalization.
A synthetic dataset of 10,000 users with the following columns:
device_type
,browser
,traffic_source
scroll_depth
,time_on_page
signed_up
(target variable)
- Random Forest Classifier predicts probability of user signup.
- One-hot encoding for categorical variables.
- Evaluated using classification metrics and ROC-AUC.
- If the model predicts a probability > 0.5 → “Start Now & Save 30%”
- Else → “Learn More About Us”
- Users are randomly split into two groups:
- A (Control): Fixed CTA.
- B (Test): Personalized CTA.
- Signup outcomes are simulated.
- Two-proportion Z-test used to check statistical significance.
Bar chart showing conversion rates for both groups.
- Install dependencies:
pip install pandas numpy scikit-learn statsmodels matplotlib
- Run the script:
python ml_ab_test.py
ml_ab_test.py
: Main Python scriptuser_sessions.csv
: Sample datasetREADME.md
: Project overview
Created by an AI/ML and data science enthusiast. Feel free to modify and use this project in your own portfolio.