Skip to content

Flexiana/ml-for-marketers-examples

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Cross-Price Elasticity Estimation: Modern Methods Showcase

A comprehensive demonstration of state-of-the-art methods for estimating cross-price elasticities using Python's leading econometric and machine learning libraries.

Overview

This project implements working examples for cross-price elasticity estimation using:

  • EconML: Double ML, IV methods, Causal Forests, DR learners
  • PyBLP: BLP (Berry-Levinsohn-Pakes) random-coefficient logit demand models
  • LinearModels: High-dimensional panel regressions with fixed effects and IV/2SLS
  • Statsmodels: AIDS/QUAIDS demand systems
  • PyMC: Bayesian hierarchical elasticity models
  • Scikit-learn/XGBoost: DML pipelines with ML nuisance models

Installation

Recommended: Use a clean virtual environment to avoid dependency conflicts

# Create virtual environment
python -m venv elasticity_env

# Activate environment
source elasticity_env/bin/activate  # On macOS/Linux
# OR
elasticity_env\Scripts\activate     # On Windows

# Install packages
pip install -r requirements.txt

Quick activation (after setup):

# Use the provided script
source activate_env.sh

Quick Start

# Run all examples with comparison
python main_cross_price_elasticity.py

# Or run individual examples:
python data_preparation.py           # Generate synthetic data
python example_econml.py            # EconML causal ML methods
python example_pyblp.py             # PyBLP demand estimation
python example_linearmodels.py      # Panel data methods
python example_statsmodels_aids.py  # AIDS/QUAIDS systems
python example_pymc.py              # Bayesian hierarchical models
python example_sklearn_xgb_dml.py   # ML-based DML

Data

The project uses synthetic retail scanner data with:

  • Multiple products (substitutes and complements)
  • Multiple stores and markets
  • Time periods with seasonality
  • Price variations (promotions, regular prices)
  • Instrumental variables (cost shifters, competitor prices)
  • Consumer demographics for heterogeneous effects

Methods Demonstrated

1. EconML (Causal Machine Learning)

  • Double ML (DML): Flexible control for confounders using ML
  • IV Methods: Handling endogenous prices with ML first stages
  • Causal Forests: Heterogeneous treatment effects
  • DR Learners: Doubly robust estimation

2. PyBLP (Structural Demand)

  • Random Coefficients Logit: Flexible substitution patterns
  • Supply Side: Joint demand-supply estimation
  • Demographics: Consumer heterogeneity
  • Nested Logit: Category-based substitution

3. LinearModels (Panel Methods)

  • Fixed Effects: Entity and time fixed effects
  • 2SLS/IV: Instrumental variables for panels
  • Dynamic Panels: Lagged dependent variables
  • Heterogeneous Effects: Varying elasticities by group

4. Statsmodels (Demand Systems)

  • Linear AIDS: Almost Ideal Demand System
  • QUAIDS: Quadratic AIDS with flexible Engel curves
  • Restrictions: Homogeneity, symmetry, adding-up
  • Welfare Analysis: Consumer surplus calculations

5. PyMC (Bayesian Methods)

  • Hierarchical Models: Partial pooling across products
  • Cross-Price Effects: Full elasticity matrices
  • Varying Slopes: Market-specific elasticities
  • Time-Varying: Dynamic elasticity evolution

6. ML-Based DML

  • XGBoost DML: Gradient boosting for nuisances
  • LightGBM: High-dimensional features
  • Ensemble Methods: Combining multiple ML models
  • Neural Networks: Deep learning for non-linearities

Key Features

Elasticity Types Estimated

  • Own-price elasticities: Response of quantity to own price
  • Cross-price elasticities: Response to competitor prices
  • Income elasticities: Response to consumer income
  • Promotional elasticities: Impact of marketing

Methodological Advances

  • Endogeneity handling: IV, control functions, panel methods
  • Heterogeneity: Random coefficients, hierarchical models
  • Non-linearities: ML methods, polynomial terms
  • Uncertainty quantification: Bayesian posteriors, bootstrap

Output Files

Each method produces:

  • Elasticity estimates with confidence/credible intervals
  • Model diagnostics and convergence checks
  • Visualization plots (saved as PNG)
  • Comparison tables (CSV format)

Main outputs:

  • data/: Generated datasets in various formats
  • *.png: Visualization plots for each method
  • elasticity_comparison.csv: Summary comparison table
  • summary_comparison.png: Overall comparison plot

Results Interpretation

True Values (from data generation)

  • Own-price elasticity: -1.2
  • Within-category cross-price: 0.4 (substitutes)
  • Complement cross-price: -0.15
  • Unrelated products: 0.02

Method Selection Guide

Choose based on your needs:

Scenario Recommended Method
Causal inference focus EconML (DML, IV)
Rich substitution patterns PyBLP
Panel data LinearModels
Complete demand system AIDS/QUAIDS
Uncertainty quantification PyMC
High-dimensional controls XGBoost/LightGBM DML
Limited assumptions ML-based methods

Performance Notes

  • Full execution takes ~10-15 minutes on a modern machine
  • PyBLP and PyMC examples are computationally intensive
  • Reduce data size or iterations for faster testing

Technical Requirements

  • Python 3.8+
  • 8GB RAM minimum (16GB recommended)
  • Multi-core processor for parallel sampling

Troubleshooting

Common issues and solutions:

  1. Memory errors: Reduce number of markets/products in examples
  2. Convergence warnings: Increase iterations or adjust priors
  3. Import errors: Ensure all packages installed via pip install -r requirements.txt

Extensions

Possible extensions to explore:

  • Real data applications (scanner data, e-commerce)
  • Dynamic pricing strategies
  • Competition analysis
  • Demand forecasting
  • Optimal pricing algorithms

References

Key papers for each method:

  • DML: Chernozhukov et al. (2018) "Double/debiased machine learning"
  • BLP: Berry, Levinsohn & Pakes (1995) "Automobile prices in market equilibrium"
  • AIDS: Deaton & Muellbauer (1980) "An almost ideal demand system"
  • Causal Forests: Athey et al. (2019) "Generalized random forests"

License

MIT License - See LICENSE file for details

Contact

For questions or issues, please open a GitHub issue or contact the maintainers.


This showcase demonstrates modern econometric methods for elasticity estimation. The synthetic data and true parameters allow for method validation and comparison.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published