A comprehensive demonstration of state-of-the-art methods for estimating cross-price elasticities using Python's leading econometric and machine learning libraries.
This project implements working examples for cross-price elasticity estimation using:
- EconML: Double ML, IV methods, Causal Forests, DR learners
- PyBLP: BLP (Berry-Levinsohn-Pakes) random-coefficient logit demand models
- LinearModels: High-dimensional panel regressions with fixed effects and IV/2SLS
- Statsmodels: AIDS/QUAIDS demand systems
- PyMC: Bayesian hierarchical elasticity models
- Scikit-learn/XGBoost: DML pipelines with ML nuisance models
Recommended: Use a clean virtual environment to avoid dependency conflicts
# Create virtual environment
python -m venv elasticity_env
# Activate environment
source elasticity_env/bin/activate # On macOS/Linux
# OR
elasticity_env\Scripts\activate # On Windows
# Install packages
pip install -r requirements.txt
Quick activation (after setup):
# Use the provided script
source activate_env.sh
# Run all examples with comparison
python main_cross_price_elasticity.py
# Or run individual examples:
python data_preparation.py # Generate synthetic data
python example_econml.py # EconML causal ML methods
python example_pyblp.py # PyBLP demand estimation
python example_linearmodels.py # Panel data methods
python example_statsmodels_aids.py # AIDS/QUAIDS systems
python example_pymc.py # Bayesian hierarchical models
python example_sklearn_xgb_dml.py # ML-based DML
The project uses synthetic retail scanner data with:
- Multiple products (substitutes and complements)
- Multiple stores and markets
- Time periods with seasonality
- Price variations (promotions, regular prices)
- Instrumental variables (cost shifters, competitor prices)
- Consumer demographics for heterogeneous effects
- Double ML (DML): Flexible control for confounders using ML
- IV Methods: Handling endogenous prices with ML first stages
- Causal Forests: Heterogeneous treatment effects
- DR Learners: Doubly robust estimation
- Random Coefficients Logit: Flexible substitution patterns
- Supply Side: Joint demand-supply estimation
- Demographics: Consumer heterogeneity
- Nested Logit: Category-based substitution
- Fixed Effects: Entity and time fixed effects
- 2SLS/IV: Instrumental variables for panels
- Dynamic Panels: Lagged dependent variables
- Heterogeneous Effects: Varying elasticities by group
- Linear AIDS: Almost Ideal Demand System
- QUAIDS: Quadratic AIDS with flexible Engel curves
- Restrictions: Homogeneity, symmetry, adding-up
- Welfare Analysis: Consumer surplus calculations
- Hierarchical Models: Partial pooling across products
- Cross-Price Effects: Full elasticity matrices
- Varying Slopes: Market-specific elasticities
- Time-Varying: Dynamic elasticity evolution
- XGBoost DML: Gradient boosting for nuisances
- LightGBM: High-dimensional features
- Ensemble Methods: Combining multiple ML models
- Neural Networks: Deep learning for non-linearities
- Own-price elasticities: Response of quantity to own price
- Cross-price elasticities: Response to competitor prices
- Income elasticities: Response to consumer income
- Promotional elasticities: Impact of marketing
- Endogeneity handling: IV, control functions, panel methods
- Heterogeneity: Random coefficients, hierarchical models
- Non-linearities: ML methods, polynomial terms
- Uncertainty quantification: Bayesian posteriors, bootstrap
Each method produces:
- Elasticity estimates with confidence/credible intervals
- Model diagnostics and convergence checks
- Visualization plots (saved as PNG)
- Comparison tables (CSV format)
Main outputs:
data/
: Generated datasets in various formats*.png
: Visualization plots for each methodelasticity_comparison.csv
: Summary comparison tablesummary_comparison.png
: Overall comparison plot
- Own-price elasticity: -1.2
- Within-category cross-price: 0.4 (substitutes)
- Complement cross-price: -0.15
- Unrelated products: 0.02
Choose based on your needs:
Scenario | Recommended Method |
---|---|
Causal inference focus | EconML (DML, IV) |
Rich substitution patterns | PyBLP |
Panel data | LinearModels |
Complete demand system | AIDS/QUAIDS |
Uncertainty quantification | PyMC |
High-dimensional controls | XGBoost/LightGBM DML |
Limited assumptions | ML-based methods |
- Full execution takes ~10-15 minutes on a modern machine
- PyBLP and PyMC examples are computationally intensive
- Reduce data size or iterations for faster testing
- Python 3.8+
- 8GB RAM minimum (16GB recommended)
- Multi-core processor for parallel sampling
Common issues and solutions:
- Memory errors: Reduce number of markets/products in examples
- Convergence warnings: Increase iterations or adjust priors
- Import errors: Ensure all packages installed via
pip install -r requirements.txt
Possible extensions to explore:
- Real data applications (scanner data, e-commerce)
- Dynamic pricing strategies
- Competition analysis
- Demand forecasting
- Optimal pricing algorithms
Key papers for each method:
- DML: Chernozhukov et al. (2018) "Double/debiased machine learning"
- BLP: Berry, Levinsohn & Pakes (1995) "Automobile prices in market equilibrium"
- AIDS: Deaton & Muellbauer (1980) "An almost ideal demand system"
- Causal Forests: Athey et al. (2019) "Generalized random forests"
MIT License - See LICENSE file for details
For questions or issues, please open a GitHub issue or contact the maintainers.
This showcase demonstrates modern econometric methods for elasticity estimation. The synthetic data and true parameters allow for method validation and comparison.