Reproducible data engineering template with:
- Podman Compose: Airflow + Postgres
- VS Code Dev Container (auto start/stop with
shutdownAction: stopCompose) uvfor Python package & project managementrufffor linting/formatting (replaces black/isort)sqlmodelfor Bronze tables (Pydantic + SQLAlchemy)dbt-corefor Silver/Gold modeling
- Podman with Docker API socket enabled (or Docker), macOS/Linux/WSL2.
- DevContainer CLI:
npm install -g @devcontainers/cli(or VS Code + Dev Containers extension) - Cookiecutter:
pipx install cookiecutter - (Optional)
pyenvon host;.python-versionis respected.
-
Generate project from template (run from your projects directory):
# Navigate to where you want the new project created cd ~/projects # or wherever you keep projects # Generate from remote template cookiecutter https://github.com/Troubladore/data-eng-template # Or if you have it locally: cookiecutter .
You'll be prompted to enter:
project_name: "My Awesome Data Project"repo_slug: "my-awesome-data-project" (auto-generated from project name)python_version: "3.12" (default)airflow_version: "2.9.3" (default)airflow_executor: Choose execution model- LocalExecutor (default): Runs tasks in parallel using separate processes
- SequentialExecutor: Runs tasks one at a time (for testing/lightweight setups)
license: Choose project license- Proprietary (default): All rights reserved, no license granted
- MIT: Permissive open source license
- Apache-2.0: Permissive with patent protection
-
Navigate to generated project:
cd my-awesome-data-project/ # whatever you named it
-
Start DevContainer:
- CLI (recommended):
devcontainer up --workspace-folder . - VS Code: Open project → Reopen in Container (services auto-start)
- CLI (recommended):
-
Access services:
- Airflow: http://localhost:8080 (admin/admin)
- Postgres:
make psql
Airflow image installs lightweight extras on boot via
_PIP_ADDITIONAL_REQUIREMENTSfor dev only. For heavier deps, build a custom image later.
This template includes Astronomer-inspired deployment optimizations:
- 5-15 second deployments vs 5+ minute full rebuilds
- Perfect for iterative DAG development
- Automatic change detection with SHA256 hashing
make deploy-dags # Deploy only DAG files (fastest)
make deploy # Auto-detect changes and choose optimal strategy
make deploy-full # Full rebuild (dependencies + code)- Multi-stage builds with dependency separation
- 60-80% faster rebuilds with intelligent caching
- Persistent pip/uv caches in development
- Automatically detects what changed (DAGs, dependencies, code)
- Chooses optimal deployment strategy
- Performance monitoring with timing metrics
- Volume mount caching for local development
- Hot-reload configuration (10-second DAG scanning)
- GitHub Actions CI/CD with registry caching
See full deployment guide: docs/deployment/README.md
See repository tree in this README's template generation.