Repository → experiments-framework
A systematic framework for machine learning experiments with modular workflow patterns. Built from lessons learned through iterative experimentation to enable reproducible research.
- Project Structure
- Scope
- Previous Work
- Quick Start
- Notebook Tools Installation
- Reproducibility Framework
- Scripts
- Workflows
experiments-framework/
├── notebooks/ # all jupyter notebooks
│ ├── scripts/ # notebook utility scripts
│ ├── templates/ # clean starting templates
│ │ ├── 01_preprocessing.working.ipynb # data prep template
│ │ ├── 02_annotation.working.ipynb # labeling template
│ │ ├── 03_training.working.ipynb # model training template
│ │ └── systems.working.ipynb # infrastructure template
│ ├── machine_learning/ # ml development
│ │ ├── 01_preprocessing.dev.ipynb # active preprocessing
│ │ ├── 02_annotation.dev.ipynb # active annotation
│ │ ├── 03_training.dev.ipynb # active training
│ │ ├── preprocessing/ # preprocessing experiments
│ │ ├── annotation/ # annotation experiments
│ │ └── training/ # training experiments
│ └── systems/ # infrastructure notebooks
│ └── systems.dev.ipynb # systems development
├── data/ # data organization
│ ├── raw/ # original recordings
│ ├── clips/ # extracted video clips
│ ├── frames/ # extracted images
│ └── annotations/ # labels and metadata
├── configs/ # workflow configurations
├── models/ # trained ml models
├── scripts/ # project setup scripts
│ ├── setup_experiments_structure.sh # creates dirs/notebooks
│ ├── setup_provenance.sh # tracks repo evolution
│ └── setup_orcid.sh # citation setup
├── lib/ # reusable code modules
│ └── notebook_tools/ # notebook utilities
├── references/ # citations and refs
├── environment.yml # conda environment
└── PROVENANCE.md # repo history tracking
- Modular workflow patterns for preprocessing, annotation, and training
- Clear separation of concerns across directories
- Reproducible structure for researchers and students
- Minimal dependencies with documented environment setup
- Primary(ACTIVE) development repo → traffic-vision-v0.4
- Prior(DEPRECATED) experimental repo → experiments-test
- Achieved successful vehicle counting on 21 of 30 GDOT traffic camera feeds
- Framework failed due to monolithic notebooks and environment conflicts
- Individual notebook execution became a bottleneck
This repo rebuilds the workflow framework to be modular and scalable.
-
Clone and setup:
git clone https://github.com/iTrauco/experiments-framework.git cd experiments-framework chmod +x scripts/*.sh ./scripts/setup_experiments_structure.sh
-
Track your work:
./scripts/setup_provenance.sh
-
Start developing in the
.dev
notebooks or copy templates to begin new experiments.
cd /path/to/notebook_tools
pip install -e .
This installs the library in "editable" mode - any changes you make to the code are immediately available without reinstalling.
⚠️ Development Status: All modules inlib/
are early-stage development prototypes. Functionality is still being worked out - some modules may be dead code, others are spaghetti. Creating modular packages as I identify what's killing my bandwidth.
This project uses a Conda environment to manage dependencies for reproducible analysis. Follow these steps to set up the environment:
- Anaconda or Miniconda installed on your system
- Git for cloning the repository
-
Clone the repository:
git clone https://github.com/iTrauco/experiments-framework.git cd experiments-framework
-
Create the Conda environment:
conda create -n traffic-vision-env python=3.11 -y
-
Activate the environment:
conda activate traffic-vision-env
-
Install baseline packages:
conda install -c conda-forge jupyter numpy pandas matplotlib seaborn scikit-learn opencv -y
-
Install deep learning and computer vision packages:
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118 pip install ultralytics supervision
-
Launch Jupyter Notebook:
jupyter notebook
-
Access the notebook in your browser via the URL displayed in the terminal.
The environment includes essential data science and computer vision packages:
- Python 3.11
- Jupyter Notebook
- pandas & numpy for data manipulation
- matplotlib & seaborn for visualization
- scikit-learn for traditional ML algorithms
- OpenCV for image and video processing
- PyTorch for deep learning model development
- Ultralytics for YOLO object detection
- Supervision for object tracking utilities
Creates the complete project directory structure with interactive menu. Features:
- Default creates in parent directory (
../
) - Remembers last used location
- Rollback option to undo
- Installs GitHub Actions and pre-commit hooks
Documents your experimental evolution across repository rebuilds:
- Links to previous repos and branches
- Records what you tested and learned
- Builds a timeline in PROVENANCE.md
Adds ORCID identifier and citation infrastructure to your project.
- Copy templates from
notebooks/templates/
to start new work - Develop in
.dev
notebooks at the machine_learning level - Create experimental variations in subdirectories
- Track successful patterns in LESSONS_LEARNED.md files
Automatically converts notebooks to markdown on push for better documentation and diffs.
Validates notebook metadata and cleans outputs before commits.
For collaborators who enhance the environment with additional packages:
# Export the updated environment
conda activate traffic-vision-env
conda env export > environment.yml
This ensures full reproducibility across systems by preserving all dependencies and versions.
Author: Christopher Trauco | ORCID: 0009-0005-8113-6528