Weights & Biases Efficiency Audit Tool

The Weights & Biases Efficiency Audit Tool is a Python-based utility designed to fetch and analyze historical data from your experiment tracking platform. This tool helps you gain insights into your machine learning projects by auditing compute usage and resource utilization.

Features

Fetch historical metrics, parameters, and metadata for all experiments.
Analyze GPU and CPU utilization metrics, including full metric history for GPUs.
Export results to a detailed Excel report, including raw data, summary, and image report.

Gain insights into your machine learning projects by auditing compute usage and resource utilization.

Utilization / Cost

Are you using the optimal machine sizes for your workloads?
How often do you have idle GPU or entire machines?
How much compute is wasted on idle GPU time?
What's the financial impact of underutilized resources?

Efficiency:

What's your overall GPU utilization across all experiments?
How many experiments run with 0% GPU utilization?
Which runs represent the biggest optimization opportunities?
How does your efficiency break down across different runs?

Performance Analysis

Track CPU, GPU memory, and disk utilization
Analyze network I/O patterns
Review system metrics across all experiments
Identify efficiency patterns over time

Folder Structure

wandb-efficiency-audit/
│
├── wandb_efficiency_audit.py    # Main script
├── generate_report_image.py     # Helper functions for visual report generation
├── fonts/                       # Font files for report generation
├── README.md                    # Documentation
└── pyproject.toml               # Python project configuration

Installation

Prerequisites

Python 3.9 or higher
An active W&B account (or access to public W&B projects)
Turn on W&B System Metrics monitoring (usually enabled by default)

Setup

Clone this repository:

git clone https://github.com/valohai/wandb-efficiency-audit.git
cd wandb-efficiency-audit

Create a virtualenv and install the required dependencies:

python -m venv venv
source venv/bin/activate  # This depends on your shell
pip install -e .

(Optional) Log in to W&B if accessing private projects:

wandb login

Note: No login required for public W&B projects.

Usage

Run the script to generate the audit report:

wandb-efficiency-audit --project "entity/project"

To analyze only completed runs (excluding failed/crashed runs):

wandb-efficiency-audit --project "entity/project" --completed-only

The report will be saved as experiment_metrics_summary.xlsx in the current directory.

Output Files

The tool generates two main outputs:

experiment_metrics_summary.xlsx - A comprehensive Excel workbook containing:
- Summary sheet with visual report, key metrics, and methodology
- Cost analysis and efficiency distribution
- Example runs with biggest optimization opportunities
- Detailed metrics sheet with all raw data
Visual report PNG (embedded in Excel) showing:
- Total GPU utilization percentage
- Total GPU idle time
- Percentage of runs with 0% GPU utilization
- Cost of idle compute by GPU type
- Example runs with low utilization

Understanding the Results

Efficiency Score Categories

Excellent (70%+): Optimal GPU utilization
Good (50-70%): Acceptable utilization with minor optimization potential
Fair (30-50%): Significant room for improvement
Poor (10-30%): Major underutilization issues
Critical (<10%): Severe waste, immediate action recommended

Key Metrics Explained

Total GPU Utilization: Average GPU core utilization over time across all runs
Total GPU Idle Time: Total runtime multiplied by (100% - Average GPU utilization)
Runs with 0% GPU Utilization: Share of runs that have a GPU Core that was not utilized at all during the whole run
Cost of Idle Compute: Estimated cost of unused GPU time based on AWS on-demand pricing
Example runs: Example runs that have a low utilization and are in the 25% of the longest runs found in the project

Requirements

The project dependencies are:

wandb — Interact with W&B tracking server.
pandas — Data processing and analysis.
openpyxl — Generate Excel reports.
pillow — Generate visual report images.
requests — HTTP requests for data submission.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Weights & Biases Efficiency Audit Tool

Features

Folder Structure

Installation

Prerequisites

Setup

Usage

Output Files

Understanding the Results

Efficiency Score Categories

Key Metrics Explained

Requirements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
fonts		fonts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
generate_report_image.py		generate_report_image.py
pyproject.toml		pyproject.toml
wandb_efficiency_audit.py		wandb_efficiency_audit.py

License

valohai/wandb-efficiency-audit

Folders and files

Latest commit

History

Repository files navigation

Weights & Biases Efficiency Audit Tool

Features

Folder Structure

Installation

Prerequisites

Setup

Usage

Output Files

Understanding the Results

Efficiency Score Categories

Key Metrics Explained

Requirements

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages