Federated Reinforcement Learning for Multi-Cloud Load Balancing and Task Scheduling

This repository implements an academic research prototype for federated reinforcement learning (FRL) across multi‑cloud regions/providers to jointly learn load balancing and task scheduling policies under latency, SLO, and carbon/energy constraints.

Stack: Python · PyTorch · YAML configs · Docker · GitHub Actions · Integration hooks for Kubernetes/OpenStack · Prometheus metrics adapters
Focus: Reproducible experiments, ablation studies, and paper-ready figures.

Key Features

Federated RL (FedAvg + optional FedProx) coordinating regional PPO agents.
Two-tier policy: (1) Load balancer picks a cloud/region; (2) Scheduler assigns to a node/pool.
Multi-objective rewards combining latency, queueing delay, SLO violations, cost, and carbon intensity.
Integration hooks (mock+optional live):
- Kubernetes: cluster metrics + deployment scaling via kubernetes client (optional, with safe stubs for dry-run).
- OpenStack: Nova/Neutron stubs for VM placement decisions.
- Prometheus: metric scraping adapters.
Academic package: configs, baselines, ablations, seeding, experiment runner, result exports, and paper materials in docs/.
Reproducibility: fixed random seeds, logged configs, deterministic ops where possible.

Quickstart

# 1) Create env
python3 -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt

# 2) Smoke test (synthetic multi-cloud simulation)
python -m src.run_experiment --config configs/experiments/small_demo.yaml

# 3) Plot results
python -m src.tools.plot_results --input results/small_demo/metrics.csv --out results/small_demo/plots

Note: Kubernetes/OpenStack hooks default to dry-run unless you set INTEGRATION_MODE=live and provide credentials.

Repository Layout

federated-rl-multicloud/
├── configs/                 # YAML configs for env/agents/experiments
├── docs/                    # Paper materials: abstract, outline, figs (Mermaid), checklist
├── notebooks/               # Minimal notebooks to inspect logs and metrics
├── results/                 # (created at runtime) experiment outputs
├── src/
│   ├── envs/                # Multi-cloud simulator
│   ├── federated/           # Aggregation server & client logic
│   ├── hooks/               # Integration hooks (K8s, OpenStack, Prometheus)
│   ├── models/              # Policy/value networks
│   ├── rl/                  # PPO implementation (minimal)
│   ├── sched/               # Two-tier policy wrapper
│   ├── tools/               # Plotting, seeding, io helpers
│   └── run_experiment.py    # CLI entrypoint
├── tests/                   # Unit tests (smoke-level)
├── .github/workflows/ci.yml # CI: lint + unit tests
├── Dockerfile               # Container to run experiments
├── docker-compose.yaml      # Optional: launches a Prometheus stub & experiment container
├── Makefile
├── requirements.txt
├── LICENSE
└── CITATION.cff

Reproducing Paper Figures

Choose an experiment YAML under configs/experiments/ (e.g., small_demo.yaml, ablation_fedprox.yaml).
Run the experiment.
Use src/tools/plot_results.py to generate latency CDFs, learning curves, and Pareto plots.
Insert figures into docs/paper/ as instructed in docs/paper/outline.md.

Safety & Live Integrations

The code ships with safe defaults (dry-run). Live integration requires explicit env vars and kubeconfig/OpenStack creds.
Review configs/integrations/*.yaml and src/hooks/* before enabling live mode in production environments.

License & Citation

Licensed under MIT (see LICENSE).
Please cite using CITATION.cff.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Federated Reinforcement Learning for Multi-Cloud Load Balancing and Task Scheduling

Key Features

Quickstart

Repository Layout

Reproducing Paper Figures

Safety & Live Integrations

License & Citation

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.github/workflows		.github/workflows
configs		configs
docs		docs
notebooks		notebooks
src		src
tests		tests
CITATION.cff		CITATION.cff
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
docker-compose.yaml		docker-compose.yaml
requirements.txt		requirements.txt

License

BlckKn1fe/federated-rl-multicloud

Folders and files

Latest commit

History

Repository files navigation

Federated Reinforcement Learning for Multi-Cloud Load Balancing and Task Scheduling

Key Features

Quickstart

Repository Layout

Reproducing Paper Figures

Safety & Live Integrations

License & Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages