EDIL is a research prototype that pairs latent encoders with a federated averaging loop. The idea: compress and lightly obfuscate data locally, train models on those embeddings across multiple workers, then aggregate weights centrally.
The sections below align terminology with Ratio1.ai’s published concepts for R1FS (private IPFS-like storage) and CStore (distributed in-memory state).
- What it is: A domain autoencoder turns raw inputs (e.g., images) into compact embeddings. Training and aggregation happen on these embeddings instead of raw data.
- Potential benefits:
- Cuts bandwidth and speeds up local training.
- Provides mild obfuscation vs. raw data leakage.
- Allows reuse of a domain encoder across many tasks.
- Limitations:
- Embeddings are not encryption; a decoder or adversary can often reconstruct inputs.
- No semantic security, key management, secure aggregation, or DP by default.
- Workers and the aggregator see plaintext embeddings and weights.
- Train or load a domain encoder (
SimpleDomainAutoEncoder) on local data to produce embeddings. - Shard data across workers using
sample_shards, respecting worker load percentages. - Local training per worker: each worker receives initial weights, trains with
SimpleTrainer, evaluates withSimpleTester, and returns updated weights. - Aggregate models: coordinator averages numpy-formatted state dicts (
aggregate_function) and repeats for multiple rounds. - Evaluate end-to-end: combine the domain encoder with the aggregated classifier for inference.
- Prefer the
.devcontainer/setup for reproducible CUDA/PyTorch tooling (devcontainer.json,Dockerfile). Launching the repo in a devcontainer pulls the right dependencies without touching the host. - If you run locally, mirror the dependencies in
edil/experiments, but expect CUDA defaults in the scripts; switch to CPU indata_utils.py/local_test.pyif needed.
- (Optional) Train the MNIST domain encoder in
edil/experiments/other/ae_test.pyby settingTRAIN_MODE=True(stores encoder/decoder in_cache/). If you skip this, the demo will try to load pre-existing weights from_cache/. - Launch the federated simulation:
python edil/experiments/local_test.py. This loads MNIST, shreds data per worker, trains per round on embeddings, aggregates weights, and reports accuracy. - Outputs are printed to stdout; temporary artifacts live in
_cache/. Use CPU by forcing device indata_utils.py/local_test.pyif you do not have CUDA.
- R1FS as storage plane: Ratio1’s encrypted, sharded, IPFS-like file system. Each file is content-addressed (CID), encrypted, and distributed across edge nodes for redundancy and load balancing. In EDIL, raw data stays on-device; only embeddings or model artifacts are stored/retrieved via R1FS.
- CStore as coordination/state plane: a distributed in-memory database (etcd + Redis–like) used to announce and discover CIDs, share lightweight metadata, and synchronize state across nodes. In EDIL, workers publish CIDs for encoder/model artifacts and pull peer updates via CStore hash sets/keys.
- R1EN as workers: replace local worker objects with R1EN agents exposing train/test RPCs. Sharding logic is reused; transport shifts to the R1EN network.
- Data/share flow:
- Store encoded shards or model checkpoints in R1FS → obtain CIDs.
- Announce CIDs through CStore (hash-set style namespaces) so peers can discover updates.
- Peers fetch artifacts from R1FS using announced CIDs; state consistency comes from CStore replication.
- Security & identity: use mutual TLS and node identity; add attestation where available. R1FS provides content integrity via hashes; encrypt at rest and in transit.
- Privacy hardening: add secure aggregation (HE/SMPC) and differential privacy so embeddings/weights are not exposed in plaintext to peers or the aggregator.
- Operations: schedule across heterogeneous R1EN hardware, manage stragglers/retries, add telemetry and audit logs. Use CStore versioning/keys to roll out/roll back encoders and checkpoints; monitor R1FS performance (bandwidth/latency) for edge constraints.
- Lifecycle: track encoder + classifier versions jointly, propagate to R1ENs, and refresh encoders when accuracy/latency trade-offs drift.
- Works as a single-process simulation only.
- No real encryption, secure aggregation, or networking is present.
- Treat latent encoding as an optimization and weak obfuscation, not as a security guarantee. Real “h-encrypted” training requires cryptographic protections.