This project aims to build robust and user-friendly fleet management tooling, tailored for asynchronously managing devices that are capable of and intended to run NixOS. It inherits its motivational roots from NITS.
The logical model encompasses Coordinators, Agents, and Admins. CI/CD systems are merged into the Admin category.
flowchart TD;
Ag[Agents]
Ad[Admins]
C[Coordinator]
DB[(State)]
C --- DB
Ag -- report status --> C
C -- send updates --> Ag
Ad -- submit updates --> C
- Asynchronous update chain: Admin -> Coordinator -> Agent
- Overview all managed devices and their update status in as real-time technically possible
- No evaluation during update procedure
- Capability model suitable for organizations
- Audit trail of administrative actions
For machines that are not always directly reachable via a direct SSH connection, or may never be - e.g. if they are behind NAT or even Carrier-Grade-NAT.
Having intelligent process on-site allows a more sophisticated request towards the update repository, as well as more sophisticated update execution and reporting.
The assumption is that the devices are not capable of or it's undesired to build the configurations on them. Hence, there is a need for a build-cache and a trusted signature. And something like a CI/CD pipeline, or even just an admin that builds, signs, and pushes the binaries.
In this scenario it's redundant to evaluate again and there's already the need for a trusted signature for the binary cache. It's a low-hanging fruit to make the final closure the update payload, and transmit metadata to the devices that enables them to download and apply the update.
Nix-Fleet continues on the closely proximate NITS experiment and puts a different technological spin on the principles by swapping Go and NATS for Rust and Iroh for a few reasons. To start with the most subjective, it's the general purpose programming language that the initial author has been enjoying most in recent years for application development. More objectively, it promises for easier integration with the Rust-based Iroh, Snix, and NixOps4. All of which are promising integrations at various points down the line. Iroh has been selected for its native support for endpoint discovery in any network topology without the reliance on external overlay networking, and for its ease of building custom protocols on top of it.
Looking at the wider Nix ecosystem, there are open-source tools for pull-based updates that can provide valuable inspiration. The following list gives an analysis with counter indications that prevent each respective project to be a viable base for the architecture this project aims for. Please raise an issue or pull-request if you notice incorrect or missing important information.
Project | Evaluation | Admin | Server | Agent |
---|---|---|---|---|
Bento | on-device | Shell script | SFTP | same script as Admin on a systemd timer |
Comin | on-device | git commit/push | Git repository | Golang Agent Daemon periodically polls Git repositories |
npcnix | on-device | Rust CLI "packs" Nix Flake source and uploads it to S3 | (AWS) S3 | Rust Agent Daemon polls Nix Flake from S3 |
NixOS' native system.autoUpgrade |
on-device | All supported Flake URL types | depends on flake storage | Shell script on a timer |
The code is grouped by language or framework name.
This repository uses the blueprint structure.
/flake.nix
/flake.lock
/nix/ # blueprint set up underneath here.
/Cargo.toml
/Cargo.lock
/rust/ # all rust code lives here.
/rust/common/Cargo.toml
/rust/common/src/lib.rs
This project is currently funded through NGI Fediversity Fund, a fund established by NLnet with financial support from the European Commission's Next Generation Internet program. Learn more at the NLnet project page.
SPDX-License-Identifier: MIT OR Apache-2.0