A High-Performance RDMA based Distributed Storage System.
Blackbird draws inspiration from Microsoft/FARM and RDMA based KV store among other projects. It also takes cues from Redis for its simplicity and ubiquity. It delivers intelligent data placement, enabling applications to seamlessly offload data to a high-performance tiered system managing placement and latency for you.
Use cases: HPC & ML training/inference pipelines, realtime analytics, feature stores, and metadata-heavy services where Redis/Memcached lack tiering or RDMA, and Alluxio is too heavyweight.
- RDMA-first performance: UCX (RoCE/InfiniBand) with TCP fallback; zero-copy fast path
- Tiered caching: GPU memory → CPU DRAM → NVMe; policy-driven placement and eviction
- High availability: Keystone control-plane with leader election & failover (etcd)
- Placement engine: Topology-aware worker selection & load balancing
- Batch APIs: High-throughput batched puts/gets/exists
- Observability: Prometheus-style
/metrics
, health, and cluster stats
# 1) Start etcd
etcd --listen-client-urls http://localhost:2379 \
--advertise-client-urls http://localhost:2379
# 2) Start Keystone (example)
./examples/keystone_example --etcd-endpoints localhost:2379
# Keystone:
# - RPC: :9090
# - Metrics (Prometheus): :9091/metrics
// Existence
auto exists = keystone_service->object_exists("my_key");
// Worker lookup
auto workers = keystone_service->get_workers("my_key");
// Put workflow
auto placements = keystone_service->put_start("my_key", data_size, worker_config);
// ... perform UCX transfers to placements ...
auto ok = keystone_service->put_complete("my_key");
// Remove
auto removed = keystone_service->remove_object("my_key");
// Batch ops
auto ex = keystone_service->batch_object_exists(keys);
auto w = keystone_service->batch_get_workers(keys);
auto ps = keystone_service->batch_put_start(keys, sizes, config);
Data model basics
- Key: string identifier
- Placements: one-or-more workers per key (policy-driven)
- TTL: optional expiry; Soft pin: opt-out of eviction
- UCX fields: endpoint addresses, rkeys, region descriptors
# Prometheus metrics
curl -s http://localhost:9091/metrics | head -n 50
Programmatic stats:
auto stats = keystone_service->get_cluster_stats();
if (is_ok(stats)) {
auto s = get_value(stats);
std::cout << "Active clients: " << s.active_clients << "\n";
std::cout << "Total objects: " << s.total_objects << "\n";
std::cout << "Utilization: " << (s.utilization * 100) << "%\n";
}
Health signals:
- Client heartbeats & TTL expiry
- Worker liveness & chunk health
- Automatic recovery & cleanup of orphaned placements
- Object metadata & locations; worker liveness/status
- Placement & load balancing; admission control
- Client/session tracking; automatic failure handling
- TTL/GC of objects; eviction coordination
- UCX endpoints; registered memory for RDMA
- Local tier managers (GPU/DRAM/NVMe) with pluggable policies
- Background compaction/defragmentation (future)
- Service discovery & registration
- Leader election for Keystone HA
- Distributed configuration and health registry
- C++20 compiler (GCC ≥10 or Clang ≥12)
- CMake ≥3.20
- UCX ≥1.12
- etcd ≥3.4
- Libraries:
glog
,nlohmann/json
, yaLanTingLibs
git clone https://github.com/blackbird-io/blackbird.git
cd blackbird
mkdir build && cd build
cmake -DCMAKE_BUILD_TYPE=Release ..
make -j"$(nproc)"
sudo make install # optional
cmake -S . -B build -DBUILD_TESTS=ON
cmake --build build -j"$(nproc)"
cd build && ctest --output-on-failure
- v0.1: Keystone MVP, basic client SDK, Prometheus metrics
- v0.2: UCX client library GA, placement policies, benchmark suite
- v0.3: Tier managers (GPU/DRAM/NVMe) + compaction/defrag
- v0.4: Security (mTLS), ACLs, encryption-at-rest/in-flight
- v1.0: Stability, perf tuning, operability hardening
Feature | Blackbird | Redis Cluster | Memcached | Alluxio |
---|---|---|---|---|
RDMA Support | ✅ Native | ❌ | ❌ | |
Multi-tier Caching | ✅ | ❌ | ❌ | ✅ |
Service Discovery | ✅ etcd | ❌ | ✅ | |
High Availability | ✅ | ✅ | ❌ | ✅ |
Language | C++20 | C | C | Java/Scala |
We welcome issues and PRs.
- Fork the repo
- Create a branch:
git checkout -b feature/awesome
- Add your feature.
clang-format
/cppcheck
if available- Open a PR
See CONTRIBUTING.md
and CODE_OF_CONDUCT.md
(coming soon).
Apache - see LICENSE.