This repository contains applications deployed on the home-cluster
via Flux using GitOps.
⚠️ Attention:
This repository is undergoing active restructuring and refactoring. The current documentation is not reflecting the current state of the repository!
A Kubernetes cluster needs to be bootstrapped with the Cilium CNI and Flux pointing to this repository.
For ksops and Flux to decrypt the initial secrets for configuring the External Secrets Operator using HashiCorp Vault, a Google Cloud Service Account with access to the correct KMS key needs to be set in the flux
namespace.
The repository is structured in:
- a common directory containing common applications applied to all clusters
- a templates directory containing common templates used in the cluster, especially for gatus
- a sites directory containing cluster specific applications
The templates directory contains the following templates:
- gatus - templates for the Gatus service status page
- internal-http - templates for internal HTTP services
- internal-tcp - templates for internal TCP services
The repository follows the app-of-apps pattern for each site.
The first Flux Kustomization
being defined needs to reference the app-of-apps
directory in the respective site directory.
These are bootstrapping the main Flux applications, referring to the respective <PROJECT>/applications/
kosutomizations:
infrastructure
: core cluster infrastructurecore
: core applicationsapplications
: (user) applications running on the cluster/networkhome-assistant
: Home Assistant related applications
Each of these applications follows the app-of-apps pattern again using sub-kustomizations defined in the respective application directories.
The common directory contains applications and templates that can be applied to all clusters.
The following applications are defined in common/infrastructure/
.
- Cilium - Provides the cluster CNI.
- External Secrets Operator - Synchronizes secrets from external stores to Kubernetes
Secret
objects. - Generic Device Plugin - Makes custom hardware devices accessible in the cluster.
- Kubelet Serving Cert Approver - Enables automatic certificate approval by the kubelet.
- MetalLB - Provides a Kubernetes network load balancer to expose Kubernetes
Service
s. - Metrics Server - Collects container resource metrics.
- NVIDIA Device Plugin - Makes the NVIDIA GPU accessible in the cluster.
- Reloader - Automatically reloads Kubernetes resources when secrets or configmaps change.
- Rook Ceph - Manages persistent storage in the cluster.
- Traefik - Exposes Kubernetes
Ingress
resources to the "outside world".
The following applications are defined in common/core/
.
- Adguard External DNS - Adguard DNS integration for External DNS.
- Cert Manager - Certificate management using ACME Let's Encrypt.
- CloudNativePG - PostgreSQL database operator.
- External DNS - Creates DNS records in Google Cloud DNS domains for publicly reachable services.
- Gatus - Service status page.
- Kyverno - Policy engine designed for Kubernetes.
- Monitoring (Victoria Metrics & Grafana) - Monitoring stack using Victoria Metrics and Grafana.
- Velero - Performs cluster backups.
The following applications are defined in common/applications/
.
- Frigate - NVR with real-time object detection for IP cameras.
- InfluxDB - InfluxDB time-series database.
- Ollama - Ollama local LLM model runner.
- Omada Controller - TP-Link Omada Controller.
The following applications are defined in common/home-assistant/
.
- EMQX - A MQTT broker.
- Home Assistant - The Home Assistant instance.
- PostgreSQL instance as the Home Assistant recorder target and configured via the CloudNativePG operator.
- Node-RED - Automation based on flows and Home Assistant data.
- Telegraf - Forwards Home Assistant state changes to a local InfluxDB instance.
- Z-Wave JS - Full featured Z-Wave Control Panel and MQTT Gateway.
The sites directory contains cluster specific applications.
The MUC site contains the following applications:
The following applications are defined in sites/muc/infrastructure/
.
- cilium
- external-secrets
- External Secrets Stores - Deploys the required
ClusterSecretStore
s and Vault credentials as KubernetesSecret
s.
- External Secrets Stores - Deploys the required
- kubelet-serving-cert-approver
- generic-device-plugin
- metallb
- metrics-server
- reloader
- traefik
The following applications are defined in sites/muc/core/
.
- adguard-external-dns
- cert-manager
- cloudnative-pg
- external-dns
- gatus
- monitoring
- velero
- Includes deployment of backup schedules.
The following applications are defined in sites/muc/applications/
.
- frigate
- external-services
- influxdb
The following applications are defined in sites/muc/home-assistant/
.
The VIE site contains the following applications:
The following applications are defined in sites/vie/infrastructure/
.
- cilium
- external-secrets
- External Secrets Stores - Deploys the required
ClusterSecretStore
s and Vault credentials as KubernetesSecret
s.
- External Secrets Stores - Deploys the required
- kubelet-serving-cert-approver
- metallb
- metrics-server
- nvidia-device-plugin
- reloader
- rook-ceph
- traefik
The following applications are defined in sites/vie/core/
.
- cert-manager
- cloudnative-pg
- external-dns
- gatus
- monitoring
- kyverno
- velero
- Includes deployment of backup schedules.
The following applications are defined in sites/vie/applications/
.
- external-services
- Immich - Photo management solution.
- influxdb
- LibreChat - Open-source chat application for AI conversations.
- Mealie - Recipe management application.
- Ollama - Run LLM models locally. (testing)
- omada-controller
The following applications are defined in sites/vie/home-assistant/
.
- ecowitt2mqtt - Forwards data received from ecowitt devices to the MQTT broker.
- emqx
- home-assistant
- node-red
- Ring MQTT - Amazon Ring devices to MQTT bridge.
- telegraf
- zwave
- Faster Whisper - Faster Whisper transcription with CTranslate2. (testing)
- OpenWakeWord - An open-source audio wake word (or phrase) detection framework. (testing)
- Piper - A local TTS server. (testing)
The Hochschule Burgenland site contains the applications used in the Hochschule Burgenland lectures.
This site is under active development.
The current backup and restore strategy consists of:
- CloudNativePG backups for persistent PostgreSQL data
- Home Assistant: see next section
- Velero as a second layer disaster recovery for critical workloads
Timewise, the layers of backups follow the strategy:
12:00am
: in-application backups02:00am
: Velero backups
Home Assistant related backup and restore is handled via S3 backups.
The following services implement an initContainer
as well as a nightly CronJob
to backup data to an S3 bucket. If no data is found in the Persistent Volume yet, the data from will be retrieved and copied over which results in a full restore.
- Ring MQTT
The following services use API calls to determine whether a backup or restore is necessary.
- Node-RED
- Home Assistant
- Z-Wave JS UI
The following services also have Git repositories to store their configuration which gets pulled in upon start.
- Home Assistant
- Home Assistant also defines it's own backup method via a
trigger
and ashell_command
, and doesn't rely on aCronJob
.
- Home Assistant also defines it's own backup method via a
- Ring MQTT
Kubernetes Resource Recommendations can be used to analyze the resource usage of the cluster and provide recommendations for optimizing the resource requests and limits.
kubectl apply -f https://raw.githubusercontent.com/robusta-dev/krr/refs/heads/main/docs/krr-in-cluster/krr-in-cluster-job.yaml
kubectl logs -l batch.kubernetes.io/job-name=krr > krr.txt
kubectl delete -f https://raw.githubusercontent.com/robusta-dev/krr/refs/heads/main/docs/krr-in-cluster/krr-in-cluster-job.yaml
By default, this generates a text file with the recommendations. To output any other format, you can use the -f
flag followed by the desired format.
If using JSON, you can use the jq
command to get a list of all changes:
# get all current CPU requests
cat krr.json | jq '.scans[].object.allocations.requests.cpu | select(. != "?") | select(. != null)' | awk '{ sum += $0 } END { print sum }'
# get all recommended CPU requests
cat krr.json | jq '.scans[].recommended.requests.cpu | select(.value != "?") | .value' | awk '{ sum += $0 } END { print sum }'
# get all current memory requests
cat krr.json | jq '.scans[].object.allocations.requests.memory | select(. != "?") | select(. != null)' | awk '{ sum += $0 } END { print sum/1.074e9 }'
# get all current memory limits
cat krr.json | jq '.scans[].object.allocations.limits.memory | select(. != "?") | select(. != null)' | awk '{ sum += $0 } END { print sum/1.074e9 }'
# get all recommended memory requests (= limits)
cat krr.json | jq '.scans[].recommended.requests.memory | select(.value != "?") | .value' | awk '{ sum += $0 } END { print sum/1.074e9 }'
- GitHub Actions are linting all YAML files.
- Renovate Bot is updating Helm releases and used container images in the
values.yaml
files, and GitHub Actions.