GitHub - godatadriven/hp-fury-setup

HP-Fury

GitOps repository for the HP-Fury machine.

Getting started

To get network access to the machine:

Login to tailscale.com using your Xebia e-mail address and install the Tailscale VPN client. Ping Julian to have your Tailscale account approved.
Once your account is approved, connect to the Tailscale VPN and check that you can reach the machine: ping hp-fury.tail6720f8.ts.net.

To setup access for Kubernetes:

Install kubectl on your machine: brew install kubernetes-cli.
Optional, but highly recommended:
- Install K9s for a interactive CLI interface for K8s: brew install k9s.
- Install kubectx to easily switch between K8s contexts and namespaces: brew install kubectx.
- Install kustomize to manually deploy applications to the cluster using Helm through kustomize
  - Install helm to use kustomize to build helm charts
Look for de HP-Fury - kube-config secret in the Xebia Data 1Password and copy the kube-config.yaml file to ~/.kube/config on your machine (or merge it with your existing config if you already have one).
Verify that you can connect to the cluster by running kubectl cluster-info. This should return details from the cluster at hp-fury.tail6720f8.ts.net. If you see another cluster, check you have the right context selected for kubectl.

If you need SSH access, please ping Julian to setup an account etc. on the machine.

Guidelines

When developing on this machine, please adhere to the following guidelines:

Don't use the default namespace.
Create separate namespace(s) for your application(s). This can either be:

A personal namespace in your name (e.g. jrderuiter).
A namespace named after the application (e.g. argocd).
A namespace grouping multiple applications (e.g. monitoring).

Don't put any super sensitive data on the machine.
Don't store any secrets in Git. (Use sealed-secrets or the secret generator instead, as outlined below).
Ping our Slack channel #hp-zp-g5 whenever you plan to use significant resources (such as the GPUs). This to both check whether they're available and to notify others.
In general: be mindful of others!

Usage

Adding Kubernetes resources

To add an application or other Kubernetes resources:

Create a branch on this repo (e.g. feature/add-langfuse).
Create a folder under applications (e.g. applications/<name>) for your resources.

If you're using kustomize (recommended), please use the following folder structure:

patches (directory for kustomize patches, if needed)
resources (directory for your manifest files)
- namespace.yml
- ...
kustomization.yml

For Helm-based applications, you can use Helm within Kustomize (see applications/sealed-secrets for an example).

Deploying Kubernetes resources manually

During development, you can deploy your Kubernetes resources manually using kubectl, e.g.:

# For plain Kustomize
kubectl apply -k applications/<name>

or

# For Kustomize with Helm
kustomize build --enable-helm applications/<name> | kubectl apply -f -

Deploying Kubernetes resources with ArgoCD

For longer running applications, you can deploy your resources via ArgoCD. The main advantage of this approach is that ArgoCD will monitor your application and ensure that it remains in sync with the manifests defined in this repository. Additionally, it will also monitor the application health and warn if there are any problems.

To deploy your application via ArgoCD:

On your feature branch, define your application resources under applications/<name> as outlined above.
On the main branch, create an application manifest for your application under cluster/applications, replacing <name>, <namespace> and <branch> with the correct values:

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  # Name of your application.
  name: <name>
  namespace: argocd
  annotations:
    argocd.argoproj.io/sync-wave: "0"
  finalizers:
    - resources-finalizer.argocd.argoproj.io
spec:
  project: default
  source:
    repoURL: https://github.com/godatadriven/hp-fury.git
    # Branch under which to find your application manifests.
    targetRevision: <branch>
    # Path under which to find your application manifests.
    path: applications/<name>
  destination:
    server: https://kubernetes.default.svc
    # Namespace where your resources should be deployed.
    namespace: <namespace>
  syncPolicy:
    automated:
      prune: true
      selfHeal: true

Push your changes.
Open the ArgoCD UI at https://argocd.hp-fury.internal and login (username and password are in the Xebia 1PW).
On the ArgoCD applications page, refresh the applications application. ArgoCD should find and start deploying your application.
If you run into any issues, fix the issues on your feature branch and push the changes.
When you're done, merge your changes and change the application manifest (cluster/applications/<name>.yml) to deploy the application from the main branch instead of your feature branch.

Managing secrets

To safely save secrets alongside your application manifests in Git, we currently provide two different approaches depending on your needs.

Generate the secret using kubernetes-secret-generator

If you don't care about the exact value of a secret (for example, if the secret is used for two applications to talk to each other), you can ask kubernetes-secret-generator to generate a random secret for you. See their documentation for more details on how to do so.

Encrypt the secret using sealed-secrets

If you do care about the value of a secret (for example, if you want to use the secret to login to a service), you can use sealed-secrets to encrypt your secret(s) and commit the encrypted values in Git. The sealed-secrets service in the cluster will then automatically decrypt the secrets when your application is deployed.

To encrypt a secret using sealed secrets:

Install kubeseal and yq using brew install kubeseal yq.
Encrypt your secret using kubeseal, e.g.:

kubectl create secret generic <secret-name> --namespace <namespace> --dry-run=client -o json --from-literal=<name>=<value> --from-literal=... | kubeseal --controller-namespace sealed-secrets --controller-name sealed-secrets | yq -p json

Add the resulting YAML output to your application resources.

Note that you can't change the namespace of the generated sealed secret. If you want to change the namespace, you'll have to re-encrypt the secrets. (This is intentional to stop other users from using your encrypted secret in their own namespace.)

Using the GPUs

To request a GPU for a container, configure the following resource limit/request on the container:

      resources:
        limits:
          nvidia.com/gpu: 1

This will instruct the gpu-operator to inject the GPU and required libraries into your container.

Configuring ingress for service(s)

If you want to expose a service via HTTP(S) (e.g. for a web UI), you need to configure an ingress for the service.

Pick an address for your application, e.g. my-app.hp-fury.internal.
Add this DNS entry in Pihole by adding it in the customDnsEntries section in applications/pihole/kustomization.yml. It's probably easiest to make these changes directly on the main branch, in which case ArgoCD should pick up the changes if you refresh Pihole from the UI.
Add an ingress resource with this address in your application resources:

# See applications/argocd/resources/ingress.yml for the full example.
spec:
  ingressClassName: traefik
  rules:
    - host: my-app.hp-fury.internal  # DNS address of your application, as added in PiHole.
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: my-app         # Name of your service, should point to your service.
                port:
                  name: http         # Name of the service port, should match your service.

This will allow you to access your service under https://my-app.hp-fury.internal in your browser.

Initial installation of the machine (for reference)

The below steps were used to bootstrap the machine and are kept for reference.

1. Networking

Install Tailscale following the official guide.
Install openssh-server using apt-get and harden the configuration (e.g. publickey access).
Enable UFW, allowing access for SSH.

2. Installing K3s

Add the K3s network ranges in UFW K3s ranges.
Install K3S following the official guide.
(Optional) Add your Tailscale DNS name to /etc/rancher/k3s/config.yaml and run k3s certificate rotate to generate certificates including your Tailscale DNS name:

tls-san:
  - "hp-fury.tail6720f8.ts.net"

3. Configuring ArgoCD

Create the initial ArgoCD installation using kubectl apply -k applications/argocd.
Use K9s to forward port 8080 from the argo-server service and open the ArgoCD UI at http://localhost:8080.
Retrieve initial the login password from the K8s secret argocd-initial-admin-secret in the cluster and login using admin/<initial-password>.
In the UI, go to User Info > Update Password and change the admin user password to something secret.
Configure any required repositories under Settings > Repositories. For GitHub, use HTTPS with a fine-grained access token as password. The token only needs to have read permissions on the repo content.
Deploy the applications using kubectl apply -k cluster.

4. Configuring the GPU drivers

Run the following commands to install the drivers, as detailed here.

distro=ubuntu2204 arch=x86_64 wget https://developer.download.nvidia.com/compute/cuda/repos/$distro/$arch/cuda-keyring_1.1-1_all.deb
sudo dpkg -i cuda-keyring_1.1-1_all.deb
sudo apt update
sudo apt install cuda-drivers cuda-toolkit nvidia-gds nvidia-container-toolkit

Check whether the drivers where installed correctly using nvidia-smi.
Enable the nvidia container runtime as default for k3s by editing /etc/rancher/k3s/config.yaml and adding the setting default-runtime: "nvidia".
Install the gpu-operator from applications/gpu-operator (e.g. using ArgoCD).
Test your installation by running an example application, e.g.:

apiVersion: v1
kind: Pod
metadata:
  name: gpu-operator-test
spec:
  restartPolicy: OnFailure
  containers:
    - name: cuda-vector-add
      # You either need to specify this env var or the resource limit
      # request below to ensure access to the GPU.
      env:
        - name: NVIDIA_VISIBLE_DEVICES
          value:  all
      image: nvcr.io/nvidia/cloud-native/gpu-operator-validator:v25.3.0
      command: [/bin/sh, -c]
      args: [vectorAdd]
      resources:
        limits:
          nvidia.com/gpu: 1
  # Needed if the default runtime for k3s isn't set to nvidia.
  # runtimeClassName: nvidia

5. Setting up DNS via Pihole

Before we can install Pihole, we need to disable systemd-dns to free up port 53: https://gist.github.com/zoilomora/f7d264cefbb589f3f1b1fc2cea2c844c.
Next, install Pihole on the cluster (applications/pihole) adding any local addresses under customDnsEntries.
Once Pihole is running, check if you can query Pihole using dig, e.g. using dig google.com @<ip-of-machine-running-pihole>.
In the Tailscale admin console, open the DNS tab. Under nameservers, add the IP of the machine running Pihole and enable Override DNS servers.

Name		Name	Last commit message	Last commit date
Latest commit History 112 Commits
applications		applications
cluster		cluster
.gitignore		.gitignore
LICENSE		LICENSE
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

HP-Fury

Getting started

Guidelines

Usage

Adding Kubernetes resources

Deploying Kubernetes resources manually

Deploying Kubernetes resources with ArgoCD

Managing secrets

Generate the secret using kubernetes-secret-generator

Encrypt the secret using sealed-secrets

Using the GPUs

Configuring ingress for service(s)

Initial installation of the machine (for reference)

1. Networking

2. Installing K3s

3. Configuring ArgoCD

4. Configuring the GPU drivers

5. Setting up DNS via Pihole

About

Uh oh!

Releases

Packages

Contributors 5

Uh oh!

Languages

License

godatadriven/hp-fury-setup

Folders and files

Latest commit

History

Repository files navigation

HP-Fury

Getting started

Guidelines

Usage

Adding Kubernetes resources

Deploying Kubernetes resources manually

Deploying Kubernetes resources with ArgoCD

Managing secrets

Generate the secret using kubernetes-secret-generator

Encrypt the secret using sealed-secrets

Using the GPUs

Configuring ingress for service(s)

Initial installation of the machine (for reference)

1. Networking

2. Installing K3s

3. Configuring ArgoCD

4. Configuring the GPU drivers

5. Setting up DNS via Pihole

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 5

Uh oh!

Languages

Packages