Model Validation Controller

This project is a proof of concept based on the sigstore/model-transperency-cli. It offers a Kubernetes/OpenShift operator designed to validate AI models before they are picked up by actual workload. This project provides a webhook that adds an initcontainer to perform model validation. The operator uses a custom resource to define how the models should be validated, such as utilizing Sigstore or public keys.

Features

Model Validation: Ensures AI models are validated before they are used by workloads.
Webhook Integration: A webhook automatically injects an initcontainer into pods to perform the validation step.
Custom Resource: Configurable ModelValidation custom resource to specify how models should be validated.
- Supports methods like Sigstore, pki or public key validation.

Prerequisites

Kubernetes 1.29+ or OpenShift 4.16+
Proper configuration for model validation (e.g., Sigstore, public keys)
A signed model (e.g. check the testdata or examples folder)

Installation

The operator can be installed via kustomize using different deployment overlays.

Production Deployment

For production environments with cert-manager integration:

kubectl apply -k https://raw.githubusercontent.com/sigstore/model-validation-operator/main/config/overlays/production
# or local
kubectl apply -k config/overlays/production

Testing Deployment

For testing environments with manual certificate management:

kubectl apply -k https://raw.githubusercontent.com/sigstore/model-validation-operator/main/config/overlays/testing
# or local
kubectl apply -k config/overlays/testing

Development Deployment

For development environments, deploying the operator without the webhook integration:

kubectl apply -k https://raw.githubusercontent.com/sigstore/model-validation-operator/main/config/overlays/development
# or local
kubectl apply -k config/overlays/development

OLM Deployment

For OpenShift/OLM environments:

kubectl apply -k https://raw.githubusercontent.com/sigstore/model-validation-operator/main/config/overlays/olm
# or local
kubectl apply -k config/overlays/olm

Uninstall

To uninstall the operator, use the same overlay you used for installation:

kubectl delete -k config/overlays/production

Configuration Structure

The operator uses a kustomize based, overlay configuration structure, aiming to separate generated content from environment specific content:

config/
├── crd/                      # Custom Resource Definitions
├── rbac/                     # RBAC permissions
├── webhook/                  # Webhook configuration
├── manager/                  # Controller manager deployment
├── manifests/                # OLM manifests
├── components/               # Reusable components
│   ├── webhook/              # Webhook service component
│   ├── certmanager/          # Certificate manager component
│   ├── manual-tls/           # Manual TLS configuration
│   ├── metrics-port/         # Metrics configuration
│   └── webhook-replacements/ # Webhook configuration replacements
└── overlays/                 # Environment-specific overlays
    ├── production/           # Production (cert-manager)
    ├── development/          # Development (operator only, no webhooks)
    ├── testing/              # Testing (manual, self-signed certs)
    └── olm/                  # OpenShift/OLM

Certificate Management

The operator supports different certificate management approaches:

Production: Uses cert-manager for automatic certificate management
- ⚠️ Important: The default cert-manager configuration uses self-signed certificates
- For production environments, you should configure cert-manager with a proper CA issuer
Development: Does not use certificates, there are no webhook configurations in this overlay
Testing: Uses manual, self-signed certificate management for testing scenarios
OLM: Uses OLM's built-in certificate management for OpenShift deployments

Running the Webhook Server Locally

The webhook server requires TLS certificates. When you run the operator locally, certificates will be generated automatically:

make run

This command will start the webhook server on https://localhost:9443, using the generated certs.

Known limitations

The project is at an early stage and therefore has some limitations.

There is no validation or defaulting for the custom resource.
The validation is namespace scoped and cannot be used across multiple namespaces.
No more than one validation resource can be used per namespace.
There are no status fields for the custom resource.
The model and signature path must be specified, there is no auto discovery.
TLS certificates used by the webhook are self generated.

Usage

First, a ModelValidation CR must be created as follows:

apiVersion: ml.sigstore.dev/v1alpha1
kind: ModelValidation
metadata:
  name: demo
spec:
  config:
    sigstoreConfig:
      certificateIdentity: "https://github.com/sigstore/model-validation-operator/.github/workflows/sign-model.yaml@refs/tags/v0.0.2"
      certificateOidcIssuer: "https://token.actions.githubusercontent.com"
  model:
    path: /data/tensorflow_saved_model
    signaturePath: /data/tensorflow_saved_model/model.sig

All pods in the namespace where the custom resource exists that have this label validation.ml.sigstore.dev/ml: "true" will be validated. It should be noted that this does not apply to subsequently labeled pods.

apiVersion: v1
kind: Pod
metadata:
  name: whatever-workload
+  labels:
+    validation.ml.sigstore.dev/ml: "true"
spec:
  restartPolicy: Never
  containers:
  - name: whatever-workload
    image: nginx
    ports:
    - containerPort: 80
    volumeMounts:
    - name: model-storage
      mountPath: /data
  volumes:
  - name: model-storage
    persistentVolumeClaim:
      claimName: models

Examples

The example folder contains example files for testing the operator.

Prerequisites for Examples

Before running the examples, create a namespace for testing (separate from the operator namespace):

kubectl create namespace testing

Important: Do not deploy examples in the operator namespace (e.g., model-validation-operator-system). The operator namespace has the label validation.ml.sigstore.dev/ignore: "true" which prevents the webhook from processing pods in that namespace.

Example Files

prepare.yaml: Contains a persistent volume claim and a job that downloads a signed test model.

kubectl apply -f https://raw.githubusercontent.com/sigstore/model-validation-operator/main/examples/prepare.yaml -n testing
# or local
kubectl apply -f examples/prepare.yaml -n testing

verify.yaml: Contains a model validation manifest for the validation of this model and a demo pod, which is provided with the appropriate label for validation.

kubectl apply -f https://raw.githubusercontent.com/sigstore/model-validation-operator/main/examples/verify.yaml -n testing
# or local
kubectl apply -f examples/verify.yaml -n testing

unsigned.yaml: Contains an example of a pod that would fail validation (for testing purposes).

kubectl apply -f https://raw.githubusercontent.com/sigstore/model-validation-operator/main/examples/unsigned.yaml -n testing
# or local
kubectl apply -f examples/unsigned.yaml -n testing

After the example installation, the logs of the generated job should show a successful download:

$ kubectl logs -n testing job/download-extract-model 
Connecting to github.com (140.82.121.3:443)
Connecting to objects.githubusercontent.com (185.199.108.133:443)
saving to '/data/tensorflow_saved_model.tar.gz'
tensorflow_saved_mod  44% |**************                  | 3983k  0:00:01 ETA
tensorflow_saved_mod 100% |********************************| 8952k  0:00:00 ETA
'/data/tensorflow_saved_model.tar.gz' saved
./
./model.sig
./variables/
./variables/variables.data-00000-of-00001
./variables/variables.index
./saved_model.pb
./fingerprint.pb

The operator logs should show that a pod has been modified:

$ kubectl logs -n model-validation-operator-system deploy/model-validation-controller-manager
time=2025-01-20T22:13:05.051Z level=INFO msg="Starting webhook server on :9443"
time=2025-01-20T22:13:47.556Z level=INFO msg="new request, path: /mutate-v1-pod"
time=2025-01-20T22:13:47.557Z level=INFO msg="Execute webhook"
time=2025-01-20T22:13:47.560Z level=INFO msg="Search associated Model Validation CR" pod=whatever-workload namespace=testing
time=2025-01-20T22:13:47.591Z level=INFO msg="construct args"
time=2025-01-20T22:13:47.591Z level=INFO msg="found sigstore config"

Finally, the test pod should be running and the injected initcontainer should have been successfully validated.

$ kubectl logs -n testing whatever-workload model-validation
INFO:__main__:Creating verifier for sigstore
INFO:tuf.api._payload:No signature for keyid f5312f542c21273d9485a49394386c4575804770667f2ddb59b3bf0669fddd2f
INFO:tuf.api._payload:No signature for keyid ff51e17fcf253119b7033f6f57512631da4a0969442afcf9fc8b141c7f2be99c
INFO:tuf.api._payload:No signature for keyid ff51e17fcf253119b7033f6f57512631da4a0969442afcf9fc8b141c7f2be99c
INFO:tuf.api._payload:No signature for keyid ff51e17fcf253119b7033f6f57512631da4a0969442afcf9fc8b141c7f2be99c
INFO:tuf.api._payload:No signature for keyid ff51e17fcf253119b7033f6f57512631da4a0969442afcf9fc8b141c7f2be99c
INFO:__main__:Verifying model signature from /data/model.sig
INFO:__main__:all checks passed

In case the workload is modified, is not executed:

ERROR:__main__:verification failed: the manifests do not match

Name		Name	Last commit message	Last commit date
Latest commit History 88 Commits
.devcontainer		.devcontainer
.github/workflows		.github/workflows
api/v1alpha1		api/v1alpha1
cmd		cmd
config		config
examples		examples
hack		hack
internal		internal
scripts		scripts
test		test
testdata/tensorflow_saved_model		testdata/tensorflow_saved_model
tls		tls
.dockerignore		.dockerignore
.gitignore		.gitignore
.golangci.yml		.golangci.yml
CONTRIBUTORS.md		CONTRIBUTORS.md
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
PROJECT		PROJECT
README.md		README.md
ROADMAP.md		ROADMAP.md
bundle.Dockerfile		bundle.Dockerfile
generate-tls-openssl.sh		generate-tls-openssl.sh
generate-tls.sh		generate-tls.sh
go.mod		go.mod
go.sum		go.sum

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Model Validation Controller

Features

Prerequisites

Installation

Production Deployment

Testing Deployment

Development Deployment

OLM Deployment

Uninstall

Configuration Structure

Certificate Management

Running the Webhook Server Locally

Known limitations

Usage

Examples

Prerequisites for Examples

Example Files

About

Uh oh!

Releases 2

Packages

Uh oh!

Contributors 5

Uh oh!

Languages

License

sigstore/model-validation-operator

Folders and files

Latest commit

History

Repository files navigation

Model Validation Controller

Features

Prerequisites

Installation

Production Deployment

Testing Deployment

Development Deployment

OLM Deployment

Uninstall

Configuration Structure

Certificate Management

Running the Webhook Server Locally

Known limitations

Usage

Examples

Prerequisites for Examples

Example Files

About

Resources

License

Code of conduct

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors 5

Uh oh!

Languages

Packages