diff --git a/site-src/guides/index.md b/site-src/guides/index.md index b0faa4971..b84f90228 100644 --- a/site-src/guides/index.md +++ b/site-src/guides/index.md @@ -83,6 +83,84 @@ Tooling: kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/releases/latest/download/manifests.yaml ``` +### Install the Gateway + + 1. Requirements + - Gateway API [CRDs](https://gateway-api.sigs.k8s.io/guides/#installing-gateway-api) Installed. + + Choose one of the following options to install Gateway. + +=== "GKE" + + 1. Enable the Google Kubernetes Engine API, Compute Engine API, the Network Services API and configure proxy-only subnets when necessary. + + See [Deploy Inference Gateways](https://cloud.google.com/kubernetes-engine/docs/how-to/deploy-gke-inference-gateway) for detailed instructions. + +=== "Istio" + + Please note that this feature is currently in an experimental phase and is not intended for production use. + The implementation and user experience are subject to changes as we continue to iterate on this project. + + 1. Install Istio + + ``` + TAG=$(curl https://storage.googleapis.com/istio-build/dev/1.28-dev) + # on Linux + wget https://storage.googleapis.com/istio-build/dev/$TAG/istioctl-$TAG-linux-amd64.tar.gz + tar -xvf istioctl-$TAG-linux-amd64.tar.gz + # on macOS + wget https://storage.googleapis.com/istio-build/dev/$TAG/istioctl-$TAG-osx.tar.gz + tar -xvf istioctl-$TAG-osx.tar.gz + # on Windows + wget https://storage.googleapis.com/istio-build/dev/$TAG/istioctl-$TAG-win.zip + unzip istioctl-$TAG-win.zip + + ./istioctl install --set tag=$TAG --set hub=gcr.io/istio-testing --set values.pilot.env.ENABLE_GATEWAY_API_INFERENCE_EXTENSION=true + ``` + +=== "Kgateway" + + [Kgateway](https://kgateway.dev/) added Inference Gateway support as a **technical preview** in the + [v2.0.0 release](https://github.com/kgateway-dev/kgateway/releases/tag/v2.0.0). InferencePool v1.0.0 is currently supported in the latest [rolling release](https://github.com/kgateway-dev/kgateway/releases/tag/v2.1.0-main), which includes the latest changes but may be unstable until the [v2.1.0 release](https://github.com/kgateway-dev/kgateway/milestone/58) is published. + + 1. Requirements + + - [Helm](https://helm.sh/docs/intro/install/) installed. + + 2. Set the Kgateway version and install the Kgateway CRDs. + + ```bash + KGTW_VERSION=v2.1.0-main + helm upgrade -i --create-namespace --namespace kgateway-system --version $KGTW_VERSION kgateway-crds oci://cr.kgateway.dev/kgateway-dev/charts/kgateway-crds + ``` + + 3. Install Kgateway + + ```bash + helm upgrade -i --namespace kgateway-system --version $KGTW_VERSION kgateway oci://cr.kgateway.dev/kgateway-dev/charts/kgateway --set inferenceExtension.enabled=true + ``` + +=== "Agentgateway" + + [Agentgateway](https://agentgateway.dev/) is a purpose-built proxy designed for AI workloads, and comes with native support for Inference Gateway. Agentgateway integrates with [Kgateway](https://kgateway.dev/) as it's control plane. InferencePool v1.0.0 is currently supported in the latest [rolling release](https://github.com/kgateway-dev/kgateway/releases/tag/v2.1.0-main), which includes the latest changes but may be unstable until the [v2.1.0 release](https://github.com/kgateway-dev/kgateway/milestone/58) is published. + + 1. Requirements + + - [Helm](https://helm.sh/docs/intro/install/) installed. + + 2. Set the Kgateway version and install the Kgateway CRDs. + + ```bash + KGTW_VERSION=v2.1.0-main + helm upgrade -i --create-namespace --namespace kgateway-system --version $KGTW_VERSION kgateway-crds oci://cr.kgateway.dev/kgateway-dev/charts/kgateway-crds + ``` + + 3. Install Kgateway + + ```bash + helm upgrade -i --namespace kgateway-system --version $KGTW_VERSION kgateway oci://cr.kgateway.dev/kgateway-dev/charts/kgateway --set inferenceExtension.enabled=true --set agentGateway.enabled=true + ``` + ### Deploy the InferencePool and Endpoint Picker Extension Install an InferencePool named `vllm-llama3-8b-instruct` that selects from endpoints with label `app: vllm-llama3-8b-instruct` and listening on port 8000. The Helm install command automatically installs the endpoint-picker, inferencepool along with provider specific resources. @@ -135,17 +213,13 @@ Tooling: oci://registry.k8s.io/gateway-api-inference-extension/charts/inferencepool ``` -### Deploy an Inference Gateway +### Deploy the Inference Gateway Choose one of the following options to deploy an Inference Gateway. === "GKE" - 1. Enable the Google Kubernetes Engine API, Compute Engine API, the Network Services API and configure proxy-only subnets when necessary. - See [Deploy Inference Gateways](https://cloud.google.com/kubernetes-engine/docs/how-to/deploy-gke-inference-gateway) - for detailed instructions. - - 2. Deploy Inference Gateway: + 1. Deploy the Inference Gateway: ```bash kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/gke/gateway.yaml @@ -158,13 +232,13 @@ Tooling: NAME CLASS ADDRESS PROGRAMMED AGE inference-gateway inference-gateway True 22s ``` - 3. Deploy the HTTPRoute + 2. Deploy the HTTPRoute ```bash kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/gke/httproute.yaml ``` - 4. Confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True`: + 3. Confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True`: ```bash kubectl get httproute llm-route -o yaml @@ -172,31 +246,7 @@ Tooling: === "Istio" - Please note that this feature is currently in an experimental phase and is not intended for production use. - The implementation and user experience are subject to changes as we continue to iterate on this project. - - 1. Requirements - - - Gateway API [CRDs](https://gateway-api.sigs.k8s.io/guides/#installing-gateway-api) installed. - - 2. Install Istio - - ``` - TAG=$(curl https://storage.googleapis.com/istio-build/dev/1.28-dev) - # on Linux - wget https://storage.googleapis.com/istio-build/dev/$TAG/istioctl-$TAG-linux-amd64.tar.gz - tar -xvf istioctl-$TAG-linux-amd64.tar.gz - # on macOS - wget https://storage.googleapis.com/istio-build/dev/$TAG/istioctl-$TAG-osx.tar.gz - tar -xvf istioctl-$TAG-osx.tar.gz - # on Windows - wget https://storage.googleapis.com/istio-build/dev/$TAG/istioctl-$TAG-win.zip - unzip istioctl-$TAG-win.zip - - ./istioctl install --set tag=$TAG --set hub=gcr.io/istio-testing --set values.pilot.env.ENABLE_GATEWAY_API_INFERENCE_EXTENSION=true - ``` - - 3. Deploy Gateway + 1. Deploy the Inference Gateway ```bash kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/istio/gateway.yaml @@ -209,13 +259,13 @@ Tooling: inference-gateway inference-gateway True 22s ``` - 4. Deploy the HTTPRoute + 2. Deploy the HTTPRoute ```bash kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/istio/httproute.yaml ``` - 5. Confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True`: + 3. Confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True`: ```bash kubectl get httproute llm-route -o yaml @@ -223,28 +273,7 @@ Tooling: === "Kgateway" - [Kgateway](https://kgateway.dev/) added Inference Gateway support as a **technical preview** in the - [v2.0.0 release](https://github.com/kgateway-dev/kgateway/releases/tag/v2.0.0). InferencePool v1.0.0 is currently supported in the latest [rolling release](https://github.com/kgateway-dev/kgateway/releases/tag/v2.1.0-main), which includes the latest changes but may be unstable until the [v2.1.0 release](https://github.com/kgateway-dev/kgateway/milestone/58) is published. - - 1. Requirements - - - [Helm](https://helm.sh/docs/intro/install/) installed. - - Gateway API [CRDs](https://gateway-api.sigs.k8s.io/guides/#installing-gateway-api) installed. - - 2. Set the Kgateway version and install the Kgateway CRDs. - - ```bash - KGTW_VERSION=v2.1.0-main - helm upgrade -i --create-namespace --namespace kgateway-system --version $KGTW_VERSION kgateway-crds oci://cr.kgateway.dev/kgateway-dev/charts/kgateway-crds - ``` - - 3. Install Kgateway - - ```bash - helm upgrade -i --namespace kgateway-system --version $KGTW_VERSION kgateway oci://cr.kgateway.dev/kgateway-dev/charts/kgateway --set inferenceExtension.enabled=true - ``` - - 4. Deploy the Gateway + 1. Deploy the Inference Gateway ```bash kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/kgateway/gateway.yaml @@ -257,13 +286,13 @@ Tooling: inference-gateway kgateway True 22s ``` - 5. Deploy the HTTPRoute + 2. Deploy the HTTPRoute ```bash kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/kgateway/httproute.yaml ``` - 6. Confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True`: + 3. Confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True`: ```bash kubectl get httproute llm-route -o yaml @@ -271,27 +300,7 @@ Tooling: === "Agentgateway" - [Agentgateway](https://agentgateway.dev/) is a purpose-built proxy designed for AI workloads, and comes with native support for Inference Gateway. Agentgateway integrates with [Kgateway](https://kgateway.dev/) as it's control plane. InferencePool v1.0.0 is currently supported in the latest [rolling release](https://github.com/kgateway-dev/kgateway/releases/tag/v2.1.0-main), which includes the latest changes but may be unstable until the [v2.1.0 release](https://github.com/kgateway-dev/kgateway/milestone/58) is published. - - 1. Requirements - - - [Helm](https://helm.sh/docs/intro/install/) installed. - - Gateway API [CRDs](https://gateway-api.sigs.k8s.io/guides/#installing-gateway-api) installed. - - 2. Set the Kgateway version and install the Kgateway CRDs. - - ```bash - KGTW_VERSION=v2.1.0-main - helm upgrade -i --create-namespace --namespace kgateway-system --version $KGTW_VERSION kgateway-crds oci://cr.kgateway.dev/kgateway-dev/charts/kgateway-crds - ``` - - 3. Install Kgateway - - ```bash - helm upgrade -i --namespace kgateway-system --version $KGTW_VERSION kgateway oci://cr.kgateway.dev/kgateway-dev/charts/kgateway --set inferenceExtension.enabled=true --set agentGateway.enabled=true - ``` - - 4. Deploy the Gateway + 1. Deploy the Inference Gateway ```bash kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/agentgateway/gateway.yaml @@ -304,13 +313,13 @@ Tooling: inference-gateway agentgateway True 22s ``` - 5. Deploy the HTTPRoute + 2. Deploy the HTTPRoute ```bash kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/agentgateway/httproute.yaml ``` - 6. Confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True`: + 3. Confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True`: ```bash kubectl get httproute llm-route -o yaml