Docs: Updates Benchmark Guide #789

danehans · 2025-05-06T23:51:09Z

config/manifests/benchmark/benchmark.yaml: Updates the target port to 80 since the guide only sets the target-ip and most gateways use port 80 by default.
site-src/performance/benchmark/index.md: Provides additional explanation in steps to help guide users. Adds a note to use the GPU-based vLLM deployment for benchmarking. Updates the benchmark_id value to match the labels in tools/benchmark/benchmark.ipynb.
tools/benchmark/benchmark.ipynb: Sets the default run id and removes undefined INTERACTIVE_PLOT variable.

Signed-off-by: Daneyon Hansen <[email protected]>

k8s-ci-robot · 2025-05-06T23:51:16Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: danehans

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [danehans]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

netlify · 2025-05-06T23:51:25Z

✅ Deploy Preview for gateway-api-inference-extension ready!

Name	Link
🔨 Latest commit	`188d82e`
🔍 Latest deploy log	https://app.netlify.com/sites/gateway-api-inference-extension/deploys/681aa07004341800084dcc4f
😎 Deploy Preview	https://deploy-preview-789--gateway-api-inference-extension.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

ahg-g · 2025-05-07T15:57:56Z

/lgtm

liu-cong

/hold if you want to address the nits.

Thank you for the update, it looks much cleaner!

liu-cong · 2025-05-07T15:54:57Z

site-src/performance/benchmark/index.md

    ```bash
    git clone https://github.com/kubernetes-sigs/gateway-api-inference-extension
    cd gateway-api-inference-extension
    ```

-1. Get the target IP. Examples below show how to get the IP of a gateway or a LoadBalancer k8s service.
+1. Get the target IP. The examples below shows how to get the IP of a gateway or a k8s service.


Suggested change

1. Get the target IP. The examples below shows how to get the IP of a gateway or a k8s service.

1. Get the target IP. The example below shows how to get the IP of a gateway or a k8s service.

liu-cong · 2025-05-07T15:59:08Z

site-src/performance/benchmark/index.md


 ```bash
-kubectl scale --replicas=8 -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/vllm/gpu-deployment.yaml
+kubectl scale deployment vllm-llama3-8b-instruct --replicas=8


nit: I suggest changing replicas to 6 as the example in the end uses 6 replicas, and the new regression test PR also uses 6

#755

liu-cong · 2025-05-07T16:02:19Z

config/manifests/benchmark/benchmark.yaml

@@ -37,7 +37,7 @@ spec:
        - name: BACKEND
          value: vllm
        - name: PORT
-          value: "8081"
+          value: "80"


Great! 8081 was the port back then when we had envoy patches.

liu-cong · 2025-05-07T16:03:03Z

/lgtm

Signed-off-by: Daneyon Hansen <[email protected]>

Docs: Updates Benchmark Guide

188d82e

Signed-off-by: Daneyon Hansen <[email protected]>

danehans requested a review from liu-cong May 6, 2025 23:51

k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label May 6, 2025

k8s-ci-robot requested review from nirrozenbaum and robscott May 6, 2025 23:51

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 6, 2025

k8s-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label May 6, 2025

k8s-ci-robot assigned ahg-g May 7, 2025

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 7, 2025

k8s-ci-robot merged commit e7944d1 into kubernetes-sigs:main May 7, 2025
8 checks passed

liu-cong reviewed May 7, 2025

View reviewed changes

k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label May 7, 2025

rlakhtakia pushed a commit to rlakhtakia/gateway-api-inference-extension that referenced this pull request May 13, 2025

Docs: Updates Benchmark Guide (kubernetes-sigs#789)

73b0885

Signed-off-by: Daneyon Hansen <[email protected]>

nayihz pushed a commit to nayihz/gateway-api-inference-extension that referenced this pull request May 14, 2025

Docs: Updates Benchmark Guide (kubernetes-sigs#789)

0a9bce3

Signed-off-by: Daneyon Hansen <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Docs: Updates Benchmark Guide #789

Docs: Updates Benchmark Guide #789

danehans commented May 6, 2025

k8s-ci-robot commented May 6, 2025

netlify bot commented May 6, 2025 •

edited

Loading

ahg-g commented May 7, 2025

liu-cong left a comment

liu-cong May 7, 2025

liu-cong May 7, 2025

liu-cong May 7, 2025

liu-cong commented May 7, 2025

	1. Get the target IP. The examples below shows how to get the IP of a gateway or a k8s service.
	1. Get the target IP. The example below shows how to get the IP of a gateway or a k8s service.

Docs: Updates Benchmark Guide #789

Docs: Updates Benchmark Guide #789

Conversation

danehans commented May 6, 2025

k8s-ci-robot commented May 6, 2025

netlify bot commented May 6, 2025 • edited Loading

✅ Deploy Preview for gateway-api-inference-extension ready!

ahg-g commented May 7, 2025

liu-cong left a comment

Choose a reason for hiding this comment

liu-cong May 7, 2025

Choose a reason for hiding this comment

liu-cong May 7, 2025

Choose a reason for hiding this comment

liu-cong May 7, 2025

Choose a reason for hiding this comment

liu-cong commented May 7, 2025

netlify bot commented May 6, 2025 •

edited

Loading