You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The LPG benchmark tool works by sending traffic to the specified target IP and port, and collect results. Follow the steps below to run a single benchmark. You can deploy multiple LPG instances if you want to run benchmarks in parallel against different targets.
35
+
The LPG benchmark tool works by sending traffic to the specified target IP and port, and collecting the results.
36
+
Follow the steps below to run a single benchmark. Multiple LPG instances can be deployed to run benchmarks in
1. Get the target IP. Examples below show how to get the IP of a gateway or a LoadBalancer k8s service.
46
+
1. Get the target IP. The examples below shows how to get the IP of a gateway or a k8s service.
43
47
44
48
```bash
45
49
# Get gateway IP
@@ -51,32 +55,43 @@ The LPG benchmark tool works by sending traffic to the specified target IP and p
51
55
echo$SVC_IP
52
56
```
53
57
54
-
1. Then update the `<target-ip>`in`./config/manifests/benchmark/benchmark.yaml` to your target IP. Feel free to adjust other parameters such as request_rates as well. For a complete list of LPG configurations, pls refer to the [LPG user guide](https://github.com/AI-Hypercomputer/inference-benchmark?tab=readme-ov-file#configuring-the-benchmark).
58
+
1. Then update the `<target-ip>`in`./config/manifests/benchmark/benchmark.yaml` to the value of `$SVC_IP` or `$GW_IP`.
59
+
Feel free to adjust other parameters such as `request_rates` as well. For a complete list of LPG configurations, refer to the
60
+
[LPG user guide](https://github.com/AI-Hypercomputer/inference-benchmark?tab=readme-ov-file#configuring-the-benchmark).
55
61
56
-
1. Start the benchmark tool.`kubectl apply -f ./config/manifests/benchmark/benchmark.yaml`
62
+
1. Start the benchmark tool.
57
63
58
-
1. Wait for benchmark to finish and download the results. Use the `benchmark_id` environment variable
59
-
to specify what this benchmark is for. For instance, `inference-extension` or `k8s-svc`. When the LPG tool finishes benchmarking, it will print a log line `LPG_FINISHED`,
60
-
the script below will watch for that log line and then start downloading results.
1. After the script finishes, you should see benchmark results under `./tools/benchmark/output/default-run/my-benchmark/results/json` folder. Here is a [sample json file](./sample.json).
75
+
76
+
After the script finishes, you should see benchmark results under `./tools/benchmark/output/default-run/k8s-svc/results/json` folder.
77
+
Here is a [sample json file](./sample.json). Replace `k8s-svc` with `inference-extension` when running an inference extension benchmark.
66
78
67
79
### Tips
68
80
81
+
* When using a `benchmark_id` other than `k8s-svc` or `inference-extension`, the labels in`./tools/benchmark/benchmark.ipynb` must be
82
+
updated accordingly to analyze the results.
69
83
* You can specify `run_id="runX"` environment variable when running the `./download-benchmark-results.bash` script.
70
84
This is useful when you run benchmarks multiple times to get a more statistically meaningful results and group the results accordingly.
71
85
* Update the `request_rates` that best suit your benchmark environment.
72
86
73
87
### Advanced Benchmark Configurations
74
88
75
-
Pls refer to the [LPG user guide](https://github.com/AI-Hypercomputer/inference-benchmark?tab=readme-ov-file#configuring-the-benchmark) for a detailed list of configuration knobs.
89
+
Refer to the [LPG user guide](https://github.com/AI-Hypercomputer/inference-benchmark?tab=readme-ov-file#configuring-the-benchmark) for a
90
+
detailed list of configuration knobs.
76
91
77
92
## Analyze the results
78
93
79
-
This guide shows how to run the jupyter notebook using vscode.
94
+
This guide shows how to run the jupyter notebook using vscode after completing k8s service and inference extension benchmarks.
80
95
81
96
1. Create a python virtual environment.
82
97
@@ -92,6 +107,6 @@ This guide shows how to run the jupyter notebook using vscode.
92
107
```
93
108
94
109
1. Open the notebook `./tools/benchmark/benchmark.ipynb`, and run each cell. At the end you should
95
-
see a bar chart like below where **"ie"** represents inference extension. This chart is generated using this benchmarking tool with 6 vLLM (v1) model servers (H100 80 GB), [llama2-7b](https://huggingface.co/meta-llama/Llama-2-7b-chat-hf/tree/main) and the [ShareGPT dataset](https://huggingface.co/datasets/anon8231489123/ShareGPT_Vicuna_unfiltered/resolve/main/ShareGPT_V3_unfiltered_cleaned_split.json).
96
-
97
-

110
+
see a bar chart like below where __"ie"__ represents inference extension. This chart is generated using this benchmarking tool with 6 vLLM (v1) model servers (H100 80 GB), [llama2-7b](https://huggingface.co/meta-llama/Llama-2-7b-chat-hf/tree/main) and the [ShareGPT dataset](https://huggingface.co/datasets/anon8231489123/ShareGPT_Vicuna_unfiltered/resolve/main/ShareGPT_V3_unfiltered_cleaned_split.json).
0 commit comments