@@ -291,7 +291,7 @@ embedding:
291291| `attachments` | list[string] | ❌ | Attachments | ❌ |
292292| `expected_response` | string | 📋 | Expected response for comparison | ❌ |
293293| `expected_intent` | string | 📋 | Expected intent for intent evaluation| ❌ |
294- | `expected_tool_calls` | list[list[dict]] | 📋 | Expected tool call sequences | ❌ |
294+ | `expected_tool_calls` | list[list[list[ dict]]] | 📋 | Expected tool call sequences (multiple alternative sets) | ❌ |
295295| `tool_calls` | list[list[dict]] | ❌ | Actual tool calls from API | ✅ (if API enabled) |
296296| `verify_script` | string | 📋 | Path to verification script | ❌ |
297297| `turn_metrics` | list[string] | ❌ | Turn-specific metrics to evaluate | ❌ |
@@ -302,7 +302,7 @@ embedding:
302302Examples
303303> - `expected_response`: Required for `custom:answer_correctness`
304304> - `expected_intent`: Required for `custom:intent_eval`
305- > - `expected_tool_calls`: Required for `custom:tool_eval`
305+ > - `expected_tool_calls`: Required for `custom:tool_eval` (multiple alternative sets format)
306306> - `verify_script`: Required for `script:action_eval` (used when API is enabled)
307307> - `response`: Required for most metrics (auto-populated if API enabled)
308308
@@ -314,15 +314,37 @@ Examples
314314| `[]` (empty list) | Skip evaluation for this turn |
315315| `["metric1", ...]` | Use specified metrics only |
316316
317+ # ### Tool Evaluation
318+
319+ The `custom:tool_eval` metric supports flexible matching with multiple alternative patterns :
320+
321+ - **Format**: `[[[tool_calls, ...]], [[tool_calls]], ...]` (list of list of list)
322+ - **Matching**: Tries each alternative until one matches
323+ - **Use Cases**: Optional tools, multiple approaches, default arguments, skip scenarios
324+ - **Empty Sets**: `[]` represents "no tools" and must come after primary alternatives
325+
317326# ### Tool Call Structure
318327
319328 ` ` ` yaml
329+ # Multiple alternative sets format: [[[tool_calls, ...]], [[tool_calls]], ...]
320330 expected_tool_calls:
321- -
322- - tool_name: oc_get # Tool name
323- arguments: # Tool arguments
324- kind: pod
325- name: openshift-light* # Regex patterns supported for flexible matching
331+ - # Alternative 1: Primary approach
332+ - # Sequence 1
333+ - tool_name: oc_get
334+ arguments:
335+ kind: pod
336+ name: openshift-light* # Regex patterns supported
337+ - # Sequence 2 (if multiple parallel tool calls needed)
338+ - tool_name: oc_describe
339+ arguments:
340+ kind: pod
341+ - # Alternative 2: Different approach
342+ - # Sequence 1
343+ - tool_name: kubectl_get
344+ arguments:
345+ resource: pods
346+ - # Alternative 3: Skip scenario (optional)
347+ [] # When model has information from previous conversation
326348 ` ` `
327349
328350# ### Script-Based Evaluations
0 commit comments