handle no tool call alternative #101

asamal4 · 2025-11-13T00:09:03Z

Handle no tool call alternative.

Summary by CodeRabbit

Bug Fixes

Tool call evaluation now proceeds when tool calls are absent, enabling proper assessment of alternative scenarios where no tools are expected.

Tests

Added test coverage for evaluation scenarios with missing tool calls.

coderabbitai · 2025-11-13T00:09:11Z

Walkthrough

The change removes an early return in _evaluate_tool_calls when tool calls are absent or empty. Instead of immediately returning a 0.0 score, missing/empty tool_calls are now treated as an empty list and forwarded to evaluate_tool_calls for determination. A unit test is added to verify correct handling of None tool_calls with alternative expectations.

Changes

Cohort / File(s)	Summary
Production Logic `src/lightspeed_evaluation/core/metrics/custom/custom.py`	Removed early return in `_evaluate_tool_calls` when tool_calls are missing/empty; now treats them as empty list and forwards to `evaluate_tool_calls` instead of immediately returning 0.0 score
Test Coverage `tests/unit/core/metrics/custom/test_custom.py`	New unit test module for CustomMetrics; validates `_evaluate_tool_calls` correctly handles None tool_calls and matches alternative expectations

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Focus on verifying the control flow change logic is correct and doesn't introduce unintended side effects
Confirm the test properly validates the new behavior with None tool_calls and alternative matching expectations

Possibly related PRs

Ability to set alternate tool calls for eval #90: Implements multi-alternative/empty-set matching and messaging in evaluate_tool_calls, which complements this change by providing the evaluation logic that now receives the forwarded empty tool_calls

Suggested reviewers

VladimirKadlec
tisnik

Poem

🐰 No early returns today,
Let empty calls find their way,
Forward to evaluate's care,
Alternatives blooming fair!
Logic flows, not stops—hooray!

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'handle no tool call alternative' directly describes the main change: removing early return for missing tool calls to allow evaluation via alternative paths.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 24f68c4 and e647043.

📒 Files selected for processing (2)

src/lightspeed_evaluation/core/metrics/custom/custom.py (1 hunks)
tests/unit/core/metrics/custom/test_custom.py (1 hunks)

🧰 Additional context used

📓 Path-based instructions (7)

src/lightspeed_evaluation/**