Skip to content

Conversation

@thoraxe
Copy link
Contributor

@thoraxe thoraxe commented Jul 30, 2025

Description

Type of change

  • Refactor
  • New feature
  • Bug fix
  • CVE fix
  • Optimization
  • Documentation Update
  • Configuration Update
  • Bump-up service version
  • Bump-up dependent library
  • Bump-up library or tool used for development (does not change the final image)
  • CI configuration change
  • Konflux configuration change
  • Unit tests improvement
  • Integration tests improvement
  • End to end tests improvement

Related Tickets & Documents

There were no related issues or anything. This was just randomly discussed in Slack so I decided to give it a whirl.

Checklist before requesting a review

  • I have performed a self-review of my code.
  • PR has passed all pre-merge test jobs.
  • If it is a core feature, I have added thorough tests.

Testing

There are tests that were written to validate that the parameter (or its absence) works, and the existing tests still work.

Summary by CodeRabbit

  • New Features

    • Introduced a new option to disable all tools and MCP server integration for both standard and streaming query endpoints by setting a no_tools flag in requests.
  • Bug Fixes

    • Improved default handling to ensure backward compatibility when the new option is not specified.
  • Tests

    • Added comprehensive tests to verify correct behavior when the new option is enabled or disabled.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jul 30, 2025

Walkthrough

A new no_tools boolean flag was introduced in the API request model, endpoint logic, and agent creation functions for both synchronous and streaming query flows. This flag allows bypassing all tool and MCP server integrations. The change is comprehensively tested with new and updated unit tests to verify correct behavior when no_tools is toggled.

Changes

Cohort / File(s) Change Summary
Synchronous Query Endpoint & Agent
src/app/endpoints/query.py
Updated get_agent and retrieve_response to accept and handle a no_tools flag, conditionally disabling tool parsing, MCP headers, and toolgroups.
Streaming Query Endpoint & Agent
src/app/endpoints/streaming_query.py
Extended async get_agent and retrieve_response to support the no_tools flag, disabling tool parsing and MCP/toolgroup logic when set.
API Request Model
src/models/requests.py
Added optional no_tools boolean attribute to QueryRequest model, updated docstring and example schema.
Synchronous Query Endpoint Tests
tests/unit/app/endpoints/test_query.py
Added and updated tests to cover all behaviors of the no_tools flag, including agent creation, response retrieval, and endpoint handler logic.
Streaming Query Endpoint Tests
tests/unit/app/endpoints/test_streaming_query.py
Added and updated async tests for the streaming query endpoint, exercising all code paths affected by the no_tools flag.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant API
    participant Agent

    Client->>API: POST /query or /streaming-query (no_tools=[True|False])
    API->>API: Parse QueryRequest (extract no_tools)
    API->>Agent: get_agent(..., no_tools)
    alt no_tools == True
        API->>Agent: create_turn(toolgroups=None)
        API->>Agent: agent.extra_headers = {}
    else no_tools == False
        API->>Agent: create_turn(toolgroups=constructed)
        API->>Agent: agent.extra_headers = MCP headers
    end
    Agent-->>API: Response
    API-->>Client: Response
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Suggested reviewers

  • umago

Poem

Oh, what a hop in the code today—
The "no_tools" flag now leads the way!
When set to True, the tools all rest,
The agent hops on with just its best.
Tests abound, both swift and keen,
Ensuring queries stay smooth and clean.
🐇✨

Note

⚡️ Unit Test Generation is now available in beta!

Learn more here, or try it out under "Finishing Touches" below.


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 221b044 and 61fcaaa.

📒 Files selected for processing (5)
  • src/app/endpoints/query.py (3 hunks)
  • src/app/endpoints/streaming_query.py (3 hunks)
  • src/models/requests.py (3 hunks)
  • tests/unit/app/endpoints/test_query.py (5 hunks)
  • tests/unit/app/endpoints/test_streaming_query.py (6 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
  • src/models/requests.py
  • src/app/endpoints/query.py
  • src/app/endpoints/streaming_query.py
  • tests/unit/app/endpoints/test_streaming_query.py
🔇 Additional comments (9)
tests/unit/app/endpoints/test_query.py (9)

590-590: Good practice: Explicit parameter passing for clarity.

Adding the explicit no_tools=False parameter maintains backward compatibility and makes the test intention clear.


749-749: Consistent parameter updates across tests.

These explicit no_tools=False parameters ensure all existing tests maintain their original behavior and intentions after the function signature change.

Also applies to: 814-814, 887-887


1391-1425: Well-structured test for no_tools=True functionality.

The test properly validates the new no_tools=True parameter by setting up appropriate mocks and verifying the expected response behavior.


1427-1461: Good coverage with complementary test for no_tools=False.

This test ensures that the default behavior with no_tools=False continues to work as expected, providing comprehensive coverage for both parameter values.


1463-1507: Excellent test validating MCP and RAG bypass functionality.

This test effectively verifies that no_tools=True properly bypasses both MCP servers and RAG functionality by checking that extra_headers is empty and toolgroups=None.


1509-1565: Comprehensive test ensuring normal functionality is preserved.

This test properly validates that no_tools=False maintains all expected behavior including MCP headers and combined RAG + MCP toolgroups.


1567-1616: Good test validating tool parser disabling.

This test effectively verifies that no_tools=True results in tool_parser=None being passed to the Agent constructor, ensuring tools are properly disabled at the agent level.


1618-1674: Well-designed test preserving tool parser functionality.

This test ensures that no_tools=False maintains the expected tool parser behavior by properly mocking GraniteToolParser and verifying it's used in agent creation.


1676-1686: Essential backward compatibility test.

This test ensures that existing code continues to work by verifying that no_tools defaults to False when not specified, maintaining backward compatibility.

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai generate unit tests to generate unit tests for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@manstis
Copy link
Contributor

manstis commented Jul 30, 2025

This should be added to /streaming_query too for consistency.

@thoraxe
Copy link
Contributor Author

thoraxe commented Jul 30, 2025

This should be added to /streaming_query too for consistency.

didn't realize that was a different endpoint. i will handle that now

Copy link
Contributor

@tisnik tisnik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM on my side, thank you

Copy link
Contributor

@umago umago left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code-wise I think it looks good but unit-tests needs to be fixed

@thoraxe
Copy link
Contributor Author

thoraxe commented Aug 1, 2025

Code-wise I think it looks good but unit-tests needs to be fixed

What's wrong or missing? Can you be more specific? When run locally, all unit tests pass.

@umago
Copy link
Contributor

umago commented Aug 1, 2025

@thoraxe the unit tests in the CI is failing and it looks related to the changes proposed by this PR [0]. I see what you mean by running locally, perhaps we need to dig more into the logs and see what's wrong but, deffo we can't merge it when the upstream CI is failing right ?

I can try to take a look at see if I can spot something.

Are you running locally with "make test-unit" ?

  _____________ test_streaming_query_endpoint_handler_no_tools_false _____________

mocker = <pytest_mock.plugin.MockerFixture object at 0x7f689bbc3320>

    @pytest.mark.asyncio
    async def test_streaming_query_endpoint_handler_no_tools_false(mocker):
        """Test the streaming query endpoint handler with no_tools=False (default behavior)."""
        mock_client = mocker.AsyncMock()
        mock_async_lsc = mocker.patch("client.AsyncLlamaStackClientHolder.get_client")
        mock_async_lsc.return_value = mock_client
        mock_client.models.list.return_value = [
            mocker.Mock(identifier="model1", model_type="llm", provider_id="provider1"),
        ]
    
        mock_config = mocker.Mock()
        mock_config.user_data_collection_configuration.transcripts_disabled = True
        mocker.patch("app.endpoints.streaming_query.configuration", mock_config)
    
        # Mock the streaming response
        mock_streaming_response = mocker.AsyncMock()
        mock_streaming_response.__aiter__.return_value = iter([])
    
        mocker.patch(
            "app.endpoints.streaming_query.retrieve_response",
            return_value=(mock_streaming_response, "test_conversation_id"),
        )
>       mocker.patch(
            "app.endpoints.streaming_query.select_model_id", return_value="fake_model_id"
        )

tests/unit/app/endpoints/test_streaming_query.py:1630: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
.venv/lib/python3.12/site-packages/pytest_mock/plugin.py:439: in __call__
    return self._start_patch(
.venv/lib/python3.12/site-packages/pytest_mock/plugin.py:257: in _start_patch
    mocked: MockType = p.start()
                       ^^^^^^^^^
/usr/lib/python3.12/unittest/mock.py:1606: in start
    result = self.__enter__()
             ^^^^^^^^^^^^^^^^
/usr/lib/python3.12/unittest/mock.py:1458: in __enter__
    original, local = self.get_original()
                      ^^^^^^^^^^^^^^^^^^^

[0] https://github.com/lightspeed-core/lightspeed-stack/actions/runs/16649655486/job/47118796741?pr=318#step:8:392

@umago
Copy link
Contributor

umago commented Aug 1, 2025

/retest

@umago
Copy link
Contributor

umago commented Aug 1, 2025

I fetched the change and ran unit tests locally, it does work for me as well. We may be doing something off in the GH workflow

Copy link
Contributor

@umago umago left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue figured out. PR needs to rebase with current main and UT fixes

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 14

🧹 Nitpick comments (19)
src/metrics/utils.py (1)

22-22: Consider handling the type annotation issue more gracefully.

The pyright: ignore[reportAttributeAccessIssue] comment suggests uncertainty about the model object structure. Consider adding proper type annotations or documentation about the expected model structure.

tests/unit/auth/test_jwk_token.py (2)

1-2: Remove unnecessary pylint disable comment

The redefined-outer-name warning is expected in pytest fixtures. Consider configuring pylint to ignore this in test files rather than disabling it inline.

-# pylint: disable=redefined-outer-name
-
 """Unit tests for functions defined in auth/jwk_token.py"""

75-110: Complex but necessary mock setup

The mock server implementation correctly handles the async context manager pattern required by aiohttp. Consider adding a docstring explaining the mock structure for future maintainers.

Add documentation to explain the mock structure:

 def make_signing_server(mocker, key_set, algorithms):
-    """A fake server to serve our signing keys as JWKs."""
+    """A fake server to serve our signing keys as JWKs.
+    
+    Creates a mock aiohttp.ClientSession that returns JWK sets when called.
+    The mock properly implements async context managers for both the session
+    and individual requests.
+    
+    Args:
+        mocker: pytest-mock fixture
+        key_set: List of key dictionaries with 'private_key' and 'kid'
+        algorithms: List of algorithms corresponding to each key
+    
+    Returns:
+        Mock ClientSession class
+    """
src/app/main.py (1)

71-75: Consider placement of route path collection.

The route paths collection logic is placed after router inclusion, which is correct for capturing all registered routes. However, consider moving this closer to where it's used in the middleware for better code organization.

+# Collect app route paths for middleware filtering
+app_routes_paths = [
+    route.path
+    for route in app.routes
+    if isinstance(route, (Mount, Route, WebSocketRoute))
+]
+
 @app.middleware("")
 async def rest_api_metrics(
     request: Request, call_next: Callable[[Request], Awaitable[Response]]
 ) -> Response:
.github/workflows/e2e_tests.yaml (2)

44-44: Fix YAML formatting issues.

Multiple formatting issues detected:

  • Trailing spaces on lines 44, 61, 102, 241, 247
  • Wrong indentation on lines 62, 103, 242 (should be 10 spaces, not 12)
  • Missing newline at end of file (line 272)

Also applies to: 61-62, 102-103, 241-242, 247-247, 272-272


251-255: Replace fixed sleep with proper health checks.

Using a fixed 20-second sleep is fragile. Services may not be ready, or you're waiting unnecessarily.

Consider using docker-compose health checks or a retry loop:

# Wait for services to be healthy
for i in {1..30}; do
  if curl -f http://localhost:8080/v1/models >/dev/null 2>&1; then
    echo "Services are ready!"
    break
  fi
  echo "Waiting for services... ($i/30)"
  sleep 2
done
src/auth/jwk_token.py (1)

103-107: Security consideration for tokens without kid.

When JWT has no kid, the code returns the first key matching the algorithm. This could be exploited if an attacker can add their key to the JWK set. Consider adding a configuration option to reject tokens without kid for stricter security.

tests/unit/utils/test_mcp_headers.py (1)

95-117: Remove duplicate test case.

test_mcp_headers_dependency_empty_headers (lines 95-105) and test_mcp_headers_dependency_mcp_headers_empty (lines 107-117) test the exact same scenario - both have {"MCP-HEADERS": ""}.

Remove one of these duplicate tests.

docs/openapi.md (1)

303-304: Fix markdown table formatting issues.

The metrics endpoint response table has formatting issues that should be corrected.

 | Status Code | Description | Component |
 |-------------|-------------|-----------|
-| 200 | Successful Response | string
- |
+| 200 | Successful Response | string |
docs/getting_started.md (6)

3-48: Fix markdown list indentation in table of contents.

The table of contents has inconsistent indentation that violates markdown formatting standards. Nested list items should use 2-space increments.

Apply this diff to fix the indentation:

 <!-- vim-markdown-toc GFM -->
 
 * [Preface](#preface)
 * [Deployment methods](#deployment-methods)
 * [Integration with Llama Stack framework](#integration-with-llama-stack-framework)
-    * [Llama Stack as a library](#llama-stack-as-a-library)
-    * [Llama Stack as a server](#llama-stack-as-a-server)
+  * [Llama Stack as a library](#llama-stack-as-a-library)
+  * [Llama Stack as a server](#llama-stack-as-a-server)
 * [Local deployment](#local-deployment)
-    * [Llama Stack used as a separate process](#llama-stack-used-as-a-separate-process)
-        * [Prerequisites](#prerequisites)
-        * [Installation of all required tools](#installation-of-all-required-tools)
+  * [Llama Stack used as a separate process](#llama-stack-used-as-a-separate-process)
+    * [Prerequisites](#prerequisites)
+    * [Installation of all required tools](#installation-of-all-required-tools)

Continue this pattern for all nested list items in the table of contents.


62-67: Standardize list marker style for consistency.

The list uses dashes (-) while the rest of the document uses asterisks (*). Maintain consistency throughout the document.

Apply this diff to standardize the list markers:

-* Local deployment
-    - Llama Stack framework is used as a library
-    - Llama Stack framework is used as a separate process (deployed locally)
-* Running from a container
-    - Llama Stack framework is used as a library
-    - Llama Stack framework is used as a separate process
+* Local deployment
+  * Llama Stack framework is used as a library
+  * Llama Stack framework is used as a separate process (deployed locally)
+* Running from a container
+  * Llama Stack framework is used as a library
+  * Llama Stack framework is used as a separate process

103-103: Consider a more concise alternative to "last but not least".

The phrase "last but not least" might be wordy. Consider using "finally" or "additionally" for better conciseness.

-...ally made changes and improvements. And last but not least, it is possible to trace, monitor and debug...
+...ally made changes and improvements. Finally, it is possible to trace, monitor and debug...

969-969: Fix grammar: use "or" instead of "nor".

The phrase "no other packages nor tools" should use "or" for correct grammar.

-...* image is needed and no other packages nor tools need to be installed.
+...* image is needed and no other packages or tools need to be installed.

1523-1523: Use "cannot" instead of "can not".

"Cannot" is the preferred single-word form in formal writing.

-...container and the standard port mapping can not be leveraged there.
+...container and the standard port mapping cannot be leveraged there.

213-213: Add language specifications to code blocks.

Several code blocks are missing language specifications, which affects syntax highlighting and readability.

Apply these changes to add appropriate language specifications:

-    ```
+    ```text
     usage: llama [-h] {model,stack,download,verify-download} ...
     
-    ```
+    ```text
     INFO     2025-07-27 16:56:12,464 llama_stack.cli.stack.run:147 server: No image type or image name provided...
     
-    ```
+    ```text
     INFO     2025-07-29 15:26:20,864 llama_stack.cli.stack.run:126 server: Using run configuration: run.yaml
     
-    ```
+    ```text
     uv run src/lightspeed_stack.py
     
-    ```
+    ```text
     INFO     2025-07-27 16:56:12,464 llama_stack.cli.stack.run:147 server: No image type or image name provided...

Also applies to: 234-234, 399-399, 581-581, 1168-1168

docs/openapi.json (4)

254-257: Example config contradicts schema defaults
feedback_enabled is shown as true in the example payload, yet its schema default (Line 1693) is false. Same for transcripts_enabled (false example vs false default). Please align examples and defaults to avoid confusion in generated docs and SDKs.


596-617: Prometheus endpoint: use canonical content-type
Prometheus expects text/plain; version=0.0.4; charset=utf-8. Consider updating the content definition to that explicit MIME type to prevent tooling from mis-classifying the payload.


1035-1055: Array property lacks max-items guard
categories could theoretically accept hundreds of entries. If that is undesired, add maxItems (e.g., 6) to keep the payload size predictable.


1146-1174: Inference configuration: provide sane defaults
default_model/default_provider are both optional but have no default values. Supplying empty strings in generated configs is common—consider pattern: '^.+$' or explicit "default": null to avoid silent mis-configuration.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b4c0a26 and 221b044.

⛔ Files ignored due to path filters (7)
  • docs/architecture.png is excluded by !**/*.png
  • docs/architecture.svg is excluded by !**/*.svg
  • docs/config.png is excluded by !**/*.png
  • docs/llama_stack_as_library.svg is excluded by !**/*.svg
  • docs/llama_stack_as_service.svg is excluded by !**/*.svg
  • docs/persistent_storage.svg is excluded by !**/*.svg
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (61)
  • .github/workflows/e2e_tests.yaml (1 hunks)
  • CONTRIBUTING.md (1 hunks)
  • Containerfile (2 hunks)
  • Makefile (2 hunks)
  • README.md (6 hunks)
  • docker-compose.yaml (1 hunks)
  • docs/config.puml (4 hunks)
  • docs/getting_started.md (1 hunks)
  • docs/openapi.json (14 hunks)
  • docs/openapi.md (11 hunks)
  • examples/run.yaml (1 hunks)
  • lightspeed-stack.yaml (2 hunks)
  • pyproject.toml (2 hunks)
  • scripts/generate_openapi_schema.py (1 hunks)
  • src/app/endpoints/config.py (1 hunks)
  • src/app/endpoints/feedback.py (2 hunks)
  • src/app/endpoints/metrics.py (1 hunks)
  • src/app/endpoints/query.py (11 hunks)
  • src/app/endpoints/streaming_query.py (10 hunks)
  • src/app/main.py (2 hunks)
  • src/app/routers.py (2 hunks)
  • src/auth/__init__.py (3 hunks)
  • src/auth/jwk_token.py (1 hunks)
  • src/configuration.py (3 hunks)
  • src/constants.py (1 hunks)
  • src/metrics/__init__.py (1 hunks)
  • src/metrics/utils.py (1 hunks)
  • src/models/config.py (7 hunks)
  • src/models/requests.py (9 hunks)
  • src/services/data_collector.py (3 hunks)
  • src/utils/mcp_headers.py (2 hunks)
  • src/version.py (1 hunks)
  • test.containerfile (1 hunks)
  • tests/configuration/lightspeed-stack.yaml (1 hunks)
  • tests/configuration/run.yaml (1 hunks)
  • tests/e2e/features/rest_api.feature (1 hunks)
  • tests/e2e/gen_scenario_list.py (1 hunks)
  • tests/integration/test_configuration.py (1 hunks)
  • tests/integration/test_version.py (1 hunks)
  • tests/unit/__init__.py (1 hunks)
  • tests/unit/app/endpoints/test_config.py (1 hunks)
  • tests/unit/app/endpoints/test_conversations.py (1 hunks)
  • tests/unit/app/endpoints/test_feedback.py (4 hunks)
  • tests/unit/app/endpoints/test_info.py (1 hunks)
  • tests/unit/app/endpoints/test_metrics.py (1 hunks)
  • tests/unit/app/endpoints/test_models.py (5 hunks)
  • tests/unit/app/endpoints/test_query.py (22 hunks)
  • tests/unit/app/endpoints/test_streaming_query.py (19 hunks)
  • tests/unit/app/test_routers.py (5 hunks)
  • tests/unit/auth/test_jwk_token.py (1 hunks)
  • tests/unit/metrics/__init__.py (1 hunks)
  • tests/unit/metrics/test_utis.py (1 hunks)
  • tests/unit/models/test_config.py (17 hunks)
  • tests/unit/models/test_requests.py (4 hunks)
  • tests/unit/services/test_data_collector.py (11 hunks)
  • tests/unit/test_client.py (5 hunks)
  • tests/unit/test_configuration.py (9 hunks)
  • tests/unit/utils/test_common.py (7 hunks)
  • tests/unit/utils/test_endpoints.py (1 hunks)
  • tests/unit/utils/test_mcp_headers.py (1 hunks)
  • tests/unit/utils/test_types.py (1 hunks)
✅ Files skipped from review due to trivial changes (25)
  • tests/unit/metrics/init.py
  • src/app/endpoints/config.py
  • src/app/routers.py
  • tests/unit/init.py
  • tests/unit/utils/test_types.py
  • tests/integration/test_version.py
  • scripts/generate_openapi_schema.py
  • src/version.py
  • tests/unit/app/endpoints/test_info.py
  • tests/e2e/features/rest_api.feature
  • tests/unit/utils/test_endpoints.py
  • tests/unit/app/endpoints/test_conversations.py
  • CONTRIBUTING.md
  • tests/e2e/gen_scenario_list.py
  • tests/configuration/lightspeed-stack.yaml
  • examples/run.yaml
  • lightspeed-stack.yaml
  • src/app/endpoints/metrics.py
  • tests/unit/app/test_routers.py
  • tests/unit/test_configuration.py
  • README.md
  • tests/unit/services/test_data_collector.py
  • src/constants.py
  • Containerfile
  • tests/unit/utils/test_common.py
🚧 Files skipped from review as they are similar to previous changes (2)
  • src/app/endpoints/query.py
  • tests/unit/app/endpoints/test_query.py
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: in the lightspeed-stack project after pr #278, the `userdatacollection.feedback_storage` property dy...
Learnt from: onmete
PR: lightspeed-core/lightspeed-stack#278
File: src/app/endpoints/feedback.py:113-118
Timestamp: 2025-07-23T14:26:40.340Z
Learning: In the lightspeed-stack project after PR #278, the `UserDataCollection.feedback_storage` property dynamically constructs paths from `user_data_dir` and will always return a valid string, making fallback logic like `or Path("")` redundant.

Applied to files:

  • src/services/data_collector.py
  • tests/integration/test_configuration.py
  • docs/config.puml
  • docs/openapi.json
🧬 Code Graph Analysis (11)
src/app/endpoints/feedback.py (1)
src/configuration.py (2)
  • configuration (48-53)
  • user_data_collection_configuration (72-77)
src/auth/__init__.py (3)
src/configuration.py (2)
  • configuration (48-53)
  • authentication_configuration (88-98)
src/auth/jwk_token.py (1)
  • JwkTokenAuthDependency (112-191)
src/models/config.py (1)
  • jwk_configuration (192-199)
tests/unit/app/endpoints/test_metrics.py (1)
src/app/endpoints/metrics.py (1)
  • metrics_endpoint_handler (14-16)
src/configuration.py (1)
src/models/config.py (1)
  • InferenceConfiguration (220-237)
tests/unit/metrics/test_utis.py (1)
src/metrics/utils.py (1)
  • setup_model_metrics (11-50)
src/app/main.py (4)
src/log.py (1)
  • get_logger (7-13)
src/metrics/utils.py (1)
  • setup_model_metrics (11-50)
src/utils/common.py (1)
  • register_mcp_servers_async (30-49)
src/app/routers.py (1)
  • include_routers (20-38)
tests/unit/test_client.py (1)
src/client.py (4)
  • LlamaStackClientHolder (19-52)
  • get_client (46-52)
  • get_client (81-87)
  • AsyncLlamaStackClientHolder (55-87)
src/metrics/utils.py (3)
src/configuration.py (3)
  • configuration (48-53)
  • llama_stack_configuration (64-69)
  • inference (109-114)
src/client.py (2)
  • LlamaStackClientHolder (19-52)
  • AsyncLlamaStackClientHolder (55-87)
src/log.py (1)
  • get_logger (7-13)
src/models/config.py (1)
src/configuration.py (1)
  • inference (109-114)
src/auth/jwk_token.py (3)
src/auth/interface.py (1)
  • AuthInterface (8-13)
src/auth/utils.py (1)
  • extract_user_token (7-26)
src/models/config.py (1)
  • JwkConfiguration (157-161)
tests/unit/utils/test_mcp_headers.py (1)
src/utils/mcp_headers.py (2)
  • extract_mcp_headers (29-53)
  • mcp_headers_dependency (15-26)
🪛 GitHub Actions: E2E Tests
src/app/main.py

[error] 85-85: Application startup failed due to APIConnectionError: Connection error when attempting to list models in setup_model_metrics.

🪛 YAMLlint (1.37.1)
docker-compose.yaml

[error] 32-32: no new line character at the end of file

(new-line-at-end-of-file)

.github/workflows/e2e_tests.yaml

[error] 44-44: trailing spaces

(trailing-spaces)


[error] 61-61: trailing spaces

(trailing-spaces)


[warning] 62-62: wrong indentation: expected 10 but found 12

(indentation)


[error] 102-102: trailing spaces

(trailing-spaces)


[warning] 103-103: wrong indentation: expected 10 but found 12

(indentation)


[error] 241-241: trailing spaces

(trailing-spaces)


[warning] 242-242: wrong indentation: expected 10 but found 12

(indentation)


[error] 247-247: trailing spaces

(trailing-spaces)


[error] 272-272: no new line character at the end of file

(new-line-at-end-of-file)

🪛 LanguageTool
docs/getting_started.md

[style] ~103-~103: ‘last but not least’ might be wordy. Consider a shorter alternative.
Context: ...ally made changes and improvements. And last but not least, it is possible to trace, monitor and d...

(EN_WORDINESS_PREMIUM_LAST_BUT_NOT_LEAST)


[grammar] ~969-~969: Ensure spelling is correct
Context: ...* image is needed and no other packages nor tools need to be installed. ### Retr...

(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)


[style] ~1523-~1523: Unless you want to emphasize “not”, use “cannot” which is more common.
Context: ...container and the standard port mapping can not be leveraged there. This configuration ...

(CAN_NOT_PREMIUM)

docs/openapi.md

[style] ~490-~490: Consider using a different adverb to strengthen your wording.
Context: ...-| | incorrect | The answer provided is completely wrong | | not_relevant | This answer do...

(COMPLETELY_ENTIRELY)

🪛 markdownlint-cli2 (0.17.2)
docs/getting_started.md

8-8: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


9-9: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


11-11: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


12-12: Unordered list indentation
Expected: 4; Actual: 8

(MD007, ul-indent)


13-13: Unordered list indentation
Expected: 4; Actual: 8

(MD007, ul-indent)


14-14: Unordered list indentation
Expected: 4; Actual: 8

(MD007, ul-indent)


15-15: Unordered list indentation
Expected: 4; Actual: 8

(MD007, ul-indent)


16-16: Unordered list indentation
Expected: 4; Actual: 8

(MD007, ul-indent)


17-17: Unordered list indentation
Expected: 4; Actual: 8

(MD007, ul-indent)


18-18: Unordered list indentation
Expected: 4; Actual: 8

(MD007, ul-indent)


19-19: Unordered list indentation
Expected: 4; Actual: 8

(MD007, ul-indent)


20-20: Unordered list indentation
Expected: 4; Actual: 8

(MD007, ul-indent)


21-21: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


22-22: Unordered list indentation
Expected: 4; Actual: 8

(MD007, ul-indent)


23-23: Unordered list indentation
Expected: 4; Actual: 8

(MD007, ul-indent)


24-24: Unordered list indentation
Expected: 4; Actual: 8

(MD007, ul-indent)


25-25: Unordered list indentation
Expected: 4; Actual: 8

(MD007, ul-indent)


26-26: Unordered list indentation
Expected: 4; Actual: 8

(MD007, ul-indent)


27-27: Unordered list indentation
Expected: 4; Actual: 8

(MD007, ul-indent)


28-28: Unordered list indentation
Expected: 4; Actual: 8

(MD007, ul-indent)


30-30: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


31-31: Unordered list indentation
Expected: 4; Actual: 8

(MD007, ul-indent)


32-32: Unordered list indentation
Expected: 4; Actual: 8

(MD007, ul-indent)


33-33: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


34-34: Unordered list indentation
Expected: 4; Actual: 8

(MD007, ul-indent)


35-35: Unordered list indentation
Expected: 4; Actual: 8

(MD007, ul-indent)


36-36: Unordered list indentation
Expected: 4; Actual: 8

(MD007, ul-indent)


37-37: Unordered list indentation
Expected: 4; Actual: 8

(MD007, ul-indent)


38-38: Unordered list indentation
Expected: 4; Actual: 8

(MD007, ul-indent)


39-39: Unordered list indentation
Expected: 4; Actual: 8

(MD007, ul-indent)


40-40: Unordered list indentation
Expected: 4; Actual: 8

(MD007, ul-indent)


41-41: Unordered list indentation
Expected: 4; Actual: 8

(MD007, ul-indent)


42-42: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


43-43: Unordered list indentation
Expected: 4; Actual: 8

(MD007, ul-indent)


44-44: Unordered list indentation
Expected: 4; Actual: 8

(MD007, ul-indent)


45-45: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


46-46: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


62-62: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


63-63: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


63-63: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


64-64: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


64-64: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


65-65: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


66-66: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


66-66: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


67-67: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


67-67: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


213-213: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


234-234: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


399-399: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


581-581: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


981-981: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


1168-1168: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

docs/openapi.md

303-303: Table pipe style
Expected: leading_and_trailing; Actual: leading_only; Missing trailing pipe

(MD055, table-pipe-style)


304-304: Table column count
Expected: 3; Actual: 1; Too few cells, row will be missing data

(MD056, table-column-count)


304-304: Tables should be surrounded by blank lines

(MD058, blanks-around-tables)


522-522: Code block style
Expected: fenced; Actual: indented

(MD046, code-block-style)

🪛 actionlint (1.7.7)
.github/workflows/e2e_tests.yaml

27-27: "github.event.pull_request.head.ref" is potentially untrusted. avoid using it directly in inline scripts. instead, pass it through an environment variable. see https://docs.github.com/en/actions/security-for-github-actions/security-guides/security-hardening-for-github-actions for more details

(expression)

🔇 Additional comments (81)
tests/integration/test_configuration.py (1)

58-58: LGTM: Configuration flag inversion correctly implemented.

The test assertion correctly reflects the change from feedback_disabled to feedback_enabled, maintaining the same semantic meaning with inverted logic.

tests/unit/app/endpoints/test_config.py (1)

43-43: LGTM: Test data updated for configuration flag rename.

The change from "feedback_disabled": True to "feedback_enabled": False correctly maintains the same test scenario (feedback disabled) while using the new positive flag naming convention.

src/app/endpoints/feedback.py (2)

47-47: LGTM: Configuration flag usage updated correctly.

The change from negating feedback_disabled to directly returning feedback_enabled correctly reflects the configuration model update while maintaining the same functional behavior.


128-128: LGTM: Typo correction improves logging quality.

Fixed the spelling from "sucessfully" to "successfully".

src/auth/__init__.py (2)

6-6: LGTM: JWK token module import added.

The import statement correctly includes the new jwk_token module to support JWT authentication with JWK.


35-39: LGTM: JWK token authentication case implemented correctly.

The new authentication case properly:

  • Matches the AUTH_MOD_JWK_TOKEN constant
  • Initializes JwkTokenAuthDependency with the correct configuration
  • Passes the virtual_path parameter
src/utils/mcp_headers.py (3)

15-15: LGTM: Parameter rename improves code clarity.

Removing the underscore prefix from _request to request is appropriate since the parameter is being used (passed to extract_mcp_headers).


16-16: LGTM: Docstring capitalization improved.

Consistent capitalization of "MCP" improves documentation quality.


26-26: LGTM: Function call updated for renamed parameter.

Correctly updated the function call to use the renamed request parameter.

tests/unit/app/endpoints/test_metrics.py (2)

15-25: Comprehensive metrics validation looks good.

The test thoroughly validates that all expected Prometheus metrics are present in the response body, covering counters, histograms, and gauges for REST API calls, LLM operations, and token counting.


8-8: Passing None as the request is acceptable
The metrics_endpoint_handler (and root_endpoint_handler) signatures include a _request: Request parameter that isn’t used, and all existing unit tests for endpoint handlers consistently pass None in the same way. No change is needed here.

• tests/unit/app/endpoints/test_metrics.py: uses metrics_endpoint_handler(None)
• tests/unit/app/endpoints/test_root.py: uses root_endpoint_handler(request=None)

Makefile (4)

17-17: Good standardization to explicit Python module invocation.

The change from uv run pytest to uv run python -m pytest ensures consistent module loading and follows best practices.


22-22: Consistent with unit test command update.

The integration test command now matches the unit test format using explicit Python module invocation.


84-84: Migration from pdm to uv completed.

The build command now uses uv run instead of pdm run, completing the package manager migration.


87-87: Consistent package manager usage.

The upload command now uses uv run to match other commands in the Makefile.

src/metrics/utils.py (1)

33-44: Logic for default model detection looks correct.

The implementation properly compares the model/provider tuple with the default configuration and sets the metric value accordingly.

src/configuration.py (3)

15-15: Good addition of InferenceConfiguration import.

The import supports the new inference property added to the configuration class.


88-98: Enhanced authentication configuration validation.

The additional assertion ensures that authentication configuration is not only loaded but also properly initialized, which improves error handling.


108-115: New inference configuration property follows established pattern.

The property implementation is consistent with other configuration accessors and properly handles the optional nature of inference configuration.

src/services/data_collector.py (3)

98-98: Good refactoring from negative to positive flag logic.

The change from feedback_disabled to not feedback_enabled makes the configuration more intuitive while maintaining the same logical behavior.


111-111: Consistent flag inversion for transcripts.

The change from transcripts_disabled to not transcripts_enabled aligns with the feedback flag refactoring.


226-226: Cleanup logic updated consistently.

The cleanup method also uses the positive flag logic, maintaining consistency across all data collection operations.

tests/unit/metrics/test_utis.py (1)

6-69: Well-structured test with comprehensive coverage!

The test properly validates the setup_model_metrics function by:

  • Mocking the client to return a mix of LLM and non-LLM models
  • Verifying that only LLM models trigger metric updates
  • Ensuring the default model receives a metric value of 1 while others receive 0
  • Validating the exact sequence of metric calls
docker-compose.yaml (1)

1-31: Well-structured docker-compose configuration!

The configuration properly sets up both services with:

  • Appropriate build context and image references
  • Correct port mappings (8321 for llama-stack, 8080 for lightspeed-stack)
  • Volume mounts with SELinux contexts (:Z) for enhanced security
  • Proper service dependencies
  • Shared network for inter-container communication
test.containerfile (1)

1-39: Clean containerfile structure!

The containerfile follows good practices:

  • Uses minimal UBI base image for security
  • Properly sets up working directory and environment
  • Installs necessary system and Python dependencies
  • Clear CMD for running the llama stack service
tests/unit/app/endpoints/test_feedback.py (3)

16-26: Configuration tests correctly updated for inverted flag

The tests properly reflect the configuration change from feedback_disabled to feedback_enabled, maintaining the correct logic inversion.


51-86: Excellent use of parameterized testing!

The parameterized test effectively covers:

  • Empty feedback payloads
  • Feedback with negative sentiment and multiple categories
  • Proper mocking and assertions for each scenario

113-157: Comprehensive parameterized test coverage for store_feedback

The test covers diverse feedback scenarios:

  • Traditional text feedback with sentiment
  • New categorized feedback without text
  • Proper verification of stored data structure
tests/unit/auth/test_jwk_token.py (2)

19-73: Well-designed fixture architecture!

The fixtures demonstrate excellent design:

  • Modular key generation with make_key() helper
  • Composable fixtures that build upon each other
  • Proper cleanup of global state with clear_jwk_cache
  • Clear naming conventions for different test scenarios

182-540: Comprehensive test coverage for JWT authentication!

The test suite excellently covers:

  • ✅ Valid token scenarios (single and multi-key)
  • ✅ Authentication failures (expired, invalid signature)
  • ✅ Missing components (headers, claims)
  • ✅ Custom claim configurations
  • ✅ Edge cases (tokens without kid, multiple keys with same algorithm)
  • ✅ Proper HTTP status codes and error messages
src/app/main.py (2)

3-15: LGTM: Clean import organization and appropriate dependencies.

The reorganized imports and new dependencies support the metrics middleware functionality effectively.


43-65: Well-implemented REST API metrics middleware.

The middleware correctly filters requests, measures response durations, and tracks API calls while appropriately excluding the /metrics endpoint from counter increments to avoid noise.

src/metrics/__init__.py (1)

1-51: Excellent Prometheus metrics implementation.

The metrics definitions follow best practices with:

  • Consistent naming convention with ls_ prefix
  • Appropriate metric types for each use case
  • Well-defined labels for proper dimensionality
  • Clear, descriptive help text

The TODO items for token usage metrics are acceptable for future implementation.

tests/unit/test_client.py (2)

9-33: Excellent test coverage for client error handling.

The new tests properly verify that uninitialized clients raise RuntimeError with appropriate error messages. The regex matching correctly escapes parentheses in the error messages.


47-48: Good enhancement of existing tests.

Adding assertions to verify get_client() returns non-None values after successful initialization strengthens the test coverage and validates the complete client lifecycle.

Also applies to: 63-64, 96-97, 112-113

pyproject.toml (3)

6-20: Good project metadata enhancements.

The addition of maintainers, classifiers, and keywords improves project discoverability and provides useful metadata for package consumers.


32-35: New runtime dependencies support added features appropriately.

The additions of prometheus-client, starlette, aiohttp, and authlib align well with the metrics infrastructure and authentication features being introduced.


74-76: Development dependency updates look reasonable.

The addition of types-cachetools, build, and upgraded twine version support better development and publishing workflows.

tests/configuration/run.yaml (1)

1-125: Well-structured test configuration for llama stack environment.

The configuration provides a comprehensive minimal setup with:

  • Complete API coverage for testing various capabilities
  • SQLite-based stores appropriate for test environments
  • Proper environment variable usage for API key security
  • Reasonable server configuration for testing

The dependency on ${env.OPENAI_API_KEY} assumes this environment variable is available during testing, which should be documented in test setup instructions.

tests/unit/app/endpoints/test_models.py (3)

7-8: LGTM!

The import of APIConnectionError is appropriately added to support the new test case for connection error handling.


54-54: Verify the boolean values match intended behavior.

The configuration keys have been changed from negative flags (transcripts_disabled, feedback_disabled) to positive flags (transcripts_enabled, feedback_enabled). However, the boolean values remain False. Please ensure this aligns with the intended behavior - if transcripts/feedback were previously enabled (with *_disabled: False), they should now be *_enabled: True.

Also applies to: 98-98, 137-137, 181-181


145-145: LGTM!

Good standardization to use mocker.Mock() from pytest-mock instead of direct Mock() import.

tests/unit/models/test_requests.py (8)

8-8: LGTM!

The import of FeedbackCategory is necessary for the new category-related tests.


125-134: LGTM!

Good edge case test for get_documents method with empty attachments.


221-233: LGTM!

The test correctly validates that at least one form of feedback must be provided, including the new categories field.


247-264: LGTM!

Comprehensive test for the new categories feature with multiple category values.


265-274: LGTM!

Good test case for single category feedback.


275-292: LGTM!

Good test to verify duplicate category handling.


293-304: LGTM!

Good validation that empty categories list is properly rejected.


305-358: LGTM!

Excellent test coverage for the categories feature including:

  • Categories-only feedback
  • Mixed feedback types (categories + sentiment + user feedback)
  • Validation of all category enum values
  • Type validation for invalid input

The test suite comprehensively covers the new functionality.

src/auth/jwk_token.py (1)

122-191: LGTM! Comprehensive JWT validation implementation.

Excellent error handling with appropriate HTTP status codes for different failure scenarios:

  • 401 for authentication failures (bad signature, expired token, missing claims)
  • 400 for malformed tokens
  • 500 for internal errors

The claim extraction properly validates required fields and logs successful authentication.

tests/unit/utils/test_mcp_headers.py (1)

1-185: LGTM! Comprehensive test coverage.

Excellent test coverage for MCP header utilities including:

  • Valid single and multiple header entries
  • Invalid JSON handling
  • Type validation (list, null)
  • Both sync and async function variants

The tests properly verify that invalid inputs return empty dictionaries as expected.

src/app/endpoints/streaming_query.py (5)

3-19: LGTM: Import additions support new functionality.

The new imports for ast, logging, and additional type hints are appropriate for the enhanced streaming event handling and no_tools feature implementation.


54-87: LGTM: Correct implementation of no_tools parameter.

The get_agent function properly implements the no_tools feature by:

  • Adding the parameter with appropriate default value for backward compatibility
  • Correctly setting tool_parser=None when tools are disabled
  • Enhancing logging for better debugging

The implementation aligns well with the PR objectives.


140-412: Excellent refactoring of streaming event handling.

The reorganization of stream_build_event into specialized helper functions is a significant improvement:

  • Better separation of concerns: Each function handles a specific event type
  • Improved maintainability: Logic is broken into manageable, focused functions
  • Enhanced readability: Clear naming and organization make the code easier to follow
  • Better testability: Individual functions can be tested in isolation

The helper functions are well-organized into logical sections (error, turn, shield, inference, tool execution handling), making the codebase much more maintainable.


432-486: LGTM: Proper metrics integration.

The metrics tracking implementation is well-executed:

  • Correctly tracks both successful LLM calls and failures
  • Uses appropriate labels (provider_id, model_id) for detailed monitoring
  • Integrates seamlessly with the existing codebase patterns
  • Provides good observability for the streaming query endpoint

529-585: LGTM: Comprehensive no_tools implementation.

The retrieve_response function correctly implements the no_tools feature:

When no_tools=True:

  • Properly bypasses MCP headers and toolgroups
  • Clears agent extra headers to prevent tool-related data leakage
  • Sets toolgroups to None to disable tool execution

When no_tools=False:

  • Maintains existing behavior with full tool and MCP support
  • Preserves backward compatibility

The conditional logic is clear and the implementation aligns perfectly with the PR objectives.

docs/openapi.json (6)

4-10: Metadata update looks good
Title, description and version bump are consistent with the broader re-branding effort.


700-709: Verify optional jwk_config interplay with module field
If authentication.module"jwt", a non-null jwk_config is meaningless. Add oneOf/allOf constraints or at least document the expected combination to prevent invalid configs slipping through validation.


981-993: New FeedbackCategory enum is well-defined
Enumeration and description provide clear semantics.


1200-1239: JWT defaults: confirm claim names
Hard-coding "user_id" / "username" as defaults is fine if the backend honors them. Double-check the auth layer to ensure mismatches don’t surface at runtime.


1240-1258: LivenessResponse schema addition LGTM
Straight-forward and consistent with existing readiness schema.


1690-1710: Positive flag defaults may enable collection unintentionally
Switching from *_disabled to *_enabled flips default semantics. With both flags defaulting to false, behaviour remains unchanged—but any legacy config files that relied on omission to disable collection now require explicit false. Document this breaking change prominently in release notes.

tests/unit/app/endpoints/test_streaming_query.py (12)

3-3: LGTM! Import added for handling timestamps.

The addition of datetime import is appropriate for handling timestamps in the new test cases for turn events.


16-35: LGTM! Comprehensive Llama Stack type imports.

The imports cover all necessary types for creating properly typed streaming response chunks and events, replacing generic mocks with precise type representations.


101-101: LGTM! Configuration key corrected.

The change from "transcripts_disabled": True to "transcripts_enabled": False aligns with the global inversion of user data collection flags.


177-221: LGTM! Well-structured streaming response using proper types.

The test correctly constructs typed AgentTurnResponseStreamChunk objects with appropriate event payloads, providing realistic test data. The explanatory comment about why mocks cannot be used is helpful for future maintainers.


229-231: LGTM! Updated to match new model selection logic.

The patch correctly reflects the change from select_model_id to select_model_and_provider_id returning a tuple of (model_id, provider_id).


269-270: Verify the expected streaming content length.

The assertion expects exactly 5 streaming content chunks and extracts referenced documents from the 5th chunk. Please verify this matches the actual streaming behavior with the new event structure.

Also applies to: 283-284


533-533: LGTM! Consistent addition of no_tools parameter.

The explicit no_tools=False parameter has been consistently added to all get_agent calls, maintaining backward compatibility and testing the default behavior.

Also applies to: 1042-1042, 1110-1110, 1189-1189


650-817: LGTM! Comprehensive test coverage for event types.

The tests thoroughly cover all event types including turn lifecycle, shield violations (with and without), and properly verify:

  • SSE format and content
  • Metric incrementation for validation errors
  • Proper role assignment for different event types

The structured approach with typed objects provides realistic test scenarios.


819-904: LGTM! Well-tested step progress events.

The tests properly cover:

  • Text delta events
  • Tool call delta events with both string and ToolCall object payloads
  • Correct event type and role assignments

Good coverage of the different delta types that can occur during step progress.


1569-1647: LGTM! Proper endpoint handler tests for no_tools parameter.

The tests correctly verify that the streaming query endpoint handler works with both no_tools=True and no_tools=False, returning a StreamingResponse in both cases.


1650-1756: LGTM! Comprehensive tests for no_tools impact on MCP and RAG.

The tests thoroughly verify that:

  • When no_tools=True: MCP headers are empty, toolgroups is None
  • When no_tools=False: MCP headers are populated, toolgroups include both RAG and MCP servers

Excellent coverage of the feature's core functionality.


1759-1874: LGTM! Proper verification of tool_parser behavior.

The tests correctly verify that:

  • When no_tools=True: tool_parser=None is passed to AsyncAgent
  • When no_tools=False: The GraniteToolParser is properly instantiated and passed

This ensures tool parsing is disabled when tools are not needed.

docs/config.puml (1)

1-95: Documentation changes appear unrelated to PR objective

The PlantUML diagram changes introduce JWT/JWK authentication, inference configuration, and rename user data collection flags, but none of these changes relate to the PR's stated objective of adding a no_tools parameter to the /query endpoint. These appear to be unrelated changes that should be in a separate PR.

Likely an incorrect or invalid review comment.

src/models/requests.py (2)

72-73: LGTM! The no_tools parameter is correctly implemented

The addition of the optional no_tools boolean parameter with a default value of False is well-implemented and documented.

Also applies to: 86-86, 102-102


153-167: Feedback categorization changes are out of scope for this PR

The introduction of FeedbackCategory enum and the categories field in FeedbackRequest are unrelated to the PR's objective of adding a no_tools parameter. These substantial changes should be in a separate PR to maintain focus and make reviews easier.

Also applies to: 178-212, 264-291

Likely an incorrect or invalid review comment.

tests/unit/models/test_config.py (1)

1-709: Incorrect missing tests for no_tools parameter

Tests covering the no_tools parameter already exist in the endpoint test suites:

  • tests/unit/app/endpoints/test_query.py
  • tests/unit/app/endpoints/test_streaming_query.py

The test_config.py file is focused on model configuration and is not the appropriate place for no_tools behavior tests. You can safely ignore the original comment.

Likely an incorrect or invalid review comment.

src/models/config.py (2)

98-100: Good addition of file validation

The addition of file existence validation for library_client_config_path is a good improvement that prevents runtime errors.


150-162: Configuration changes unrelated to PR objective

The addition of JwtConfiguration, JwkConfiguration, and InferenceConfiguration classes are substantial changes that don't relate to the PR's objective of adding a no_tools parameter to the /query endpoint. These should be in a separate PR.

Also applies to: 220-238

Likely an incorrect or invalid review comment.

# .github/workflows/e2e_test.yml
name: E2E Tests

on: [push, pull_request_target]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Security risk: Running untrusted PR code with secrets access.

Using pull_request_target with checkout of PR head code and access to secrets (OPENAI_API_KEY) is dangerous. Malicious PRs could:

  1. Modify docker-compose files or test code to exfiltrate secrets
  2. Execute arbitrary commands with repository permissions

Consider either:

  • Use pull_request instead of pull_request_target (but you'll lose access to secrets)
  • Keep pull_request_target but don't checkout PR code - only run trusted code from the base repository
  • Implement a two-workflow pattern where untrusted code runs first without secrets, then trusted code validates results

Also applies to: 13-25

🤖 Prompt for AI Agents
In .github/workflows/e2e_tests.yaml around lines 4 and 13 to 25, the workflow
uses pull_request_target event with checkout of PR code and access to secrets,
which poses a security risk. To fix this, change the event from
pull_request_target to pull_request to avoid exposing secrets to untrusted PR
code, or modify the workflow to avoid checking out PR code when using
pull_request_target, ensuring only trusted base repository code runs with
secrets. Alternatively, implement a two-workflow approach where untrusted code
runs first without secrets and trusted code runs later with secrets to validate
results.

Comment on lines 26 to 46
- name: Debug checkout for umago/lightspeed-stack setup-metrics branch
run: |
echo "=== GitHub Event Information ==="
echo "Event name: ${{ github.event_name }}"
echo "Base repo: ${{ github.repository }}"
echo "Base SHA: ${{ github.sha }}"
echo ""
echo "=== PR Information ==="
echo "PR head repo: '${{ github.event.pull_request.head.repo.full_name }}'"
echo "PR head ref: '${{ github.event.pull_request.head.ref }}'"
echo "PR head SHA: '${{ github.event.pull_request.head.sha }}'"
echo "PR number: ${{ github.event.pull_request.number }}"
echo ""
echo "=== Resolved Checkout Values ==="
echo "Repository used: ${{ github.event.pull_request.head.repo.full_name || github.repository }}"
echo "Ref used: ${{ github.event.pull_request.head.ref || github.sha }}"
echo ""
echo "=== Expected for umago/lightspeed-stack:setup-metrics ==="
echo "Should be repo: umago/lightspeed-stack"
echo "Should be ref: setup-metrics"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix potential command injection vulnerability.

The static analysis correctly identifies that using ${{ github.event.pull_request.head.ref }} directly in shell commands is unsafe. Malicious branch names could inject commands.

Use environment variables instead:

       - name: Debug checkout for umago/lightspeed-stack setup-metrics branch
+        env:
+          PR_HEAD_REF: ${{ github.event.pull_request.head.ref }}
         run: |
           echo "=== GitHub Event Information ==="
           echo "Event name: ${{ github.event_name }}"
           echo "Base repo: ${{ github.repository }}"
           echo "Base SHA: ${{ github.sha }}"
           echo ""
           echo "=== PR Information ==="
           echo "PR head repo: '${{ github.event.pull_request.head.repo.full_name }}'"
-          echo "PR head ref: '${{ github.event.pull_request.head.ref }}'"
+          echo "PR head ref: '$PR_HEAD_REF'"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- name: Debug checkout for umago/lightspeed-stack setup-metrics branch
run: |
echo "=== GitHub Event Information ==="
echo "Event name: ${{ github.event_name }}"
echo "Base repo: ${{ github.repository }}"
echo "Base SHA: ${{ github.sha }}"
echo ""
echo "=== PR Information ==="
echo "PR head repo: '${{ github.event.pull_request.head.repo.full_name }}'"
echo "PR head ref: '${{ github.event.pull_request.head.ref }}'"
echo "PR head SHA: '${{ github.event.pull_request.head.sha }}'"
echo "PR number: ${{ github.event.pull_request.number }}"
echo ""
echo "=== Resolved Checkout Values ==="
echo "Repository used: ${{ github.event.pull_request.head.repo.full_name || github.repository }}"
echo "Ref used: ${{ github.event.pull_request.head.ref || github.sha }}"
echo ""
echo "=== Expected for umago/lightspeed-stack:setup-metrics ==="
echo "Should be repo: umago/lightspeed-stack"
echo "Should be ref: setup-metrics"
- name: Debug checkout for umago/lightspeed-stack setup-metrics branch
env:
PR_HEAD_REF: ${{ github.event.pull_request.head.ref }}
run: |
echo "=== GitHub Event Information ==="
echo "Event name: ${{ github.event_name }}"
echo "Base repo: ${{ github.repository }}"
echo "Base SHA: ${{ github.sha }}"
echo ""
echo "=== PR Information ==="
echo "PR head repo: '${{ github.event.pull_request.head.repo.full_name }}'"
echo "PR head ref: '$PR_HEAD_REF'"
echo "PR head SHA: '${{ github.event.pull_request.head.sha }}'"
echo "PR number: ${{ github.event.pull_request.number }}"
echo ""
echo "=== Resolved Checkout Values ==="
echo "Repository used: ${{ github.event.pull_request.head.repo.full_name || github.repository }}"
echo "Ref used: ${{ github.event.pull_request.head.ref || github.sha }}"
echo ""
echo "=== Expected for umago/lightspeed-stack:setup-metrics ==="
echo "Should be repo: umago/lightspeed-stack"
echo "Should be ref: setup-metrics"
🧰 Tools
🪛 actionlint (1.7.7)

27-27: "github.event.pull_request.head.ref" is potentially untrusted. avoid using it directly in inline scripts. instead, pass it through an environment variable. see https://docs.github.com/en/actions/security-for-github-actions/security-guides/security-hardening-for-github-actions for more details

(expression)

🪛 YAMLlint (1.37.1)

[error] 44-44: trailing spaces

(trailing-spaces)

🤖 Prompt for AI Agents
In .github/workflows/e2e_tests.yaml between lines 26 and 46, the use of GitHub
context expressions like ${{ github.event.pull_request.head.ref }} directly in
shell commands can lead to command injection vulnerabilities if branch names are
malicious. To fix this, assign these values to environment variables in the
workflow step and then reference the environment variables inside the shell
script. This approach safely handles the values and prevents injection risks.

Comment on lines 86 to 89
feedback_disabled: false
feedback_storage: "/tmp/data/feedback"
transcripts_disabled: false
transcripts_storage: "/tmp/data/transcripts"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Configuration keys inconsistent with codebase changes.

The configuration uses feedback_disabled and transcripts_disabled (negative flags), but other test files were updated to use positive flags (feedback_enabled, transcripts_enabled). This inconsistency could cause issues.

🤖 Prompt for AI Agents
In .github/workflows/e2e_tests.yaml around lines 86 to 89, the configuration
keys use negative flags feedback_disabled and transcripts_disabled, which is
inconsistent with the rest of the codebase that uses positive flags
feedback_enabled and transcripts_enabled. Update these keys to feedback_enabled
and transcripts_enabled respectively, and adjust their boolean values
accordingly to maintain the intended behavior.


networks:
lightspeednet:
driver: bridge No newline at end of file
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Add newline at end of file

Add a newline character at the end of the file to follow YAML best practices:

 networks:
   lightspeednet:
-    driver: bridge
+    driver: bridge
+
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
driver: bridge
networks:
lightspeednet:
driver: bridge
🧰 Tools
🪛 YAMLlint (1.37.1)

[error] 32-32: no new line character at the end of file

(new-line-at-end-of-file)

🤖 Prompt for AI Agents
In docker-compose.yaml at line 32, add a newline character at the end of the
file to comply with YAML best practices and ensure proper file formatting.

Comment on lines 787 to 796
"inference": {
"anyOf": [
{
"$ref": "#/components/schemas/InferenceConfiguration"
},
{
"type": "null"
}
],
"default": {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

no_tools flag missing from QueryRequest schema
The PR objective is to expose an optional no_tools boolean on /query (and /streaming_query). That field is implemented in code but absent from the OpenAPI spec, breaking client generation and contract testing.

Apply something like:

@@ "QueryRequest": {
   "properties": {
+      "no_tools": {
+          "type": "boolean",
+          "title": "No Tools",
+          "description": "When true, skip invocation of any external tools during query processing.",
+          "default": false
+      },

and remember to extend the examples blocks.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
"inference": {
"anyOf": [
{
"$ref": "#/components/schemas/InferenceConfiguration"
},
{
"type": "null"
}
],
"default": {}
"no_tools": {
"type": "boolean",
"title": "No Tools",
"description": "When true, skip invocation of any external tools during query processing.",
"default": false
},
"inference": {
"anyOf": [
{
"$ref": "#/components/schemas/InferenceConfiguration"
},
{
"type": "null"
}
],
"default": {}
🤖 Prompt for AI Agents
In docs/openapi.json around lines 787 to 796, the QueryRequest schema is missing
the optional boolean field `no_tools`, which is implemented in code but absent
in the OpenAPI spec. Add the `no_tools` field as an optional boolean property to
the QueryRequest schema, ensuring it is included in the `properties` section
with a default value if applicable. Also, update the `examples` blocks to
include examples demonstrating the usage of the `no_tools` flag to keep the
documentation and client generation consistent.

Comment on lines 25 to 28
default_model_label = (
configuration.inference.default_provider, # type: ignore[reportAttributeAccessIssue]
configuration.inference.default_model, # type: ignore[reportAttributeAccessIssue]
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Add error handling for missing inference configuration.

The code assumes configuration.inference exists and has both default_provider and default_model attributes, but these could be None based on the Optional[InferenceConfiguration] return type.

+    inference_config = configuration.inference
+    if not inference_config or not inference_config.default_provider or not inference_config.default_model:
+        logger.warning("Default model/provider not configured, all metrics will be set to 0")
+        default_model_label = (None, None)
+    else:
+        default_model_label = (
+            inference_config.default_provider,
+            inference_config.default_model,
+        )
-    default_model_label = (
-        configuration.inference.default_provider,  # type: ignore[reportAttributeAccessIssue]
-        configuration.inference.default_model,  # type: ignore[reportAttributeAccessIssue]
-    )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
default_model_label = (
configuration.inference.default_provider, # type: ignore[reportAttributeAccessIssue]
configuration.inference.default_model, # type: ignore[reportAttributeAccessIssue]
)
inference_config = configuration.inference
if not inference_config or not inference_config.default_provider or not inference_config.default_model:
logger.warning("Default model/provider not configured, all metrics will be set to 0")
default_model_label = (None, None)
else:
default_model_label = (
inference_config.default_provider,
inference_config.default_model,
)
🤖 Prompt for AI Agents
In src/metrics/utils.py around lines 25 to 28, the code assumes
configuration.inference and its attributes default_provider and default_model
always exist, but they can be None. Add error handling to check if
configuration.inference is None or if either attribute is missing before
accessing them. If missing, handle the case gracefully by providing default
values or raising a clear error to avoid runtime exceptions.

RUN microdnf install -y --nodocs --setopt=keepcache=0 --setopt=tsflags=nodocs \
python3.12 python3.12-devel python3.12-pip git tar

RUN curl -LsSf https://astral.sh/uv/install.sh | sh
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Verify UV installer integrity

Downloading and executing scripts directly from the internet poses security risks. Consider verifying the installer's checksum or using a specific version.

-RUN curl -LsSf https://astral.sh/uv/install.sh | sh
+RUN curl -LsSf https://astral.sh/uv/0.5.18/install.sh -o install.sh && \
+    echo "expected-checksum install.sh" | sha256sum -c && \
+    sh install.sh && \
+    rm install.sh
🤖 Prompt for AI Agents
In test.containerfile at line 15, the RUN command downloads and executes the UV
installer script directly without verifying its integrity. To fix this, modify
the Dockerfile to download the installer script first, then verify its checksum
against a known value before executing it. Alternatively, specify a fixed
version of the installer to reduce risk. This ensures the script's authenticity
and prevents executing tampered code.

Comment on lines 19 to 37
RUN uv venv && \
uv pip install llama-stack \
fastapi \
opentelemetry-sdk \
opentelemetry-exporter-otlp \
opentelemetry-instrumentation \
aiosqlite \
litellm \
uvicorn \
blobfile \
datasets \
sqlalchemy \
faiss-cpu \
mcp \
autoevals \
psutil \
torch \
peft \
trl
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Pin package versions for reproducibility and security

Installing packages without version constraints can lead to non-reproducible builds and potential security vulnerabilities.

Consider pinning specific versions for critical packages:

 RUN uv venv && \
-    uv pip install llama-stack \
+    uv pip install llama-stack==0.1.0 \
-    fastapi \
+    fastapi==0.115.0 \
-    opentelemetry-sdk \
+    opentelemetry-sdk==1.28.0 \
     # ... pin other packages similarly

Alternatively, use a requirements.txt file with pinned versions for better maintainability.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
RUN uv venv && \
uv pip install llama-stack \
fastapi \
opentelemetry-sdk \
opentelemetry-exporter-otlp \
opentelemetry-instrumentation \
aiosqlite \
litellm \
uvicorn \
blobfile \
datasets \
sqlalchemy \
faiss-cpu \
mcp \
autoevals \
psutil \
torch \
peft \
trl
RUN uv venv && \
uv pip install llama-stack==0.1.0 \
fastapi==0.115.0 \
opentelemetry-sdk==1.28.0 \
opentelemetry-exporter-otlp \
opentelemetry-instrumentation \
aiosqlite \
litellm \
uvicorn \
blobfile \
datasets \
sqlalchemy \
faiss-cpu \
mcp \
autoevals \
psutil \
torch \
peft \
trl
🤖 Prompt for AI Agents
In test.containerfile around lines 19 to 37, the RUN command installs multiple
Python packages without specifying versions, which can cause non-reproducible
builds and security risks. To fix this, pin specific versions for each package
by appending version numbers (e.g., package==1.2.3) or switch to using a
requirements.txt file with all dependencies and their versions listed, then
install using pip install -r requirements.txt for better maintainability and
reproducibility.

Comment on lines 203 to 206
with pytest.raises(HTTPException) as e:
models_endpoint_handler(request)
assert e.status_code == status.HTTP_500_INTERNAL_SERVER_ERROR
assert e.detail["response"] == "Unable to connect to Llama Stack"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix assertion placement in exception test.

The assertions on lines 205-206 are unreachable because they're inside the with pytest.raises(HTTPException) block. Once the exception is raised, the code exits the block. Move the assertions outside:

-    with pytest.raises(HTTPException) as e:
+    with pytest.raises(HTTPException) as exc_info:
         models_endpoint_handler(request)
-        assert e.status_code == status.HTTP_500_INTERNAL_SERVER_ERROR
-        assert e.detail["response"] == "Unable to connect to Llama Stack"
+    assert exc_info.value.status_code == status.HTTP_500_INTERNAL_SERVER_ERROR
+    assert exc_info.value.detail == "Unable to connect to Llama Stack"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
with pytest.raises(HTTPException) as e:
models_endpoint_handler(request)
assert e.status_code == status.HTTP_500_INTERNAL_SERVER_ERROR
assert e.detail["response"] == "Unable to connect to Llama Stack"
with pytest.raises(HTTPException) as exc_info:
models_endpoint_handler(request)
assert exc_info.value.status_code == status.HTTP_500_INTERNAL_SERVER_ERROR
assert exc_info.value.detail == "Unable to connect to Llama Stack"
🤖 Prompt for AI Agents
In tests/unit/app/endpoints/test_models.py around lines 203 to 206, the
assertions checking the exception's status_code and detail are placed inside the
pytest.raises context manager block, making them unreachable. Move these
assertions outside the with block so they execute after the exception is caught,
ensuring the test properly verifies the exception details.

@@ -0,0 +1,69 @@
"""Unit tests for functions defined in metrics/utils.py"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix typo in filename: test_utis.pytest_utils.py

The filename contains a typo. It should be test_utils.py to match the module being tested (metrics/utils.py).

🤖 Prompt for AI Agents
In tests/unit/metrics/test_utis.py at line 1, the filename contains a typo and
should be renamed from test_utis.py to test_utils.py to correctly reflect the
module being tested (metrics/utils.py). Rename the file accordingly to fix the
typo.

Copy link
Contributor

@umago umago left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thank you!

@tisnik tisnik merged commit 1c95a1a into lightspeed-core:main Aug 1, 2025
17 of 18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants