Skip to content

Conversation

@thoraxe
Copy link
Contributor

@thoraxe thoraxe commented Oct 31, 2025

Description

Implements /v1/responses endpoint providing OpenAI-compatible API interface
while leveraging existing Lightspeed RAG and LLM integration.

  • Add CreateResponseRequest and OpenAIResponse models following OpenAI spec
  • Implement responses endpoint handler with proper auth and error handling
  • Add OpenAI to Lightspeed request/response mapping utilities
  • Add RESPONSES action to authorization system
  • Include comprehensive unit test coverage (100% for new code)
  • Maintain full compatibility with existing authentication patterns
  • Support referenced documents via metadata field for RAG integration

🤖 Generated with Claude Code

Co-Authored-By: Claude [email protected]"

Type of change

  • Refactor
  • New feature
  • Bug fix
  • CVE fix
  • Optimization
  • Documentation Update
  • Configuration Update
  • Bump-up service version
  • Bump-up dependent library
  • Bump-up library or tool used for development (does not change the final image)
  • CI configuration change
  • Konflux configuration change
  • Unit tests improvement
  • Integration tests improvement
  • End to end tests improvement

Related Tickets & Documents

https://issues.redhat.com/browse/LCORE-901

Checklist before requesting a review

  • I have performed a self-review of my code.
  • PR has passed all pre-merge test jobs.
  • If it is a core feature, I have added thorough tests.

Testing

  • Please provide detailed steps to perform tests related to this code change.
  • How were the fix/results from this change verified? Please provide relevant screenshots or results.

Summary by CodeRabbit

Release Notes

  • New Features

    • Added OpenAI-compatible /responses endpoint for response generation with configurable model, instructions, temperature, and output token limits.
    • Added RESPONSES authorization action for endpoint access control.
    • Referenced documents now included in response metadata.
  • Improvements

    • Enhanced error handling with specific HTTP status codes for connection and validation failures.
    • LLM failure metrics tracking for monitoring.

  Implements /v1/responses endpoint providing OpenAI-compatible API interface
  while leveraging existing Lightspeed RAG and LLM integration.

  - Add CreateResponseRequest and OpenAIResponse models following OpenAI spec
  - Implement responses endpoint handler with proper auth and error handling
  - Add OpenAI to Lightspeed request/response mapping utilities
  - Add RESPONSES action to authorization system
  - Include comprehensive unit test coverage (100% for new code)
  - Maintain full compatibility with existing authentication patterns
  - Support referenced documents via metadata field for RAG integration

  🤖 Generated with [Claude Code](https://claude.com/claude-code)

  Co-Authored-By: Claude <[email protected]>"
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 31, 2025

Walkthrough

This pull request introduces a new OpenAI-compatible /responses endpoint that accepts OpenAI-formatted requests, translates them into internal QueryRequest objects, retrieves responses via the existing Llama Stack integration, and returns OpenAI-formatted responses. Includes new request/response models, mapping utilities, authorization configuration, and comprehensive test coverage.

Changes

Cohort / File(s) Summary
Core Endpoint Implementation
src/app/endpoints/responses.py, src/utils/openai_mapping.py
Introduces new endpoint handler that validates config, extracts auth, maps OpenAI requests to internal format, retrieves responses via AsyncLlamaStackClientHolder, and maps results back to OpenAI format. Includes error handling for APIConnectionError (500), ValueError/AttributeError/TypeError (422), and OpenAPI response definitions. Also adds two mapping functions: map_openai_to_query_request() and map_query_to_openai_response().
Request/Response Models
src/models/requests.py, src/models/responses.py
Adds CreateResponseRequest model with fields for model, input, instructions, temperature, and max_output_tokens. Introduces five new response models: ResponseContent, ResponseMessage, ResponseOutput, ResponseUsage, and OpenAIResponse, each with field-level and model-level validators for OpenAI compatibility.
Authorization & Configuration
src/models/config.py
Adds new RESPONSES enum member to Action class to represent endpoint access permission.
Router Registration
src/app/routers.py
Imports responses module and registers its router with FastAPI app at prefix "/v1".
Endpoint Tests
tests/unit/app/endpoints/test_responses.py
Comprehensive unit tests covering successful response generation, authorization validation, error handling (APIConnectionError, ValueError, AttributeError, TypeError), and correct mapping invocation.
Integration & Router Tests
tests/unit/app/test_routers.py, tests/unit/authorization/test_resolvers.py
Updates router tests to verify new responses router registration and prefix; adds authorization test verifying RESPONSES action inclusion in AccessRule specifications.
Mapping & Model Tests
tests/unit/test_openai_mapping.py, tests/unit/test_openai_requests.py, tests/unit/test_openai_response_models.py
Adds extensive test coverage for OpenAI mapping functions (bidirectional conversion, ID/timestamp generation, token usage), CreateResponseRequest validation, and response model validators (content type, role, finish_reason, status).

Sequence Diagram

sequenceDiagram
    participant Client
    participant Endpoint as /responses Endpoint
    participant Mapping as OpenAI Mapping
    participant LlamaStack as Llama Stack Client
    participant Metrics as Metrics

    Client->>Endpoint: POST /v1/responses (OpenAI format)
    Endpoint->>Endpoint: Validate config & auth
    Endpoint->>Mapping: map_openai_to_query_request()
    Mapping->>Mapping: Convert to internal format
    Mapping-->>Endpoint: QueryRequest

    Endpoint->>LlamaStack: retrieve_response(QueryRequest)
    
    alt Success
        LlamaStack-->>Endpoint: QueryResponse
        Endpoint->>Mapping: map_query_to_openai_response()
        Mapping->>Mapping: Generate ID, timestamp, usage
        Mapping-->>Endpoint: OpenAIResponse
        Endpoint-->>Client: 200 OpenAIResponse
    else APIConnectionError
        LlamaStack-->>Endpoint: APIConnectionError
        Endpoint->>Metrics: Increment LLM failure metric
        Endpoint-->>Client: 500 Error Detail
    else Validation Error
        Endpoint->>Endpoint: ValueError/AttributeError/TypeError
        Endpoint-->>Client: 422 Invalid Input
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

  • Error handling paths: Endpoint implements three distinct error scenarios (APIConnectionError with metrics, ValueError/AttributeError/TypeError) requiring careful verification of HTTP status codes and payload structures.
  • Validator logic: Five response models contain cross-field validation rules (validate_type, validate_role, validate_finish_reason, etc.) that must be reviewed for correctness and edge case handling.
  • Bidirectional mapping: Two mapping functions must correctly transform between OpenAI and internal formats, including optional field handling and MVP-specific constraints.
  • Integration points: Interactions with AsyncLlamaStackClientHolder, auth dependencies, and metrics instrumentation need verification against existing patterns.

Possibly related PRs

Suggested reviewers

  • manstis
  • tisnik

Poem

🐰 Hops and cheers for OpenAI's grace,
A compatible endpoint now has its place,
With mapping so clever and models so fine,
Your responses will sparkle and beautifully align!

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The PR title "[WIP] Add OpenAI Responses API endpoint with MVP functionality" clearly and specifically describes the main objective of the changeset: introducing a new OpenAI-compatible responses endpoint with minimum viable product scope. The title is concise at 62 characters, avoids vague terminology or noise, and accurately reflects the primary changes across the modified files—including the new endpoint handler, request/response models, mapping utilities, authorization action, and comprehensive test coverage. The [WIP] prefix appropriately signals work-in-progress status. A teammate scanning the repository history would immediately understand this PR adds OpenAI Responses API support to the Lightspeed Stack.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
tests/unit/test_openai_response_models.py (1)

104-113: Consider adding total_tokens validation.

The test acknowledges that total_tokens validation may be added later, but currently allows mismatches between the sum of prompt_tokens + completion_tokens and total_tokens. While this is documented as intentional, consider adding a @model_validator to ensure data integrity.

If you decide to add this validation, you could add this validator to the ResponseUsage model in src/models/responses.py:

@model_validator(mode="after")
def validate_total_tokens(self) -> "ResponseUsage":
    """Validate that total_tokens matches the sum of prompt and completion tokens."""
    expected_total = self.prompt_tokens + self.completion_tokens
    if self.total_tokens != expected_total:
        raise ValueError(
            f"total_tokens ({self.total_tokens}) must equal "
            f"prompt_tokens + completion_tokens ({expected_total})"
        )
    return self
tests/unit/app/endpoints/test_responses.py (1)

24-30: Consider using the shared MOCK_AUTH constant.

The coding guidelines specify: "Use the shared auth mock constant: MOCK_AUTH = ("mock_user_id", "mock_username", False, "mock_token") in tests". While your current MOCK_AUTH uses a UUID format which may be intentional for this specific test, consider whether the shared constant should be used for consistency across the test suite.

As per coding guidelines

src/utils/openai_mapping.py (1)

23-73: Consider handling empty instructions string.

Line 55 directly assigns openai_request.instructions to system_prompt, which could be an empty string if provided. Depending on downstream behavior, an empty string might be treated differently than None. Consider normalizing empty strings to None for consistency.

Apply this diff if empty instructions should be treated as absent:

     # Map OpenAI instructions to Lightspeed system_prompt
-    system_prompt = openai_request.instructions
+    system_prompt = openai_request.instructions if openai_request.instructions else None
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between cd3291c and 65c6f42.

📒 Files selected for processing (12)
  • src/app/endpoints/responses.py (1 hunks)
  • src/app/routers.py (2 hunks)
  • src/models/config.py (1 hunks)
  • src/models/requests.py (1 hunks)
  • src/models/responses.py (2 hunks)
  • src/utils/openai_mapping.py (1 hunks)
  • tests/unit/app/endpoints/test_responses.py (1 hunks)
  • tests/unit/app/test_routers.py (5 hunks)
  • tests/unit/authorization/test_resolvers.py (1 hunks)
  • tests/unit/test_openai_mapping.py (1 hunks)
  • tests/unit/test_openai_requests.py (1 hunks)
  • tests/unit/test_openai_response_models.py (1 hunks)
🧰 Additional context used
📓 Path-based instructions (10)
src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use absolute imports for internal modules (e.g., from auth import get_auth_dependency)

Files:

  • src/utils/openai_mapping.py
  • src/models/config.py
  • src/models/requests.py
  • src/models/responses.py
  • src/app/endpoints/responses.py
  • src/app/routers.py
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: All modules start with descriptive module-level docstrings explaining purpose
Use logger = logging.getLogger(name) for module logging after import logging
Define type aliases at module level for clarity
All functions require docstrings with brief descriptions
Provide complete type annotations for all function parameters and return types
Use typing_extensions.Self in model validators where appropriate
Use modern union syntax (str | int) and Optional[T] or T | None consistently
Function names use snake_case with descriptive, action-oriented prefixes (get_, validate_, check_)
Avoid in-place parameter modification; return new data structures instead of mutating arguments
Use appropriate logging levels: debug, info, warning, error with clear messages
All classes require descriptive docstrings explaining purpose
Class names use PascalCase with conventional suffixes (Configuration, Error/Exception, Resolver, Interface)
Abstract base classes should use abc.ABC and @AbstractMethod for interfaces
Provide complete type annotations for all class attributes
Follow Google Python docstring style for modules, classes, and functions, including Args, Returns, Raises, Attributes sections as needed

Files:

  • src/utils/openai_mapping.py
  • src/models/config.py
  • tests/unit/authorization/test_resolvers.py
  • src/models/requests.py
  • tests/unit/test_openai_mapping.py
  • tests/unit/test_openai_response_models.py
  • tests/unit/test_openai_requests.py
  • src/models/responses.py
  • tests/unit/app/test_routers.py
  • src/app/endpoints/responses.py
  • tests/unit/app/endpoints/test_responses.py
  • src/app/routers.py
src/{models/config.py,configuration.py}

📄 CodeRabbit inference engine (CLAUDE.md)

src/{models/config.py,configuration.py}: All configuration uses Pydantic models extending ConfigurationBase
Configuration base models must set model_config with extra="forbid" to reject unknown fields

Files:

  • src/models/config.py
src/{models/**/*.py,configuration.py}

📄 CodeRabbit inference engine (CLAUDE.md)

src/{models/**/*.py,configuration.py}: Use @field_validator and @model_validator for custom validation in Pydantic models
Use precise type hints in configuration (e.g., Optional[FilePath], PositiveInt, SecretStr)

Files:

  • src/models/config.py
  • src/models/requests.py
  • src/models/responses.py
src/models/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/models/**/*.py: Pydantic models: use BaseModel for data models and extend ConfigurationBase for configuration
Use @model_validator and @field_validator for Pydantic model validation

Files:

  • src/models/config.py
  • src/models/requests.py
  • src/models/responses.py
tests/{unit,integration}/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/{unit,integration}/**/*.py: Use pytest for all unit and integration tests
Do not use unittest in tests; pytest is the standard

Files:

  • tests/unit/authorization/test_resolvers.py
  • tests/unit/test_openai_mapping.py
  • tests/unit/test_openai_response_models.py
  • tests/unit/test_openai_requests.py
  • tests/unit/app/test_routers.py
  • tests/unit/app/endpoints/test_responses.py
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Use pytest-mock to create AsyncMock objects for async interactions in tests
Use the shared auth mock constant: MOCK_AUTH = ("mock_user_id", "mock_username", False, "mock_token") in tests

Files:

  • tests/unit/authorization/test_resolvers.py
  • tests/unit/test_openai_mapping.py
  • tests/unit/test_openai_response_models.py
  • tests/unit/test_openai_requests.py
  • tests/unit/app/test_routers.py
  • tests/unit/app/endpoints/test_responses.py
src/app/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use standard FastAPI imports (from fastapi import APIRouter, HTTPException, Request, status, Depends) in FastAPI app code

Files:

  • src/app/endpoints/responses.py
  • src/app/routers.py
src/{app/**/*.py,client.py}

📄 CodeRabbit inference engine (CLAUDE.md)

Use async def for I/O-bound operations and external API calls

Files:

  • src/app/endpoints/responses.py
  • src/app/routers.py
src/app/endpoints/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

In API endpoints, raise FastAPI HTTPException with appropriate status codes for error handling

Files:

  • src/app/endpoints/responses.py
🧠 Learnings (3)
📚 Learning: 2025-09-18T16:46:33.353Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-09-18T16:46:33.353Z
Learning: Applies to src/models/**/*.py : Use model_validator and field_validator for Pydantic model validation

Applied to files:

  • src/models/responses.py
📚 Learning: 2025-09-18T16:46:33.353Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-09-18T16:46:33.353Z
Learning: Applies to src/{models/**/*.py,configuration.py} : Use field_validator and model_validator for custom validation in Pydantic models

Applied to files:

  • src/models/responses.py
📚 Learning: 2025-09-18T16:46:33.353Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-09-18T16:46:33.353Z
Learning: Applies to src/app/**/*.py : Use standard FastAPI imports (from fastapi import APIRouter, HTTPException, Request, status, Depends) in FastAPI app code

Applied to files:

  • src/app/routers.py
🧬 Code graph analysis (8)
src/utils/openai_mapping.py (2)
src/models/requests.py (2)
  • CreateResponseRequest (418-512)
  • QueryRequest (73-225)
src/models/responses.py (5)
  • QueryResponse (194-305)
  • ResponseContent (1177-1224)
  • ResponseMessage (1227-1271)
  • ResponseOutput (1274-1308)
  • ResponseUsage (1311-1347)
tests/unit/authorization/test_resolvers.py (2)
src/models/config.py (2)
  • AccessRule (381-385)
  • Action (329-378)
src/authorization/resolvers.py (7)
  • GenericAccessResolver (146-202)
  • check_access (123-124)
  • check_access (134-138)
  • check_access (171-188)
  • get_actions (127-128)
  • get_actions (140-143)
  • get_actions (190-202)
tests/unit/test_openai_mapping.py (3)
src/models/requests.py (2)
  • CreateResponseRequest (418-512)
  • QueryRequest (73-225)
src/models/responses.py (7)
  • QueryResponse (194-305)
  • OpenAIResponse (1350-1544)
  • ReferencedDocument (179-191)
  • ResponseContent (1177-1224)
  • ResponseMessage (1227-1271)
  • ResponseOutput (1274-1308)
  • ResponseUsage (1311-1347)
src/utils/openai_mapping.py (2)
  • map_openai_to_query_request (23-73)
  • map_query_to_openai_response (76-154)
tests/unit/test_openai_response_models.py (1)
src/models/responses.py (5)
  • OpenAIResponse (1350-1544)
  • ResponseOutput (1274-1308)
  • ResponseMessage (1227-1271)
  • ResponseContent (1177-1224)
  • ResponseUsage (1311-1347)
tests/unit/test_openai_requests.py (1)
src/models/requests.py (1)
  • CreateResponseRequest (418-512)
src/app/endpoints/responses.py (8)
src/authentication/__init__.py (1)
  • get_auth_dependency (14-52)
src/authorization/middleware.py (1)
  • authorize (111-122)
src/client.py (2)
  • AsyncLlamaStackClientHolder (18-55)
  • get_client (49-55)
src/configuration.py (1)
  • configuration (73-77)
src/models/requests.py (1)
  • CreateResponseRequest (418-512)
src/models/responses.py (4)
  • OpenAIResponse (1350-1544)
  • ForbiddenResponse (1120-1142)
  • UnauthorizedResponse (1094-1117)
  • QueryResponse (194-305)
src/utils/endpoints.py (1)
  • check_configuration_loaded (111-123)
src/utils/openai_mapping.py (2)
  • map_openai_to_query_request (23-73)
  • map_query_to_openai_response (76-154)
tests/unit/app/endpoints/test_responses.py (7)
src/app/endpoints/responses.py (1)
  • responses_endpoint_handler (75-192)
src/models/config.py (2)
  • config (140-146)
  • Action (329-378)
src/models/requests.py (1)
  • CreateResponseRequest (418-512)
src/models/responses.py (7)
  • OpenAIResponse (1350-1544)
  • QueryResponse (194-305)
  • ReferencedDocument (179-191)
  • ResponseContent (1177-1224)
  • ResponseMessage (1227-1271)
  • ResponseOutput (1274-1308)
  • ResponseUsage (1311-1347)
src/utils/types.py (1)
  • TurnSummary (89-163)
src/utils/token_counter.py (1)
  • TokenCounter (18-41)
src/client.py (1)
  • get_client (49-55)
src/app/routers.py (1)
tests/unit/app/test_routers.py (1)
  • include_router (36-51)
🪛 GitHub Actions: Python linter
tests/unit/app/test_routers.py

[error] 43-43: Pylint: W0621: Redefining name 'responses' from outer scope (line 9) (redefined-outer-name).

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: build-pr
  • GitHub Check: e2e_tests (azure)
  • GitHub Check: e2e_tests (ci)
🔇 Additional comments (10)
src/app/routers.py (1)

21-21: LGTM!

The import and registration of the responses router follows the existing patterns and is correctly integrated at the /v1 prefix alongside other API endpoints.

Also applies to: 39-39

tests/unit/authorization/test_resolvers.py (1)

344-362: LGTM!

The test properly validates that the RESPONSES action integrates with the authorization system. It confirms access control, permission checks, and action enumeration work as expected.

tests/unit/app/test_routers.py (1)

68-68: LGTM!

The router count and prefix assertions are correctly updated to account for the new responses router.

Also applies to: 77-77, 93-93, 102-102

src/models/config.py (1)

353-354: LGTM!

The RESPONSES action enum member is correctly added with appropriate documentation and value, following the established pattern for action definitions.

tests/unit/test_openai_requests.py (1)

1-140: LGTM!

The test suite comprehensively validates the CreateResponseRequest model, covering:

  • Valid minimal and full configurations
  • Required field validation
  • Boundary conditions for temperature and max_output_tokens
  • Extra field rejection (forbid mode)
  • Input validation for both string and array types
  • Model configuration structure

Test coverage appears complete for the new request model.

tests/unit/test_openai_response_models.py (1)

1-266: LGTM!

The test suite comprehensively validates all OpenAI response models, covering:

  • Valid and invalid type/role/finish_reason/status/object values
  • Empty array/string validation
  • Required field validation
  • Nested model validation
  • Metadata handling

Test coverage appears complete for the new response models.

tests/unit/app/endpoints/test_responses.py (1)

140-370: LGTM!

The test suite comprehensively covers the /responses endpoint behavior:

  • Successful response generation with proper OpenAI response structure
  • Authorization enforcement (via signature inspection)
  • API connection error handling with metrics
  • Request and response mapping invocation
  • Validation error handling (ValueError, AttributeError, TypeError)

The tests properly mock dependencies and verify error paths, status codes, and metric updates.

src/utils/openai_mapping.py (2)

76-154: LGTM!

The mapping function correctly converts internal QueryResponse to OpenAI format:

  • Generates unique IDs and timestamps appropriately
  • Maps token usage correctly with calculated totals
  • Preserves referenced documents in metadata
  • Creates proper nested response structure with content, message, and output

The implementation follows the OpenAI specification and handles the MVP requirements well.


48-49: MVP limitation properly documented; external ticket verification required.

The code change at lines 48-49 correctly implements the MVP limitation for array input with clear error messaging. The codebase contains comments indicating Phase 2 enhancements are planned (including array input support and explicit model mapping). However, LCORE-901 is not referenced anywhere in the repository—it appears to be tracked in an external or internal ticket system outside the codebase. Verify that Phase 2 plans for array input support are properly tracked in your internal issue management system.

src/app/endpoints/responses.py (1)

31-31: Use module-qualified logger name.

The project's Python guidelines call for logger = logging.getLogger(__name__) so log records inherit the module path correctly. Please switch to the __name__-based logger. As per coding guidelines.

⛔ Skipped due to learnings
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-09-18T16:46:33.353Z
Learning: Applies to **/*.py : Use logger = logging.getLogger(__name__) for module logging after import logging
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-09-18T16:46:33.353Z
Learning: Applies to **/*.py : Use appropriate logging levels: debug, info, warning, error with clear messages
Learnt from: luis5tb
Repo: lightspeed-core/lightspeed-stack PR: 727
File: src/app/endpoints/a2a.py:43-43
Timestamp: 2025-10-29T13:05:22.438Z
Learning: In the lightspeed-stack repository, endpoint files in src/app/endpoints/ intentionally use a shared logger name "app.endpoints.handlers" rather than __name__, allowing unified logging configuration across all endpoint handlers (query.py, streaming_query.py, a2a.py).

authorized,
metrics,
tools,
responses,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix the naming collision between imported module and parameter name.

The Pylint error indicates that the responses parameter on line 43 shadows the imported responses module from line 25. While this is in a mock class and may not cause runtime issues, it violates best practices and triggers linting failures.

Apply this diff to rename the parameter:

     def include_router(  # pylint: disable=too-many-arguments
         self,
         router: Any,
         *,
         prefix: str = "",
         tags=None,
         dependencies=None,
-        responses=None,
+        response_models=None,
         deprecated=None,
         include_in_schema=None,
         default_response_class=None,
         callbacks=None,
         generate_unique_id_function=None,
     ) -> None:

Also applies to: 43-43

🤖 Prompt for AI Agents
In tests/unit/app/test_routers.py around lines 25 and 43, the imported module
name responses (line 25) is being shadowed by a parameter named responses in the
mock class at line 43; rename the parameter (e.g., to resp or mock_responses)
and update all references to that parameter inside the class (constructor and
any methods) so they refer to the new name, leaving the module import unchanged;
run the tests/linter to confirm the shadowing warning is resolved.

# For MVP simplicity, use default model/provider selection logic from query.py
# This will be enhanced in Phase 2 to support explicit model mapping
summary, conversation_id, referenced_documents, token_usage = (
await retrieve_response(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note this is not using the responses API from llamastack

Copy link
Contributor

@maysunfaisal maysunfaisal Nov 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldnt this PR require moving Llama Stack to 0.3.x to use the new Llama Stack Responses API https://llamastack.github.io/docs/api/agents? As the previous Llama Stack Agent APIs are deprecated https://llamastack.github.io/docs/api-deprecated/agents ... i.e.; do we need an explicit LCORE /responses endpoint if we switch to Llama Stack 0.3.x?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

depends on what level of completeness LCORE is willing to live with @maysunfaisal .... the responses API was introduced a few months ago in 0.2.x but was labeled work in progress, with some known bugs and missing pieces

so there could be some staging in play

but charting a roadmap that after some intermediate stages ends up having LCORE leverage the llama stack openai api compatible endpoint, vs. the deprecated agent apis or some other responses api endpoint, should be the end goal

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants