[WIP] Add OpenAI Responses API endpoint with MVP functionality #749

thoraxe · 2025-10-31T18:30:16Z

Description

Implements /v1/responses endpoint providing OpenAI-compatible API interface
while leveraging existing Lightspeed RAG and LLM integration.

Add CreateResponseRequest and OpenAIResponse models following OpenAI spec
Implement responses endpoint handler with proper auth and error handling
Add OpenAI to Lightspeed request/response mapping utilities
Add RESPONSES action to authorization system
Include comprehensive unit test coverage (100% for new code)
Maintain full compatibility with existing authentication patterns
Support referenced documents via metadata field for RAG integration

🤖 Generated with Claude Code

Co-Authored-By: Claude [email protected]"

Type of change

Related Tickets & Documents

https://issues.redhat.com/browse/LCORE-901

Checklist before requesting a review

I have performed a self-review of my code.
PR has passed all pre-merge test jobs.
If it is a core feature, I have added thorough tests.

Testing

Please provide detailed steps to perform tests related to this code change.
How were the fix/results from this change verified? Please provide relevant screenshots or results.

Summary by CodeRabbit

Release Notes

New Features
- Added OpenAI-compatible /responses endpoint for response generation with configurable model, instructions, temperature, and output token limits.
- Added RESPONSES authorization action for endpoint access control.
- Referenced documents now included in response metadata.
Improvements
- Enhanced error handling with specific HTTP status codes for connection and validation failures.
- LLM failure metrics tracking for monitoring.

Implements /v1/responses endpoint providing OpenAI-compatible API interface while leveraging existing Lightspeed RAG and LLM integration. - Add CreateResponseRequest and OpenAIResponse models following OpenAI spec - Implement responses endpoint handler with proper auth and error handling - Add OpenAI to Lightspeed request/response mapping utilities - Add RESPONSES action to authorization system - Include comprehensive unit test coverage (100% for new code) - Maintain full compatibility with existing authentication patterns - Support referenced documents via metadata field for RAG integration 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>"

coderabbitai · 2025-10-31T18:30:28Z

Walkthrough

This pull request introduces a new OpenAI-compatible /responses endpoint that accepts OpenAI-formatted requests, translates them into internal QueryRequest objects, retrieves responses via the existing Llama Stack integration, and returns OpenAI-formatted responses. Includes new request/response models, mapping utilities, authorization configuration, and comprehensive test coverage.

Changes

Cohort / File(s)	Summary
Core Endpoint Implementation `src/app/endpoints/responses.py`, `src/utils/openai_mapping.py`	Introduces new endpoint handler that validates config, extracts auth, maps OpenAI requests to internal format, retrieves responses via AsyncLlamaStackClientHolder, and maps results back to OpenAI format. Includes error handling for APIConnectionError (500), ValueError/AttributeError/TypeError (422), and OpenAPI response definitions. Also adds two mapping functions: `map_openai_to_query_request()` and `map_query_to_openai_response()`.
Request/Response Models `src/models/requests.py`, `src/models/responses.py`	Adds `CreateResponseRequest` model with fields for model, input, instructions, temperature, and max_output_tokens. Introduces five new response models: `ResponseContent`, `ResponseMessage`, `ResponseOutput`, `ResponseUsage`, and `OpenAIResponse`, each with field-level and model-level validators for OpenAI compatibility.
Authorization & Configuration `src/models/config.py`	Adds new `RESPONSES` enum member to Action class to represent endpoint access permission.
Router Registration `src/app/routers.py`	Imports responses module and registers its router with FastAPI app at prefix "/v1".
Endpoint Tests `tests/unit/app/endpoints/test_responses.py`	Comprehensive unit tests covering successful response generation, authorization validation, error handling (APIConnectionError, ValueError, AttributeError, TypeError), and correct mapping invocation.
Integration & Router Tests `tests/unit/app/test_routers.py`, `tests/unit/authorization/test_resolvers.py`	Updates router tests to verify new responses router registration and prefix; adds authorization test verifying RESPONSES action inclusion in AccessRule specifications.
Mapping & Model Tests `tests/unit/test_openai_mapping.py`, `tests/unit/test_openai_requests.py`, `tests/unit/test_openai_response_models.py`	Adds extensive test coverage for OpenAI mapping functions (bidirectional conversion, ID/timestamp generation, token usage), CreateResponseRequest validation, and response model validators (content type, role, finish_reason, status).

Sequence Diagram

sequenceDiagram
    participant Client
    participant Endpoint as /responses Endpoint
    participant Mapping as OpenAI Mapping
    participant LlamaStack as Llama Stack Client
    participant Metrics as Metrics

    Client->>Endpoint: POST /v1/responses (OpenAI format)
    Endpoint->>Endpoint: Validate config & auth
    Endpoint->>Mapping: map_openai_to_query_request()
    Mapping->>Mapping: Convert to internal format
    Mapping-->>Endpoint: QueryRequest

    Endpoint->>LlamaStack: retrieve_response(QueryRequest)
    
    alt Success
        LlamaStack-->>Endpoint: QueryResponse
        Endpoint->>Mapping: map_query_to_openai_response()
        Mapping->>Mapping: Generate ID, timestamp, usage
        Mapping-->>Endpoint: OpenAIResponse
        Endpoint-->>Client: 200 OpenAIResponse
    else APIConnectionError
        LlamaStack-->>Endpoint: APIConnectionError
        Endpoint->>Metrics: Increment LLM failure metric
        Endpoint-->>Client: 500 Error Detail
    else Validation Error
        Endpoint->>Endpoint: ValueError/AttributeError/TypeError
        Endpoint-->>Client: 422 Invalid Input
    end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Error handling paths: Endpoint implements three distinct error scenarios (APIConnectionError with metrics, ValueError/AttributeError/TypeError) requiring careful verification of HTTP status codes and payload structures.
Validator logic: Five response models contain cross-field validation rules (validate_type, validate_role, validate_finish_reason, etc.) that must be reviewed for correctness and edge case handling.
Bidirectional mapping: Two mapping functions must correctly transform between OpenAI and internal formats, including optional field handling and MVP-specific constraints.
Integration points: Interactions with AsyncLlamaStackClientHolder, auth dependencies, and metrics instrumentation need verification against existing patterns.

Possibly related PRs

LCORE-220: Add metrics to lightspeed-stack #256: Related through LLM failure metrics instrumentation—this PR emits and increments LLM failure counters that correspond to metrics infrastructure added in that PR.

Suggested reviewers

manstis
tisnik

Poem

🐰 Hops and cheers for OpenAI's grace,
A compatible endpoint now has its place,
With mapping so clever and models so fine,
Your responses will sparkle and beautifully align! ✨

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The PR title "[WIP] Add OpenAI Responses API endpoint with MVP functionality" clearly and specifically describes the main objective of the changeset: introducing a new OpenAI-compatible responses endpoint with minimum viable product scope. The title is concise at 62 characters, avoids vague terminology or noise, and accurately reflects the primary changes across the modified files—including the new endpoint handler, request/response models, mapping utilities, authorization action, and comprehensive test coverage. The [WIP] prefix appropriately signals work-in-progress status. A teammate scanning the repository history would immediately understand this PR adds OpenAI Responses API support to the Lightspeed Stack.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (3)

tests/unit/test_openai_response_models.py (1)
104-113: Consider adding total_tokens validation.

The test acknowledges that total_tokens validation may be added later, but currently allows mismatches between the sum of prompt_tokens + completion_tokens and total_tokens. While this is documented as intentional, consider adding a @model_validator to ensure data integrity.

If you decide to add this validation, you could add this validator to the ResponseUsage model in src/models/responses.py:
@model_validator(mode="after")
def validate_total_tokens(self) -> "ResponseUsage":
    """Validate that total_tokens matches the sum of prompt and completion tokens."""
    expected_total = self.prompt_tokens + self.completion_tokens
    if self.total_tokens != expected_total:
        raise ValueError(
            f"total_tokens ({self.total_tokens}) must equal "
            f"prompt_tokens + completion_tokens ({expected_total})"
        )
    return self
tests/unit/app/endpoints/test_responses.py (1)

24-30: Consider using the shared MOCK_AUTH constant.

The coding guidelines specify: "Use the shared auth mock constant: MOCK_AUTH = ("mock_user_id", "mock_username", False, "mock_token") in tests". While your current MOCK_AUTH uses a UUID format which may be intentional for this specific test, consider whether the shared constant should be used for consistency across the test suite.

As per coding guidelines
src/utils/openai_mapping.py (1)
23-73: Consider handling empty instructions string.

Line 55 directly assigns openai_request.instructions to system_prompt, which could be an empty string if provided. Depending on downstream behavior, an empty string might be treated differently than None. Consider normalizing empty strings to None for consistency.

Apply this diff if empty instructions should be treated as absent:
     # Map OpenAI instructions to Lightspeed system_prompt
-    system_prompt = openai_request.instructions
+    system_prompt = openai_request.instructions if openai_request.instructions else None

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between cd3291c and 65c6f42.

📒 Files selected for processing (12)

src/app/endpoints/responses.py (1 hunks)
src/app/routers.py (2 hunks)
src/models/config.py (1 hunks)
src/models/requests.py (1 hunks)
src/models/responses.py (2 hunks)
src/utils/openai_mapping.py (1 hunks)
tests/unit/app/endpoints/test_responses.py (1 hunks)
tests/unit/app/test_routers.py (5 hunks)
tests/unit/authorization/test_resolvers.py (1 hunks)
tests/unit/test_openai_mapping.py (1 hunks)
tests/unit/test_openai_requests.py (1 hunks)
tests/unit/test_openai_response_models.py (1 hunks)

🧰 Additional context used

📓 Path-based instructions (10)

src/**/*.py