Skip to content

Conversation

@luis5tb
Copy link
Contributor

@luis5tb luis5tb commented Oct 15, 2025

Description

This PR updates the codebase to use the new vector_stores API instead of the deprecated vector_dbs API, which was removed in llama-stack PR #3774.

Type of change

  • Refactor
  • New feature
  • Bug fix
  • CVE fix
  • Optimization
  • Documentation Update
  • Configuration Update
  • Bump-up service version
  • Bump-up dependent library
  • Bump-up library or tool used for development (does not change the final image)
  • CI configuration change
  • Konflux configuration change
  • Unit tests improvement
  • Integration tests improvement
  • End to end tests improvement

Related Tickets & Documents

Summary by CodeRabbit

  • Refactor

    • Backend now consistently uses "vector store" identifiers instead of "vector database" identifiers in query and streaming flows.
    • RAG toolgroup selection and construction updated to accept and pass vector store IDs.
    • Conditional handling preserved so toolgroups remain disabled when no IDs are available; end-user behavior unchanged.
  • Tests

    • Unit tests updated to reflect the new vector store data structures and access patterns.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 15, 2025

Walkthrough

Replaces vector DB listing with vector store listing in query endpoints; updates get_rag_toolgroups to accept vector_store_ids, and adapts streaming and non-streaming flows and tests to derive IDs from client.vector_stores.list(). Control flow and toolgroup semantics remain the same.

Changes

Cohort / File(s) Summary
RAG toolgroup parameter migration
src/app/endpoints/query.py
Signature and internals updated: vector_db_idsvector_store_ids; docstring, conditionals, and ToolgroupAgentToolGroupWithArgs usage now pass vector_store_ids.
Streaming endpoint alignment
src/app/endpoints/streaming_query.py
Replaced vector_dbs.list() usage with vector_stores.list(); compute vector_store_ids from response, pass to get_rag_toolgroups, preserve empty-toolgroup → None behavior.
Unit test updates
tests/unit/app/endpoints/test_query.py, tests/unit/app/endpoints/test_streaming_query.py
Tests renamed local vars to vector_store_ids; adjusted mocks to return a .data list from vector_stores.list() and use mock_vector_store.id accessors; assertions updated to expect vector_store_ids in toolgroup args.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant Client as HTTP Client
  participant Endpoint as query / streaming_query
  participant SDK as API Client
  participant RAG as get_rag_toolgroups
  participant Agent as Toolgroup Agent

  Client->>Endpoint: Request (query / streaming)
  Endpoint->>SDK: call vector_stores.list()
  note right of SDK #D6EAF8: returns wrapper with .data list
  SDK-->>Endpoint: list response (data -> vector_store objects)
  Endpoint->>Endpoint: extract `vector_store_ids` from .data (ids)
  Endpoint->>RAG: get_rag_toolgroups(vector_store_ids)
  RAG-->>Endpoint: Toolgroups or None
  Endpoint->>Agent: Build/run agent with toolgroups
  Agent-->>Client: Response / stream
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

  • Check all occurrences of vector_db_ids were renamed to vector_store_ids.
  • Verify client.vector_stores.list() usage and that tests mock the .data wrapper consistently.
  • Confirm Toolgroup argument types and docstrings align with the new parameter name.

Poem

I hopped from DBs to store-lined trails,
Fresh IDs jingling in my tail,
I nudged a name, kept flow the same,
Tiny hop, a clearer frame.
Cheers from a coding rabbit 🥕🐇

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The PR title "fix: Migrate from deprecated vector_dbs to vector_stores API" directly corresponds to the primary change across all modified files in the changeset. The title accurately summarizes the main objective: replacing the deprecated vector_dbs API with the new vector_stores API throughout the codebase. The changes in query.py, streaming_query.py, and their corresponding tests all implement this same migration, with updates to API calls, parameter names (vector_db_ids to vector_store_ids), and test mocks. The title is clear, concise, specific, and would effectively communicate the purpose of this changeset to teammates reviewing the git history.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a5f1f16 and 7fd8a8f.

📒 Files selected for processing (4)
  • src/app/endpoints/query.py (2 hunks)
  • src/app/endpoints/streaming_query.py (1 hunks)
  • tests/unit/app/endpoints/test_query.py (1 hunks)
  • tests/unit/app/endpoints/test_streaming_query.py (12 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/app/endpoints/streaming_query.py
🧰 Additional context used
📓 Path-based instructions (7)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: All modules start with descriptive module-level docstrings explaining purpose
Use logger = logging.getLogger(name) for module logging after import logging
Define type aliases at module level for clarity
All functions require docstrings with brief descriptions
Provide complete type annotations for all function parameters and return types
Use typing_extensions.Self in model validators where appropriate
Use modern union syntax (str | int) and Optional[T] or T | None consistently
Function names use snake_case with descriptive, action-oriented prefixes (get_, validate_, check_)
Avoid in-place parameter modification; return new data structures instead of mutating arguments
Use appropriate logging levels: debug, info, warning, error with clear messages
All classes require descriptive docstrings explaining purpose
Class names use PascalCase with conventional suffixes (Configuration, Error/Exception, Resolver, Interface)
Abstract base classes should use abc.ABC and @AbstractMethod for interfaces
Provide complete type annotations for all class attributes
Follow Google Python docstring style for modules, classes, and functions, including Args, Returns, Raises, Attributes sections as needed

Files:

  • tests/unit/app/endpoints/test_streaming_query.py
  • src/app/endpoints/query.py
  • tests/unit/app/endpoints/test_query.py
tests/{unit,integration}/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/{unit,integration}/**/*.py: Use pytest for all unit and integration tests
Do not use unittest in tests; pytest is the standard

Files:

  • tests/unit/app/endpoints/test_streaming_query.py
  • tests/unit/app/endpoints/test_query.py
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Use pytest-mock to create AsyncMock objects for async interactions in tests
Use the shared auth mock constant: MOCK_AUTH = ("mock_user_id", "mock_username", False, "mock_token") in tests

Files:

  • tests/unit/app/endpoints/test_streaming_query.py
  • tests/unit/app/endpoints/test_query.py
src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use absolute imports for internal modules (e.g., from auth import get_auth_dependency)

Files:

  • src/app/endpoints/query.py
src/app/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use standard FastAPI imports (from fastapi import APIRouter, HTTPException, Request, status, Depends) in FastAPI app code

Files:

  • src/app/endpoints/query.py
src/{app/**/*.py,client.py}

📄 CodeRabbit inference engine (CLAUDE.md)

Use async def for I/O-bound operations and external API calls

Files:

  • src/app/endpoints/query.py
src/app/endpoints/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

In API endpoints, raise FastAPI HTTPException with appropriate status codes for error handling

Files:

  • src/app/endpoints/query.py
🧬 Code graph analysis (1)
tests/unit/app/endpoints/test_query.py (1)
src/app/endpoints/query.py (1)
  • get_rag_toolgroups (784-811)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: e2e_tests (ci)
  • GitHub Check: e2e_tests (azure)
🔇 Additional comments (4)
src/app/endpoints/query.py (2)

686-689: API migration looks correct.

The migration from vector_dbs.list() to vector_stores.list().data properly adapts to the new API structure where results are wrapped in a response object with a .data attribute, and vector store objects use .id instead of .identifier.


784-810: Function migration is complete and consistent.

The parameter has been properly renamed from vector_db_ids to vector_store_ids throughout the function signature, docstring, and implementation. The args dictionary key at line 805 now uses "vector_store_ids", which should align with the updated Llama Stack builtin RAG tool expectations after the deprecation in llama-stack PR #3774.

tests/unit/app/endpoints/test_query.py (1)

1387-1398: Test updated correctly to match implementation.

The test variable and assertion have been properly updated from vector_db_ids to vector_store_ids, addressing the concern raised in the previous review comment. The assertion on line 1398 now correctly expects result[0]["args"]["vector_store_ids"].

tests/unit/app/endpoints/test_streaming_query.py (1)

443-447: Excellent migration - all test mocks properly updated.

All tests in this file correctly mock the new vector_stores API structure:

  • Using mock_vector_store.id instead of mock_vector_db.identifier
  • Wrapping the list in mock_list_response.data
  • Calling mock_client.vector_stores.list() instead of vector_dbs

This demonstrates the correct pattern that should be followed in test_query.py for consistency.

Also applies to: 488-490, 544-546, 601-603, 660-662, 713-715, 767-769, 1174-1176, 1255-1257, 1319-1321, 1571-1575, 1623-1627


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@luis5tb luis5tb force-pushed the vector_stores branch 2 times, most recently from 93b4b1b to a5f1f16 Compare October 27, 2025 10:49
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 52c0f6d and a5f1f16.

📒 Files selected for processing (2)
  • src/app/endpoints/query.py (2 hunks)
  • src/app/endpoints/streaming_query.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/app/endpoints/streaming_query.py
🧰 Additional context used
📓 Path-based instructions (5)
src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use absolute imports for internal modules (e.g., from auth import get_auth_dependency)

Files:

  • src/app/endpoints/query.py
src/app/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use standard FastAPI imports (from fastapi import APIRouter, HTTPException, Request, status, Depends) in FastAPI app code

Files:

  • src/app/endpoints/query.py
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: All modules start with descriptive module-level docstrings explaining purpose
Use logger = logging.getLogger(name) for module logging after import logging
Define type aliases at module level for clarity
All functions require docstrings with brief descriptions
Provide complete type annotations for all function parameters and return types
Use typing_extensions.Self in model validators where appropriate
Use modern union syntax (str | int) and Optional[T] or T | None consistently
Function names use snake_case with descriptive, action-oriented prefixes (get_, validate_, check_)
Avoid in-place parameter modification; return new data structures instead of mutating arguments
Use appropriate logging levels: debug, info, warning, error with clear messages
All classes require descriptive docstrings explaining purpose
Class names use PascalCase with conventional suffixes (Configuration, Error/Exception, Resolver, Interface)
Abstract base classes should use abc.ABC and @AbstractMethod for interfaces
Provide complete type annotations for all class attributes
Follow Google Python docstring style for modules, classes, and functions, including Args, Returns, Raises, Attributes sections as needed

Files:

  • src/app/endpoints/query.py
src/{app/**/*.py,client.py}

📄 CodeRabbit inference engine (CLAUDE.md)

Use async def for I/O-bound operations and external API calls

Files:

  • src/app/endpoints/query.py
src/app/endpoints/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

In API endpoints, raise FastAPI HTTPException with appropriate status codes for error handling

Files:

  • src/app/endpoints/query.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: e2e_tests (ci)
  • GitHub Check: e2e_tests (azure)
🔇 Additional comments (2)
src/app/endpoints/query.py (2)

686-688: LGTM! API migration is correct.

The migration from client.vector_dbs.list() to client.vector_stores.list() correctly implements the API update. The list comprehension properly extracts IDs from the response data.


689-689: LGTM! Function call updated correctly.

The call to get_rag_toolgroups now correctly passes vector_store_ids, maintaining consistency with the API migration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant