Skip to content

Conversation

AshutoshRavindraIwale
Copy link

Summary

This PR adds a guard in the Chroma vectorstore implementation to ensure that the hnsw:num_threads metadata parameter does not exceed the number of available CPU cores when loading a persisted collection.

Context

Fixes a bug where moving a Chroma collection from a server with N CPU cores to one with fewer cores (M < N) could cause a ValueError if hnsw:num_threads was set higher than supported on the new host. See: langchain-ai/langchain#32678

Changes

  • In langchain_community/vectorstores/chroma.py, thread allocation in collection metadata is capped to os.cpu_count() if necessary during init.
  • Adds/updates tests in tests/unit_tests/vectorstores/test_chroma.py:
    • Verifies capping of excessive thread settings.
    • Ensures valid/low thread counts remain untouched.
    • Checks handling for missing/None collection metadata.
    • Confirms backward compatibility with ChromaDB integrations.

Testing

  • All tests, including new edge cases, pass locally.
  • Real and mock ChromaDB tested for backwards compatibility.

Closes langchain-ai/langchain#32678

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Error when loading langchain.vectorstores.Chroma on different server with different num of cores
1 participant