Skip to content

Conversation

chroma-droid
Copy link

This PR cherry-picks the commit a5d9241 onto rc/2025-08-29. If there are unresolved conflicts, please resolve them manually.

## Description of changes

This PR will make the push_logs call initialize the log.  Existing logs
are unaffected.  If a log doesn't exist, a not_found error will be
returned or the request will be empty, as appropriate.

This deletes the paths that push/pull logs in the go log service.

## Test plan

CI

## Migration plan

This is part of the migration to the rust log service.

We need to plan for how to roll it out such that the rust log service
rolls first.

## Observability plan

Watch staging.

## Documentation Changes

N/A
Copy link

Reviewer Checklist

Please leverage this checklist to ensure your code review is thorough before approving

Testing, Bugs, Errors, Logs, Documentation

  • Can you think of any use case in which the code does not behave as intended? Have they been tested?
  • Can you think of any inputs or external events that could break the code? Is user input validated and safe? Have they been tested?
  • If appropriate, are there adequate property based tests?
  • If appropriate, are there adequate unit tests?
  • Should any logging, debugging, tracing information be added or removed?
  • Are error messages user-friendly?
  • Have all documentation changes needed been made?
  • Have all non-obvious changes been commented?

System Compatibility

  • Are there any potential impacts on other parts of the system or backward compatibility?
  • Does this change intersect with any items on our roadmap, and if so, is there a plan for fitting them together?

Quality

  • Is this code of a unexpectedly high quality (Readability, Modularity, Intuitiveness)

Copy link
Contributor

Deprecate Go Log Service and Remove Legacy LogService Paths

This PR fully removes support for the legacy Go-based log service ('logservice') from the Chroma codebase, completing the transition to the Rust-based log service. All gRPC endpoints and test paths in the Go log server related to log mutation (PushLogs, PullLogs, ScoutLogs, etc.) now return explicit errors instructing clients that these APIs are no longer available and have migrated to Rust. Python distributed failover and migration tests that relied on the old Go log service are deleted. Configuration and deployment (Tiltfile, config YAMLs) are updated to point exclusively to the Rust log service, and any alternate or fallback host logic is eliminated. Property-based tests, integration tests, and utility code are refactored to align with the new Rust-only log protocol, ensuring the system no longer attempts to communicate with the deprecated Go log service.

Key Changes

• All log mutation-related gRPC endpoints in go/pkg/log/server now immediately return errors indicating deprecation (PushLogs, PullLogs, ScoutLogs, ForkLogs, etc.)
• The Go log service implementation no longer serializes or deserializes or operates on records; all push/pull code is removed
• Removal of python distributed test_log_failover.py (and associated property tests) that depended on failover or migration logic through legacy logservice
• Updated configuration files (rust/worker/tilt_config.yaml, rust/frontend/sample_configs/tilt_config.yaml) to reference only rust-log-service; removed alternate host and threshold fields everywhere
• Tiltfile and deployment pipeline refactoring to eliminate logservice as a runtime dependency (including k8s resources)
• Rust-side test, source, and integration points are updated to expect rust-log-service exclusively, including error handling for Go log deprecation messages
• Property test regression seeds are updated to reflect new code paths

Affected Areas

• Go log service server implementation (go/pkg/log/server)
• Python distributed and property tests (chromadb/test/distributed, chromadb/test/property)
• Tilt deployment and YAML configuration (Tiltfile, rust/worker/tilt_config.yaml, rust/frontend/sample_configs/tilt_config.yaml)
• Rust integration tests and libraries relating to log operations
• Proptest regression seed files (rust/log-service/proptest-regressions/lib.txt)

This summary was automatically generated by @propel-code-bot

Copy link
Contributor

blacksmith-sh bot commented Aug 29, 2025

4 Jobs Failed:

PR checks / all-required-pr-checks-passed

Step "Decide whether the needed jobs succeeded or failed" from job "all-required-pr-checks-passed" is failing. The last 20 log lines are:

[...]
}
EOM
)"
shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0}
env:
  GITHUB_REPO_NAME: chroma-core/chroma
  PYTHONPATH: /home/runner/_work/_actions/re-actors/alls-green/release/v1/src
# ❌ Some of the required to succeed jobs failed 😢😢😢

📝 Job statuses:
📝 python-tests → ❌ failure [required to succeed or be skipped]
📝 python-vulnerability-scan → ✓ success [required to succeed or be skipped]
📝 javascript-client-tests → ✓ success [required to succeed or be skipped]
📝 rust-tests → ✓ success [required to succeed or be skipped]
📝 go-tests → ✓ success [required to succeed or be skipped]
📝 lint → ✓ success [required to succeed]
📝 check-helm-version-bump → ⬜ skipped [required to succeed or be skipped]
📝 delete-helm-comment → ✓ success [required to succeed or be skipped]
Error: Process completed with exit code 1.
PR checks / Python tests / test-cluster-rust-frontend (3.9, chromadb/test/property/test_add.py)

Step "Test" from job "Python tests / test-cluster-rust-frontend (3.9, chromadb/test/property/test_add.py)" is failing. The last 20 log lines are:

[...]
INTERNALERROR>     teardown.throw(exception)
INTERNALERROR>   File "/opt/hostedtoolcache/Python/3.9.19/x64/lib/python3.9/site-packages/_pytest/logging.py", line 801, in pytest_runtestloop
INTERNALERROR>     return (yield)  # Run all the tests.
INTERNALERROR>   File "/opt/hostedtoolcache/Python/3.9.19/x64/lib/python3.9/site-packages/pluggy/_callers.py", line 139, in _multicall
INTERNALERROR>     teardown.throw(exception)
INTERNALERROR>   File "/opt/hostedtoolcache/Python/3.9.19/x64/lib/python3.9/site-packages/_pytest/terminal.py", line 688, in pytest_runtestloop
INTERNALERROR>     result = yield
INTERNALERROR>   File "/opt/hostedtoolcache/Python/3.9.19/x64/lib/python3.9/site-packages/pluggy/_callers.py", line 121, in _multicall
INTERNALERROR>     res = hook_impl.function(*args)
INTERNALERROR>   File "/opt/hostedtoolcache/Python/3.9.19/x64/lib/python3.9/site-packages/xdist/dsession.py", line 138, in pytest_runtestloop
INTERNALERROR>     self.loop_once()
INTERNALERROR>   File "/opt/hostedtoolcache/Python/3.9.19/x64/lib/python3.9/site-packages/xdist/dsession.py", line 163, in loop_once
INTERNALERROR>     call(**kwargs)
INTERNALERROR>   File "/opt/hostedtoolcache/Python/3.9.19/x64/lib/python3.9/site-packages/xdist/dsession.py", line 217, in worker_workerfinished
INTERNALERROR>     assert not crashitem, (crashitem, node)
INTERNALERROR> AssertionError: ('chromadb/test/property/test_add.py::test_out_of_order_ids[basic_http_client]', <WorkerController gw7>)
INTERNALERROR> assert not 'chromadb/test/property/test_add.py::test_out_of_order_ids[basic_http_client]'

============================== 1 warning in 8.02s ==============================
Error: Process completed with exit code 3.
PR checks / Python tests / test-rust-bindings (3.9, chromadb/test/property --ignore-glob chromadb/test/property/test_cross_v...

Step "Test" from job "Python tests / test-rust-bindings (3.9, chromadb/test/property --ignore-glob chromadb/test/property/test_cross_v..." is failing. The last 20 log lines are:

[...]
INTERNALERROR>     teardown.throw(exception)
INTERNALERROR>   File "/opt/hostedtoolcache/Python/3.9.19/x64/lib/python3.9/site-packages/_pytest/logging.py", line 801, in pytest_runtestloop
INTERNALERROR>     return (yield)  # Run all the tests.
INTERNALERROR>   File "/opt/hostedtoolcache/Python/3.9.19/x64/lib/python3.9/site-packages/pluggy/_callers.py", line 139, in _multicall
INTERNALERROR>     teardown.throw(exception)
INTERNALERROR>   File "/opt/hostedtoolcache/Python/3.9.19/x64/lib/python3.9/site-packages/_pytest/terminal.py", line 688, in pytest_runtestloop
INTERNALERROR>     result = yield
INTERNALERROR>   File "/opt/hostedtoolcache/Python/3.9.19/x64/lib/python3.9/site-packages/pluggy/_callers.py", line 121, in _multicall
INTERNALERROR>     res = hook_impl.function(*args)
INTERNALERROR>   File "/opt/hostedtoolcache/Python/3.9.19/x64/lib/python3.9/site-packages/xdist/dsession.py", line 138, in pytest_runtestloop
INTERNALERROR>     self.loop_once()
INTERNALERROR>   File "/opt/hostedtoolcache/Python/3.9.19/x64/lib/python3.9/site-packages/xdist/dsession.py", line 163, in loop_once
INTERNALERROR>     call(**kwargs)
INTERNALERROR>   File "/opt/hostedtoolcache/Python/3.9.19/x64/lib/python3.9/site-packages/xdist/dsession.py", line 217, in worker_workerfinished
INTERNALERROR>     assert not crashitem, (crashitem, node)
INTERNALERROR> AssertionError: ('chromadb/test/property/test_fork.py::test_fork[rust_sqlite_ephemeral]', <WorkerController gw6>)
INTERNALERROR> assert not 'chromadb/test/property/test_fork.py::test_fork[rust_sqlite_ephemeral]'

================ 62 passed, 2 xpassed, 2899 warnings in 40.27s =================
Error: Process completed with exit code 3.

1 job failed running on non-Blacksmith runners.


Summary: 1 successful workflow, 1 failed workflow

Last updated: 2025-08-29 18:00:39 UTC

Comment on lines +616 to +617
return Err(Status::not_found(format!(
"collection {collection_id} not found"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[BestPractice]

The error handling strategy has changed from forwarding to proxy services to returning NOT_FOUND errors. This is a significant behavioral change that could break existing clients expecting the previous forwarding behavior. Consider:

  1. Documenting this breaking change in migration guides
  2. Adding proper error codes that clearly indicate the service migration
  3. Ensuring all callers can handle NOT_FOUND appropriately
return Err(Status::not_found(format!(
    "collection {collection_id} not found (migrated from legacy log service)"
)));

Comment on lines +18 to +21
let fragments = match reader.scan(offset, Limits::UNLIMITED).await {
Ok(fragments) => fragments,
Err(Error::UninitializedLog) => vec![],
Err(err) => return Err(err),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[BestPractice]

The copy function now silently handles UninitializedLog errors by returning an empty fragments vector. This could mask legitimate errors. Consider logging this condition:

let fragments = match reader.scan(offset, Limits::UNLIMITED).await {
    Ok(fragments) => fragments,
    Err(Error::UninitializedLog) => {
        tracing::debug!("Source log is uninitialized, copying empty log");
        vec![]
    },
    Err(err) => return Err(err),
};

@chroma-droid chroma-droid deleted the branch rc/2025-08-29 September 3, 2025 00:01
@rescrv rescrv deleted the hotfix-5369/rc/2025-08-29 branch September 3, 2025 20:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants