[ENH]: add retries to Rust client #5641

codetheweb · 2025-10-18T02:45:20Z

Description of changes

Retries on all GET requests and 429s.

Test plan

How are these changes tested?

Tests pass locally with pytest for python, yarn test for js, cargo test for rust

Migration plan

Are there any migrations, or any forwards/backwards compatibility changes needed in order to make sure this change deploys reliably?

Observability plan

What is the plan to instrument and monitor this change?

Documentation Changes

Are all docstrings for user-facing APIs updated if required? Do we need to make documentation changes in the docs section?

codetheweb · 2025-10-18T02:45:38Z

[ENH]: Rust client misc cleanup #5642
[ENH]: add retries to Rust client #5641 👈 (View in Graphite)
[ENH]: add OpenTelemetry metrics to Rust client #5640
[ENH]: add list_collections to Rust client #5639
main

This stack of pull requests is managed by Graphite. Learn more about stacking.

github-actions · 2025-10-18T02:45:55Z

propel-code-bot · 2025-10-18T02:51:50Z

Introduce Retry Logic for Rust Client GET Requests and 429s

This pull request adds configurable retry logic to the Rust ChromaClient. It implements exponential backoff retries for all GET requests and for any HTTP requests that receive a 429 Too Many Requests response. The retry policy is user-adjustable via the new ChromaRetryOptions field in ChromaClientOptions. Non-GET requests will only be retried on 429 responses, not on 5xx or other errors, preserving idempotency. Tests using httpmock are included to verify both positive and negative retry scenarios.

Key Changes

• Introduced ChromaRetryOptions to ChromaClientOptions enabling user configuration of retry behavior
• Integrated backon crate's ExponentialBuilder for exponential backoff policies in chroma_client.rs
• Updated ChromaClient to retry all GET requests and non-GET requests on 429 responses, with detailed retry notification via tracing
• Wire up retry metrics increment in metrics.rs and extend metrics to track retries
• Added explicit tests to ensure: (a) GET requests are retried on errors, (b) non-GET requests are retried on 429s, (c) retry count and result correctness
• Updated Cargo.toml and Cargo.lock with httpmock and related dependencies; minor tokio and smallvec upgrades

Affected Areas

• rust/chroma/src/client/chroma_client.rs
• rust/chroma/src/client/options.rs
• rust/chroma/src/client/metrics.rs
• rust/chroma/Cargo.toml
• Cargo.lock

This summary was automatically generated by @propel-code-bot

rust/chroma/src/client/options.rs

propel-code-bot · 2025-10-18T02:55:22Z

rust/chroma/src/client/chroma_client.rs

 }

 #[cfg(test)]
 mod tests {


[TestCoverage]

This is a great addition! The retry logic looks solid and the tests for the success cases are well-written.

To make the test suite even more robust, consider adding a test case to verify that non-idempotent methods (like POST) are not retried on server errors (like 500), as per your retry logic. This ensures the retry mechanism isn't overly aggressive and follows HTTP idempotency best practices.

The suggested test follows common retry library testing patterns seen in crates like tokio-retry and backoff, which emphasize testing both positive and negative retry scenarios. This is particularly important for HTTP clients where retry behavior should respect HTTP method semantics:

#[tokio::test] #[test_log::test] async fn test_does_not_retry_non_get_on_500() { // Test implementation... assert_eq!(mock.calls(), 1); // Ensures only one attempt, no retries }

Context for Agents

[**TestCoverage**] This is a great addition! The retry logic looks solid and the tests for the success cases are well-written. To make the test suite even more robust, consider adding a test case to verify that non-idempotent methods (like POST) are *not* retried on server errors (like 500), as per your retry logic. This ensures the retry mechanism isn't overly aggressive and follows HTTP idempotency best practices. The suggested test follows common retry library testing patterns seen in crates like `tokio-retry` and `backoff`, which emphasize testing both positive and negative retry scenarios. This is particularly important for HTTP clients where retry behavior should respect HTTP method semantics: ```rust #[tokio::test] #[test_log::test] async fn test_does_not_retry_non_get_on_500() { // Test implementation... assert_eq!(mock.calls(), 1); // Ensures only one attempt, no retries } ``` File: rust/chroma/src/client/chroma_client.rs Line: 286

rescrv · 2025-10-18T14:36:22Z

rust/chroma/src/client/chroma_client.rs

+        };
+
+        let response = attempt
+            .retry(&self.retry_policy)


Can the caller opt out of this? It's dangerous and incorrect for a client to retry non-reads.

Sorry do you mean a user or something inside the crate calling send? Users can set max_retries to 0. However this specifically only retries GET requests and 429s, both of which are always safe to retry in our current system.

graphite-app · 2025-10-19T18:04:44Z

Merge activity

Oct 19, 6:04 PM UTC: Graphite rebased this pull request, because this pull request is set to merge when ready.
Oct 19, 6:31 PM UTC: @codetheweb merged this pull request with Graphite.

propel-code-bot · 2025-10-19T18:07:16Z

rust/chroma/src/client/chroma_client.rs

+            .when(|err| {
+                err.status()
+                    .map(|status| status == StatusCode::TOO_MANY_REQUESTS)
+                    .unwrap_or_default()
+                    || method == Method::GET
+            })


[BestPractice]

The current retry logic for GET requests is quite broad. It will retry on any error, including 4xx client errors like 404 Not Found or 401 Unauthorized, which are typically not transient and are unlikely to succeed on a retry.

To make the client more robust, it would be better to limit retries for GET requests to:

Network errors (connection failures, timeouts)

5xx server errors (500, 502, 503, 504) which indicate transient server issues

Specific 4xx codes that may be transient: 408 (Request Timeout), 429 (Too Many Requests)

This avoids wasting time and resources on retries for unrecoverable client-side errors like 401 (Unauthorized), 403 (Forbidden), 404 (Not Found), etc.

Here's a suggested change to make the retry condition more specific:

Context for Agents

[**BestPractice**] The current retry logic for GET requests is quite broad. It will retry on any error, including 4xx client errors like 404 Not Found or 401 Unauthorized, which are typically not transient and are unlikely to succeed on a retry. To make the client more robust, it would be better to limit retries for GET requests to: - Network errors (connection failures, timeouts) - 5xx server errors (500, 502, 503, 504) which indicate transient server issues - Specific 4xx codes that may be transient: 408 (Request Timeout), 429 (Too Many Requests) This avoids wasting time and resources on retries for unrecoverable client-side errors like 401 (Unauthorized), 403 (Forbidden), 404 (Not Found), etc. Here's a suggested change to make the retry condition more specific: File: rust/chroma/src/client/chroma_client.rs Line: 269

codetheweb mentioned this pull request Oct 18, 2025

[ENH]: add list_collections to Rust client #5639

Merged

1 task

codetheweb mentioned this pull request Oct 18, 2025

[ENH]: add OpenTelemetry metrics to Rust client #5640

Merged

1 task

codetheweb force-pushed the feat-chroma-rust-client-metrics branch from 8d165c0 to f3c8a82 Compare October 18, 2025 02:47

codetheweb force-pushed the feat-chroma-rust-client-retries branch 3 times, most recently from 78b6eb1 to 263fa95 Compare October 18, 2025 02:50

codetheweb marked this pull request as ready for review October 18, 2025 02:50

propel-code-bot bot reviewed Oct 18, 2025

View reviewed changes

rust/chroma/src/client/options.rs Outdated Show resolved Hide resolved

propel-code-bot bot reviewed Oct 18, 2025

View reviewed changes

codetheweb mentioned this pull request Oct 18, 2025

[ENH]: Rust client misc cleanup #5642

Merged

1 task

codetheweb force-pushed the feat-chroma-rust-client-retries branch from 263fa95 to 251cad6 Compare October 18, 2025 03:07

rescrv self-requested a review October 18, 2025 04:17

rescrv approved these changes Oct 18, 2025

View reviewed changes

codetheweb force-pushed the feat-chroma-rust-client-metrics branch from f3c8a82 to bd81ebe Compare October 19, 2025 17:29

codetheweb force-pushed the feat-chroma-rust-client-retries branch from 251cad6 to 7cde601 Compare October 19, 2025 17:29

codetheweb force-pushed the feat-chroma-rust-client-metrics branch 2 times, most recently from ef08e67 to 16aa07e Compare October 19, 2025 17:30

codetheweb force-pushed the feat-chroma-rust-client-retries branch from 7cde601 to d994a3e Compare October 19, 2025 17:32

codetheweb force-pushed the feat-chroma-rust-client-metrics branch from 16aa07e to 37666f3 Compare October 19, 2025 17:32

codetheweb changed the base branch from feat-chroma-rust-client-metrics to graphite-base/5641 October 19, 2025 18:03

[ENH]: add retries to Rust client

4b08584

codetheweb force-pushed the graphite-base/5641 branch from 37666f3 to f0232ca Compare October 19, 2025 18:04

codetheweb force-pushed the feat-chroma-rust-client-retries branch from d994a3e to 4b08584 Compare October 19, 2025 18:04

graphite-app bot changed the base branch from graphite-base/5641 to main October 19, 2025 18:04

propel-code-bot bot reviewed Oct 19, 2025

View reviewed changes

codetheweb merged commit 18c5938 into main Oct 19, 2025
70 of 75 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[ENH]: add retries to Rust client #5641

[ENH]: add retries to Rust client #5641

codetheweb commented Oct 18, 2025 •

edited

Loading

Uh oh!

codetheweb commented Oct 18, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Oct 18, 2025

Uh oh!

propel-code-bot bot commented Oct 18, 2025 •

edited

Loading

Uh oh!

Uh oh!

propel-code-bot bot Oct 18, 2025

Uh oh!

rescrv Oct 18, 2025

Uh oh!

codetheweb Oct 19, 2025

Uh oh!

graphite-app bot commented Oct 19, 2025 •

edited by codetheweb

Loading

Uh oh!

propel-code-bot bot Oct 19, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[ENH]: add retries to Rust client #5641

[ENH]: add retries to Rust client #5641

Conversation

codetheweb commented Oct 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description of changes

Test plan

Migration plan

Observability plan

Documentation Changes

Uh oh!

codetheweb commented Oct 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Oct 18, 2025

Reviewer Checklist

Testing, Bugs, Errors, Logs, Documentation

System Compatibility

Quality

Uh oh!

propel-code-bot bot commented Oct 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

propel-code-bot bot Oct 18, 2025

Choose a reason for hiding this comment

Uh oh!

rescrv Oct 18, 2025

Choose a reason for hiding this comment

Uh oh!

codetheweb Oct 19, 2025

Choose a reason for hiding this comment

Uh oh!

graphite-app bot commented Oct 19, 2025 • edited by codetheweb Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merge activity

Uh oh!

propel-code-bot bot Oct 19, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

codetheweb commented Oct 18, 2025 •

edited

Loading

codetheweb commented Oct 18, 2025 •

edited

Loading

propel-code-bot bot commented Oct 18, 2025 •

edited

Loading

graphite-app bot commented Oct 19, 2025 •

edited by codetheweb

Loading