-
Notifications
You must be signed in to change notification settings - Fork 1.8k
[ENH] Explicitly convert SlowDown from S3 to StorageError::Backoff #5377
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Reviewer ChecklistPlease leverage this checklist to ensure your code review is thorough before approving Testing, Bugs, Errors, Logs, Documentation
System Compatibility
Quality
|
Introduce StorageError::Backoff for S3 SlowDown Errors This pull request introduces a new StorageError::Backoff enum variant to explicitly represent Amazon S3 'SlowDown' (HTTP 429) responses encountered during storage operations. The 'SlowDown' error returned by S3 is now consistently converted into the Backoff error, allowing callers to identify rate limiting scenarios and handle them appropriately, rather than treating these as generic unexpected errors. The PR modifies relevant S3 client methods to implement the new error mapping and updates the error code logic accordingly. Key Changes• Added a new Affected Areas• rust/storage/src/s3.rs ( This summary was automatically generated by @propel-code-bot |
// Back off and retry---usually indicates an explicit 429/SlowDown. | ||
#[error("Back off and retry---usually indicates an explicit 429/SlowDown.")] | ||
Backoff, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Documentation]
The new StorageError::Backoff
variant lacks documentation. Consider adding a doc comment to clarify when this error should be used and how callers should handle it:
// Back off and retry---usually indicates an explicit 429/SlowDown. | |
#[error("Back off and retry---usually indicates an explicit 429/SlowDown.")] | |
Backoff, | |
/// Amazon S3 requests that we back off and retry, typically due to rate limiting (429/SlowDown) | |
#[error("Back off and retry---usually indicates an explicit 429/SlowDown.")] | |
Backoff, |
⚡ Committable suggestion
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.
rust/storage/src/s3.rs
Outdated
} else if err.meta().code() == Some("PreconditionFailed") { | ||
StorageError::Backoff |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[CriticalError]
There's a logical error in the oneshot_upload
method. Line 613 checks for "PreconditionFailed"
twice, but the second check (line 613-614) should be checking for "SlowDown"
to match the intended behavior:
} else if err.meta().code() == Some("PreconditionFailed") { | |
StorageError::Backoff | |
} else if err.meta().code() == Some("SlowDown") { | |
StorageError::Backoff |
⚡ Committable suggestion
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.
}) | ||
} | ||
_ => { | ||
Err(e) => match e { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[BestPractice]
Great job adding the Backoff
error for S3's SlowDown
responses. It looks like this check has been added to get
, put
, delete
, and delete_many
.
However, to make the change complete, the same check should be applied to other S3 operations where this error could occur. I've identified a few places where it seems to be missing:
copy()
prepare_multipart_upload()
upload_part()
finish_multipart_upload()
Since rename()
uses copy()
, it would also benefit from this change. Applying the SlowDown
check consistently across all S3 client calls will make the error handling more robust and fully achieve the goal of this PR.
_ => { | ||
if inner.code() == Some("SlowDown") { | ||
Err(StorageError::Backoff) | ||
} else { | ||
tracing::error!("error: {}", inner.to_string()); | ||
Err(StorageError::Generic { | ||
source: Arc::new(inner), | ||
}) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[BestPractice]
The logic to check for SlowDown
errors is duplicated across multiple functions (get_stream_and_e_tag
, fetch_range
, oneshot_upload
, etc.). To keep the code DRY and ensure consistency, consider extracting this into a private helper function.
For example, you could add this helper to s3.rs
:
fn check_common_s3_errors(err: &dyn ProvideErrorMetadata) -> Option<StorageError> {
if err.code() == Some("SlowDown") {
return Some(StorageError::Backoff);
}
// Add other common checks here in the future
None
}
Then you could simplify the error handling blocks like this one:
_ => { | |
if inner.code() == Some("SlowDown") { | |
Err(StorageError::Backoff) | |
} else { | |
tracing::error!("error: {}", inner.to_string()); | |
Err(StorageError::Generic { | |
source: Arc::new(inner), | |
}) | |
} | |
} | |
_ => { | |
if let Some(storage_err) = check_common_s3_errors(&inner) { | |
Err(storage_err) | |
} else { | |
tracing::error!("error: {}", inner.to_string()); | |
Err(StorageError::Generic { | |
source: Arc::new(inner), | |
}) | |
} | |
} |
This would make the code cleaner and easier to maintain.
⚡ Committable suggestion
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.
5275f03
to
d53488d
Compare
if inner.code() == Some("SlowDown") { | ||
Err(StorageError::Backoff) | ||
} else { | ||
Err(StorageError::Generic { | ||
source: Arc::new(inner), | ||
}) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[BestPractice]
To improve observability and for consistency with the error handling in get_stream_and_e_tag
, consider adding an error log in this generic fallback case. While the outer SdkError
is logged, logging the specific inner
error here provides more direct insight into unexpected S3 issues from this path.
Description of changes
Amazon periodically tells us to slow down and that manifests as a
gruesome error trace. Make a specific error type so callers can handle
the error.
Test plan
CI
Migration plan
N/A
Observability plan
N/A
Documentation Changes
N/A