Skip to content

Conversation

@xiaoxuandev
Copy link
Contributor

@xiaoxuandev xiaoxuandev commented Aug 3, 2023

@danielcweeks @jackye1995 @amogh-jahagirdar @nastra
Add retry for both S3InputStream and S3OutputStream so that when we encounter network failures (mostly SSLException for server side connection reset and SocketTimoutException for client side connection reset) or any other retriable S3 error (like throttling error), we can retry at operation level without failing the entire query.

@github-actions github-actions bot added the AWS label Aug 3, 2023
this.skipSize = skipSize;
}

public static boolean shouldRetry(Exception exception) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nastra
Copy link
Contributor

nastra commented Aug 4, 2023

It feels like this adds a lot of baggage/complexity due to requiring quite a few configuration options. Also I was wondering whether we could leverage the retry behavior of the underlying S3 client by properly configuring it. The default retry behavior is shown in https://github.com/aws/aws-sdk-java-v2/blob/2.20.18/core/sdk-core/src/main/java/software/amazon/awssdk/core/internal/retry/SdkDefaultRetrySetting.java#L72-L84 and a configuration could look similar to what has been proposed in #8043

@xiaoxuandev
Copy link
Contributor Author

Thanks. @nastra I see what you are proposing but in our case the SDK level retry wouldn't help. Because we are using this method
https://sdk.amazonaws.com/java/api/latest/software/amazon/awssdk/services/s3/S3Client.html#getObject(software.amazon.awssdk.services.s3.model.GetObjectRequest,software.amazon.awssdk.core.sync.ResponseTransformer) which returns us a stream we will read from. The SDK level retry doesn't give us coverage beyond this call. In our case, the exceptions are thrown post this call when we are reading from the stream. That is why we need to add retries here.

@rdblue
Copy link
Contributor

rdblue commented Aug 31, 2023

@danielcweeks, what are your thoughts on retries in the S3 layer?

@github-actions
Copy link

This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the [email protected] list. Thank you for your contributions.

@github-actions github-actions bot added the stale label Sep 13, 2024
@github-actions
Copy link

This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If you think that is incorrect, or the pull request requires review, you can revive the PR at any time.

@github-actions github-actions bot closed this Sep 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants