Skip to content

Conversation

@edgarRd
Copy link
Contributor

@edgarRd edgarRd commented Oct 16, 2024

The retry policy for S3InputStream reads introduced in c0d73f4 (#10433) is not actually re-opening the stream on each retry attempt given that it uses onFailure which, as per the documentation, it is triggered after the whole (retry) policy has failed completely and was unable to produce a successful result, i.e. all retries have been exhausted and failed. With the existing implementation, once a stream fails with something like SSLException due to a socket connection reset, the retries will keep failing (due to not actually reopening the stream) and therefore this won't work. The onFailure call does not work either because by the time openStream is called, the retry policy failure has already been propagated, failing the read operation.
The tests also were not testing that the retry policy actually called openStream(true) to reset the stream, it only tested that the calls went through the retry policy a given number of times, generally always succeeding on the last one.

This PR fixes the problem by using onRetry instead, as per the documentation, to re-open the stream right before we attempt the read again. Also, it fixes the tests by actually checking that the stream is reset (re-opened).

@github-actions github-actions bot added the AWS label Oct 16, 2024
@edgarRd
Copy link
Contributor Author

edgarRd commented Oct 16, 2024

@amogh-jahagirdar @jackye1995 @danielcweeks PTAL. Thanks!

@edgarRd
Copy link
Contributor Author

edgarRd commented Oct 16, 2024

cc @Parth-Brahmbhatt

@edgarRd edgarRd changed the title AWS: Fix S3inputstream retry policy AWS: Fix S3InputStream retry policy Oct 16, 2024
Copy link
Contributor

@amogh-jahagirdar amogh-jahagirdar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing this @edgarRd , this was a miss on my part on the original PR. Yeah we need to be using onRetry for triggering the input stream reset, not onFailure. Had some minor comments on the log message, would be great to fix so we can get this correctly into 1.7

Comment on lines +65 to +68
void resetForRetry() throws IOException {
resetForRetryCounter.incrementAndGet();
super.resetForRetry();
}
Copy link
Contributor

@amogh-jahagirdar amogh-jahagirdar Oct 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing this, this was crucial for verifying the retry behavior actually resets the input stream. Seems like we were just getting lucky on the tests before since they only counted the number of attempts but not what was actually happening in the attempts.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. That's correct.

@edgarRd
Copy link
Contributor Author

edgarRd commented Oct 16, 2024

@amogh-jahagirdar I've addressed the comment. PTAL - Thanks.

Copy link
Contributor

@amogh-jahagirdar amogh-jahagirdar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @edgarRd, really appreciate the fix here! I'll keep it open for a little bit in case anyone else had any comments.

Copy link
Contributor

@singhpk234 singhpk234 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM as well, Thanks @edgarRd !

@amogh-jahagirdar
Copy link
Contributor

amogh-jahagirdar commented Oct 17, 2024

Thanks @edgarRd, I'll go ahead and merge. Thank you @Parth-Brahmbhatt @singhpk234 for reviewing!

@amogh-jahagirdar amogh-jahagirdar merged commit bbbfd1e into apache:main Oct 17, 2024
@edgarRd
Copy link
Contributor Author

edgarRd commented Oct 17, 2024

Thanks for the reviews, @amogh-jahagirdar @singhpk234 and @Parth-Brahmbhatt !

zachdisc pushed a commit to zachdisc/iceberg that referenced this pull request Dec 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants