-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Fix S3InputStream's handling of large skips #24521
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
I was able to reproduce this issue, but not in a way where it's going to be easy to write a regression test. What I did was upload a 10GB uncompressed JSON file to s3 and set up the |
lib/trino-filesystem-s3/src/main/java/io/trino/filesystem/s3/S3InputStream.java
Outdated
Show resolved
Hide resolved
When the skip(n) method is called the MAX_SKIP_BYTES check is skipped, resulting in the call potentially blocking for a long time. Instead of delegating to the underlying stream, set the nextReadPosition value. This allows the next read to decide if it is best to keep the existing s3 object stream or open a new one. This behavior matches the implementations for Azure and GCS.
11fba15
to
5fe42db
Compare
I'm testing this with secrets now |
What is the furher consequence of this? |
@findinpath I think that existing description is exhaustive enough. For S3 FS any delayed request will cause planning/execution to be longer than necessary. |
@alexjo2144 thanks, merging |
Fyi .. added RN entry
okay @wendigo @alexjo2144 |
That works for me, thanks Manfred |
Description
When the skip(n) method is called the MAX_SKIP_BYTES check is skipped, resulting in the call potentially blocking for a long time.
Instead of delegating to the underlying stream, set the nextReadPosition value. This allows the next read to decide if it is best to keep the existing s3 object stream or open a new one.
This behavior matches the implementations for Azure and GCS.
Additional context and related issues
Release notes
( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
( ) Release notes are required, with the following suggested text: