Skip to content
Merged
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 23 additions & 2 deletions aws/src/main/java/org/apache/iceberg/aws/s3/S3InputStream.java
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import software.amazon.awssdk.core.sync.ResponseTransformer;
import software.amazon.awssdk.http.Abortable;
import software.amazon.awssdk.services.s3.S3Client;
import software.amazon.awssdk.services.s3.model.GetObjectRequest;
import software.amazon.awssdk.services.s3.model.NoSuchKeyException;
Expand Down Expand Up @@ -194,9 +195,29 @@ private void openStream() throws IOException {
}
}

private void closeStream() throws IOException {
private void closeStream() {
if (stream != null) {
stream.close();
// if we aren't at the end of the stream, and the stream is abortable, then
// call abort() so we don't read the remaining data with the Apache HTTP client
abortStream();
try {
stream.close();
} catch (Exception e) {
// log at trace level as closing an aborted stream will throw a content length
// check exception with the Apache HTTP client
LOG.trace("Error closing stream", e);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That means we will see exceptions for every stream close if trace is enabled. If it is expected to fail with the content length check, can we at least skip that? Then we don't need to log in trace, but we can log in maybe a warning level.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is how I originally had it but the feedback was to log something. I'd be happy to remove that.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@singhpk234 @danielcweeks are you ok if I revert this back to a no-op?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the question to ask before this is, why do we need to close it again if it's already aborted? If you see in the Trino logic I referenced, it's an if-else.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would work for the Apache HTTP client, the close doesn't do anything for that. But aborting the stream opened by URL connection client doesn't do anything, so we need to close it in that case.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, that's interesting difference, thanks for the explanation!

}
stream = null;
}
}

private void abortStream() {
try {
if (stream instanceof Abortable && stream.read() != -1) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we want to read one more byte here? It might cause one more request. I think it does not hurt to even abort when it's already fully read.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you abort the connection then it will be invalidated and removed from the pool, so it won't be reused. So this is an optimization to attempt to reuse the connection in cases where it has been fully read.

((Abortable) stream).abort();
}
} catch (Exception e) {
LOG.debug("An error occurred while aborting the stream", e);
Comment thread
jackye1995 marked this conversation as resolved.
Outdated
}
}

Expand Down