-
Notifications
You must be signed in to change notification settings - Fork 3k
AWS: abort S3 input stream on close if not EOS #7262
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| try { | ||
| stream.close(); | ||
| } catch (Exception e) { | ||
| // close quietly |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might want to log the underlying reason as info or warn here, just so we don't silently swallow issues that may come up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will always throw on an aborted stream on a checksum check failure, so it could be very noisy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a trace log
| if (stream != null) { | ||
| stream.close(); | ||
| } | ||
| closeStream(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change seems unrelated and I'm not sure what it really adds. It looks like we're dereferencing the stream, but we still would throw in any situation where it actually have any affect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
stream.close() was being called without the null check in one place, so this was added to prevent a potential NPE, however this is unrelated to the performance regression.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I rolled this back
|
Thanks @bryanck, this is great find and fix ! one doubt will it cause issue when we switch back the client from apache to url-connection (let's say via table props) considering this was not a issue with url-connection client ? Also do we need this in 1.2.1 considering 1.2.0 was still shipped with url-connection as default http client ? The regression commit is still in master commit. |
|
@singhpk234 The I thought we wanted to get #7119 into 1.2.1, is that still the case @danielcweeks @nastra ? Without that, it becomes more cumbersome to configure when using the AWS bundle, as including the URL connection client on the classpath then requires the default AWS http client be set also (e.g. via system properties). |
|
@bryanck apologies, as per my understanding, I thought it's other way round, based on this file UrlConnectionHttpClient.java I am also very curious about this statement from the doc as well : "Input stream that provides access to the unmarshalled POJO response returned by the service in addition to the streamed contents. This input stream should be closed to release the underlying connection back to the connection pool. If it is not desired to read remaining data from the stream, you can explicitly abort the connection via abort(). Note that this will close the underlying connection and require establishing an HTTP connection which may outweigh the cost of reading the additional data." If we go by the last paragraph and IIUC, it says calling abort will not let the connection being re-used and the connection pool will have to establish a new http connection, can it cause regression in some scenario ? I am just trying to clear my understanding here.
My understanding was 1.2.1 was only for bug fixes for 1.2.0 release. |
|
@singhpk234 |
|
I ran the TPC-DS benchmark with and without the changes in this PR using the Apache HTTP client. Without this change, the result was ~2x slower (as a result of reading far more data). With this change, the result was the same as with URL connection client (and the amount of data transferred was also the same). Closing the stream opened by URL connection client won't read to EOS like with the Apache HTTP client, so with this PR the behavior should be better aligned. |
|
@singhpk234 you are right that changing the default HTTP client for a patch release is not appropriate, so we'd leave that out of 1.2.1. We'd like the Apache HTTP client fixes in for 1.2.1, though, so that can be used. |
|
Nice catch. Sorry I totally forgot we did exactly the same abort thing for Trino: https://github.com/trinodb/trino/blob/master/plugin/trino-hive/src/main/java/io/trino/plugin/hive/s3/TrinoS3FileSystem.java#L1579, and also in my old PR https://github.com/apache/iceberg/pull/4912/files#diff-0b632866a3b10fac55c442b08178ec0ac72b3b600878243e15d788a8bd031054R299. |
|
|
||
| private void abortStream() { | ||
| try { | ||
| if (stream instanceof Abortable && stream.read() != -1) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we want to read one more byte here? It might cause one more request. I think it does not hurt to even abort when it's already fully read.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you abort the connection then it will be invalidated and removed from the pool, so it won't be reused. So this is an optimization to attempt to reuse the connection in cases where it has been fully read.
| } catch (Exception e) { | ||
| // log at trace level as closing an aborted stream will throw a content length | ||
| // check exception with the Apache HTTP client | ||
| LOG.trace("Error closing stream", e); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That means we will see exceptions for every stream close if trace is enabled. If it is expected to fail with the content length check, can we at least skip that? Then we don't need to log in trace, but we can log in maybe a warning level.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is how I originally had it but the feedback was to log something. I'd be happy to remove that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@singhpk234 @danielcweeks are you ok if I revert this back to a no-op?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the question to ask before this is, why do we need to close it again if it's already aborted? If you see in the Trino logic I referenced, it's an if-else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would work for the Apache HTTP client, the close doesn't do anything for that. But aborting the stream opened by URL connection client doesn't do anything, so we need to close it in that case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, that's interesting difference, thanks for the explanation!
| try { | ||
| stream.close(); | ||
| } catch (Exception e) { | ||
| // close quietly, closing an aborted stream will throw a content length |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I guess I did not express it clearly. What I meant was that, can we check for the exception that if it is a content length check exception then we can skip, otherwise we log a warning?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. ContentLengthInputStream is an Apache HTTP class, so referring to that will tie S3InputStream to that client.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could check the name of the class, but that seems like a little bit of a hack.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah okay, what does the exception look like? I though there are some error code that we can get out of the exception that would imply this specific error. Is that not there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is what gets thrown. Calling close on the aborted stream will attempt a read which then throws that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checking ConnectionClosedException seems good enough to me, because it makes sense that if the connection is already closed then we can omit the error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think org.apache.http.ConnectionClosedException is already in the class path given we have some REST client integration in core, but I might be wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, I pushed that change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is a different version than being used by the AWS SDK (v5 instead of v4). Also, the AWS bundle shades the library.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another option is we check which HTTP client is being used and change the close behavior based on that.
jackye1995
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, looks good to me!
| } catch (IOException e) { | ||
| // the Apache HTTP client will throw a ConnectionClosedException | ||
| // when closing an aborted stream, which is expected | ||
| if (!e.getClass().getSimpleName().equals("ConnectionClosedException")) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we just do instanceof here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh okay I see this thread #7262 (comment), since it's specific to Apache Http client, it's not guaranteed it'll be on the classpath so we can't use instanceof
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would add a dependency on the AWS-specific Apache HTTP client (v4). Also, the AWS bundle shades the library, so the classes will be different depending on that (different packages).
amogh-jahagirdar
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for the fix @bryanck !
Actually that's a potentially better approach, since we only have 2 officially supported client types I am a bit worried about the 1 additional byte read, seems to only benefit We could get the value from @bryanck @amogh-jahagirdar any thoughts? |
|
I was hoping we could get it from S3Client but I didn't see a way, I'll look some more tonight |
|
The only way I could find to determine the HTTP client type on the S3 client is to use reflection. So we could do that which is not ideal. Another options is to read the AwsProperties to get the type, but that may not match the actual HTTP client for custom factories. I feel like the current solution is the least invasive for the 1.2.1 release and achieves the desired outcome of performance parity between the two HTTP clients. There are alternatives to detecting the EOS also. We could get the content length of the response and track the number of bytes read from the stream, but I feel like that is heavier than the one byte read. Generaly that byte will be buffered if not EOS so shouldn't trigger any network I/O. |
* AWS: abort S3 input stream on close if not EOS * Close the stream for backwards compatibility * undo unrelated change * add trace log * comment update * logger updates * handle connection closed exception
singhpk234
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM as well (apologies being late to party), many thanks @bryanck for this fix !
* AWS: abort S3 input stream on close if not EOS * Close the stream for backwards compatibility * undo unrelated change * add trace log * comment update * logger updates * handle connection closed exception
This PR adds a check when closing an S3 input stream if the stream is at end-of-stream. If not, it calls
abort()on the stream instead ofclose(). This avoids reading to the end of the stream when closing.The Apache HTTP connection will always read the entire stream on close, which results in much more data being read than needed. The URL connection client does not behave this way.
Now that the Apache HTTP client is the default instead of URL connection client for AWS, this change should prevent a performance regression.