-
Notifications
You must be signed in to change notification settings - Fork 9.2k
HADOOP-16906. Add Abortable.abort() interface for streams to enable output stream to be terminated #2667
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…ble output stream to be terminated No meaningful tests have been added, as I have no idea where I can add it, and how s3a has been tested with integration test manner. (Tests in TestS3ABlockOutputStream only check simple things with mocking everything, so can't do some write/upload test with it.)
|
I'll migrate @steveloughran 's review comments to the diff in PR. It doesn't seem to be done automatically - these comments have been shown as "normal review comments". |
| } | ||
|
|
||
| @Override | ||
| public void abort() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Migrating comment in HeartSaVioR@63c5588#r46507150
the new "cloud ready" API calls always return a CompletableFuture, to emphasise that the op may take time and to allow the caller to do something while waiting. Would we want to do this here? I'm not convinced it is appropriate. Instead we say
- call must guarantee that after this is invoked,. close() will not materialize the file at its final path
- it may communicate with the store to cancel an operation; which may retry. Errors will be stored.
- there may still/also be async IO to the store after the call returns, but this must maintain the requirement "not visible"
- And close() may do some IO to cancel
| return; | ||
| } | ||
|
|
||
| S3ADataBlocks.DataBlock block = getActiveBlock(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Migrating comment in HeartSaVioR@63c5588#r46506593
I think we are going to have to worry about this a bit more, because we may have queued >1 block for upload in a separate thread. They'll maybe need interruption, or at least, when they finish, see if they should immediately cancel the upload. This won't make any difference in the semantics of abort() (the final upload has been killed), I just don't want to run up any bills.
| S3ADataBlocks.DataBlock block = getActiveBlock(); | ||
| try { | ||
| if (multiPartUpload != null) { | ||
| multiPartUpload.abort(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Migrating comment in HeartSaVioR@63c5588#r46506765
ok, don't worry so much about my prev comment. That cancels all the outstanding futures.
| "uploadId", 50000, 1024, inputStream, null, 0L)); | ||
| } | ||
|
|
||
| @Test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Migrating comment in HeartSaVioR@63c5588#r46506646
tests are good. We will need to do an ITest too, which can be done in ITestS3ABlockOutputArray
|
|
||
| // This verification replaces testing various operations after calling abort: | ||
| // after calling abort, stream is closed like calling close(). | ||
| intercept(IOException.class, () -> stream.checkOpen()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Migrating comment in HeartSaVioR@63c5588#r46507220
should also verify that stream.write() raises an IOE. We could raise a subclass of IOE to indicate this was a checkOpen failure for a stricter test
|
💔 -1 overall
This message was automatically generated. |
|
Addressed the basic UT, IT, scaled IT. Once the Yetus is happy with the change, I'll remove WIP and ping again for review. |
|
I don't get the warning sign findbugs provided; |
|
💔 -1 overall
This message was automatically generated. |
|
I'll remove WIP tag as I don't have any idea on findbugs failure and I guess I've addressed everything except "cloud-friendly" requirement on API. This doesn't sound something strictly bound to this PR, but please correct me if I'm missing here. |
|
cc. @steveloughran Could you please review this PR? Thanks in advance! |
|
💔 -1 overall
This message was automatically generated. |
mehakmeet
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, pending some nits. Also, maybe we should add a test after a path is created and then abort the upload? There are no tests to check abort with pre-existing data on a file. Also, we have object_multipart_aborted statistic in IOStatistics by @steveloughran, which could be a little helpful in asserting the abort while doing a multipart upload.
|
|
||
| import org.apache.hadoop.fs.FSDataOutputStream; | ||
| import org.apache.hadoop.fs.FileSystem; | ||
| import org.apache.hadoop.io.IOUtils; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reposition in org.apache.* block below.
| } | ||
|
|
||
| @Test | ||
| public void testAbortAfterTwoPartUpload() throws Throwable { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
javadoc or describe() to explain the test.
| import com.amazonaws.services.s3.model.PutObjectRequest; | ||
| import com.amazonaws.services.s3.model.PutObjectResult; | ||
| import com.amazonaws.services.s3.model.UploadPartRequest; | ||
| import org.apache.hadoop.fs.Abortable; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reposition in org.apache.* block below.
mukund-thakur
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prod code looks good.
Pending checkstyle as well.
Let me think what can done for better tests. I like mehakmeet's suggention. Also about the find bug.
|
|
||
| @Test | ||
| public void testAbortAfterWrite() throws Throwable { | ||
| Path dest = path("testAbortAfterWrite"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use getMethodName()
| @Test | ||
| public void testAbortAfterWrite() throws Throwable { | ||
| Path dest = path("testAbortAfterWrite"); | ||
| describe(" testAbortAfterWrite"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: A bit more explanatory.
| return true; | ||
|
|
||
| // S3A supports abort. | ||
| case StreamCapabilities.ABORTABLE: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can merge both case statements
|
|
||
| S3ADataBlocks.DataBlock block = getActiveBlock(); | ||
| try { | ||
| if (multiPartUpload != null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wondering what happens in case of non multipart upload
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it,we are closing the stream before only.
This is the reason. I think we can ignore this. What do you think @steveloughran ? |
|
Added an extra patch to this, see the PR #2684 . |
|
Thanks @steveloughran for helping! I see you've addressed lots of points especially amazing efforts on doc. Let me close this and jump on your PR. Thanks again! |
NOTE: WIP. DO-NOT-MERGE.
No meaningful tests have been added, as I have no idea where I can add it, and how s3a has been tested
with integration test manner. (Tests in TestS3ABlockOutputStream only check simple things with mocking
everything, so can't do some write/upload test with it.)
I got some review comments from @steveloughran in my commit, and will reflect these review comments.
Once it's done I'll remove WIP.