Skip to content

Conversation

@steveloughran
Copy link
Contributor

HADOOP-16190. S3A copyFile operation to include source versionID or etag in the copy request

This patch adds the constraints on the request, and maps a 412 response to a RemoteFileChangedException.

No obvious test for this. The way to do it would be to get an invalid etag/version in to the request and see what happens, which would complicate the copy API a bit -but is something we will need for etag/version tracking in s3guard anyway....

Change-Id: I4b229336ba2d57018bd8b66888b807074419598e

@steveloughran
Copy link
Contributor Author

Initial PR: untested

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
0 reexec 27 Docker mode activated.
_ Prechecks _
+1 @author 0 The patch does not contain any @author tags.
-1 test4tests 0 The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
_ trunk Compile Tests _
+1 mvninstall 1001 trunk passed
+1 compile 31 trunk passed
+1 checkstyle 21 trunk passed
+1 mvnsite 35 trunk passed
+1 shadedclient 717 branch has no errors when building and testing our client artifacts.
+1 findbugs 43 trunk passed
+1 javadoc 21 trunk passed
_ Patch Compile Tests _
+1 mvninstall 29 the patch passed
+1 compile 29 the patch passed
+1 javac 29 the patch passed
+1 checkstyle 18 the patch passed
+1 mvnsite 32 the patch passed
+1 whitespace 0 The patch has no whitespace issues.
+1 shadedclient 722 patch has no errors when building and testing our client artifacts.
+1 findbugs 46 the patch passed
+1 javadoc 21 the patch passed
_ Other Tests _
+1 unit 275 hadoop-aws in the patch passed.
+1 asflicense 28 The patch does not generate ASF License warnings.
3169
Subsystem Report/Notes
Docker Client=17.05.0-ce Server=17.05.0-ce base: https://builds.apache.org/job/hadoop-multibranch/job/PR-606/1/artifact/out/Dockerfile
GITHUB PR #606
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle
uname Linux b3971a01fcbb 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/hadoop.sh
git revision trunk / d60673c
maven version: Apache Maven 3.3.9
Default Java 1.8.0_191
findbugs v3.1.0-RC1
Test Results https://builds.apache.org/job/hadoop-multibranch/job/PR-606/1/testReport/
Max. process+thread count 446 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws
Console output https://builds.apache.org/job/hadoop-multibranch/job/PR-606/1/console
Powered by Apache Yetus 0.9.0 http://yetus.apache.org

This message was automatically generated.

@steveloughran
Copy link
Contributor Author

no, no new tests. Could add one to check exception translation though

…tag in the copy request

This patch adds the constraints on the request, and maps a 412 response to a RemoteFileChangedException.

No obvious test for this. The way to do it would be to get an invalid etag/version in to the request and see what happens, which would complicate the copy API a bit -but is something we will need for etag/version tracking in s3guard anyway....

Change-Id: I4b229336ba2d57018bd8b66888b807074419598e
@steveloughran steveloughran force-pushed the s3/HADOOP-16190-copy-etag branch from b19d3fe to 8ca29e7 Compare March 21, 2019 11:19
@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
0 reexec 26 Docker mode activated.
_ Prechecks _
+1 @author 0 The patch does not contain any @author tags.
-1 test4tests 0 The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
_ trunk Compile Tests _
+1 mvninstall 999 trunk passed
+1 compile 33 trunk passed
+1 checkstyle 23 trunk passed
+1 mvnsite 38 trunk passed
+1 shadedclient 715 branch has no errors when building and testing our client artifacts.
+1 findbugs 43 trunk passed
+1 javadoc 25 trunk passed
_ Patch Compile Tests _
+1 mvninstall 29 the patch passed
+1 compile 28 the patch passed
+1 javac 28 the patch passed
+1 checkstyle 18 the patch passed
+1 mvnsite 31 the patch passed
+1 whitespace 0 The patch has no whitespace issues.
+1 shadedclient 741 patch has no errors when building and testing our client artifacts.
+1 findbugs 47 the patch passed
+1 javadoc 23 the patch passed
_ Other Tests _
+1 unit 274 hadoop-aws in the patch passed.
+1 asflicense 28 The patch does not generate ASF License warnings.
3200
Subsystem Report/Notes
Docker Client=17.05.0-ce Server=17.05.0-ce base: https://builds.apache.org/job/hadoop-multibranch/job/PR-606/2/artifact/out/Dockerfile
GITHUB PR #606
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle
uname Linux 2c74b7f3fb17 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/hadoop.sh
git revision trunk / 9f1c017
maven version: Apache Maven 3.3.9
Default Java 1.8.0_191
findbugs v3.1.0-RC1
Test Results https://builds.apache.org/job/hadoop-multibranch/job/PR-606/2/testReport/
Max. process+thread count 445 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws
Console output https://builds.apache.org/job/hadoop-multibranch/job/PR-606/2/console
Powered by Apache Yetus 0.9.0 http://yetus.apache.org

This message was automatically generated.

@steveloughran
Copy link
Contributor Author

full test run against an unversioned bucket with s3guard, dynamo and scale options (i.e..uses etags)

All tests worked except for ITestDynamoDBMetadataStoreScale, whose failures are down to me switching to an unlimited PAYG DDB table. See HADOOP-16118

rerunning against a versioned bucket to make sure this codepath is good too.

  1. There are no tests because I can't think of a good test here.
  2. I'm going to add a comment in testing about why it's good to test with versioned buckets

…ry rule. Includes a PNG file of the form to fill in

Change-Id: I7cce8d954da08a9f9e39701e6abaf6b99bdf6cea
@steveloughran
Copy link
Contributor Author

  • tested at scale against a versioned bucket; all happy apart from those dynamo db failures.
  • Added a section on setting up a versioned bucket for testing, especially setting up a rule to to delete old files. you will pay for a days storage of all the data generated on every test run: your bill is O(runs), with scale test runs costing more. But after 24h with no tests, no data to bill for.

FWIW, I'll set up different buckets with different policies, and have my dev hadoop-trunk source tree running unversioned, but the separate branch I use for committing work set up to test against versioned. At least once we've got the version-aware-delete-fake-dirs-code in


AWS S3 and some third party stores support versioned buckets.

Hadoop is adding awareness of this, including

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whitespace:end of line

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
0 reexec 29 Docker mode activated.
_ Prechecks _
+1 @author 0 The patch does not contain any @author tags.
-1 test4tests 0 The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
_ trunk Compile Tests _
+1 mvninstall 1020 trunk passed
+1 compile 30 trunk passed
+1 checkstyle 25 trunk passed
+1 mvnsite 36 trunk passed
+1 shadedclient 700 branch has no errors when building and testing our client artifacts.
+1 findbugs 46 trunk passed
+1 javadoc 27 trunk passed
_ Patch Compile Tests _
+1 mvninstall 29 the patch passed
+1 compile 28 the patch passed
+1 javac 28 the patch passed
+1 checkstyle 20 the patch passed
+1 mvnsite 33 the patch passed
-1 whitespace 0 The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix <<patch_file>>. Refer https://git-scm.com/docs/git-apply
+1 shadedclient 742 patch has no errors when building and testing our client artifacts.
+1 findbugs 50 the patch passed
+1 javadoc 23 the patch passed
_ Other Tests _
+1 unit 273 hadoop-aws in the patch passed.
+1 asflicense 32 The patch does not generate ASF License warnings.
3212
Subsystem Report/Notes
Docker Client=17.05.0-ce Server=17.05.0-ce base: https://builds.apache.org/job/hadoop-multibranch/job/PR-606/3/artifact/out/Dockerfile
GITHUB PR #606
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle
uname Linux 2dc13fc24f1d 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/hadoop.sh
git revision trunk / 9f1c017
maven version: Apache Maven 3.3.9
Default Java 1.8.0_191
findbugs v3.1.0-RC1
whitespace https://builds.apache.org/job/hadoop-multibranch/job/PR-606/3/artifact/out/whitespace-eol.txt
Test Results https://builds.apache.org/job/hadoop-multibranch/job/PR-606/3/testReport/
Max. process+thread count 440 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws
Console output https://builds.apache.org/job/hadoop-multibranch/job/PR-606/3/console
Powered by Apache Yetus 0.9.0 http://yetus.apache.org

This message was automatically generated.

* @throws AmazonClientException on failures inside the AWS SDK
* @throws InterruptedIOException the operation was interrupted
* @throws IOException Other IO problems
*/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add @return javadoc

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done; will review rest of comment.

…the exception, add a test for that and an entry in the troubleshooting_s3a doc

Change-Id: I71a13024a7efe32b56ecfe49fbe79529627ab126
@steveloughran
Copy link
Contributor Author

I've just updated the patch to address the PR and some other changes it needed: a test and a doc entry.

Also my first attempt at using vs.code for an in-IDE review tool. Mixed opinions there; will need to explore it more —but worth trying out

Change-Id: I46b3fc65711856b390e1a4d3b4d860113061eac8

/**
* Error message used when mapping a 412 to this exception.
*/

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whitespace:end of line


AWS S3 and some third party stores support versioned buckets.

Hadoop is adding awareness of this, including

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whitespace:end of line

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
0 reexec 26 Docker mode activated.
_ Prechecks _
+1 @author 0 The patch does not contain any @author tags.
+1 test4tests 0 The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+1 mvninstall 1003 trunk passed
+1 compile 28 trunk passed
+1 checkstyle 18 trunk passed
+1 mvnsite 36 trunk passed
+1 shadedclient 673 branch has no errors when building and testing our client artifacts.
+1 findbugs 38 trunk passed
+1 javadoc 20 trunk passed
_ Patch Compile Tests _
+1 mvninstall 27 the patch passed
+1 compile 27 the patch passed
+1 javac 27 the patch passed
-0 checkstyle 15 hadoop-tools/hadoop-aws: The patch generated 1 new + 10 unchanged - 0 fixed = 11 total (was 10)
+1 mvnsite 33 the patch passed
-1 whitespace 0 The patch has 2 line(s) that end in whitespace. Use git apply --whitespace=fix <<patch_file>>. Refer https://git-scm.com/docs/git-apply
+1 shadedclient 738 patch has no errors when building and testing our client artifacts.
+1 findbugs 47 the patch passed
+1 javadoc 21 the patch passed
_ Other Tests _
+1 unit 275 hadoop-aws in the patch passed.
+1 asflicense 25 The patch does not generate ASF License warnings.
3134
Subsystem Report/Notes
Docker Client=17.05.0-ce Server=17.05.0-ce base: https://builds.apache.org/job/hadoop-multibranch/job/PR-606/4/artifact/out/Dockerfile
GITHUB PR #606
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle
uname Linux 4bcdbdabc70e 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/hadoop.sh
git revision trunk / 67dd45f
maven version: Apache Maven 3.3.9
Default Java 1.8.0_191
findbugs v3.1.0-RC1
checkstyle https://builds.apache.org/job/hadoop-multibranch/job/PR-606/4/artifact/out/diff-checkstyle-hadoop-tools_hadoop-aws.txt
whitespace https://builds.apache.org/job/hadoop-multibranch/job/PR-606/4/artifact/out/whitespace-eol.txt
Test Results https://builds.apache.org/job/hadoop-multibranch/job/PR-606/4/testReport/
Max. process+thread count 446 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws
Console output https://builds.apache.org/job/hadoop-multibranch/job/PR-606/4/console
Powered by Apache Yetus 0.9.0 http://yetus.apache.org

This message was automatically generated.

copyObjectRequest.setCannedAccessControlList(cannedACL);
copyObjectRequest.setNewObjectMetadata(dstom);
String id = srcom.getVersionId();
if (id != null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way I have been approaching this in my work for HADOOP-16085 is to use ChangeDetectionPolicy and ChangeTracker. I was adding new ChangeTracker.maybeApplyConstraint(CopyObjectRequest) and ChangeDetectionPolicy.applyRevisionConstraint(CopyObjectRequest, revisionId) methods to support that.

I call ChangeTracker.processMetadata(srcom) and then ChangeTracker.maybeApplyConstraint(copyObjectRequest). Resulting behavior depends on ChangeDetectionPolicy, but would generally be similar to what you are doing here. A couple of differences:

  • if change.detection.mode=none, my code doesn't add the constraint
  • if change.detection.version.required=true, my code will throw NoVersionAttributeException when the attribute defined by change.detection.source is unavailable

What do you think about that?

Side note: my branch doesn't currently handle exceptions on copy (the 412 error condition) properly. This PR helped remind me of that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something else is I wonder about involving ChangeTracker in exception handling such that it can see this condition and increment the version mismatch counter.

}

@Test
public void test413isPreconditions() throws Exception {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@noslowerdna noted this somewhere else, but it looks like 413 should be 412 here.

@ben-roling
Copy link
Contributor

I like the documentation updates. Thanks for adding those :)

enabled.

A full `hadoop-aws` test run implicitly cleans up all files in the bucket
in `ITestS3AContractRootDir`, so all every test run creates a large set of
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all every -> every

A full `hadoop-aws` test run implicitly cleans up all files in the bucket
in `ITestS3AContractRootDir`, so all every test run creates a large set of
old (deleted) file versions. To avoid large bills, you must
create a lifecycle rule on the bucket to purge the old versions.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nevermind, given the specific steps listed below

@steveloughran steveloughran added the fs/s3 changes related to hadoop-aws; submitter must declare test endpoint label Mar 28, 2019

/**
* Error message used when mapping a 412 to this exception.
*/

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whitespace:end of line

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
0 reexec 32 Docker mode activated.
_ Prechecks _
+1 @author 0 The patch does not contain any @author tags.
+1 test4tests 0 The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+1 mvninstall 959 trunk passed
+1 compile 28 trunk passed
+1 checkstyle 21 trunk passed
+1 mvnsite 38 trunk passed
+1 shadedclient 697 branch has no errors when building and testing our client artifacts.
+1 findbugs 39 trunk passed
+1 javadoc 23 trunk passed
_ Patch Compile Tests _
+1 mvninstall 27 the patch passed
+1 compile 26 the patch passed
+1 javac 26 the patch passed
-0 checkstyle 16 hadoop-tools/hadoop-aws: The patch generated 1 new + 10 unchanged - 0 fixed = 11 total (was 10)
+1 mvnsite 30 the patch passed
-1 whitespace 0 The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix <<patch_file>>. Refer https://git-scm.com/docs/git-apply
+1 shadedclient 684 patch has no errors when building and testing our client artifacts.
+1 findbugs 47 the patch passed
+1 javadoc 20 the patch passed
_ Other Tests _
+1 unit 276 hadoop-aws in the patch passed.
+1 asflicense 23 The patch does not generate ASF License warnings.
3081
Subsystem Report/Notes
Docker Client=17.05.0-ce Server=17.05.0-ce base: https://builds.apache.org/job/hadoop-multibranch/job/PR-606/5/artifact/out/Dockerfile
GITHUB PR #606
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle
uname Linux bb1df5e0708f 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/hadoop.sh
git revision trunk / 15d38b1
maven version: Apache Maven 3.3.9
Default Java 1.8.0_191
findbugs v3.1.0-RC1
checkstyle https://builds.apache.org/job/hadoop-multibranch/job/PR-606/5/artifact/out/diff-checkstyle-hadoop-tools_hadoop-aws.txt
whitespace https://builds.apache.org/job/hadoop-multibranch/job/PR-606/5/artifact/out/whitespace-eol.txt
Test Results https://builds.apache.org/job/hadoop-multibranch/job/PR-606/5/testReport/
Max. process+thread count 446 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws
Console output https://builds.apache.org/job/hadoop-multibranch/job/PR-606/5/console
Powered by Apache Yetus 0.9.0 http://yetus.apache.org

This message was automatically generated.

…ream read version mismatch counter

Change-Id: I0e3e5ac94592495d078202a0881aef4d89236e81
@steveloughran
Copy link
Contributor Author

S3 ireland, s3guard+ ddb + auth. Apart from the DynamoDB per-request billing test failures covered in #647, all good


/**
* Error message used when mapping a 412 to this exception.
*/

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whitespace:end of line

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
0 reexec 27 Docker mode activated.
_ Prechecks _
+1 @author 0 The patch does not contain any @author tags.
+1 test4tests 0 The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+1 mvninstall 971 trunk passed
+1 compile 33 trunk passed
+1 checkstyle 20 trunk passed
+1 mvnsite 36 trunk passed
+1 shadedclient 686 branch has no errors when building and testing our client artifacts.
+1 findbugs 65 trunk passed
+1 javadoc 25 trunk passed
_ Patch Compile Tests _
+1 mvninstall 30 the patch passed
+1 compile 28 the patch passed
+1 javac 28 the patch passed
-0 checkstyle 18 hadoop-tools/hadoop-aws: The patch generated 1 new + 10 unchanged - 0 fixed = 11 total (was 10)
+1 mvnsite 32 the patch passed
-1 whitespace 0 The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix <<patch_file>>. Refer https://git-scm.com/docs/git-apply
+1 shadedclient 719 patch has no errors when building and testing our client artifacts.
+1 findbugs 44 the patch passed
+1 javadoc 25 the patch passed
_ Other Tests _
+1 unit 280 hadoop-aws in the patch passed.
+1 asflicense 25 The patch does not generate ASF License warnings.
3153
Subsystem Report/Notes
Docker Client=17.05.0-ce Server=17.05.0-ce base: https://builds.apache.org/job/hadoop-multibranch/job/PR-606/6/artifact/out/Dockerfile
GITHUB PR #606
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle
uname Linux 396d24996991 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/hadoop.sh
git revision trunk / d7a2f94
maven version: Apache Maven 3.3.9
Default Java 1.8.0_191
findbugs v3.1.0-RC1
checkstyle https://builds.apache.org/job/hadoop-multibranch/job/PR-606/6/artifact/out/diff-checkstyle-hadoop-tools_hadoop-aws.txt
whitespace https://builds.apache.org/job/hadoop-multibranch/job/PR-606/6/artifact/out/whitespace-eol.txt
Test Results https://builds.apache.org/job/hadoop-multibranch/job/PR-606/6/testReport/
Max. process+thread count 411 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws
Console output https://builds.apache.org/job/hadoop-multibranch/job/PR-606/6/console
Powered by Apache Yetus 0.9.0 http://yetus.apache.org

This message was automatically generated.

@ben-roling
Copy link
Contributor

Hey @steveloughran - what would you like to do with this PR? As I mentioned above, I did things a little differently in #646 . Would you like this to go in as-is and get replaced by strategy when #646 is merged? Or should we pull the bit of #646 relevant to this out into this PR? Or, do you have any reservations about how this is currently implemented in #646?

break;

// version/etag id cannot be met in copy.
case 412:
Copy link

@ehiggs ehiggs Apr 8, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

public final static int PRECONDITION_FAILED = 412;

@steveloughran
Copy link
Contributor Author

#646 is the big one, which does a lot more than this, This one just hardens copy even without s3guard, when you have a large enough file that the transfer manager decides to do bits in parallel.

I think this one is probably the "straightforward to backport version" which we can pull back into 3.x line without risk. The #646 patch is something big enough it's going to have to go into trunk, and we'll worry about whether to put it into 3.2.x when and only when we are all happy with it working in our day-to-day experience, tests etc.

Now, if this one went in, and I pulled it back to the older versions, would this change be a problem for you, even if your patch just completely stamps on it? Because once the big patch is in, I'm not going to worry about this one -except in those older releases.

@ben-roling
Copy link
Contributor

Given you don't have a problem with what I'm doing in #646 going over top of it later, I have no problem with this going in as-is or being patched into older versions.

I will also note though that this seems somewhat inconsistent with changes already in S3AInputStream from HADOOP-15625. There we only apply constraints indicated by the fs.s3a.change.detection configuration whereas here you apply constraints regardless of that config. You probably already realize this, but just wanted to be sure. Still, I'm not bothered enough by the inconsistency to say you shouldn't go ahead with it if you like.

@steveloughran
Copy link
Contributor Author

bq. . There we only apply constraints indicated by the fs.s3a.change.detection configuration whereas here you apply constraints regardless of that config. You probably already realize this, but just wanted to be sure. Still, I'm not bothered enough by the inconsistency to say you shouldn't go ahead with it if you like.

I couldn't see way of turning not making this mandatory, and at the same time, why anyone wouldn't want a single copy to be of a single file, not a mixture of blocks. If I'd known that this problem existed and could be stopped, I'd had have this patch in a long time ago. I'll probably backport all the way to hadoop 2.8+ for that reason -which is not something I was planning for the input stream code.

@ben-roling
Copy link
Contributor

Yep, no arguments here.

@bgaborg
Copy link

bgaborg commented Apr 13, 2019

I don't see any integration tests added to test this feature, just a unit test for the Invoker. What is the reason for this?

/**
* Error message used when mapping a 412 to this exception.
*/
public static final String PRECONDITIONS_NOT_MET =
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Http Response code is "Preconditions Failed" so we probably want to use the same string and keep the name.

Check the URI. If using a third-party store, verify that you've configured
the client to talk to the specific server in `fs.s3a.endpoint`.

#### <a name="preconditions_unmet"></a> `RemoteFileChangedException`: `Constraints of request were unsatisfiable`
Copy link

@ehiggs ehiggs Apr 15, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The http response is Precondition Failed, not "unmet". Even in AWS code, so I don't know from where unmet keeps popping up.

https://github.com/aws/aws-sdk-java/blob/8a36bb38eed08150faf899d8a5dcf33871ef3f2c/aws-java-sdk-models/src/main/resources/models/models.lex-2017-04-19-model.json#L2517


// out of range. This may happen if an object is overwritten with
// a shorter one while it is being read.
case 416:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RANGE_NOT_SATISFIABLE

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(argh, hitting cmd-enter turns this into a review and not a comment like it used to)

@steveloughran
Copy link
Contributor Author

@ehiggs thanks for the comments, will get back to them

@steveloughran
Copy link
Contributor Author

closing as a WONTFIX; HADOOP-16085 shows up some issues in the AWS SDK which make this a trickier backport than you'd think

shanthoosh pushed a commit to shanthoosh/hadoop that referenced this pull request Oct 15, 2019
…h- and low-level APIs in YARN and standalone environment

This is the initial PR for SEP-13. High-lighted changes:
- Define StreamApplication and TaskApplication with describe(ApplicationDescriptor) API to define processing logic of a Stream application
   - the objects instantiated and registered to ApplicationDescriptor in describe() method should be serializable
- Define ApplicationRunner to have mandatory constructor parameter of ApplicationDescriptor
- Define ProcessorLifecycleListenerFactory to allow user inject local logic and instantiate local objects in the processors in an application

Author: Yi Pan (Data Infrastructure) <[email protected]>
Author: Yi Pan (Data Infrastructure) <[email protected]>
Author: Yi Pan (Data Infrastructure) <[email protected]>
Author: Prateek Maheshwari <[email protected]>
Author: Prateek Maheshwari <[email protected]>
Author: prateekm <[email protected]>

Reviewers: Prateek Maheshwari <[email protected]>, Cameron Lee <[email protected]>

Closes apache#606 from nickpan47/app-runtime-with-processor-callbacks and squashes the following commits:

3e60d44a [Yi Pan (Data Infrastructure)] SAMZA-1789: final revision on ApplicationDescriptor and ApplicationRunner APIs
bdb5b0fc [Yi Pan (Data Infrastructure)] SAMZA-1789: ApplicationRunner and ApplicationDescriptor final revision
66af5b70 [Yi Pan (Data Infrastructure)] SAMZA-1789: addressing Cameron's review comments.
ec4bb1dc [Yi Pan (Data Infrastructure)] SAMZA-1789: merge with fix for SAMZA-1836
9c89c63d [Yi Pan (Data Infrastructure)] Merge branch 'master' into app-runtime-with-processor-callbacks
91fcd73a [Yi Pan (Data Infrastructure)] Merge branch 'master' into app-runtime-with-processor-callbacks
34ffda8a [Yi Pan (Data Infrastructure)] SAMZA-1789: disabling tests due to SAMZA-1836
02076c85 [Yi Pan (Data Infrastructure)] SAMZA-1789: fixed the modifier for the mandatory constructor of ApplicationRunner; Disabled three tests due to wrong configure for test systems
222abf21 [Yi Pan (Data Infrastructure)] SAMZA-1789: added a constructor to StreamProcessor to take a StreamProcessorListenerFactory
7a73992a [Yi Pan (Data Infrastructure)] SAMZA-1789: fixing checkstyle and javadoc errors
9997b98b [Yi Pan (Data Infrastructure)] SAMZA-1789: renamed all ApplicationDescriptor classes with full-spelling of Application
f4b3d43a [Yi Pan (Data Infrastructure)] SAMZA-1789: Fxing TaskApplication examples and some checkstyle errors
f2969f8d [Yi Pan (Data Infrastructure)] SAMZA-1789: fixed ApplicationDescriptor to use InputDescriptor and OutputDescriptor; addressed Prateek's comments.
f04404cc [Yi Pan (Data Infrastructure)] SAMZA-1789: move createStreams out of the loop in prepareJobs
33753f72 [Yi Pan (Data Infrastructure)] Merge branch 'master' into app-runtime-with-processor-callbacks
12c09af0 [Yi Pan (Data Infrastructure)] SAMZA-1789: Fix a merging error (with SAMZA-1813)
a072118d [Yi Pan (Data Infrastructure)] Merge branch 'master' into app-runtime-with-processor-callbacks
e7af6932 [Yi Pan (Data Infrastructure)] Merge branch 'master' into app-runtime-with-processor-callbacks
8d4d3ffd [Yi Pan (Data Infrastructure)] Merge with master
055bd91e [Yi Pan (Data Infrastructure)] SAMZA-1789: fix unit test with ThreadJobFactory
247dcff4 [Yi Pan (Data Infrastructure)] Merge branch 'master' into app-runtime-with-processor-callbacks
1621c4d0 [Yi Pan (Data Infrastructure)] SAMZA-1789: a few more fixes to address Cameron's reviews
6e446fe6 [Yi Pan (Data Infrastructure)] SAMZA-1789: address Cameron's review comments.
4382d45d [Yi Pan (Data Infrastructure)] Merge branch 'master' into app-runtime-with-processor-callbacks
3b2f04d5 [Yi Pan (Data Infrastructure)] SAMZA-1789: moved all impl classes from samza-api to samza-core.
db96da83 [Yi Pan (Data Infrastructure)] SAMZA-1789: WIP - revision to address review feedbacks.
01433717 [Yi Pan (Data Infrastructure)] Merge branch 'master' into app-runtime-with-processor-callbacks
a82708bb [Yi Pan (Data Infrastructure)] SAMZA-1789: unify ApplicationDescriptor and ApplicationRunner for high- and low-level APIs in YARN and standalone environment
c4bb0dce [Yi Pan (Data Infrastructure)] Merge branch 'master' into app-runtime-with-processor-callbacks
f20cdcda [Yi Pan (Data Infrastructure)] WIP: adding unit tests. Pending update on StreamProcessorLifecycleListener, LocalContainerRunner, and SamzaContainerListener
973eb526 [Yi Pan (Data Infrastructure)] WIP: compiles, still working on LocalContainerRunner refactor
fb1bc49e [Yi Pan (Data Infrastructure)] Merge branch 'master' into app-spec-with-app-runtime-Jul-16-18
30a4e5f0 [Yi Pan (Data Infrastructure)] WIP: application runner refactor - proto-type for SEP-13
95577b74 [Yi Pan (Data Infrastructure)] WIP: trying to figure out the two interface classes for spec: a) spec builder in init(); b) spec reader in all other lifecycle methods
42782d81 [Yi Pan (Data Infrastructure)] Merge branch 'prateek-remove-app-runner-stream-spec' into app-spec-with-app-runtime-Jul-16-18
d43e9231 [Yi Pan (Data Infrastructure)] WIP: proto-type with ApplicationRunnable and no ApplicationRunner exposed to user
f1cb8f0e [Yi Pan (Data Infrastructure)] Merge branch 'master' into single-app-api-May-21-18
7e71dc7e [Yi Pan (Data Infrastructure)] Merge with master
85619301 [Prateek Maheshwari] Merge branch 'master' into stream-spec-cleanup
7d7aa508 [Prateek Maheshwari] Updated with Cameron and Daniel's feedback.
8e6fc2da [prateekm] Remove all usages of StreamSpec and ApplicationRunner from the operator spec and impl layers.
@steveloughran steveloughran deleted the s3/HADOOP-16190-copy-etag branch October 15, 2021 19:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

fs/s3 changes related to hadoop-aws; submitter must declare test endpoint

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants