Skip to content

Conversation

@bogthe
Copy link
Contributor

@bogthe bogthe commented Feb 2, 2022

Rebased Access Point feature (the 2 new commits 648a7ff329d7458d97e8b04461ccdf56eab3456a and e234d31af24a375f3942e5ac147ed54e6587dc0f) onto branch-3.3 right before the AWS SDK upgrade. Was hoping for a quick cherry-pick but there's changes that come into effect with removal S3 guard which conflict so placed it before that.

Apologies if this isn't the right way to do it! Let me know if there's anything else I can do.

Also created a PR that picks this AcessPoint feature straight into 3.3.2, which might be of help #3955

@sunchao @steveloughran

bogthe and others added 23 commits February 2, 2022 16:57
Add support for S3 Access Points. This provides extra security as it
ensures applications are not working with buckets belong to third parties.

To bind a bucket to an access point, set the access point (ap) ARN,
which must be done for each specific bucket, using the pattern

fs.s3a.bucket.$BUCKET.accesspoint.arn = ARN

* The global/bucket option `fs.s3a.accesspoint.required` to
mandate that buckets must declare their access point.
* This is not compatible with S3Guard.

Consult the documentation for further details.

Contributed by Bogdan Stolojan
…he#3516)


Follow-on to HADOOP-17198. Support S3 Access Points

Contributed by Bogdan Stolojan
With this update, the versions of key shaded dependencies are

  jackson    2.12.3
  httpclient 4.5.13

This backport patch does not include the TestArn changes needed
for the test to work with this version of the SDK; it is only
to be applied to branches without HADOOP-17198. "Support S3 Access Points".
If that patch is backported later, that test suite MUST be
updated to the latest version.

Contributed by Steve Loughran

Change-Id: I8d2b71781ee8472b16469531f9cd0de32dd3356f
Completely removes S3Guard support from the S3A codebase.

If the connector is configured to use any metastore other than
the null and local stores (i.e. DynamoDB is selected) the s3a client
will raise an exception and refuse to initialize.

This is to ensure that there is no mix of S3Guard enabled and disabled
deployments with the same configuration but different hadoop releases
-it must be turned off completely.

The "hadoop s3guard" command has been retained -but the supported
subcommands have been reduced to those which are not purely S3Guard
related: "bucket-info" and "uploads".

This is major change in terms of the number of files
changed; before cherry picking subsequent s3a patches into
older releases, this patch will probably need backporting
first.

Goodbye S3Guard, your work is done. Time to die.

Contributed by Steve Loughran.
…mically (apache#3228). Contributed by Viraj Jasani.

Signed-off-by: He Xiaoqiao <[email protected]>
(cherry picked from commit b038042)
Reviewed-by: Viraj Jasani <[email protected]>
Signed-off-by: Takanobu Asanuma <[email protected]>
(cherry picked from commit f02374d)
…e read correctly (apache#3903)

Contributed by: Anmol Asrani

Change-Id: I6e71bf349a74032f453398c7ae66f9c3305be190
…Report (apache#3714). Contributed by liubingxing.

Signed-off-by: He Xiaoqiao <[email protected]>
(cherry picked from commit d8dea6f)
…n-ui (apache#3890)

Bumps [follow-redirects](https://github.com/follow-redirects/follow-redirects) from 1.13.3 to 1.14.7.
- [Release notes](https://github.com/follow-redirects/follow-redirects/releases)
- [Commits](follow-redirects/follow-redirects@v1.13.3...v1.14.7)

---
updated-dependencies:
- dependency-name: follow-redirects
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <[email protected]>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Akira Ajisaka <[email protected]>
(cherry picked from commit dae33cf)
See HADOOP-18091. S3A auditing leaks memory through ThreadLocal references

* Adds a new option fs.s3a.audit.enabled to controls whether or not auditing
is enabled. This is false by default.

* When false, the S3A auditing manager is NoopAuditManagerS3A,
which was formerly only used for unit tests and
during filsystem initialization.

* When true, ActiveAuditManagerS3A is used for managing auditing,
allowing auditing events to be reported.

* updates documentation and tests.

This patch does not fix the underlying leak. When auditing is enabled,
long-lived threads will retain references to the audit managers
of S3A filesystem instances which have already been closed.

Contributed by Steve Loughran.

Change-Id: I671e594cd59e8ca77a1f65be791ad0ae9530b8d9
…an EC policy (apache#3899). Contributed by daimin.

Reviewed-by: tomscut <[email protected]>
Signed-off-by: Ayush Saxena <[email protected]>
(cherry picked from commit 5ef335d)

 Conflicts:
	hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ErasureCodingPolicyManager.java
…3883) (apache#3924)

Reviewed-by: litao <[email protected]>
Signed-off-by: Takanobu Asanuma <[email protected]>
(cherry picked from commit db2c320)

 Conflicts:
	hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestGetBlocks.java
	hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/balancer/TestBalancerWithHANameNodes.java

Co-authored-by: qinyuren <[email protected]>
(cherry picked from commit c2ff390)

 Conflicts:
	hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java
…() in ViewFsBaseTest.java (apache#3918). Contributed by Xing Lin. (apache#3929)

Signed-off-by: Ayush Saxena <[email protected]>
(cherry picked from commit 0d17b62)
…lication catalog webapp (apache#2591)

Reviewed-by: Masatake Iwasaki <[email protected]>
(cherry picked from commit 9cb535c)
…che#3850)

Reviewed-by: Fei Hui <[email protected]>
Reviewed-by: litao <[email protected]>
Signed-off-by: Akira Ajisaka <[email protected]>
(cherry picked from commit 39cad5f)
…queues a DatanodeDescriptor on exception (apache#3942)

Signed-off-by: Akira Ajisaka <[email protected]>
(cherry picked from commit 089e06d)
@sunchao
Copy link
Member

sunchao commented Feb 2, 2022

This seems too disruptive. @bogthe is there anyway we can resolve the conflicts due to the removal of s3guard?

@bogthe
Copy link
Contributor Author

bogthe commented Feb 3, 2022

@sunchao not really. Whatever you resolve you can't then pick onto 3.3.2 since the changes you apply onto 3.3 need to be modified because of the S3 Guard removal.

The only non-distruptive way is to pick straight into 3.3.2 like this PR is trying to do #3955

@sunchao
Copy link
Member

sunchao commented Feb 3, 2022

I see. I'm slightly in favor of putting this in branch-3.3 for now, and then release it in Hadoop 3.3.3. What do you think?

My apologies that I didn't realize the matter is a bit complex due to the difference between branch-3.3 and branch-3.3.2. Since at this point we're wrapping up the release of 3.3.2, I'd prefer to avoid extra complexities which could delay the release further.

@steveloughran
Copy link
Contributor

the 3.3.2 fix isn't that big, it's that because the s3guard removal is in branch-3.3, there's a bit more toe stepping going on there

@bogthe
Copy link
Contributor Author

bogthe commented Feb 4, 2022

the 3.3.2 fix isn't that big, it's that because the s3guard removal is in branch-3.3, there's a bit more toe stepping going on there

@sunchao would like to close this PR and use Steve's new one (basically does the thing this one intended to 😄 )

@steveloughran
Copy link
Contributor

closing this.

@sunchao
Copy link
Member

sunchao commented Feb 4, 2022

Oh cool, thanks @steveloughran !

@steveloughran
Copy link
Contributor

merged the other one in

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.