Skip to content

Backmerge of Apache/hadoop:trunk into sumangala-patki/hadoop-Hadoop 17912: Conflict resolution#8

Open
saxenapranav wants to merge 124 commits intosumangala17:HADOOP-17912from
saxenapranav:HADOOP-17912-backmerge
Open

Backmerge of Apache/hadoop:trunk into sumangala-patki/hadoop-Hadoop 17912: Conflict resolution#8
saxenapranav wants to merge 124 commits intosumangala17:HADOOP-17912from
saxenapranav:HADOOP-17912-backmerge

Conversation

@saxenapranav
Copy link

Description of PR

PR raised for HADOOP-17912(https://issues.apache.org/jira/browse/HADOOP-17912) apache#3440 has conflict with Apache/hadoop:trunk and hence can't be merged.

This PR backmerges the trunk commits into sumangala-patki/hadoop:HADOOP-17912 branch and resolves the conflict.

How was this patch tested?

This was tested by running integrated tests with the Azure-storage accounts (US EAST).
Non-HNS Endpoint: pranavsaxenanonhns.blob.core.windows.net
HNS Endpoint: pranavsaxenahns.dfs.core.windows.net

Following is the test-result:

:::: AGGREGATED TEST RESULT ::::

HNS-OAuth
========================
[INFO] Results:
[INFO]
[ERROR] Failures:
[ERROR]   TestAccountConfiguration.testConfigPropNotFound:386->testMissingConfigKey:399 Expected a org.apache.hadoop.fs.azurebfs.contracts.exceptions.TokenAccessProviderException to be thrown, but got the result: : "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider"
[INFO]
[ERROR] Tests run: 107, Failures: 1, Errors: 0, Skipped: 2
[INFO] Results:
[INFO]
[ERROR] Failures:
[ERROR]   ITestAbfsStatistics.testOpenAppendRenameExists:244->Assert.assertTrue:53->Assert.assertTrue:42->Assert.fail:87: THIS SUCCEEDED WHEN RAN INDIVIDUALLY
[INFO]
[ERROR] Tests run: 574, Failures: 1, Errors: 0, Skipped: 26
[INFO] Results:
[INFO]
[ERROR] Failures:
[ERROR]   ITestAbfsReadWriteAndSeek.testReadAndWriteWithDifferentBufferSizesAndSeek:69->testReadWriteAndSeek:110 [Retry was required due to issue on server side] expected:<[0]> but was:<[1]>: THIS SUCCEEDED WHEN RAN INDIVIDUALLY
[INFO]
[ERROR] Tests run: 332, Failures: 1, Errors: 0, Skipped: 41

HNS-SharedKey
========================
[INFO] Results:
[INFO]
[ERROR] Failures:
[ERROR]   TestAccountConfiguration.testConfigPropNotFound:386->testMissingConfigKey:399 Expected a org.apache.hadoop.fs.azurebfs.contracts.exceptions.TokenAccessProviderException to be thrown, but got the result: : "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider"
[INFO]
[ERROR] Tests run: 107, Failures: 1, Errors: 0, Skipped: 2
[INFO] Results:
[INFO]
[WARNING] Tests run: 574, Failures: 0, Errors: 0, Skipped: 26
[INFO] Results:
[INFO]
[ERROR] Failures:
[ERROR]   ITestAbfsReadWriteAndSeek.testReadAndWriteWithDifferentBufferSizesAndSeek:69->testReadWriteAndSeek:110 [Retry was required due to issue on server side] expected:<[0]> but was:<[1]>: THIS SUCCEEDED WHEN RAN INDIVIDUALLY
[INFO]
[ERROR] Tests run: 332, Failures: 1, Errors: 0, Skipped: 41

NonHNS-SharedKey
========================
[INFO] Results:
[INFO]
[ERROR] Failures:
[ERROR]   TestAccountConfiguration.testConfigPropNotFound:386->testMissingConfigKey:399 Expected a org.apache.hadoop.fs.azurebfs.contracts.exceptions.TokenAccessProviderException to be thrown, but got the result: : "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider"
[INFO]
[ERROR] Tests run: 107, Failures: 1, Errors: 0, Skipped: 2
[INFO] Results:
[INFO]
[WARNING] Tests run: 559, Failures: 0, Errors: 0, Skipped: 268
[INFO] Results:
[INFO]
[ERROR] Failures:
[ERROR]   ITestAbfsReadWriteAndSeek.testReadAndWriteWithDifferentBufferSizesAndSeek:69->testReadWriteAndSeek:110 [Retry was required due to issue on server side] expected:<[0]> but was:<[1]>: THIS SUCCEEDED WHEN RAN INDIVIDUALLY
[ERROR]   ITestAbfsRenameStageFailure>TestRenameStageFailure.testResilienceAsExpected:126 [resilient commit support] expected:<[tru]e> but was:<[fals]e>
[ERROR]   ITestAbfsTerasort.test_110_teragen:244->executeStage:211->Assert.assertEquals:647->Assert.failNotEquals:835->Assert.fail:89 teragen(1000, abfs://testcontainer@pranavsaxenanonhns.dfs.core.windows.net/ITestAbfsTerasort/sortin) failed expected:<0> but was:<1>
[ERROR] Errors:
[ERROR]   ITestAbfsJobThroughManifestCommitter.test_0420_validateJob » OutputValidation ...
[ERROR]   ITestAbfsManifestCommitProtocol.testCommitLifecycle » OutputValidation `abfs:/...
[ERROR]   ITestAbfsManifestCommitProtocol.testCommitterWithDuplicatedCommit » OutputValidation
[ERROR]   ITestAbfsManifestCommitProtocol.testConcurrentCommitTaskWithSubDir » OutputValidation
[ERROR]   ITestAbfsManifestCommitProtocol.testMapFileOutputCommitter » OutputValidation ...
[ERROR]   ITestAbfsManifestCommitProtocol.testOutputFormatIntegration » OutputValidation
[ERROR]   ITestAbfsManifestCommitProtocol.testParallelJobsToAdjacentPaths » OutputValidation
[ERROR]   ITestAbfsManifestCommitProtocol.testTwoTaskAttemptsCommit » OutputValidation `...
[INFO]
[ERROR] Tests run: 332, Failures: 3, Errors: 8, Skipped: 46

AppendBlob-HNS-OAuth
========================
[INFO] Results:
[INFO]
[ERROR] Failures:
[ERROR]   TestAccountConfiguration.testConfigPropNotFound:386->testMissingConfigKey:399 Expected a org.apache.hadoop.fs.azurebfs.contracts.exceptions.TokenAccessProviderException to be thrown, but got the result: : "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider"
[INFO]
[ERROR] Tests run: 107, Failures: 1, Errors: 0, Skipped: 2
[INFO] Results:
[INFO]
[WARNING] Tests run: 574, Failures: 0, Errors: 0, Skipped: 26
[INFO] Results:
[INFO]
[ERROR] Failures:
[ERROR]   ITestAbfsReadWriteAndSeek.testReadAndWriteWithDifferentBufferSizesAndSeek:69->testReadWriteAndSeek:110 [Retry was required due to issue on server side] expected:<[0]> but was:<[1]>: THIS SUCCEEDED WHEN RAN INDIVIDUALLY
[INFO]
[ERROR] Tests run: 332, Failures: 1, Errors: 0, Skipped: 41

For code changes:

  • Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')?
  • Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation?
  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
  • If applicable, have you updated the LICENSE, LICENSE-binary, NOTICE-binary files?

szilard-nemeth and others added 30 commits May 11, 2022 14:29
… when moving application across queues. Contributed by Andras Gyori
… filtering for one. Contributed by Benjamin Teke
…itted queue name. Contributed by Szilard Nemeth
* Remove redundant strings.h inclusions

* strings.h was included in a bunch of
  C/C++ files and were redundant.
* Also, strings.h is not available on
  Windows and thus isn't cross-platform
  compatible.

* Build for all platforms in CI

* Revert "Build for all platforms in CI"

This reverts commit 2650f047bd6791a5908cfbe50cc8e70d42c512cb.

* Debug failure on Centos 8

* Skipping pipeline run on
  Centos 7 to debug the
  failure on Centos 8.

* Revert "Debug failure on Centos 8"

This reverts commit e365e34.
Fixes apache#4181

Signed-off-by: Owen O'Malley <oomalley@linkedin.com>
…adoop.metrics2.MetricsException and subsequent java.net.BindException: Address already in use. Contributed by Szilard Nemeth
…n corner cases (apache#4110)

Co-authored-by: Jian Chen <jian.chen@airbnb.com>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
…che#4280)

* Update QueueConfigurationsPBImpl.java

* Update TestPBImplRecords.java

* Update TestPBImplRecords.java

* Update TestPBImplRecords.java

* Update TestPBImplRecords.java
* Add the changelog and release notes
* add all jdiff XML files
* update the project pom with the new stable version

Change-Id: Iaea846c3e451bbd446b45de146845a48953d580d
Upgrade Apache Xerces Java to 2.12.2 due to handle vulnerability CVE-2022-23437

Contributed by Ashutosh Gupta
Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
Signed-off-by: Akira Ajisaka <aajisaka@apache.org>
hotcodemacha and others added 30 commits June 20, 2022 11:14
Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com>
…pache#4365)

* HADOOP-18266. Using HashSet/ TreeSet Constructor for hadoop-common

Co-authored-by: Deb <dbsamrat@3c22fba1b03f.ant.amazon.com>
…r due to HDFS-16563 (apache#4428). Contributed by fanshilun.

Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
…e on MultipleOutputs#close (apache#4247)

Contributed by Ravuri Sushma sree.

Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com>
…ches its max-parallel-apps limit. Contributed by Andras Gyori
…ontributed by Viraj Jasani.

Reviewed-by: Tao Li <tomscut@apache.org>
Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
…ges (apache#4436)

* YARN-9971.YARN Native Service HttpProbe logs THIS_HOST in error messages

Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com>
…-16202 (apache#4472)


Fixing a mockito-based test which broke when HADOOP-16202
changed the methods being invoked.

Contributed by Steve Loughran
…d even if multiple log aggregation file controllers are configured. Contributed by Szilard Nemeth.
part of HADOOP-18103.
Add support for multiple ranged vectored read api in PositionedReadable.
The default iterates through the ranges to read each synchronously,
but the intent is that FSDataInputStream subclasses can make more
efficient readers especially in object stores implementation.

Also added implementation in S3A where smaller ranges are merged and
sliced byte buffers are returned to the readers. All the merged ranged are
fetched from S3 asynchronously.

Contributed By: Owen O'Malley and Mukund Thakur
… maxReadSizeForVectorReads (apache#3964)

Part of HADOOP-18103.
Introducing fs.s3a.vectored.read.min.seek.size and fs.s3a.vectored.read.max.merged.size
to configure min seek and max read during a vectored IO operation in S3A connector.
These properties actually define how the ranges will be merged. To completely
disable merging set fs.s3a.max.readsize.vectored.read to 0.

Contributed By: Mukund Thakur
part of HADOOP-18103.
Required for vectored IO feature. None of current buffer pool
implementation is complete. ElasticByteBufferPool doesn't use
weak references and could lead to memory leak errors and
DirectBufferPool doesn't support caller preferences of direct
and heap buffers and has only fixed length buffer implementation.

Contributed By: Mukund Thakur
…#4445)

part of HADOOP-18103.
Handling memory fragmentation in S3A vectored IO implementation by
allocating smaller user range requested size buffers and directly
filling them from the remote S3 stream and skipping undesired
data in between ranges.
This patch also adds aborting active vectored reads when stream is
closed or unbuffer() is called.

Contributed By: Mukund Thakur
This feature adds methods for ranged vectored read operations
in PositionedReadable.

All stream which implement that interface support the new API.

The default implementation reads each range in the vector
sequentially.

However, specific implementations may provide higher performance
versions. This is done in two places

* Local FileSystem/Checksum FileSystem
* The S3A client.

The S3A client first coalesces adjacent and "nearby" ranges
together, then fetches each range in separate HTTP GET requests,
executed in parallel. As such it delivers significant speedups
to applications reading separate blocks of data from the same
file, columnar data format libraries in particular.

This is the merge commit of the feature branch; the work is in

HADOOP-11867. Add a high-performance vectored read API.
HADOOP-18104. S3A: Add configs to configure minSeekForVectorReads and maxReadSizeForVectorReads.
HADOOP-18107. Adding scale test for vectored reads for large file
HADOOP-18105. Implement buffer pooling with weak references.
HADOOP-18106. Handle memory fragmentation in S3A Vectored IO.

Contributed By: Owen O'Malley and Mukund Thakur
Reviewed-by: Ayush Saxena <ayushsaxena@apache.org>
Signed-off-by: Chris Nauroth <cnauroth@apache.org>
)

Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com>
…e#4484). Contributed by fanshilun.

Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
…gation (apache#4486)

* YARN-10320.Replace FSDataInputStream#read with readFully in Log Aggregation

Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com>
…onStore#confirmMutation (apache#4487)

Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com>
…n some cases (apache#4452)

* HDFS-16633.Reserved Space For Replicas is not released on some cases

Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com>
Update the dependencies of the LDAP libraries used for testing:

ldap-api.version = 2.0.0
apacheds.version = 2.0.0.AM26

Contributed by Colm O hEigeartaigh.
…omplete state (apache#4331)


ABFS rename fails intermittently when the Storage-blob tracking
metadata is in an incomplete state. This surfaces as the error code
404 and an error message of "RenameDestinationParentPathNotFound"

To mitigate this issue, when a request fails with this response.
the ABFS client issues a HEAD call on the source file
and then retries the rename operation again

ABFS filesystem statistics track when this occurs with new counters
  rename_recovery
  metadata_incomplete_rename_failures
  rename_path_attempts

This is very rare occurrence and appears to be triggered under certain
heavy load conditions, just as with HADOOP-18163.

Contributed by Mehakmeet Singh.
…user not present on client (apache#4474). Contributed by swamirishi.
…HBase is down (apache#4492)

Co-authored-by: Ashutosh Gupta <ashugpt@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.