Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
648a7ff
HADOOP-17198. Support S3 Access Points (#3260)
bogthe Sep 29, 2021
e234d31
HADOOP-17951. Improve S3A checking of S3 Access Point existence (#3516)
bogthe Oct 4, 2021
566506e
HADOOP-18068. upgrade AWS SDK to 1.12.132 (#3864)
steveloughran Jan 18, 2022
ebb2d89
HADOOP-17409. Remove s3guard from S3A module (#3534)
steveloughran Jan 18, 2022
9ec25f9
HDFS-16426. Fix nextBlockReportTime when trigger full block report fo…
liubingxing Jan 19, 2022
804b71b
HDFS-16139. Update BPServiceActor Scheduler's nextBlockReportTime ato…
virajjasani Jul 27, 2021
09f1f95
HDFS-16331. Make dfs.blockreport.intervalMsec reconfigurable (#3676)
tomscut Dec 3, 2021
46750d5
HDFS-16400. Reconfig DataXceiver parameters for datanode (#3843)
tomscut Jan 14, 2022
73fdb84
HDFS-16399. Reconfig cache report parameters for datanode (#3841)
tomscut Jan 19, 2022
9b173ee
HADOOP-18084. ABFS: Add testfilePath while verifying test contents ar…
anmolanmol1234 Jan 19, 2022
405b65a
HDFS-16352. return the real datanode numBlocks in #getDatanodeStorage…
liubingxing Dec 17, 2021
43b10d5
YARN-11065. Bump follow-redirects from 1.13.3 to 1.14.7 in hadoop-yar…
dependabot[bot] Jan 20, 2022
c376714
HADOOP-18094. Disable S3A auditing by default.
steveloughran Jan 24, 2022
2e235a7
HDFS-16430. Add validation to maximum blocks in EC group when adding …
cndaimin Jan 24, 2022
a70a53e
HDFS-16262. Async refresh of cached locations in DFSInputStream (#3527)
bbeaudreault Jan 25, 2022
05ddaff
HDFS-16423. Balancer should not get blocks on stale storages (#3883) …
jojochuang Jan 26, 2022
707364c
HDFS-16398. Reconfig block report parameters for datanode (#3831)
tomscut Jan 26, 2022
3917cb9
HADOOP-18093. Better exception handling for testFileStatusOnMountLink…
xinglin Jan 26, 2022
aa32a77
HDFS-16427. Add debug log for BlockManager#chooseExcessRedundancyStri…
tomscut Jan 27, 2022
25d5865
YARN-10561. Upgrade node.js to 12.22.1 and yarn to 1.22.5 in YARN app…
aajisaka Jan 28, 2022
ceb2b7a
HDFS-16169. Fix TestBlockTokenWithDFSStriped#testEnd2End failure (#3850)
secfree Jan 28, 2022
174eb8b
HDFS-16303. Improve handling of datanode lost while decommissioning (…
KevinWikant Jan 31, 2022
2e29b1c
HDFS-16443. Fix edge case where DatanodeAdminDefaultMonitor doubly en…
KevinWikant Jan 31, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion NOTICE-binary
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,7 @@ available from http://www.digip.org/jansson/.


AWS SDK for Java
Copyright 2010-2014 Amazon.com, Inc. or its affiliates. All Rights Reserved.
Copyright 2010-2022 Amazon.com, Inc. or its affiliates. All Rights Reserved.

This product includes software developed by
Amazon Technologies, Inc (http://www.amazon.com/).
Expand Down
197 changes: 11 additions & 186 deletions hadoop-common-project/hadoop-common/src/main/resources/core-default.xml
Original file line number Diff line number Diff line change
Expand Up @@ -1220,7 +1220,7 @@
com.amazonaws.auth.AWSCredentialsProvider.

When S3A delegation tokens are not enabled, this list will be used
to directly authenticate with S3 and DynamoDB services.
to directly authenticate with S3 and other AWS services.
When S3A Delegation tokens are enabled, depending upon the delegation
token binding it may be used
to communicate wih the STS endpoint to request session/role
Expand Down Expand Up @@ -1590,6 +1590,14 @@
implementations can still be used</description>
</property>

<property>
<name>fs.s3a.accesspoint.required</name>
<value>false</value>
<description>Require that all S3 access is made through Access Points and not through
buckets directly. If enabled, use per-bucket overrides to allow bucket access to a specific set
of buckets.</description>
</property>

<property>
<name>fs.s3a.block.size</name>
<value>32M</value>
Expand Down Expand Up @@ -1669,180 +1677,18 @@
</description>
</property>

<property>
<name>fs.s3a.metadatastore.authoritative</name>
<value>false</value>
<description>
When true, allow MetadataStore implementations to act as source of
truth for getting file status and directory listings. Even if this
is set to true, MetadataStore implementations may choose not to
return authoritative results. If the configured MetadataStore does
not support being authoritative, this setting will have no effect.
</description>
</property>

<property>
<name>fs.s3a.metadatastore.metadata.ttl</name>
<value>15m</value>
<description>
This value sets how long an entry in a MetadataStore is valid.
</description>
</property>

<property>
<name>fs.s3a.metadatastore.impl</name>
<value>org.apache.hadoop.fs.s3a.s3guard.NullMetadataStore</value>
<description>
Fully-qualified name of the class that implements the MetadataStore
to be used by s3a. The default class, NullMetadataStore, has no
effect: s3a will continue to treat the backing S3 service as the one
and only source of truth for file and directory metadata.
</description>
</property>

<property>
<name>fs.s3a.metadatastore.fail.on.write.error</name>
<value>true</value>
<description>
When true (default), FileSystem write operations generate
org.apache.hadoop.fs.s3a.MetadataPersistenceException if the metadata
cannot be saved to the metadata store. When false, failures to save to
metadata store are logged at ERROR level, but the overall FileSystem
write operation succeeds.
</description>
</property>

<property>
<name>fs.s3a.s3guard.cli.prune.age</name>
<value>86400000</value>
<description>
Default age (in milliseconds) after which to prune metadata from the
metadatastore when the prune command is run. Can be overridden on the
command-line.
</description>
</property>


<property>
<name>fs.s3a.impl</name>
<value>org.apache.hadoop.fs.s3a.S3AFileSystem</value>
<description>The implementation class of the S3A Filesystem</description>
</property>

<property>
<name>fs.s3a.s3guard.ddb.region</name>
<value></value>
<description>
AWS DynamoDB region to connect to. An up-to-date list is
provided in the AWS Documentation: regions and endpoints. Without this
property, the S3Guard will operate table in the associated S3 bucket region.
</description>
</property>

<property>
<name>fs.s3a.s3guard.ddb.table</name>
<value></value>
<description>
The DynamoDB table name to operate. Without this property, the respective
S3 bucket name will be used.
</description>
</property>

<property>
<name>fs.s3a.s3guard.ddb.table.create</name>
<value>false</value>
<description>
If true, the S3A client will create the table if it does not already exist.
</description>
</property>

<property>
<name>fs.s3a.s3guard.ddb.table.capacity.read</name>
<value>0</value>
<description>
Provisioned throughput requirements for read operations in terms of capacity
units for the DynamoDB table. This config value will only be used when
creating a new DynamoDB table.
If set to 0 (the default), new tables are created with "per-request" capacity.
If a positive integer is provided for this and the write capacity, then
a table with "provisioned capacity" will be created.
You can change the capacity of an existing provisioned-capacity table
through the "s3guard set-capacity" command.
</description>
</property>

<property>
<name>fs.s3a.s3guard.ddb.table.capacity.write</name>
<value>0</value>
<description>
Provisioned throughput requirements for write operations in terms of
capacity units for the DynamoDB table.
If set to 0 (the default), new tables are created with "per-request" capacity.
Refer to related configuration option fs.s3a.s3guard.ddb.table.capacity.read
</description>
</property>

<property>
<name>fs.s3a.s3guard.ddb.table.sse.enabled</name>
<value>false</value>
<description>
Whether server-side encryption (SSE) is enabled or disabled on the table.
By default it's disabled, meaning SSE is set to AWS owned CMK.
</description>
</property>

<property>
<name>fs.s3a.s3guard.ddb.table.sse.cmk</name>
<value/>
<description>
The KMS Customer Master Key (CMK) used for the KMS encryption on the table.
To specify a CMK, this config value can be its key ID, Amazon Resource Name
(ARN), alias name, or alias ARN. Users only need to provide this config if
the key is different from the default DynamoDB KMS Master Key, which is
alias/aws/dynamodb.
</description>
</property>

<property>
<name>fs.s3a.s3guard.ddb.max.retries</name>
<value>9</value>
<description>
Max retries on throttled/incompleted DynamoDB operations
before giving up and throwing an IOException.
Each retry is delayed with an exponential
backoff timer which starts at 100 milliseconds and approximately
doubles each time. The minimum wait before throwing an exception is
sum(100, 200, 400, 800, .. 100*2^N-1 ) == 100 * ((2^N)-1)
</description>
</property>

<property>
<name>fs.s3a.s3guard.ddb.throttle.retry.interval</name>
<value>100ms</value>
<description>
Initial interval to retry after a request is throttled events;
the back-off policy is exponential until the number of retries of
fs.s3a.s3guard.ddb.max.retries is reached.
</description>
</property>

<property>
<name>fs.s3a.s3guard.ddb.background.sleep</name>
<value>25ms</value>
<description>
Length (in milliseconds) of pause between each batch of deletes when
pruning metadata. Prevents prune operations (which can typically be low
priority background operations) from overly interfering with other I/O
operations.
</description>
</property>

<property>
<name>fs.s3a.retry.limit</name>
<value>7</value>
<description>
Number of times to retry any repeatable S3 client request on failure,
excluding throttling requests and S3Guard inconsistency resolution.
excluding throttling requests.
</description>
</property>

Expand All @@ -1851,7 +1697,7 @@
<value>500ms</value>
<description>
Initial retry interval when retrying operations for any reason other
than S3 throttle errors and S3Guard inconsistency resolution.
than S3 throttle errors.
</description>
</property>

Expand All @@ -1874,27 +1720,6 @@
</description>
</property>

<property>
<name>fs.s3a.s3guard.consistency.retry.limit</name>
<value>7</value>
<description>
Number of times to retry attempts to read/open/copy files when
S3Guard believes a specific version of the file to be available,
but the S3 request does not find any version of a file, or a different
version.
</description>
</property>

<property>
<name>fs.s3a.s3guard.consistency.retry.interval</name>
<value>2s</value>
<description>
Initial interval between attempts to retry operations while waiting for S3
to become consistent with the S3Guard data.
An exponential back-off is used here: every failure doubles the delay.
</description>
</property>

<property>
<name>fs.s3a.committer.name</name>
<value>file</value>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -137,7 +137,8 @@ internal state stores:

* The internal MapReduce state data will remain compatible across minor releases within the same major version to facilitate rolling upgrades while MapReduce workloads execute.
* HDFS maintains metadata about the data stored in HDFS in a private, internal format that is versioned. In the event of an incompatible change, the store's version number will be incremented. When upgrading an existing cluster, the metadata store will automatically be upgraded if possible. After the metadata store has been upgraded, it is always possible to reverse the upgrade process.
* The AWS S3A guard keeps a private, internal metadata store that is versioned. Incompatible changes will cause the version number to be incremented. If an upgrade requires reformatting the store, it will be indicated in the release notes.
* The AWS S3A guard kept a private, internal metadata store.
Now that the feature has been removed, the store is obsolete and can be deleted.
* The YARN resource manager keeps a private, internal state store of application and scheduler information that is versioned. Incompatible changes will cause the version number to be incremented. If an upgrade requires reformatting the store, it will be indicated in the release notes.
* The YARN node manager keeps a private, internal state store of application information that is versioned. Incompatible changes will cause the version number to be incremented. If an upgrade requires reformatting the store, it will be indicated in the release notes.
* The YARN federation service keeps a private, internal state store of application and cluster information that is versioned. Incompatible changes will cause the version number to be incremented. If an upgrade requires reformatting the store, it will be indicated in the release notes.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -477,19 +477,12 @@ rolled back to the older layout.

##### AWS S3A Guard Metadata

For each operation in the Hadoop S3 client (s3a) that reads or modifies
file metadata, a shadow copy of that file metadata is stored in a separate
metadata store, which offers HDFS-like consistency for the metadata, and may
also provide faster lookups for things like file status or directory listings.
S3A guard tables are created with a version marker which indicates
compatibility.
The S3Guard metastore used to store metadata in DynamoDB tables;
as such it had to maintain a compatibility strategy.
Now that S3Guard is removed, the tables are not needed.

###### Policy

The S3A guard metadata schema SHALL be considered
[Private](./InterfaceClassification.html#Private) and
[Unstable](./InterfaceClassification.html#Unstable). Any incompatible change
to the schema MUST result in the version number of the schema being incremented.
Applications configured to use an S3A metadata store other than
the "null" store will fail.

##### YARN Resource Manager State Store

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -343,7 +343,7 @@ stores pretend that they are a FileSystem, a FileSystem with the same
features and operations as HDFS. This is &mdash;ultimately&mdash;a pretence:
they have different characteristics and occasionally the illusion fails.

1. **Consistency**. Object stores are generally *Eventually Consistent*: it
1. **Consistency**. Object may be *Eventually Consistent*: it
can take time for changes to objects &mdash;creation, deletion and updates&mdash;
to become visible to all callers. Indeed, there is no guarantee a change is
immediately visible to the client which just made the change. As an example,
Expand Down Expand Up @@ -447,10 +447,6 @@ Object stores have an even vaguer view of time, which can be summarized as
* The timestamp is likely to be in UTC or the TZ of the object store. If the
client is in a different timezone, the timestamp of objects may be ahead or
behind that of the client.
* Object stores with cached metadata databases (for example: AWS S3 with
an in-memory or a DynamoDB metadata store) may have timestamps generated
from the local system clock, rather than that of the service.
This is an optimization to avoid round-trip calls to the object stores.
+ A file's modification time is often the same as its creation time.
+ The `FileSystem.setTimes()` operation to set file timestamps *may* be ignored.
* `FileSystem.chmod()` may update modification times (example: Azure `wasb://`).
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -522,7 +522,7 @@ public void testListOnInternalDirsOfMountTable() throws IOException {
Assert.assertTrue("A mount should appear as symlink", fs.isSymlink());
}

@Test
@Test(expected = FileNotFoundException.class)
public void testFileStatusOnMountLink() throws IOException {
Assert.assertTrue("Slash should appear as dir",
fcView.getFileStatus(new Path("/")).isDirectory());
Expand All @@ -534,12 +534,7 @@ public void testFileStatusOnMountLink() throws IOException {
checkFileStatus(fcView, "/internalDir/internalDir2/linkToDir3", fileType.isDir);
checkFileStatus(fcView, "/linkToAFile", fileType.isFile);

try {
fcView.getFileStatus(new Path("/danglingLink"));
Assert.fail("Excepted a not found exception here");
} catch ( FileNotFoundException e) {
// as excepted
}
fcView.getFileStatus(new Path("/danglingLink"));
}

@Test
Expand Down
Loading