Skip to content

Conversation

@sumangala-patki
Copy link
Contributor

@sumangala-patki sumangala-patki commented Mar 1, 2021

In HDFS, mkdirs on an existing directory path is supposed to be a success response. To achieve this, the store backend call attempts mkdirs with overwrite=true. But on the backend, additional set properties operations also gets executed (such as LMT update) which is not a HDFS requirement and leads to unnecessary metadata update traffic.

In this PR, an option to have mkdirs executed with overwrite=false is introduced and is controlled over a config. Default is retained to overwrite=true until a related backend deployment is complete.

This PR also addresses a bug where mkdirs on an existing file path was returning success instead of throwing exception.

New config: fs.azure.enable.mkdir.overwrite [true by default]

@sumangala-patki sumangala-patki changed the title Hadoop 17548 HADOOP-17548. ABFS: Config for Mkdir overwrite Mar 1, 2021
try {
op.execute();
} catch (AzureBlobFileSystemException ex) {
String existingResource =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

first check if the httpstatus code is 409 and (!isFile), then retrieve the existingResource header.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Contributor

@snvijaya snvijaya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One minor comment posted.
Recommend to change the Title and description as below:
Title: Toggle Store Mkdirs request overwrite parameter

Description:
In HDFS, mkdirs on an existing directory path is supposed to be a success response. To achieve this, the store backend call attempts mkdirs with overwrite=true. But on the backend, additional set properties operations also gets executed (such as LMT update) which is not a HDFS requirement and leads to unnecessary metadata update traffic.

In this PR, an option to have mkdirs executed with overwrite=false is introduced and is controlled over a config. Default is retained to overwrite=true until a related backend deployment is complete.

This PR also addresses a bug where mkdirs on an existing file path was returning success instead of throwing exception.

New config: fs.azure.enable.mkdir.overwrite [true by default]

@sumangala-patki sumangala-patki changed the title HADOOP-17548. ABFS: Config for Mkdir overwrite HADOOP-17548. ABFS: Toggle Store Mkdirs request overwrite parameter Mar 2, 2021
@sumangala-patki sumangala-patki marked this pull request as ready for review March 2, 2021 11:49
@sumangala-patki
Copy link
Contributor Author

sumangala-patki commented Mar 5, 2021

TEST RESULTS

HNS Account Location: East US 2
NonHNS Account Location: East US 2, Central US
Overwrite=true

HNS OAuth

[INFO] Tests run: 93, Failures: 0, Errors: 0, Skipped: 0
[WARNING] Tests run: 513, Failures: 0, Errors: 0, Skipped: 70
[WARNING] Tests run: 257, Failures: 0, Errors: 0, Skipped: 48

HNS SharedKey

[INFO] Tests run: 93, Failures: 0, Errors: 0, Skipped: 0
[WARNING] Tests run: 513, Failures: 0, Errors: 0, Skipped: 26
[WARNING] Tests run: 257, Failures: 0, Errors: 0, Skipped: 40

Non-HNS SharedKey

[INFO] Tests run: 93, Failures: 0, Errors: 0, Skipped: 0
[WARNING] Tests run: 504, Failures: 0, Errors: 0, Skipped: 250
[WARNING] Tests run: 257, Failures: 0, Errors: 0, Skipped: 40

@sumangala-patki sumangala-patki marked this pull request as draft March 5, 2021 09:49
Copy link
Contributor

@snvijaya snvijaya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@sumangala-patki sumangala-patki marked this pull request as ready for review March 10, 2021 03:47
@surendralilhore surendralilhore merged commit fe633d4 into apache:trunk Mar 14, 2021
sumangala-patki added a commit to sumangala17/hadoop that referenced this pull request Mar 17, 2021
…pache#2729)

Contributed by Sumangala Patki.

(cherry picked from commit fe633d4)
@sumangala-patki sumangala-patki deleted the HADOOP-17548 branch April 11, 2021 04:38
surendralilhore pushed a commit that referenced this pull request May 10, 2021
…2729) (#2781)

Contributed by Sumangala Patki.

(cherry picked from commit fe633d4)
kiran-maturi pushed a commit to kiran-maturi/hadoop that referenced this pull request Nov 24, 2021
jojochuang pushed a commit to jojochuang/hadoop that referenced this pull request May 23, 2023
… parameter (apache#2729) (apache#2781)

Contributed by Sumangala Patki.

(cherry picked from commit fe633d4)

Change-Id: Ibbd82f3cf2d28fa5438c9b1ee6b0e28217dc18df
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants