Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
75 commits
Select commit Hold shift + click to select a range
f5b8fdd
API calls for blob lease operations added
saxenapranav May 9, 2023
0acee48
ListBlobProducer/consumer/queue
saxenapranav May 10, 2023
c02bc3e
use of prod and consumer for rename
saxenapranav May 10, 2023
2e4372d
WIP, made method for renameBlobDir, need to fix redo method
saxenapranav May 10, 2023
b63dd44
small changes for knowing if queue is completed.
saxenapranav May 11, 2023
9d1a271
we need to complete queue when no nextmarker; now src would be rename…
saxenapranav May 11, 2023
295be5a
abfsBlobLease
saxenapranav May 11, 2023
b66bee8
calling the APIs of AbfsBlobLease
saxenapranav May 12, 2023
8ab8cf0
copyBlob and deleteBlob to have leaseId header
saxenapranav May 12, 2023
cea438e
test fix; http verb for releaseLease
saxenapranav May 12, 2023
05505c5
add javadocs; lease of blobs in dir on the basis of isAtomicRename bo…
saxenapranav May 15, 2023
e7625c8
lease in createNonRecursive
saxenapranav May 15, 2023
f222b80
blobConfigTest:mockito.nullable on leaseId; in create() of azureBlobF…
saxenapranav May 15, 2023
4c7b31e
test for parallel rename, append(when file in atomicDir is in process…
saxenapranav May 15, 2023
d624e87
testParallelCreateNonRecursiveToFilePartOfAtomicDirectoryInRename
saxenapranav May 15, 2023
de9d5cb
added tests in ITestAzureBlobFileSystemCreate: for createNonRecursive
saxenapranav May 16, 2023
0b49c60
consumer lag config
saxenapranav May 16, 2023
9b59f66
Added test for listBlobProducer
saxenapranav May 16, 2023
ba226ed
Added test for exception catch
saxenapranav May 16, 2023
2d38311
refactors
saxenapranav May 16, 2023
8235c0e
Merge branch 'ABFS_3.3.2_dev' into ABFS_3.3.2_dev_rename_improvements
saxenapranav May 16, 2023
2360293
minor changes
saxenapranav May 16, 2023
7c5d520
prevent accidental gc close on the producer object
saxenapranav May 16, 2023
0ba5fd4
nits for pr comments. WIP
saxenapranav May 18, 2023
dd187da
pr comments for stringBuilder, javadocs, syntax
saxenapranav May 19, 2023
9065343
small refactors
saxenapranav May 19, 2023
afa734e
revert ignore
saxenapranav May 19, 2023
04c1803
if space
saxenapranav May 19, 2023
ec53bad
remove ONE_MINUTE constant
saxenapranav May 23, 2023
2e1a929
pr small refactors
saxenapranav May 23, 2023
4fae0a7
added javadoc for the create method with new field
saxenapranav May 23, 2023
63a099e
backerge ABFS_3.3.2_dev
saxenapranav May 30, 2023
a01318a
renameBlobExecutorService to not have availableProcessor amount of th…
saxenapranav Jun 6, 2023
bbf03cc
no consumer lag; take from queue size
saxenapranav Jun 6, 2023
13c531e
executorService for each rename. Kill executorService once rename is …
saxenapranav Jun 6, 2023
3774ddd
lease hierarchy WIP
saxenapranav Jun 6, 2023
484ee17
process of renew added; ABfssDfsLease child of AbfsLease; AbfsLease i…
saxenapranav Jun 6, 2023
b08bdc6
AbfsBlobLease to be child of AbfsLease
saxenapranav Jun 6, 2023
e676aa2
isLeaseOnCreateNonRecursive on config
saxenapranav Jun 6, 2023
40be44e
no spawn of thread for non-infite lease acquire
saxenapranav Jun 7, 2023
a9547df
seconds duration, javadoc, test fix
saxenapranav Jun 7, 2023
26b9958
fix tests in ItestListBlobProducer
saxenapranav Jun 7, 2023
7994196
have config checks in createNonRecursive test
saxenapranav Jun 7, 2023
5db2faa
Merge branch 'ABFS_3.3.2_dev' into ABFS_3.3.2_dev_rename_improvements…
saxenapranav Jun 7, 2023
8e6b9cf
test fix in ITestAzureBlobFileSystemLease
saxenapranav Jun 7, 2023
6744352
Merge pull request #11 from saxenapranav/ABFS_3.3.2_dev_rename_improv…
saxenapranav Jun 7, 2023
0d316c4
javadoc in ListBlobQueue AbfsLease; small refactor
saxenapranav Jun 9, 2023
043944a
added a param in javadoc
saxenapranav Jun 13, 2023
31b665e
checkstyles correction
saxenapranav Jun 15, 2023
443d416
use of countDownLatch instead of busywait
saxenapranav Jun 15, 2023
e6718f8
Merge branch 'ABFS_3.3.2_dev' into ABFS_3.3.2_dev_rename_improvements
saxenapranav Jun 19, 2023
81f1d43
refactor for delimeter param in existing test
saxenapranav Jun 19, 2023
06cf503
renameDir not to have sourceBlobProperty since it already has sourceD…
saxenapranav Jun 19, 2023
7d85a89
blobPath in createDestinationPathForBlobPartOfRenameSrcDir
saxenapranav Jun 19, 2023
cf08e12
Merge pull request #12 from saxenapranav/ABFS_3.3.2_dev_rename_impove…
saxenapranav Jun 20, 2023
89b0ace
testBlobRenameOfDirectoryHavingNeighborWithSamePrefix
saxenapranav Jun 20, 2023
2ab43e3
removing throw of runtimeException if renewLEase fail
saxenapranav Jun 20, 2023
a5420f4
lease timer cancel
saxenapranav Jun 20, 2023
c9f9c29
queue spawned only if requeird
saxenapranav Jun 20, 2023
0eb5a94
testBlobRenameCancelRenewTimerForLeaseTakenInAtomicRename
saxenapranav Jun 20, 2023
85f393a
visibleForTesting on getBlobLease
saxenapranav Jun 20, 2023
0903356
assume nonhns for testBlobRenameCancelRenewTimerForLeaseTakenInAtomic…
saxenapranav Jun 21, 2023
c5ae899
testBlobRenameServerReturnsOneBlobPerList
saxenapranav Jun 21, 2023
137eca9
renameBlobExecutorService shutdown if any future task fail
saxenapranav Jun 21, 2023
3cc8c69
when renameDir fails, release srcDir lease.
saxenapranav Jun 22, 2023
a47a508
refactors
saxenapranav Jun 22, 2023
8de26ce
SOURCE_PATH_NOT_FOUND in case src is not present
saxenapranav Jun 22, 2023
51e7e5a
refactors + test adjustments
saxenapranav Jun 22, 2023
14050c1
no need of blobListProperties in RenameAtomicityUtil; general refactors
saxenapranav Jun 22, 2023
7430b1e
testParallelBlobLeaseOnChildBlobInRenameSrcDir only on blob
saxenapranav Jun 22, 2023
aafc099
refactor
saxenapranav Jun 22, 2023
21e4e26
Merge pull request #13 from saxenapranav/ABFS_3.3.2_dev_rename_major_…
saxenapranav Jun 22, 2023
b8b1ef0
testBlobAtomicRenameSrcAndDstAreNotLeftLeased
saxenapranav Jun 23, 2023
b5ec70d
Merge branch 'ABFS_3.3.2_dev' into ABFS_3.3.2_dev_rename_improvements
saxenapranav Jun 23, 2023
3e8207e
remove extra line in import
saxenapranav Jun 23, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@
import java.io.IOException;
import java.lang.reflect.Field;

import org.apache.hadoop.fs.azurebfs.constants.ConfigurationKeys;
import org.apache.hadoop.fs.azurebfs.services.PrefixMode;
import org.apache.hadoop.thirdparty.com.google.common.annotations.VisibleForTesting;
import org.apache.hadoop.thirdparty.com.google.common.base.Preconditions;
Expand Down Expand Up @@ -255,7 +256,7 @@ public class AbfsConfiguration{
private int readAheadQueueDepth;

@IntegerConfigurationValidatorAnnotation(ConfigurationKey = FS_AZURE_BLOB_DIR_RENAME_MAX_THREAD,
DefaultValue = 0)
DefaultValue = DEFAULT_FS_AZURE_BLOB_RENAME_THREAD)
private int blobDirRenameMaxThread;

@LongConfigurationValidatorAnnotation(ConfigurationKey = FS_AZURE_BLOB_COPY_PROGRESS_POLL_WAIT_MILLIS,
Expand Down Expand Up @@ -343,6 +344,13 @@ public class AbfsConfiguration{
FS_AZURE_ENABLE_ABFS_LIST_ITERATOR, DefaultValue = DEFAULT_ENABLE_ABFS_LIST_ITERATOR)
private boolean enableAbfsListIterator;

@IntegerConfigurationValidatorAnnotation(ConfigurationKey =
FS_AZURE_PRODUCER_QUEUE_MAX_SIZE, DefaultValue = DEFAULT_FS_AZURE_PRODUCER_QUEUE_MAX_SIZE)
private int producerQueueMaxSize;

@BooleanConfigurationValidatorAnnotation(ConfigurationKey=FS_AZURE_LEASE_CREATE_NON_RECURSIVE, DefaultValue = DEFAULT_FS_AZURE_LEASE_CREATE_NON_RECURSIVE)
private boolean leaseOnCreateNonRecursive;

public AbfsConfiguration(final Configuration rawConfig, String accountName)
throws IllegalAccessException, InvalidConfigurationValueException, IOException {
this.rawConfig = ProviderUtils.excludeIncompatibleCredentialProviders(
Expand Down Expand Up @@ -1222,4 +1230,12 @@ public void setOptimizeFooterRead(boolean optimizeFooterRead) {
public void setEnableAbfsListIterator(boolean enableAbfsListIterator) {
this.enableAbfsListIterator = enableAbfsListIterator;
}

public int getProducerQueueMaxSize() {
return producerQueueMaxSize;
}

public boolean isLeaseOnCreateNonRecursive() {
return leaseOnCreateNonRecursive;
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,7 @@
import java.util.concurrent.Future;

import org.apache.hadoop.fs.azurebfs.contracts.exceptions.InvalidConfigurationValueException;
import org.apache.hadoop.fs.azurebfs.services.AbfsBlobLease;
import org.apache.hadoop.fs.azurebfs.services.BlobProperty;
import org.apache.hadoop.classification.VisibleForTesting;
import org.apache.hadoop.fs.azurebfs.services.OperativeEndpoint;
Expand Down Expand Up @@ -114,6 +115,7 @@
import static org.apache.hadoop.fs.CommonConfigurationKeys.IOSTATISTICS_LOGGING_LEVEL;
import static org.apache.hadoop.fs.CommonConfigurationKeys.IOSTATISTICS_LOGGING_LEVEL_DEFAULT;
import static org.apache.hadoop.fs.azurebfs.AbfsStatistic.*;
import static org.apache.hadoop.fs.azurebfs.constants.AbfsHttpConstants.BLOB_LEASE_ONE_MINUTE_DURATION;
import static org.apache.hadoop.fs.azurebfs.constants.ConfigurationKeys.FS_AZURE_ENABLE_BLOB_ENDPOINT;
import static org.apache.hadoop.fs.azurebfs.constants.FileSystemUriSchemes.ABFS_DNS_PREFIX;
import static org.apache.hadoop.fs.azurebfs.constants.FileSystemUriSchemes.WASB_DNS_PREFIX;
Expand Down Expand Up @@ -424,6 +426,29 @@ private boolean shouldRedirect(FSOperationType type, TracingContext context)
@Override
public FSDataOutputStream create(final Path f, final FsPermission permission, final boolean overwrite, final int bufferSize,
final short replication, final long blockSize, final Progressable progress) throws IOException {
return create(f, permission, overwrite, bufferSize, replication, blockSize,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need for additional parameter ?

Copy link
Copy Markdown
Collaborator Author

@saxenapranav saxenapranav May 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method is also called from createNonRecursive which already checks the parent dir and acquire lease on the parent dir if its an atomicDirectory. The new field will tell the method that the parent dir exists and has been checked.

Why we dont want this method to check parentDir again?

  1. It has already been checked.
  2. Since the dir is leased. GetPathStatus on the parentDir will fail if leaseId is not present in the API call.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be great if this is added as a brief comment.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have added javadoc in the new method.

progress, false);
}

/**
* Creates a file in the file system with the specified parameters.
* @param f the path of the file to create
* @param permission the permission of the file
* @param overwrite whether to overwrite the existing file if any
* @param bufferSize the size of the buffer to be used
* @param replication the number of replicas for the file
* @param blockSize the size of the block for the file
* @param progress the progress indicator for the file creation
* @param blobParentDirPresentChecked whether the presence of parent directory
* been checked
* @return a FSDataOutputStream object that can be used to write to the file
* @throws IOException if an error occurs while creating the file
*/
private FSDataOutputStream create(final Path f,
final FsPermission permission,
final boolean overwrite, final int bufferSize,
final short replication,
final long blockSize, final Progressable progress, final Boolean blobParentDirPresentChecked) throws IOException {
LOG.debug("AzureBlobFileSystem.create path: {} permission: {} overwrite: {} bufferSize: {}",
f,
permission,
Expand All @@ -449,9 +474,11 @@ public FSDataOutputStream create(final Path f, final FsPermission permission, fi

if (prefixMode == PrefixMode.BLOB) {
validatePathOrSubPathDoesNotExist(qualifiedPath, tracingContext);
Path parent = qualifiedPath.getParent();
if (parent != null && !parent.isRoot()) {
if (!blobParentDirPresentChecked) {
Path parent = qualifiedPath.getParent();
if (parent != null && !parent.isRoot()) {
mkdirs(parent);
}
}
}

Expand All @@ -478,14 +505,36 @@ public FSDataOutputStream createNonRecursive(final Path f, final FsPermission pe
TracingContext tracingContext = new TracingContext(clientCorrelationId,
fileSystemId, FSOperationType.CREATE_NON_RECURSIVE, tracingHeaderFormat,
listener);
/*
* Get exclusive access to folder if this is a directory designated for atomic
* rename. The primary use case of the HBase write-ahead log file management.
*/
AbfsBlobLease abfsBlobLease = null;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As currently v1 doesnt provide this, lets have this functionality in a config control and have it as off.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added for createNonRecursive lease acquire: fs.azure.lease.create.non.recursive. default: false

String parentPath = parent.toUri().getPath();
if (getAbfsStore().getPrefixMode() == PrefixMode.BLOB
&& getAbfsStore().isAtomicRenameKey(parentPath)) {
if (getAbfsStore().getAbfsConfiguration().isLeaseOnCreateNonRecursive()) {
abfsBlobLease = new AbfsBlobLease(getAbfsClient(),
parentPath, BLOB_LEASE_ONE_MINUTE_DURATION, tracingContext);
}
}
final FileStatus parentFileStatus = tryGetFileStatus(parent, tracingContext);

if (parentFileStatus == null) {
if (parentFileStatus == null || !parentFileStatus.isDirectory()) {
if (abfsBlobLease != null) {
abfsBlobLease.free();
}
throw new FileNotFoundException("Cannot create file "
+ f.getName() + " because parent folder does not exist.");
+ f.getName()
+ " because parent folder does not exist or is a file.");
}

return create(f, permission, overwrite, bufferSize, replication, blockSize, progress);
final FSDataOutputStream outputStream = create(f, permission, overwrite,
bufferSize, replication, blockSize, progress, true);
if (abfsBlobLease != null) {
Comment thread
saxenapranav marked this conversation as resolved.
abfsBlobLease.free();
}
return outputStream;
}

@Override
Expand Down
Loading