-
Notifications
You must be signed in to change notification settings - Fork 9.2k
HADOOP-18457. ABFS: Support for account level throttling #5034
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HADOOP-18457. ABFS: Support for account level throttling #5034
Conversation
|
💔 -1 overall
This message was automatically generated. |
...azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsClientThrottlingAnalyzer.java
Show resolved
Hide resolved
...c/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsClientThrottlingInterceptFactory.java
Outdated
Show resolved
Hide resolved
...c/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsClientThrottlingInterceptFactory.java
Outdated
Show resolved
Hide resolved
...c/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsClientThrottlingInterceptFactory.java
Outdated
Show resolved
Hide resolved
...adoop-azure/src/test/java/org/apache/hadoop/fs/azurebfs/constants/TestConfigurationKeys.java
Outdated
Show resolved
Hide resolved
...zure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsClientThrottlingIntercept.java
Outdated
Show resolved
Hide resolved
...zure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsClientThrottlingIntercept.java
Outdated
Show resolved
Hide resolved
...zure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsClientThrottlingIntercept.java
Outdated
Show resolved
Hide resolved
...zure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsClientThrottlingIntercept.java
Outdated
Show resolved
Hide resolved
...azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsClientThrottlingAnalyzer.java
Outdated
Show resolved
Hide resolved
|
💔 -1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
...c/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsClientThrottlingInterceptFactory.java
Outdated
Show resolved
Hide resolved
| if (isOperationOnAccountIdle.get()) { | ||
| resumeTimer(); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Race condition alert:
Should we recall this piece of code after sleep. Reason being, lets say the timerTask is going to setIdle as true, and just before that, another thread comes into suspendIfNecessary -> if will read false and go on to sleep. Now after sleep, what ever metric will be created due to API call, that shall be dropped as on the next API call, resumeTimer would have been called which would need to creation of new blobMetrics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still better approach I feel we can implement is (I just thought of it now), lets have a synchronized block which deals with isOperationOnAccountIdle (either for suspend or resume), because still race conditions can happen. For ex:,
- https://github.com/anmolanmol1234/hadoop/blob/HADOOP-18457_temp/hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsClientThrottlingAnalyzer.java#L266
- https://github.com/anmolanmol1234/hadoop/blob/HADOOP-18457_temp/hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsClientThrottlingAnalyzer.java#L133-L136
- https://github.com/anmolanmol1234/hadoop/blob/HADOOP-18457_temp/hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsClientThrottlingAnalyzer.java#L267-L269
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets have something like saxenapranav@667cafa
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
steveloughran
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good.
There is one remaining thing we need to be sure of, which is "how does the weak reference map and the timer task interact". What I don't want is any thread leakage, or the timer holding a reference which stops any interceptor to be released.
how can we be confident of this?
...azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsClientThrottlingAnalyzer.java
Outdated
Show resolved
Hide resolved
...ure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsThrottlingInterceptFactory.java
Outdated
Show resolved
Hide resolved
...op-azure/src/main/java/org/apache/hadoop/fs/azurebfs/constants/FileSystemConfigurations.java
Outdated
Show resolved
Hide resolved
...doop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsThrottlingIntercept.java
Show resolved
Hide resolved
|
🎊 +1 overall
This message was automatically generated. |
The timer is not holding any reference of analyzer. If there is an idle period we are anyway cancelling the timer task and scheduling it back only if resumed, so mostly we are not leading to any thread leakages here. |
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
steveloughran
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1.
there's a couple of minor code/javadoc style things, but big patches always have those and it's not enough to block the patch.
merging, will try the backport too with a goal of getting into 3.3.5 RC0
|
|
||
| package org.apache.hadoop.fs.azurebfs.services; | ||
|
|
||
| final class AbfsNoOpThrottlingIntercept implements AbfsThrottlingIntercept { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: add a javadoc to say what it is, just to keep the style checkers happy
|
|
||
| final class AbfsNoOpThrottlingIntercept implements AbfsThrottlingIntercept { | ||
|
|
||
| public static final AbfsNoOpThrottlingIntercept INSTANCE = new AbfsNoOpThrottlingIntercept(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this can be downgraded to package scope, to match the class
| * Resumes the timer if it was stopped. | ||
| */ | ||
| private void resumeTimer() { | ||
| blobMetrics = new AtomicReference<AbfsOperationMetrics>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: use <> in the constructor
This allows abfs request throttling to be shared across all abfs connections talking to containers belonging to the same abfs storage account -as that is the level at which IO throttling is applied. The option is enabled/disabled in the configuration option "fs.azure.account.throttling.enabled"; The default is "true" Contributed by Anmol Asrani
This allows abfs request throttling to be shared across all abfs connections talking to containers belonging to the same abfs storage account -as that is the level at which IO throttling is applied. The option is enabled/disabled in the configuration option "fs.azure.account.throttling.enabled"; The default is "true" Contributed by Anmol Asrani
This allows abfs request throttling to be shared across all abfs connections talking to containers belonging to the same abfs storage account -as that is the level at which IO throttling is applied. The option is enabled/disabled in the configuration option "fs.azure.account.throttling.enabled"; The default is "true" Contributed by Anmol Asrani
Added account wise statistics collection in case of throttling
-> Currently the throttling statistics collection was not done at an account level, so if one account would get throttled, it might lead to some requests getting throttled for the other accounts as well. So to remove this kind of confusion, we have added support to collect throttling statistics at account level.