-
Notifications
You must be signed in to change notification settings - Fork 593
HDDS-9842. [hsync] Checking disk capacity at every write request is expensive for HBase #6381
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Ok looks like this is not enough. I need to cache more info to avoid overhead. Close it for now. Will reopen. |
|
After this change, isVolumeFull() CPU time reduced from >2% to less than 0.2% |
...ainer-service/src/main/java/org/apache/hadoop/ozone/container/common/volume/VolumeUsage.java
Show resolved
Hide resolved
| HddsUtils::processForDebug); | ||
| this.cachedVolumeUsage = CacheBuilder.newBuilder() | ||
| .maximumSize(1000) | ||
| .refreshAfterWrite(1, TimeUnit.MINUTES) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1 minute is probably too aggressive?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @kerneltime here.
Would reducing this to 1 second see much performance impact?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1 second should be fine. Need an expiration to prevent it from aggressively refreshing volume stats unnecessarily.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
made it update 1 second periodically until 1 minute after the last access.
…xpensive for HBase. Fix test failures. Rewrite using Guava cache. Change-Id: Ib3cae51246d4a5d65d6386d5e10b804f513d98f0 Revert "Fix test failures." This reverts commit 869fc65. Revert "HDDS-9842. [hsync] Checking disk capacity at every write request is expensive for HBase." This reverts commit d7bff06. Fix Change-Id: I27d5cf136dfb09a625372aa88ffcf620627c8ac4 Fix NPE Change-Id: I47f62066babca29ce7a3f5c49073092493d8d7b0 Fix NPE again. Change-Id: I48b245506c5a084e545b04814354e0365ea023c5 Checkstyle Change-Id: I07320a34b44c564660a727eafb4ce3697682f4ee Add CacheLoader Change-Id: I59e9f6a1556b094bf7bfdc05426daab165aec974 Cache volume free space. Change-Id: I4a9fb9ad4c5e7ba95f3db27020910184e7f61a06 (cherry picked from commit 2886408) Fix NPE. Change-Id: Iae968baec905250ea3ba0bb47927e12f062fd79a (cherry picked from commit 60175d1) Checkstyle. Change-Id: I5ebce8d3e4b4f4d919d0c455a938558f2c7dc99c Add unit test and fix bugs. Change-Id: I334d83b9d2b57f6fe4516fc878f604d503350799 empty statement. Change-Id: I4f9bf9c5c06182f55a96aec553a76c4928389b45

What changes were proposed in this pull request?
HDDS-9842. [hsync] Checking disk capacity at every write request is expensive for HBase.
Volume space usage is checked for every write request (WriteChunk, PutBlock). It's okay for large payloads but for HBase WAL where a client sends thousands of write requests per second, this overhead is unacceptable.
Proposed solution: cache volume space usage using Guava Cache. Cache is refreshed periodically every minute. The cached value is used to determine if the volume is full or not, and its value doesn't need to be accurate. One minute staleness should be acceptable. If not, even 1 second interval should be good enough.
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-9842
How was this patch tested?
Existing unit tests