HDDS-9842. [hsync] Checking disk capacity at every write request is expensive for HBase #6381

jojochuang · 2024-03-14T21:32:39Z

What changes were proposed in this pull request?

HDDS-9842. [hsync] Checking disk capacity at every write request is expensive for HBase.

Volume space usage is checked for every write request (WriteChunk, PutBlock). It's okay for large payloads but for HBase WAL where a client sends thousands of write requests per second, this overhead is unacceptable.

Proposed solution: cache volume space usage using Guava Cache. Cache is refreshed periodically every minute. The cached value is used to determine if the volume is full or not, and its value doesn't need to be accurate. One minute staleness should be acceptable. If not, even 1 second interval should be good enough.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-9842

How was this patch tested?

Existing unit tests

jojochuang · 2024-03-15T04:17:03Z

Ok looks like this is not enough. I need to cache more info to avoid overhead. Close it for now. Will reopen.

jojochuang · 2024-03-15T06:59:32Z

After this change, isVolumeFull() CPU time reduced from >2% to less than 0.2%

jojochuang · 2024-03-15T07:06:41Z

...ainer-service/src/main/java/org/apache/hadoop/ozone/container/common/volume/VolumeUsage.java

kerneltime · 2024-03-20T18:33:02Z

...iner-service/src/main/java/org/apache/hadoop/ozone/container/common/impl/HddsDispatcher.java

            HddsUtils::processForDebug);
+    this.cachedVolumeUsage = CacheBuilder.newBuilder()
+        .maximumSize(1000)
+        .refreshAfterWrite(1, TimeUnit.MINUTES)


1 minute is probably too aggressive?

I agree with @kerneltime here.

Would reducing this to 1 second see much performance impact?

1 second should be fine. Need an expiration to prevent it from aggressively refreshing volume stats unnecessarily.

made it update 1 second periodically until 1 minute after the last access.

…xpensive for HBase. Fix test failures. Rewrite using Guava cache. Change-Id: Ib3cae51246d4a5d65d6386d5e10b804f513d98f0 Revert "Fix test failures." This reverts commit 869fc65. Revert "HDDS-9842. [hsync] Checking disk capacity at every write request is expensive for HBase." This reverts commit d7bff06. Fix Change-Id: I27d5cf136dfb09a625372aa88ffcf620627c8ac4 Fix NPE Change-Id: I47f62066babca29ce7a3f5c49073092493d8d7b0 Fix NPE again. Change-Id: I48b245506c5a084e545b04814354e0365ea023c5 Checkstyle Change-Id: I07320a34b44c564660a727eafb4ce3697682f4ee Add CacheLoader Change-Id: I59e9f6a1556b094bf7bfdc05426daab165aec974 Cache volume free space. Change-Id: I4a9fb9ad4c5e7ba95f3db27020910184e7f61a06 (cherry picked from commit 2886408) Fix NPE. Change-Id: Iae968baec905250ea3ba0bb47927e12f062fd79a (cherry picked from commit 60175d1) Checkstyle. Change-Id: I5ebce8d3e4b4f4d919d0c455a938558f2c7dc99c Add unit test and fix bugs. Change-Id: I334d83b9d2b57f6fe4516fc878f604d503350799 empty statement. Change-Id: I4f9bf9c5c06182f55a96aec553a76c4928389b45

smengcl · 2024-07-08T22:08:58Z

Closing this one in favor of #6383 . Can port that to HDDS-7593 if needed. Thx!

jojochuang requested a review from smengcl March 14, 2024 21:32

jojochuang changed the title ~~Hdds 9842~~ HDDS-9842. [hsync] Checking disk capacity at every write request is expensive for HBase Mar 14, 2024

jojochuang closed this Mar 15, 2024

jojochuang reopened this Mar 15, 2024

jojochuang mentioned this pull request Mar 15, 2024

HDDS-9842. Cache volume capacity and available space #6383

Merged

kerneltime reviewed Mar 20, 2024

View reviewed changes

...ainer-service/src/main/java/org/apache/hadoop/ozone/container/common/volume/VolumeUsage.java Show resolved Hide resolved

kerneltime reviewed Mar 20, 2024

View reviewed changes

jojochuang added the hbase HBase on Ozone support label Mar 24, 2024

smengcl force-pushed the HDDS-7593 branch from 44b1242 to 3fe5cde Compare March 27, 2024 23:29

jojochuang force-pushed the HDDS-7593 branch from d4314c9 to 8aa8a36 Compare April 11, 2024 22:20

jojochuang force-pushed the HDDS-9842 branch from 7f86181 to e8a5cfe Compare May 13, 2024 23:42

jojochuang force-pushed the HDDS-7593 branch from ea0e389 to 8aa8a36 Compare June 18, 2024 21:54

jojochuang force-pushed the HDDS-9842 branch from 1db9f94 to 4c67513 Compare June 26, 2024 01:02

smengcl closed this Jul 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

HDDS-9842. [hsync] Checking disk capacity at every write request is expensive for HBase #6381

HDDS-9842. [hsync] Checking disk capacity at every write request is expensive for HBase #6381

Uh oh!

jojochuang commented Mar 14, 2024

Uh oh!

jojochuang commented Mar 15, 2024

Uh oh!

jojochuang commented Mar 15, 2024

Uh oh!

jojochuang commented Mar 15, 2024

Uh oh!

Uh oh!

kerneltime Mar 20, 2024

Uh oh!

smengcl Mar 21, 2024

Uh oh!

jojochuang Mar 21, 2024

Uh oh!

jojochuang May 17, 2024

Uh oh!

smengcl commented Jul 8, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

HDDS-9842. [hsync] Checking disk capacity at every write request is expensive for HBase #6381

HDDS-9842. [hsync] Checking disk capacity at every write request is expensive for HBase #6381

Uh oh!

Conversation

jojochuang commented Mar 14, 2024

What changes were proposed in this pull request?

What is the link to the Apache JIRA

How was this patch tested?

Uh oh!

jojochuang commented Mar 15, 2024

Uh oh!

jojochuang commented Mar 15, 2024

Uh oh!

jojochuang commented Mar 15, 2024

Uh oh!

Uh oh!

kerneltime Mar 20, 2024

Choose a reason for hiding this comment

Uh oh!

smengcl Mar 21, 2024

Choose a reason for hiding this comment

Uh oh!

jojochuang Mar 21, 2024

Choose a reason for hiding this comment

Uh oh!

jojochuang May 17, 2024

Choose a reason for hiding this comment

Uh oh!

smengcl commented Jul 8, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants