[SPARK-32350][FOLLOW-UP] Fix count update issue and partition the value list to a set of small batches for LevelDB writeAll #29425

baohe-zhang · 2020-08-13T15:47:01Z

What changes were proposed in this pull request?

This is a follow-up of #29149. The change is that when writing a large value list, we divide the value list to a set of 128-values smaller batches, and bulk write these smaller batches one by one. This improvement can reduce memory pressure caused by serialization and give fairness to other writing threads. The idea is proposed by @mridulm in #29149 (comment).

Update 9/1: After adding the unit test, I found the count of objects cannot be updated properly when using writeAll. This PR also fix the issue by using a hash map to temporarily store and update the delta of each countKey, and put the countKey with its delta to WriteBatch before calling db.write(batch).

Why are the changes needed?

Fix the count update bug when calling writeAll().
Reduce the memory pressure and alleviate the fairness issue when batch writing a very long value list.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Unit test and manual test.

A test case testWriteAll() is added in LevelDBSuite.java.

I compared some write-related benchmark tests in LevelDBBenchmark.java between writeAll() and write(). The test was conducted in an HDD disk, and the disk is not busy.

benchmark test (writing 10000 objects)	with writeAll(), ms	with write(), ms
randomUpdatesIndexed	593.202	791.28
randomUpdatesNoIndex	253.803	388.076
randomWritesIndexed	834.471	1290.984
randomWritesNoIndex	248.064	475.399
sequentialUpdatesIndexed	351.207	454.227
sequentialUpdatesNoIndex	236.391	380.359
sequentialWritesIndexed	433.044	782.449
sequentialWritesNoIndex	229.85	523.149

baohe-zhang · 2020-08-13T15:51:08Z

cc @mridulm @HeartSaVioR

SparkQA · 2020-08-13T18:09:13Z

Test build #127415 has finished for PR 29425 at commit 17653ba.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

gatorsmile · 2020-08-16T21:57:53Z

common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDB.java

+      // Partition the value list to a set of the 128-values batches. It can reduce the
+      // memory pressure caused by serialization and give fairness to other writing threads
+      // when writing a very large list.
+      for (List<?> batchList : Iterables.partition(entry.getValue(), 128)) {


I left a comment in #28412 (review) . I think we should add the unit test cases for verifying the code work as our design. We might need to update the implementation to provide the private APIs for testing.

Sure I will add unit tests.

Hi @gatorsmile , unit tests for HybridStore are added in #29509. And the unit test for writeAll() method of levelDB is added in current pr.

baohe-zhang · 2020-08-25T22:14:54Z

common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDB.java

+            try (WriteBatch batch = db().createWriteBatch()) {
+              updateBatch(batch, valueIter.next(), serializedValueIter.next(), klass,
+                naturalIndex, indices);
+              db().write(batch);


@HeartSaVioR Here I put db().write(batch) inside the while loop, that means we will call db().write() for every entity, in a way similar to removeAllByIndexValues(). I did this change because after I added the unit test, I found a bug that the count cannot be updated properly when we use the original code of writeAll(). The count is updated in https://github.com/apache/spark/blob/master/common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDBTypeInfo.java#L395. Since the delta is overridden to 1 every time we add a new entity, we can only increase the count by 1 for every db().write(batch) call, even though we write multiple entities in that call.

Thanks for the information. Sounds like ideally we should fix the bug in LevelDBTypeInfo instead. Sorry I haven't have time to look into details. If possible we'd better fix there; if that's not feasible please let me know.

Sounds good, I would try to fix it.

Hi @HeartSaVioR , I fixed the bug by using a hashmap to store and update the <countKey, delta> pair in a batch and use the information in that hashmap to update the count right before calling db.write(batch). I updated the test case to cover the count update issue. I also ran benchmark tests for writeAll and put the result in the description section of this PR.

SparkQA · 2020-08-26T00:31:40Z

Test build #127897 has finished for PR 29425 at commit ae7d0cb.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

baohe-zhang · 2020-09-01T23:06:10Z

common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDB.java

+                naturalIndex, indices, counts);
+            }
+            for (Map.Entry<ByteBuffer, Long> countEntry : counts.entrySet()) {
+              naturalIndex.updateCount(batch, countEntry.getKey().array(), countEntry.getValue());


@HeartSaVioR Here I use naturalIndex.updateCount() to put the count information of all indexes to the batch. When implementing this I found we can lift the method updateCount() and long getCount(byte[] key) from LevelDBTypeInfo.Index to LevelDBTypeInfo, as these methods are not accessing any member of LevelDBTypeInfo.Index. Doing that would allow us to call ti.updateCount() to update count for all indexes, which would make more sense. However, it's totally optional.

SparkQA · 2020-09-02T01:11:04Z

Test build #128168 has finished for PR 29425 at commit 5a5f6c9.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

HeartSaVioR · 2020-09-03T08:35:25Z

I'm not 100% sure I'm qualified to review this patch, as this touches the details of LevelDB KV store implementation.
(My understanding is more likely regarding to SHS + interfaces of KV store, instead of KV store implementation"s".)

Let me cc to @mridulm and @vanzin again. I'll try to spend time to look into details of LevelDB once I get some time and try to review again if @mridulm and @vanzin aren't available for these days.

baohe-zhang · 2020-09-21T22:57:08Z

Hi @HeartSaVioR , I would like to know if there are any updates on the review of this PR. Thanks!

HeartSaVioR · 2020-09-24T12:09:47Z

Sorry I still have several things in my plate and have been struggling with these things. You'd better ping @mridulm as he'd understand the patch well.

@mridulm Appreciated if you have a time to look into this. Thanks.

github-actions · 2021-01-03T01:06:42Z

We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.
If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag!

pan3793 · 2024-03-25T17:17:00Z

I occasionally found this PR, it contains an underlying correctness fix of KVStore#count, @baohe-zhang would you like to continue it

also cc @mridulm

Partition value list to smaller batches

17653ba

probot-autolabeler bot added the CORE label Aug 13, 2020

gatorsmile reviewed Aug 16, 2020

View reviewed changes

Baohe Zhang added 2 commits August 25, 2020 17:02

Add unit test for leveldb writeAll()

2d9e273

Fix bug when updating count in levelDB

ae7d0cb

baohe-zhang commented Aug 25, 2020

View reviewed changes

Fix count update bug when calling writeAll

5a5f6c9

baohe-zhang changed the title ~~[SPARK-32350][FOLLOW-UP] Divide the value list into a set of small batches when bulk writing to LevelDB~~ [SPARK-32350][FOLLOW-UP] Fix count update issue and partition the value list to a set of small batches for LevelDB writeAll Sep 1, 2020

baohe-zhang commented Sep 1, 2020

View reviewed changes

github-actions bot added the Stale label Jan 3, 2021

github-actions bot closed this Jan 4, 2021

[SPARK-32350][FOLLOW-UP] Fix count update issue and partition the value list to a set of small batches for LevelDB writeAll #29425

[SPARK-32350][FOLLOW-UP] Fix count update issue and partition the value list to a set of small batches for LevelDB writeAll #29425

Uh oh!

Conversation

baohe-zhang commented Aug 13, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

baohe-zhang commented Aug 13, 2020

Uh oh!

SparkQA commented Aug 13, 2020

Uh oh!

gatorsmile Aug 16, 2020

Choose a reason for hiding this comment

Uh oh!

baohe-zhang Aug 17, 2020

Choose a reason for hiding this comment

Uh oh!

baohe-zhang Aug 25, 2020

Choose a reason for hiding this comment

Uh oh!

baohe-zhang Aug 25, 2020

Choose a reason for hiding this comment

Uh oh!

HeartSaVioR Aug 27, 2020

Choose a reason for hiding this comment

Uh oh!

baohe-zhang Aug 28, 2020

Choose a reason for hiding this comment

Uh oh!

baohe-zhang Sep 1, 2020

Choose a reason for hiding this comment

Uh oh!

SparkQA commented Aug 26, 2020

Uh oh!

baohe-zhang Sep 1, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

SparkQA commented Sep 2, 2020

Uh oh!

HeartSaVioR commented Sep 3, 2020

Uh oh!

baohe-zhang commented Sep 21, 2020

Uh oh!

HeartSaVioR commented Sep 24, 2020

Uh oh!

github-actions bot commented Jan 3, 2021

Uh oh!

pan3793 commented Mar 25, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

baohe-zhang commented Aug 13, 2020 •

edited

Loading

baohe-zhang Sep 1, 2020 •

edited

Loading