Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce redundant pack reads during building bitmap filter for delta merge case #9876

Open
wants to merge 12 commits into
base: master
Choose a base branch
from

Conversation

gengliqi
Copy link
Contributor

@gengliqi gengliqi commented Feb 14, 2025

What problem does this PR solve?

Issue Number: close #9875

Problem Summary:
This optimization can provide a significant performance boost when the data being read is not updated frequently.

What is changed and how it works?

Reduce redundant pack reads during bitmap filter building for delta merge case

Run CHBenchmark 1500: Overall 26.5% Improvement

mysql> select sum(DELTA_MEMTABLE_ROWS+DELTA_PERSISTED_ROWS) delta_rows, TIDB_TABLE from INFORMATION_SCHEMA.TIFLASH_SEGMENTS where TIDB_DATABASE='chbench_delta' group by TIDB_TABLE order by delta_rows;
+------------+------------+
| delta_rows | TIDB_TABLE |
+------------+------------+
|          0 | item       |
|         15 | region     |
|         75 | nation     |
|       4704 | warehouse  |
|      20000 | supplier   |
|      49790 | district   |
|     183924 | new_order  |
|    1022080 | history    |
|    1193288 | orders     |
|    1996224 | customer   |
|    7977425 | stock      |
|   41277141 | order_line |
+------------+------------+
12 rows in set (0.09 sec)

mysql> select sum(DELTA_MEMTABLE_ROWS+DELTA_PERSISTED_ROWS) delta_rows, sum(stable_rows) stable_rows, 100 * sum(DELTA_MEMTABLE_ROWS+DELTA_PERSISTED_ROWS) / sum(stable_rows) delta_percent from INFORMATION_SCHEMA.TIFLASH_SEGMENTS where TIDB_DATABASE='chbench_delta';
+------------+-------------+---------------+
| delta_rows | stable_rows | delta_percent |
+------------+-------------+---------------+
|   53724666 |  2283695006 |        2.3525 |
+------------+-------------+---------------+
1 row in set (0.09 sec)

The above table shows the delta rows for each table in chbench_delta database.

Question Original Optimization Improvement
Q1 3.30 1.17 182.05%
Q2 0.90 0.88 2.27%
Q3 4.34 1.86 133.33%
Q4 5.11 2.88 77.43%
Q5 14.07 11.56 21.71%
Q6 2.98 1.00 198.00%
Q7 8.01 5.62 42.53%
Q8 5.05 2.83 78.45%
Q9 46.22 44.01 5.02%
Q10 27.34 25.09 8.97%
Q11 0.56 0.46 21.74%
Q12 5.34 3.22 65.84%
Q13 0.75 0.39 92.31%
Q14 4.05 2.04 98.53%
Q15 13.11 9.36 40.06%
Q16 10.24 10.26 -0.19%
Q17 6.88 2.49 176.31%
Q18 35.46 32.14 10.33%
Q19 0.17 0.04 325.00%
Q20 3.72 1.59 133.96%
Q21 11.53 6.66 73.12%
Q22 0.44 0.10 340.00%
Sum 209.57 165.65 26.51%

All Stable Performance Comparison

The following table presents the performance improvements for All Stable compared to both the original version (with Delta) and the optimized version (with Delta):

  • After compacting the data, All Stable only shows a 5.86% improvement compared to the current optimized version with delta data.
  • However, compared to the original version with delta data, All Stable shows a significant 33.93% improvement.

The chbenchmark data was inserted into another database. Then the compact table command is executed. As a result, the data might differ. So there are some differences in the query plans. For example, in Q20, the query plan uses hash partition join and includes late materialization for order_line table in the All Stable version, while original version uses broadcast join and no late materialization for order_line table. This causes Q20 to run slower in All Stable compared to the original version due to exchange overhead and low selectivity.

Query All Stable All Stable Improvement over Original All Stable Improvement over Optimization
Q1 1.00 230.00% 17.00%
Q2 0.87 3.45% 1.15%
Q3 1.39 212.23% 33.81%
Q4 2.71 88.56% 6.27%
Q5 11.10 26.76% 4.14%
Q6 0.76 292.11% 31.58%
Q7 5.47 46.44% 2.74%
Q8 2.39 111.30% 18.41%
Q9 39.85 15.98% 10.44%
Q10 24.50 11.59% 2.41%
Q11 0.51 9.80% -9.80%
Q12 2.99 78.60% 7.69%
Q13 0.38 97.37% 2.63%
Q14 1.86 117.74% 9.68%
Q15 9.02 45.34% 3.77%
Q16 10.32 -0.78% -0.58%
Q17 1.55 343.87% 60.65%
Q18 32.01 10.78% 0.41%
Q19 0.04 325.00% 0.00%
Q20 1.81 105.52% -12.15%
Q21 5.88 96.09% 13.27%
Q22 0.07 528.57% 42.86%
Sum 156.48 33.93% 5.86%

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No code

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

None

@ti-chi-bot ti-chi-bot bot added release-note-none Denotes a PR that doesn't merit a release note. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Feb 14, 2025
@gengliqi gengliqi changed the title Reduce redundant pack reads during bitmap filter building for delta merge case Reduce redundant pack reads during building bitmap filter for delta merge case Feb 14, 2025
@ti-chi-bot ti-chi-bot bot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Feb 17, 2025
Signed-off-by: gengliqi <[email protected]>
@ti-chi-bot ti-chi-bot bot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Feb 19, 2025
Signed-off-by: gengliqi <[email protected]>
Copy link
Contributor

@Lloyd-Pottiger Lloyd-Pottiger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Copy link
Contributor

ti-chi-bot bot commented Feb 20, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Lloyd-Pottiger

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot bot added needs-1-more-lgtm Indicates a PR needs 1 more LGTM. approved labels Feb 20, 2025
Copy link
Contributor

ti-chi-bot bot commented Feb 20, 2025

[LGTM Timeline notifier]

Timeline:

  • 2025-02-20 05:25:25.882352166 +0000 UTC m=+1111768.278574229: ☑️ agreed by Lloyd-Pottiger.

@gengliqi gengliqi force-pushed the mvcc-optimize-skip-pack branch from 112e089 to e36245d Compare February 20, 2025 07:15
Signed-off-by: gengliqi <[email protected]>
@gengliqi gengliqi force-pushed the mvcc-optimize-skip-pack branch from e36245d to b00df5f Compare February 20, 2025 07:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved needs-1-more-lgtm Indicates a PR needs 1 more LGTM. release-note-none Denotes a PR that doesn't merit a release note. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Reduce redundant pack reads during building bitmap filter for delta merge case
3 participants