Improve performance for BaseShowTablesWithSizes query.#15713
Improve performance for BaseShowTablesWithSizes query.#15713arthurschreiber merged 3 commits intomainfrom
BaseShowTablesWithSizes query.#15713Conversation
Review ChecklistHello reviewers! 👋 Please follow this checklist when reviewing this Pull Request. General
Tests
Documentation
New flags
If a workflow is added or modified:
Backward compatibility
|
|
@shlomi-noach can you take a look? I remember you recently tried to optimize this query as well. |
Signed-off-by: Arthur Schreiber <arthurschreiber@github.com>
af92c65 to
6bd23ca
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #15713 +/- ##
==========================================
+ Coverage 68.40% 68.95% +0.55%
==========================================
Files 1556 1558 +2
Lines 195121 204509 +9388
==========================================
+ Hits 133479 141026 +7547
- Misses 61642 63483 +1841 ☔ View full report in Codecov by Sentry. |
shlomi-noach
left a comment
There was a problem hiding this comment.
In #13375 I made the split to optimize the query -- kudos on being able to unsplit it and optimize even further. It appears like we don't test this query, or perhaps the test is buried deep down. @arthurschreiber would you mind:
- looking for where this query is tested
- if not already tested, create tests for both partitioned and unpartitioned scenarios
- Add a comment to the query, indicating where this is being tested?
|
@shlomi-noach Thanks for taking a look! I believe this query is implicitly tested via some of the existing test cases, but I can try to find out exactly which test cases cover it. I'll see if I can add tests that at least cover the partitioned / non partitioned table handling a bit better. |
|
@shlomi-noach I found some existing test cases here, so I guess no need test cases need to be added? The coverage seems sufficient IMHO. vitess/go/vt/vttablet/endtoend/misc_test.go Lines 885 to 932 in 6bd23ca |
Signed-off-by: Arthur Schreiber <arthurschreiber@github.com>
|
@shlomi-noach I pushed some light refactoring of the test case. |
deepthi
left a comment
There was a problem hiding this comment.
Nice work on improving the tests too!
Co-authored-by: Deepthi Sigireddi <deepthi@planetscale.com> Signed-off-by: Arthur Schreiber <arthurschreiber@github.com>
Signed-off-by: Arthur Schreiber <arthurschreiber@github.com> Co-authored-by: Deepthi Sigireddi <deepthi@planetscale.com>
Signed-off-by: Arthur Schreiber <arthurschreiber@github.com> Co-authored-by: Deepthi Sigireddi <deepthi@planetscale.com>
…y. (#15713) (#15795) Signed-off-by: Arthur Schreiber <arthurschreiber@github.com> Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com> Co-authored-by: Deepthi Sigireddi <deepthi@planetscale.com> Co-authored-by: Arthur Schreiber <arthurschreiber@github.com>
Signed-off-by: Arthur Schreiber <arthurschreiber@github.com> Co-authored-by: Deepthi Sigireddi <deepthi@planetscale.com>
Description
BaseShowTablesWithSizesis executed whenever avttabletprocess starts up.We use
vtcomboin our CI environment, with many different keyspaces defined in the topology or created dynamically viaCREATE DATABASE. We noticed that when upgrading from MySQL 5.7 to MySQL 8.0 in our CI environment, thevtcombostartup time and the database setup times in our build jobs started to be almost twice as long as with MySQL 8.0.I was able to trace this down to the differences in the
BaseShowTablesWithSizesquery.The MySQL 8.0 version uses two queries combined via
UNION. The problem here is that each of the two parts of theUNIONends up joininginformation_schema.innodb_tablespaces.information_schema.innodb_tablespacesis a MySQL system table that I believe MySQL fills on-demand with all available tablespace data, and then applies any conditions from theWHEREorONclauses via full table scans. In short, accessing this table is expensive, and accessing it twice in separate queries doubles the cost of accessing it.On my system, I currently have around 14k tablespaces:
Running the original query takes 0.4 seconds:
Running the query I'm proposing here takes half the time:
Related Issue(s)
Checklist
Deployment Notes