Fix: Separate Lock for Keyspace to Update Controller Mapping in Schema Tracking#17873
Fix: Separate Lock for Keyspace to Update Controller Mapping in Schema Tracking#17873harshit-gangal merged 1 commit intovitessio:mainfrom
Conversation
…date controller mapping Signed-off-by: Harshit Gangal <harshit@planetscale.com>
Review ChecklistHello reviewers! 👋 Please follow this checklist when reviewing this Pull Request. General
Tests
Documentation
New flags
If a workflow is added or modified:
Backward compatibility
|
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #17873 +/- ##
==========================================
+ Coverage 67.45% 67.47% +0.01%
==========================================
Files 1594 1594
Lines 259064 259075 +11
==========================================
+ Hits 174760 174813 +53
+ Misses 84304 84262 -42 ☔ View full report in Codecov by Sentry. |
GuptaManan100
left a comment
There was a problem hiding this comment.
While this works, should we get rid of the locking altogether in the codepath that is reading data from vttablets?
The Table Map is shared between keyspaces and we run go-routine per keyspace. The lock is required will updating those maps. |
| signal func() // a function that we'll call whenever we have new schema data | ||
|
|
||
| // map of keyspace currently tracked | ||
| trackedMu sync.Mutex |
There was a problem hiding this comment.
I'm thinking we should make it easy to see in the struct declaration which fields are protected by which mutex. trackedMu is kind of easy, but what the mu mutex is protecting is hard to see here.
…a Tracking (#17873) Signed-off-by: Harshit Gangal <harshit@planetscale.com>
…a Tracking (#17873) Signed-off-by: Harshit Gangal <harshit@planetscale.com>
* [release-21.0] Bump to `v21.0.4-SNAPSHOT` after the `v21.0.3` release (vitessio#17766) Signed-off-by: Rohit Nayak <rohit@planetscale.com> Co-authored-by: Rohit Nayak <rohit@planetscale.com> * [release-21.0] smartconnpool: Better handling for idle expiration (vitessio#17757) (vitessio#17781) Signed-off-by: Vicent Marti <vmg@strn.cat> Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com> * [release-21.0] Fail assignment expressions with the correct message (vitessio#17752) (vitessio#17776) Signed-off-by: Manan Gupta <manan@planetscale.com> Co-authored-by: Manan Gupta <35839558+GuptaManan100@users.noreply.github.com> Co-authored-by: Manan Gupta <manan@planetscale.com> * [release-21.0] Multi-tenant workflow SwitchWrites: Don't add denied tables on cancelMigration() (vitessio#17782) (vitessio#17797) Signed-off-by: Rohit Nayak <rohit@planetscale.com> Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com> * [release-21.0] VReplication Atomic Copy Workflows: fix bugs around concurrent inserts (vitessio#17772) (vitessio#17793) Signed-off-by: Rohit Nayak <rohit@planetscale.com> Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com> * [release-21.0] Upgrade the Golang version to `go1.23.6` (vitessio#17699) Signed-off-by: GitHub <noreply@github.com> Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr> Co-authored-by: frouioui <frouioui@users.noreply.github.com> Co-authored-by: Florent Poinsard <florent.poinsard@outlook.fr> * [release-21.0] Fix a potential connection pool leak. (vitessio#17807) (vitessio#17814) Signed-off-by: Arthur Schreiber <arthurschreiber@github.com> Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com> * backport: support subqueries inside subqueries when merging (Release 21.0) (vitessio#17811) Signed-off-by: Andres Taylor <andres@planetscale.com> * [release-21.0] Fix vtcombo parsing flags incorrectly (vitessio#17743) (vitessio#17820) Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com> Co-authored-by: Dirkjan Bussink <d.bussink@gmail.com> * [release-21.0] pool: reopen connection closed by idle timeout (vitessio#17818) (vitessio#17829) * [release-21.0] Implement temporal comparisons (vitessio#17826) (vitessio#17854) Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com> Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com> * [release-21.0] evalengine: normalize types during compilation (vitessio#17887) (vitessio#17896) Signed-off-by: Vicent Marti <vmg@strn.cat> Signed-off-by: Andres Taylor <andres@planetscale.com> Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com> Co-authored-by: Vicent Marti <vmg@strn.cat> * [release-21.0] Fix: Separate Lock for Keyspace to Update Controller Mapping in Schema Tracking (vitessio#17873) (vitessio#17885) Signed-off-by: Harshit Gangal <harshit@planetscale.com> Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com> Co-authored-by: Harshit Gangal <harshit@planetscale.com> * [release-21.0] Upgrade the Golang version to `go1.23.7` (vitessio#17901) Signed-off-by: GitHub <noreply@github.com> Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr> Co-authored-by: frouioui <frouioui@users.noreply.github.com> Co-authored-by: Florent Poinsard <florent.poinsard@outlook.fr> * [release-21.0] fix: race on storing schema engine last changed time (vitessio#17914) (vitessio#17917) Signed-off-by: Harshit Gangal <harshit@planetscale.com> Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> Co-authored-by: Harshit Gangal <harshit@planetscale.com> Co-authored-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> * [release-21.0] [VTAdmin] Insert into schema cache if exists already and not expired (vitessio#17908) (vitessio#17924) Signed-off-by: Frances Thai <notfelineit@gmail.com> Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com> * [release-21.0] Bump golang.org/x/net from 0.34.0 to 0.36.0 (vitessio#17958) (vitessio#17960) Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com> Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com> * [release-21.0] fix flaky test on mysqlshell backup engine (vitessio#17981) Signed-off-by: Renan Rangel <rrangel@slack-corp.com> Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr> Co-authored-by: Renan Rangel <rvrangel@users.noreply.github.com> * [release-21.0] DML test fix for duplicate column value (vitessio#17980) Signed-off-by: Harshit Gangal <harshit@planetscale.com> * [release-21.0] Fix DISTINCT on ENUM/SET columns by making enums/set hashable (vitessio#17936) (vitessio#17991) Signed-off-by: Gene Parmesan Thomas <201852096+gopoto@users.noreply.github.com> Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com> * [release-21.0] Fix tablet selection in `vtctld BackupShard` (vitessio#18002) (vitessio#18025) Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr> Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com> * [release-21.0] Set proper join vars type for the RHS field query in OLAP (vitessio#18028) (vitessio#18038) Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr> Signed-off-by: Andres Taylor <andres@planetscale.com> Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com> Co-authored-by: Andres Taylor <andres@planetscale.com> * [release-21.0] Use release branches for upgrade downgrade tests (vitessio#18029) (vitessio#18035) Signed-off-by: Manan Gupta <manan@planetscale.com> Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com> * [release-21.0] Test: Increase query timeout to fix flaky test 'TestQueryTimeoutWithShardTargeting' (vitessio#18016) (vitessio#18040) Signed-off-by: Harshit Gangal <harshit@planetscale.com> Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com> * [release-21.0] fix: App and Dba Pool metrics (vitessio#18048) (vitessio#18084) Signed-off-by: Harshit Gangal <harshit@planetscale.com> Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com> * [release-21.0] Upgrade the Golang version to `go1.23.8` (vitessio#18092) Signed-off-by: GitHub <noreply@github.com> Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr> Co-authored-by: frouioui <frouioui@users.noreply.github.com> Co-authored-by: Florent Poinsard <florent.poinsard@outlook.fr> * [release-21.0] bugfix: allow window functions when possible to push down (vitessio#18103) (vitessio#18105) Signed-off-by: Andres Taylor <andres@planetscale.com> Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com> Co-authored-by: Andres Taylor <andres@planetscale.com> * [release-21.0] VDiff: Fix logic for reconciling extra rows (vitessio#17950) (vitessio#18072) Signed-off-by: Rohit Nayak <rohit@planetscale.com> Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com> * [release-21.0] VStream API: Reset stopPos in catchup (vitessio#18119) (vitessio#18122) Signed-off-by: Noble Mittal <noblemittal@outlook.com> Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com> * [release-21.0] Fix `Reshard Cancel` behavior (vitessio#18020) (vitessio#18080) Signed-off-by: Arthur Schreiber <arthurschreiber@github.com> Co-authored-by: Arthur Schreiber <arthurschreiber@github.com> * Fix backup shard copy paste error (vitessio#18100) Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com> * [release-21.0] go/vt/vtgate: take routing rules into account for traffic mirroring (vitessio#17953) (vitessio#17994) Signed-off-by: Max Englander <max@planetscale.com> Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com> Co-authored-by: Max Englander <max@planetscale.com> * [release-21.0] Bugfix: Missing data when running vtgate outer joins (vitessio#18036) (vitessio#18044) Signed-off-by: Andres Taylor <andres@planetscale.com> Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com> * [release-21.0] Filter out tablets with unknown replication lag when electing a new primary (vitessio#18004) (vitessio#18075) Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr> Co-authored-by: Florent Poinsard <35779988+frouioui@users.noreply.github.com> * [release-21.0] Fix: Ensure Consistent Lookup Vindex Handles Duplicate Rows in Single Query (vitessio#17974) (vitessio#18078) Signed-off-by: Harshit Gangal <harshit@planetscale.com> Co-authored-by: Harshit Gangal <harshit@planetscale.com> * [release-21.0] Code Freeze for `v21.0.4` (vitessio#18135) Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr> * [release-21.0] Release of `v21.0.4` (vitessio#18136) Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr> * test fix Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com> * Revert "test fix" This reverts commit 55aec5c. --------- Signed-off-by: Rohit Nayak <rohit@planetscale.com> Signed-off-by: Vicent Marti <vmg@strn.cat> Signed-off-by: Manan Gupta <manan@planetscale.com> Signed-off-by: GitHub <noreply@github.com> Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr> Signed-off-by: Arthur Schreiber <arthurschreiber@github.com> Signed-off-by: Andres Taylor <andres@planetscale.com> Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com> Signed-off-by: Harshit Gangal <harshit@planetscale.com> Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> Signed-off-by: Frances Thai <notfelineit@gmail.com> Signed-off-by: Renan Rangel <rrangel@slack-corp.com> Signed-off-by: Gene Parmesan Thomas <201852096+gopoto@users.noreply.github.com> Signed-off-by: Noble Mittal <noblemittal@outlook.com> Signed-off-by: Max Englander <max@planetscale.com> Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com> Co-authored-by: vitess-bot <139342327+vitess-bot@users.noreply.github.com> Co-authored-by: Rohit Nayak <rohit@planetscale.com> Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com> Co-authored-by: Manan Gupta <35839558+GuptaManan100@users.noreply.github.com> Co-authored-by: Manan Gupta <manan@planetscale.com> Co-authored-by: frouioui <frouioui@users.noreply.github.com> Co-authored-by: Florent Poinsard <florent.poinsard@outlook.fr> Co-authored-by: Andrés Taylor <andres@planetscale.com> Co-authored-by: Matt Lord <mattalord@gmail.com> Co-authored-by: Dirkjan Bussink <d.bussink@gmail.com> Co-authored-by: Harshit Gangal <harshit@planetscale.com> Co-authored-by: Vicent Marti <vmg@strn.cat> Co-authored-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> Co-authored-by: Florent Poinsard <35779988+frouioui@users.noreply.github.com> Co-authored-by: Renan Rangel <rvrangel@users.noreply.github.com> Co-authored-by: Arthur Schreiber <arthurschreiber@github.com> Co-authored-by: Max Englander <max@planetscale.com>
Description
Problem
Previously, schema tracking update calls to VTTablet via GetSchema RPC and access to the keyspace-to-update-controller mapping (tracked map) were using the same lock. These operations are unrelated, but GetSchema RPC calls can be slow due to network-related reasons.
As a result, fetching the update controller from the map was blocked, leading to:
Solution
Introduced a separate mutex for managing access to the tracked map, ensuring that schema tracking operations are not blocked by RPC calls. This prevents delays in processing health updates and improves overall stability.
Backport Reason: Health check misses can lead to missing schema changes.
Related Issue(s)
Checklist
Deployment Notes