Fix DISTINCT on ENUM/SET columns by making enums/set hashable#17936
Fix DISTINCT on ENUM/SET columns by making enums/set hashable#17936GuptaManan100 merged 3 commits intovitessio:mainfrom
Conversation
Review ChecklistHello reviewers! 👋 Please follow this checklist when reviewing this Pull Request. General
Tests
Documentation
New flags
If a workflow is added or modified:
Backward compatibility
|
There was a problem hiding this comment.
@gopoto Looks like you accidentally committed a bunch of test files here that shouldn't be?
There was a problem hiding this comment.
Removed the vtroot_6701 tmp and topo directories that were committed in error; these logs and wal files were artifacts of a previous test run and shouldn't be in the tree.
Signed-off-by: Gene Parmesan Thomas <201852096+gopoto@users.noreply.github.com>
17220e3 to
f6cc31c
Compare
|
I guess this will have to be backported to oder versions as well? 🤔 |
|
Is there a way to test the "unknown set" and "unknown enum" value cases? |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #17936 +/- ##
==========================================
- Coverage 67.56% 67.54% -0.02%
==========================================
Files 1597 1597
Lines 259780 259851 +71
==========================================
+ Hits 175516 175529 +13
- Misses 84264 84322 +58 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Signed-off-by: Gene Parmesan Thomas <201852096+gopoto@users.noreply.github.com>
|
@gopoto Is this ready for review? |
Signed-off-by: Gene Parmesan Thomas <201852096+gopoto@users.noreply.github.com>
Signed-off-by: Gene Parmesan Thomas <201852096+gopoto@users.noreply.github.com>
…ashable (#17936) (#17990) Signed-off-by: Gene Parmesan Thomas <201852096+gopoto@users.noreply.github.com> Signed-off-by: Manan Gupta <manan@planetscale.com> Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com> Co-authored-by: Manan Gupta <manan@planetscale.com>
* [release-21.0] Bump to `v21.0.4-SNAPSHOT` after the `v21.0.3` release (vitessio#17766) Signed-off-by: Rohit Nayak <rohit@planetscale.com> Co-authored-by: Rohit Nayak <rohit@planetscale.com> * [release-21.0] smartconnpool: Better handling for idle expiration (vitessio#17757) (vitessio#17781) Signed-off-by: Vicent Marti <vmg@strn.cat> Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com> * [release-21.0] Fail assignment expressions with the correct message (vitessio#17752) (vitessio#17776) Signed-off-by: Manan Gupta <manan@planetscale.com> Co-authored-by: Manan Gupta <35839558+GuptaManan100@users.noreply.github.com> Co-authored-by: Manan Gupta <manan@planetscale.com> * [release-21.0] Multi-tenant workflow SwitchWrites: Don't add denied tables on cancelMigration() (vitessio#17782) (vitessio#17797) Signed-off-by: Rohit Nayak <rohit@planetscale.com> Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com> * [release-21.0] VReplication Atomic Copy Workflows: fix bugs around concurrent inserts (vitessio#17772) (vitessio#17793) Signed-off-by: Rohit Nayak <rohit@planetscale.com> Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com> * [release-21.0] Upgrade the Golang version to `go1.23.6` (vitessio#17699) Signed-off-by: GitHub <noreply@github.com> Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr> Co-authored-by: frouioui <frouioui@users.noreply.github.com> Co-authored-by: Florent Poinsard <florent.poinsard@outlook.fr> * [release-21.0] Fix a potential connection pool leak. (vitessio#17807) (vitessio#17814) Signed-off-by: Arthur Schreiber <arthurschreiber@github.com> Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com> * backport: support subqueries inside subqueries when merging (Release 21.0) (vitessio#17811) Signed-off-by: Andres Taylor <andres@planetscale.com> * [release-21.0] Fix vtcombo parsing flags incorrectly (vitessio#17743) (vitessio#17820) Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com> Co-authored-by: Dirkjan Bussink <d.bussink@gmail.com> * [release-21.0] pool: reopen connection closed by idle timeout (vitessio#17818) (vitessio#17829) * [release-21.0] Implement temporal comparisons (vitessio#17826) (vitessio#17854) Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com> Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com> * [release-21.0] evalengine: normalize types during compilation (vitessio#17887) (vitessio#17896) Signed-off-by: Vicent Marti <vmg@strn.cat> Signed-off-by: Andres Taylor <andres@planetscale.com> Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com> Co-authored-by: Vicent Marti <vmg@strn.cat> * [release-21.0] Fix: Separate Lock for Keyspace to Update Controller Mapping in Schema Tracking (vitessio#17873) (vitessio#17885) Signed-off-by: Harshit Gangal <harshit@planetscale.com> Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com> Co-authored-by: Harshit Gangal <harshit@planetscale.com> * [release-21.0] Upgrade the Golang version to `go1.23.7` (vitessio#17901) Signed-off-by: GitHub <noreply@github.com> Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr> Co-authored-by: frouioui <frouioui@users.noreply.github.com> Co-authored-by: Florent Poinsard <florent.poinsard@outlook.fr> * [release-21.0] fix: race on storing schema engine last changed time (vitessio#17914) (vitessio#17917) Signed-off-by: Harshit Gangal <harshit@planetscale.com> Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> Co-authored-by: Harshit Gangal <harshit@planetscale.com> Co-authored-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> * [release-21.0] [VTAdmin] Insert into schema cache if exists already and not expired (vitessio#17908) (vitessio#17924) Signed-off-by: Frances Thai <notfelineit@gmail.com> Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com> * [release-21.0] Bump golang.org/x/net from 0.34.0 to 0.36.0 (vitessio#17958) (vitessio#17960) Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com> Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com> * [release-21.0] fix flaky test on mysqlshell backup engine (vitessio#17981) Signed-off-by: Renan Rangel <rrangel@slack-corp.com> Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr> Co-authored-by: Renan Rangel <rvrangel@users.noreply.github.com> * [release-21.0] DML test fix for duplicate column value (vitessio#17980) Signed-off-by: Harshit Gangal <harshit@planetscale.com> * [release-21.0] Fix DISTINCT on ENUM/SET columns by making enums/set hashable (vitessio#17936) (vitessio#17991) Signed-off-by: Gene Parmesan Thomas <201852096+gopoto@users.noreply.github.com> Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com> * [release-21.0] Fix tablet selection in `vtctld BackupShard` (vitessio#18002) (vitessio#18025) Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr> Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com> * [release-21.0] Set proper join vars type for the RHS field query in OLAP (vitessio#18028) (vitessio#18038) Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr> Signed-off-by: Andres Taylor <andres@planetscale.com> Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com> Co-authored-by: Andres Taylor <andres@planetscale.com> * [release-21.0] Use release branches for upgrade downgrade tests (vitessio#18029) (vitessio#18035) Signed-off-by: Manan Gupta <manan@planetscale.com> Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com> * [release-21.0] Test: Increase query timeout to fix flaky test 'TestQueryTimeoutWithShardTargeting' (vitessio#18016) (vitessio#18040) Signed-off-by: Harshit Gangal <harshit@planetscale.com> Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com> * [release-21.0] fix: App and Dba Pool metrics (vitessio#18048) (vitessio#18084) Signed-off-by: Harshit Gangal <harshit@planetscale.com> Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com> * [release-21.0] Upgrade the Golang version to `go1.23.8` (vitessio#18092) Signed-off-by: GitHub <noreply@github.com> Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr> Co-authored-by: frouioui <frouioui@users.noreply.github.com> Co-authored-by: Florent Poinsard <florent.poinsard@outlook.fr> * [release-21.0] bugfix: allow window functions when possible to push down (vitessio#18103) (vitessio#18105) Signed-off-by: Andres Taylor <andres@planetscale.com> Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com> Co-authored-by: Andres Taylor <andres@planetscale.com> * [release-21.0] VDiff: Fix logic for reconciling extra rows (vitessio#17950) (vitessio#18072) Signed-off-by: Rohit Nayak <rohit@planetscale.com> Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com> * [release-21.0] VStream API: Reset stopPos in catchup (vitessio#18119) (vitessio#18122) Signed-off-by: Noble Mittal <noblemittal@outlook.com> Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com> * [release-21.0] Fix `Reshard Cancel` behavior (vitessio#18020) (vitessio#18080) Signed-off-by: Arthur Schreiber <arthurschreiber@github.com> Co-authored-by: Arthur Schreiber <arthurschreiber@github.com> * Fix backup shard copy paste error (vitessio#18100) Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com> * [release-21.0] go/vt/vtgate: take routing rules into account for traffic mirroring (vitessio#17953) (vitessio#17994) Signed-off-by: Max Englander <max@planetscale.com> Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com> Co-authored-by: Max Englander <max@planetscale.com> * [release-21.0] Bugfix: Missing data when running vtgate outer joins (vitessio#18036) (vitessio#18044) Signed-off-by: Andres Taylor <andres@planetscale.com> Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com> * [release-21.0] Filter out tablets with unknown replication lag when electing a new primary (vitessio#18004) (vitessio#18075) Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr> Co-authored-by: Florent Poinsard <35779988+frouioui@users.noreply.github.com> * [release-21.0] Fix: Ensure Consistent Lookup Vindex Handles Duplicate Rows in Single Query (vitessio#17974) (vitessio#18078) Signed-off-by: Harshit Gangal <harshit@planetscale.com> Co-authored-by: Harshit Gangal <harshit@planetscale.com> * [release-21.0] Code Freeze for `v21.0.4` (vitessio#18135) Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr> * [release-21.0] Release of `v21.0.4` (vitessio#18136) Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr> * test fix Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com> * Revert "test fix" This reverts commit 55aec5c. --------- Signed-off-by: Rohit Nayak <rohit@planetscale.com> Signed-off-by: Vicent Marti <vmg@strn.cat> Signed-off-by: Manan Gupta <manan@planetscale.com> Signed-off-by: GitHub <noreply@github.com> Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr> Signed-off-by: Arthur Schreiber <arthurschreiber@github.com> Signed-off-by: Andres Taylor <andres@planetscale.com> Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com> Signed-off-by: Harshit Gangal <harshit@planetscale.com> Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> Signed-off-by: Frances Thai <notfelineit@gmail.com> Signed-off-by: Renan Rangel <rrangel@slack-corp.com> Signed-off-by: Gene Parmesan Thomas <201852096+gopoto@users.noreply.github.com> Signed-off-by: Noble Mittal <noblemittal@outlook.com> Signed-off-by: Max Englander <max@planetscale.com> Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com> Co-authored-by: vitess-bot <139342327+vitess-bot@users.noreply.github.com> Co-authored-by: Rohit Nayak <rohit@planetscale.com> Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com> Co-authored-by: Manan Gupta <35839558+GuptaManan100@users.noreply.github.com> Co-authored-by: Manan Gupta <manan@planetscale.com> Co-authored-by: frouioui <frouioui@users.noreply.github.com> Co-authored-by: Florent Poinsard <florent.poinsard@outlook.fr> Co-authored-by: Andrés Taylor <andres@planetscale.com> Co-authored-by: Matt Lord <mattalord@gmail.com> Co-authored-by: Dirkjan Bussink <d.bussink@gmail.com> Co-authored-by: Harshit Gangal <harshit@planetscale.com> Co-authored-by: Vicent Marti <vmg@strn.cat> Co-authored-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> Co-authored-by: Florent Poinsard <35779988+frouioui@users.noreply.github.com> Co-authored-by: Renan Rangel <rvrangel@users.noreply.github.com> Co-authored-by: Arthur Schreiber <arthurschreiber@github.com> Co-authored-by: Max Englander <max@planetscale.com>
Description
Vitess’ DISTINCT (and grouping) primitives rely on
evalengine.NullsafeHashcodeto deduplicate rows coming back from scatter queries. While most value types
implement the
hashableinterface so our default hashing can recognize them,both
evalEnumandevalSetdid not. The result is that any query that triesto sort or de-dup on an
ENUMorSETcolumn across shards would error outwith
unexpected type ENUM.This PR adds
Hashimplementations for both enum and set evals so thatDISTINCT,GROUP BY, hash join and similar code paths can hash these typesproperly. For enum/set values we hash the numeric ordinal/bitset when all
values are known, and fall back to hashing the raw string under a binary
collation when we can't resolve the value.
To catch this in end-to-end tests, the
aggregationtest keyspace is updatedwith a simple
exampletable that has an enum column, and a new assertion isadded to the DISTINCT test to exercise
SELECT DISTINCT foo FROM example;,which fails prior to the hashable fix and passes afterwards.
Related Issue(s)
Fixes #17676
Checklist
Deployment Notes
None.
This PR was generated by an AI system in collaboration with maintainers @rbranson, @GuptaManan100, @rohit-nayak-ps, @systay.