v3: support for aggregation and sorting of TEXT#3522
v3: support for aggregation and sorting of TEXT#3522sougou merged 1 commit intovitessio:masterfrom sougou:v3
Conversation
|
LGTM |
demmer
left a comment
There was a problem hiding this comment.
Some minor comments but overall this LGTM.
Overall I would really love if we could find an alternative to weight_string but as discussed on Slack that doesn't look to be the case given the golang utf-8 limitations.
go/sqltypes/result.go
Outdated
There was a problem hiding this comment.
Can you rephrase this as "truncated to the specified number of columns".
I was initially confused thinking that this would truncate the individual fields.
go/vt/vtgate/planbuilder/route.go
Outdated
There was a problem hiding this comment.
It seems cleaner and I suspect more efficient on the gc to only create weightStrings when needed rather than always create the empty array.
There was a problem hiding this comment.
Unfortunately, this is a map. There's no autocreation support for it like slices.
Deferring the creation will introduce conditionals in multiple places (nil-check->make).
Besides, this is planbuilder time, which should be low QPS ideally.
go/vt/vtgate/vindexes/lookup_hash.go
Outdated
There was a problem hiding this comment.
I assume removing these is intentional (which I support) but that seems orthogonal to these changes.
There was a problem hiding this comment.
Yeah. There was too much spam in tests.
This change adds support for aggregating and sorting of columns that may need to follow collation rules. The feature depends on mysql's weight_string function that returns a lexically comparable value of any text column. If a group by or order by uses a text column like varchar, then v3 will additionally request the weight_string versions of such columns and perform ordering and aggregation using those values instead. The identification of a text column is currently based on a new "columns" field in the vschema that allows you to specify the type of each column. This part was implemented in a previous PR. Two approaches were possible: 1. Request additional weight_string values at the time of push-down. 2. Rewire the primitives to request and use the additional weight_string values during the Wireup phase. I went with option 2 because it minimizes overall impact of the code, which will allow us to yank this behavior out when we implement collation aware sorting in vtgate. Also, since all the weight_string columns get added at the end, we only have to truncate the rows before returning what's needed. The following changes were made: * sqltypes: result truncate functionality. * engine: Add TruncateColumnCount field that can be used to truncate a result if needed. * planbuilder: Change Wireup to check and request weight_strings. If weight_string was requested, set the row truncation to make sure that the weight_string values don't get passed on.
|
@sougou Is this a known bug? |
This change adds support for aggregating and sorting of columns
that may need to follow collation rules.
The feature depends on mysql's weight_string function
that returns a lexically comparable value of any text column.
If a group by or order by uses a text column like varchar, then
v3 will additionally request the weight_string versions of such
columns and perform ordering and aggregation using those values
instead.
The identification of a text column is currently based on a new
"columns" field in the vschema that allows you to specify the
type of each column. This part was implemented in a previous PR.
Two approaches were possible:
values during the Wireup phase.
I went with option 2 because it minimizes overall impact of the
code, which will allow us to yank this behavior out when we
implement collation aware sorting in vtgate. Also, since all
the weight_string columns get added at the end, we only have
to truncate the rows before returning what's needed.
The following changes were made:
a result if needed.
If weight_string was requested, set the row truncation to make
sure that the weight_string values don't get passed on.