select *: expand column list for tables with authoritative column lists#4336
select *: expand column list for tables with authoritative column lists#4336demmer merged 5 commits intovitessio:masterfrom
Conversation
|
@demmer @shlomi-noach |
|
cc @tomkrouper |
There was a problem hiding this comment.
nit: typo Primtive -> Primitive
There was a problem hiding this comment.
Here you are checking the number of rows, but the error talks about the number of columns. By design?
There was a problem hiding this comment.
actually, result.Rows[0] gets us the first row, which is a slice(array) consisting of column values.
go/vt/vtgate/planbuilder/expr.go
Outdated
There was a problem hiding this comment.
I don't understand most of this code, but my spidey sense is tingling here. Should this if not have a corresponding else?
There was a problem hiding this comment.
Never mind, I think I get it now.
|
You should review #3923 first :) |
|
This PR is on top of that. |
go/vt/vtgate/planbuilder/symtab.go
Outdated
There was a problem hiding this comment.
Also FWIW I think calling this columnNames is sufficient. It's self-evidently ordered from the fact that it's an array.
proto/vschema.proto
Outdated
There was a problem hiding this comment.
It seems to me that is_authoritative is too broad of an attribute name for this feature.
After all the vschema is always authoritative for the sharding controls, etc.
How about we keep this more focused and call the attribute "authoritative_columns" or something of that sort.
Also typo "ot" above.
There was a problem hiding this comment.
This is a calculated risk. My prediction is that the next step will be to pull the schema info from vttablet, at which time the meaning of this flag will become more accurate.
If you think it will backfire, I'll change it.
There was a problem hiding this comment.
IMO this potential future is all the more reason to use a more targeted configuration now that takes into account future expansions. Specifically, once the schema is propagated from the vttablet, I'd expect that by default we wouldn't want to set this flag and that instead all information would be pulled from the tablets.
What if we replaced the is_authoritative boolean with a schema_source enum containing options:
DEFAULT-- legacy behavior, eventually to be enhanced with schema info pulled from vttabletSTATIC-- the new behavior being added in this PRNONE-- same as legacy behavior for now, but will disable the feature where schemas are pulled from vttablet.
If we go down this road we have a decision to make around the weight_string sorting changes made in #3522. IMO a consistent plan would be to couple the enabling of those changes with the changes in this PR and make it so that by default the columns definition in the vschema is ignored, but that if the schema_source is static then we enable both the weight_string replacement and the expanded column list for select * changes in this PR.
That approach would, however, break compatibility for anyone who is dependent on the current weight_string implementation and uses select * (including Slack FWIW) since users would need to be sure to fill in all the column names into the vschema and enable the new feature.
The other approach would be to just control new existing feature with the flags and leave the weight_string sorting alone. This is better for backwards compatibility but would be less internally consistent.
There was a problem hiding this comment.
As far as I can tell, #3522 and this feature are fairly orthogonal. Column expansion happens early, at the time of select expression parsing, and weight_string decision is made later during order by processing. So, at that time, the columns will already be expanded out, which will cause weight_string to be added if needed.
I'll add a test case to demonstrate this behavior, which will also help me prove to myself that this works as intended.
As for is_authoritative: you seem to think it's not appropriate. I'm fine changing it to authoritative_columns.
There was a problem hiding this comment.
The reason I think the two are linked isn't anything related to the code that does the expansion, it's that weight_string ordering and select * expansion both depend on the column definitions being manually populated in the vschema.
So it seems to me that we should take both features into account when we start talking about ways to control whether or not vtgate should respect the column definitions in the vschema to inform the query planner or whether it should operate as best it can without knowing the schema.
There was a problem hiding this comment.
I'm ok to go ahead with authoritative_column_list instead of the enum approach outlined above. We may need to revisit in the future assuming we pull the schema list from vttablet.
go/vt/vtgate/planbuilder/symtab.go
Outdated
There was a problem hiding this comment.
If this is truly unreachable why not return an appropriate error?
There was a problem hiding this comment.
Actually, the code is reachable for joining an information_schema table with another. I've updated the comment. Nevertheless, the case itself is not an error condition. The behavior is as intended. I somehow assumed the higher level checks made this code path unreachable.
go/vt/vtgate/planbuilder/symtab.go
Outdated
There was a problem hiding this comment.
Minor nit, isn't this supposed to be friendlier on the GC:
tables := make([]*table, 0, len(st.orderedTables);
There was a problem hiding this comment.
Done. I also had to add a guard for length==0, which is why I went with the simpler code at the cost of GC overhead. Without the guard, the nil check fails.
To expand '*' we need a list of ordered columns. Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
If this flag is set for a table, we'll treat the column list as authoritative and expand select *. Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Add logic to use new symtab metadata about authoritative tables to expand 'select *' expressions into the full column list. Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
sougou
left a comment
There was a problem hiding this comment.
also renamed orderedTables->tableNames
go/vt/vtgate/planbuilder/symtab.go
Outdated
There was a problem hiding this comment.
Actually, the code is reachable for joining an information_schema table with another. I've updated the comment. Nevertheless, the case itself is not an error condition. The behavior is as intended. I somehow assumed the higher level checks made this code path unreachable.
go/vt/vtgate/planbuilder/symtab.go
Outdated
There was a problem hiding this comment.
Done. I also had to add a guard for length==0, which is why I went with the simpler code at the cost of GC overhead. Without the guard, the nil check fails.
proto/vschema.proto
Outdated
There was a problem hiding this comment.
This is a calculated risk. My prediction is that the next step will be to pull the schema info from vttablet, at which time the meaning of this flag will become more accurate.
If you think it will backfire, I'll change it.
Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
I also had to fixup vtexplain to make it load vschemas with enums properly. Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
|
@sougou can you write a little guide here on how to use this feature? we just saw one of these and i was about to file a bug when github (very helpfully!) suggested 3788 and then i saw this PR. |
|
#4711 adds vschema DDL commands to update these columns without uploading an entirely new vschema Some opinions:
|
Constructs like
select * from t order by colwere previously rejected because VTGate was not able to predetermine the position wherecolwould appear in the select list.This PR allows you to specify a column list for tables as authoritative, which will allow vtgate to expand the
*expression into the actual column list, thereby allowing it to identify the correct column position for post-processing constructs that refer to such columns.Change list:
orderedColumnstotablein symtab, and wrap it with anaddColumnfunc.is_authoritativeto vschema, and transmit the value intoisAuthoritativeintable.