-
-
Notifications
You must be signed in to change notification settings - Fork 212
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql/{plan,analyzer}: Add IndexedInSubqueryFilter for selecting the subquery results first and then making indexed lookups into the child. #299
Conversation
…bquery results first and then making indexed lookups into the child.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks pretty good, just a couple comments
if !isGetField || !isSubquery || !isResolved { | ||
return nil | ||
} | ||
referencesChildRow := nodeHasGetFieldReferenceBetween(subq.Query, len(scope.Schema()), len(scope.Schema())+len(node.Schema())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You want node.Child.Schema here (should be the same since it's a Filter, but in principle this can vary)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it true? It seems like I want to know if it references this node's schema...if this node had more columns than node.Child
, for example, I would will want to know if it referenced those columns...right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A GetField expression always applies to rows coming from the child of the current node. Think about Project nodes. They can have a totally different schema from the node underneath them.
// range [low, high). | ||
func nodeHasGetFieldReferenceBetween(n sql.Node, low, high int) bool { | ||
var found bool | ||
plan.Inspect(n, func(n sql.Node) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can simplifiy this with plan.InspectExpressions (no need to do plan.Inspect and then sql.Inspect separately)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think InspectExpressions
or InspectExpressionWithNode
lets me do the shenanigans with excluding the walk into IndexedInSubqueryFilter
nodes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh that's true
func (i *IndexedInSubqueryFilter) RowIter(ctx *sql.Context, row sql.Row) (sql.RowIter, error) { | ||
padded := make(sql.Row, len(row)+i.Padding) | ||
copy(padded[:], row[:]) | ||
var res []interface{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sql.Row?
|
||
type indexedInSubqueryIter struct { | ||
Ctx *sql.Context | ||
Rows []interface{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[]sql.Row
if i.I >= len(i.Rows) { | ||
return nil, io.EOF | ||
} | ||
iter, err := i.Child.RowIter(i.Ctx, sql.NewRow(i.Rows[i.I])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be no need for sql.NewRow here after the above change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know...this is a little janky. EvalMultiple
isn't returning []sql.Row
, it's returning []interface{}
, where each entry in the slice is a sql engine value. But the call to Child.RowIter(...)
is expecting a sql.Row
with a single column which is that value. I think it's typed appropriately as is?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that's a good point
} | ||
|
||
func (i *IndexedInSubqueryFilter) RowIter(ctx *sql.Context, row sql.Row) (sql.RowIter, error) { | ||
padded := make(sql.Row, len(row)+i.Padding) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You sure this padding is necessary? A subquery expects to get a row for the outer scope. Why does it need nils on the end?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you have a getField expression that reaches into the nil part of the row given, it's not going to give you a correct result
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's necessary because I didn't rewrite the subquery. It's going to append its rows onto the end of the row that comes in, and it's going to be allowed to do field accesses into the prefix and the suffix of that row. Based on the inspection, it shouldn't access these nil
values though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I get it now. The field indexes in the subquery are based off analysis of the subquery expression using the scope node, and you're not giving it this scope node. You're giving it a fake one you construct that has the same length. And these null vals won't get used because this node won't be constructed in that case.
Some time I'm going to have to clean up the analyzer to make scope management more principled / consistent.
@zachmu Took a pass but didn't address everything. TAL if you want :). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
// range [low, high). | ||
func nodeHasGetFieldReferenceBetween(n sql.Node, low, high int) bool { | ||
var found bool | ||
plan.Inspect(n, func(n sql.Node) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh that's true
No description provided.