Determine automatically if push join to table scan#6818
Determine automatically if push join to table scan#6818losipiuk wants to merge 2 commits intotrinodb:masterfrom
Conversation
5513f17 to
92dbbb0
Compare
There was a problem hiding this comment.
What if one of left output count or right count is known and larger than join output row count, why not pushdown join in such case as well ?
There was a problem hiding this comment.
Yeah - we could. Though it is strictly theoretical case. As if we do not know either left or right size. We would not know the join size :)
There was a problem hiding this comment.
Ah right, I missed that. Any particular reason for basing this on row count instead of size ?
There was a problem hiding this comment.
Not really. Probably size would be more appropriate. I will see how painful it is to change that.
i plan to review this once that one is merged |
92dbbb0 to
0e8b453
Compare
There was a problem hiding this comment.
it is safe to make it the default
There was a problem hiding this comment.
Since stats calculation can be costly (eg can involve a trip to metastore), short-circuit calculation as early as you can.
To keep this readable, please extract the condition to a separate method.
There was a problem hiding this comment.
ideally use a switch, and make it exhaustive, future proofing for the case when we add something like AUTOMATIC_EAGER (which we don't have to add yet, but we may want to add in the future)
There was a problem hiding this comment.
nvrm, in this case it doesn't matter -- this is the only place the enum is used, so no way it gets forgotten and not updated
There was a problem hiding this comment.
While this is not a rocket science, it'd be nice to add some comment, eg why we're choosing + over max.
from my perspective it was some 'random thought from findepi' (and i don't feel strongly), but still let's safe future readers suffering and try to word some explanation.
There was a problem hiding this comment.
I added some reasoning. Not sure if helpful
0e8b453 to
70680d7
Compare
|
ac |
There was a problem hiding this comment.
see conversation about code level documentation in the other pr
There was a problem hiding this comment.
Added comment as a separate commit before introducing AUTOMATIC mode.
There was a problem hiding this comment.
I think the getJoinPushdownMode should be consulted inside shouldProceedWithPushDown
(or you'd want to rename the method to indicate it's appropriate for "automatic" mode only)
There was a problem hiding this comment.
Renamed the method to skipJoinPushdownBasedOnCost (reversing true/false return value semantics), and moved getJoinPushdownMode(context.getSession()) == JoinPushdownMode.AUTOMATIC inside.
Add "automatic" mode of join pushdown operation. In that mode join will only be pused down into table scan if statistics are available for join node and both source table scan nodes. And if expected numuber of rows coming out of join is less than total number of rows from both sources.
70680d7 to
2e3ebdf
Compare
| @@ -135,16 +135,7 @@ public class FeaturesConfig | |||
| private DataSize filterAndProjectMinOutputPageSize = DataSize.of(500, KILOBYTE); | |||
There was a problem hiding this comment.
Even if number of rows after pushdown is smaller then without pushdown it could significantly increase cpu overhead of underlying source (table scans might be much cheaper than join). I think it would be great to determine what's the impact of pushdown on underlying connectors. It could be that join pushdown is beneficial only when joins are very non selective and users don't want cpu of underlying connector to increase significantly.
There was a problem hiding this comment.
Agreed. Yet I would assume that you will still be able to disable pushdown on per-connector level in configuration. As well as per-query using session.
There was a problem hiding this comment.
Totally -- #6874 provides both catalog level config and session toggle.
| return true; | ||
| } | ||
|
|
||
| if (joinOutputSize > leftOutputSize + rightOutputSize) { |
There was a problem hiding this comment.
Consider adding some factor here, e.g pushed down join should produce 2x less rows than in trino. Such factor might need to be empirically established
There was a problem hiding this comment.
so you mean to replace left + right with max(left, right) * 0.5? Works for me, given that the current formula is not very scientificly determined.
I think we should do "something reasonable" & iterate.
There was a problem hiding this comment.
Yeah - I find initial value of a factor 1.0 as good as 0.5
On top of: #6752Review last commit only.