Skip to content

Fire PushPredicateIntoTableScan rule with change in TableHandle#3470

Merged
martint merged 5 commits intotrinodb:masterfrom
phd3:hive-predicate-pushdown-fix
Apr 23, 2020
Merged

Fire PushPredicateIntoTableScan rule with change in TableHandle#3470
martint merged 5 commits intotrinodb:masterfrom
phd3:hive-predicate-pushdown-fix

Conversation

@phd3
Copy link
Copy Markdown
Member

@phd3 phd3 commented Apr 18, 2020

If there's a constraint in the filter node that the connector can "enforce", PushPredicateIntoTableScan modifies the plan to incorporate the new scan and residue filter.

In case of non-partition pushdown for hive, this is not true. The "compactEffectivePredicate" added to the table handle may or may not be satisfied by the HivePageSource (eg. ORC vs Avro). So we cannot get rid of those predicates from the FilterNode. But even though HiveMetadata::applyFilter returns a ConstraintApplicationResult, PushPredicateIntoTableScan throws it away, since arePlansSame doesn't consider change in table handles, if the enforced constraint hasn't changed. eg. Test added in TestConnectorPushdownRulesWithHive fails without this change.

Predicates on top level columns still get pushed down to ORC/Parquet without this change, because AddExchanges::visitFilter adds the predicates, but I think the predicates should get pushed down in PushPredicateIntoTableScan.

I added connector table handle comparison in arePlansSame. (That failed some tests initially due to the fact that BucketFilter equality was object reference based, now they pass). But I'm not sure if comparing connector table handles is okay, or we need a better approach to tell that there was "some level" of predicate pushdown, even if it's not guaranteed to be enforced.

This change is also required for #1720 and #2672 to work together to pushdown predicates on nested columns.

@cla-bot cla-bot bot added the cla-signed label Apr 18, 2020
@phd3 phd3 changed the title Pushdown unenforced constraints to Hive through PushPredicateIntoTableScan Fire PushPredicateIntoTableScan rule with change in ConnectorTableHandle Apr 18, 2020
@phd3 phd3 requested a review from martint April 18, 2020 00:02
@phd3 phd3 force-pushed the hive-predicate-pushdown-fix branch from 58c1602 to 2941c56 Compare April 18, 2020 21:39
@phd3 phd3 force-pushed the hive-predicate-pushdown-fix branch from 2941c56 to d0e0ed0 Compare April 21, 2020 21:15
@phd3 phd3 changed the title Fire PushPredicateIntoTableScan rule with change in ConnectorTableHandle Fire PushPredicateIntoTableScan rule with change in TableHandle Apr 21, 2020
@phd3 phd3 force-pushed the hive-predicate-pushdown-fix branch 2 times, most recently from 3f7e2aa to 28180d9 Compare April 21, 2020 22:49
@phd3
Copy link
Copy Markdown
Member Author

phd3 commented Apr 21, 2020

I've verified equals and hashCode methods for all ConnectorTableHandle implementations, and made modifications where required to incorporate additional fields for comparison.

Also added TableHandle equality method to address #3470 (comment). Using TableHandle comparison requires ConnectorTableLayoutHandle to be compared too. we want to get rid of the usage altogether now, but I think it can be taken up in a separate PR.

I've added the changes as separate commits right now for ease of review, will coalesce them into one if this feels noisy.

@martint martint self-requested a review April 22, 2020 14:52
@phd3 phd3 force-pushed the hive-predicate-pushdown-fix branch from 3f3554a to df81674 Compare April 22, 2020 22:08
@martint martint merged commit 13d6b29 into trinodb:master Apr 23, 2020
@martint martint added this to the 333 milestone Apr 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

2 participants