Fix infinite loop during planning caused by PickTableLayout by LiBrian415 · Pull Request #16627 · prestodb/presto

LiBrian415 · 2021-08-19T01:25:43Z

This diff is a fix for an edge case that causes infinite loops in the PickTableLayout$pushPredicateIntoTableScan.
Specifically, if a presto query contains a predicate that compares a complex field (array, map, ...) with a value of the
same type e.g. some_array_field <> array[], pushPredicateIntoTableScan will infinitely loop until it times out or until
a stackoverflowerror. The fact that these are complex fields might not matter, it's just that the value of these
types are generally already blocks.

While the planning timing out makes sense, the stackoverflowerror needs a bit of explaining. In the constructor
of TableScanNode, it wraps the output and assignment collections in unmodifiablecollections. While this is fine
on it's own, it becomes a problem when you have an infinite loop in pushPredicateIntoTableScan. Looking at the
implementation of pushPredicateIntoTableScan, each the it's called, it creates a new TableScanNode by passing in
values for the previous TableScanNodes. This creates N-depth wrapped unmodifiableCollections where N is the number
of pushPredicateIntoTableScan called on the same node. Because of how unmodifiablecollections are implemented,
this then causes N calls to stream(), toArray(), iterator(), ... each time one is called.

Now, that we know the end result is an infinite loop that causes a timeout or a stackoverflow, the remaining
question is why does this happen in the first place. After some investigation, this seems to happen because of
how domains are gathered during the pushPredicateIntoTableScan. Specifically, using a BASIC_COLUMN_EXTRACTOR,
predicates are extracted as domains if they take the from of a simple <VariableReferenceExpression, Anything>.
This also explains why predicates like map_key(<some_map>) = array[] still work since it'll be more complex than
the pattern above so they won't be pulled in as a domain. When creating a domain, the values are then converted to
their native block types. However, since these fields are generally not hive partitions, when a layout is generated,
they'll be considered as unenforcedConstraints. However, due to the processing of predicate -> domain -> predicate,
the resultingPredicates will have the same value but they're actually different objects. Since these Block subclasses don't
override equals, when the PickTableLayout checks if there has been some transformation (the optimization produced
a different plan structure), it'll always think something has changed. However, the only thing that changed was the
object, but the information contained by the blocks are still the same.

The fix here is just to override equals and hashcode of blocks.

tldr; Marker creation in Domain creation constructs new blocks from old blocks. This causes issues when checking for
block equality as blocks don't override equals.

Note: this wasn't tested against the latest version of Presto.

v-jizhang · 2021-08-27T22:22:06Z

Hi @LiBrian415, I'm trying to understand how an infinite loop in pushPredicateIntoTableScan. Do you have repro steps to share? Thanks

LiBrian415 · 2021-08-31T01:25:46Z

Hi @v-jizhang. Thanks for taking a look. I was able to reproduce it locally on the latest version of presto by following: https://stackoverflow.com/questions/48932907/setup-standalone-hive-metastore-service-for-presto-and-aws-s3. For visibility, I've added the configs and commands that I used below.
hive.properties

connector.name=hive-hadoop2
hive.metastore=file
hive.metastore.catalog.dir=file:///tmp/hive_catalog
hive.metastore.user=<user_name>

create schema hive.default;
use hive.default;
create table t (a array<varchar>);
insert into t (values array['hi']);
select * from t where a <> array[];

After running select * from t where a <> array[];, I'll get a statement is too large (stack overflow during analysis) message and when I use the debugger on intellij, the planner seems to be stuck in the infinite mentioned before. I've only included a table with an array but using a map and comparing the col to map() would also cause the same infinite loop.

v-jizhang · 2021-09-07T23:01:35Z

@LiBrian415 Thanks, I can confirm the infinite loop exists and your fix look good to me.

v-jizhang

Should GroupByIdBlock$hashCode and LazyBlock$hashCode be implemented also? Thanks

kaikalur · 2021-09-08T01:00:45Z

I'm not able to repro with the latest version.

v-jizhang · 2021-09-08T17:15:19Z

I'm not able to repro with the latest version.

I've double checked. It is still in the current release https://prestodb.io/download.html and the current code.
I also tested this PR and I can confirm the bug is fixed by this PR.

v-jizhang

Should GroupByIdBlock$hashCode and LazyBlock$hashCode be implemented also? Thanks

LiBrian415 · 2021-09-09T06:14:29Z

Hi @v-jizhang, thanks for double checking that the issue is in the current release. I don't think overriding the methods for LazyBlock and GroupByIdBlock is necessary. LazyBlock seems to be only used around page sources, which is after the plan/optimization steps, and it's unlikely that there exists a native type that will be converted to a GroupByIdBlock.

tldr; Since the issue is caused by comparing blocks created from the conversation of native types to blocks during the optimization step, we'll only need to override hashCode and equals for blocks that can be potentially created when converting native values to block.

v-jizhang

LGTM

LiBrian415 · 2021-09-10T00:45:12Z

Thanks for the review. :D

rongrong · 2021-10-01T21:20:59Z

@LiBrian415 can you add a test case to AbstractTestQueries? Thanks!

highker · 2021-10-07T19:57:07Z

+1, @LiBrian415, could you add a test case that could repro the issue without your patch?

rohanpednekar · 2022-04-26T19:05:22Z

@LiBrian415 Hope all is well with you. Just checking in to see if you are still planning to work on this PR?
@v-jizhang Do you think you can pick this up further to get it merged if needed? Thanks!

LiBrian415 · 2022-04-27T07:37:02Z

@rohanpednekar Hi, sorry about that. I've been busy with school for the last few months and I'm not sure when I'll have time to finish this up. I'd like to have someone else pick it up if it's still useful. Thanks.

v-jizhang · 2022-05-03T23:01:25Z

Replaced by #17720

Replaces prestodb#16627 Co-authored-by: Brian Li <librian415@gmail.com>

Replaces #16627 Co-authored-by: Brian Li <librian415@gmail.com>

LiBrian415 changed the title ~~Fix infinite loop in planning stage caused by predicates involving array[]~~ Fix infinite loop during planning caused by PickTableLayout Aug 19, 2021

v-jizhang self-requested a review August 27, 2021 22:18

Override equals and hashcode for Blocks (excluding LazyBlock)

35a94ce

LiBrian415 force-pushed the array-predicate-fix branch from 3e76154 to 35a94ce Compare August 31, 2021 01:38

v-jizhang requested changes Sep 8, 2021

View reviewed changes

v-jizhang approved these changes Sep 9, 2021

View reviewed changes

rohanpednekar added the ready-to-merge label Sep 30, 2021

rongrong requested review from highker and mbasmanova October 1, 2021 21:19

highker approved these changes Oct 7, 2021

View reviewed changes

highker self-assigned this Oct 7, 2021

highker removed their assignment Oct 22, 2021

rohanpednekar added awaiting-community-help and removed ready-to-merge labels Apr 26, 2022

v-jizhang mentioned this pull request May 3, 2022

Fix infinite loop during planning caused by PickTableLayout #17720

Merged

v-jizhang closed this May 3, 2022

v-jizhang added a commit to v-jizhang/presto that referenced this pull request May 10, 2022

Fix infinite loop during planning caused by PickTableLayout

cf87ecc

Replaces prestodb#16627 Co-authored-by: Brian Li <librian415@gmail.com>

rschlussel pushed a commit that referenced this pull request May 10, 2022

Fix infinite loop during planning caused by PickTableLayout

bd48fc3

Replaces #16627 Co-authored-by: Brian Li <librian415@gmail.com>

Conversation

LiBrian415 commented Aug 19, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

v-jizhang commented Aug 27, 2021

Uh oh!

LiBrian415 commented Aug 31, 2021

Uh oh!

v-jizhang commented Sep 7, 2021

Uh oh!

v-jizhang left a comment

Choose a reason for hiding this comment

Uh oh!

kaikalur commented Sep 8, 2021

Uh oh!

v-jizhang commented Sep 8, 2021

Uh oh!

v-jizhang left a comment

Choose a reason for hiding this comment

Uh oh!

LiBrian415 commented Sep 9, 2021

Uh oh!

v-jizhang left a comment

Choose a reason for hiding this comment

Uh oh!

LiBrian415 commented Sep 10, 2021

Uh oh!

rongrong commented Oct 1, 2021

Uh oh!

highker commented Oct 7, 2021

Uh oh!

rohanpednekar commented Apr 26, 2022

Uh oh!

LiBrian415 commented Apr 27, 2022

Uh oh!

v-jizhang commented May 3, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

LiBrian415 commented Aug 19, 2021 •

edited

Loading