Skip to content

[Draft] Add support to filter out plans that use Timestamp with Timezone#23179

Draft
kgpai wants to merge 1 commit intoprestodb:masterfrom
kgpai:filter_out_timestamp_tz
Draft

[Draft] Add support to filter out plans that use Timestamp with Timezone#23179
kgpai wants to merge 1 commit intoprestodb:masterfrom
kgpai:filter_out_timestamp_tz

Conversation

@kgpai
Copy link
Copy Markdown
Contributor

@kgpai kgpai commented Jul 11, 2024

Description

This change causes the native worker to throw if Timestamp with Timezone is encountered in the plan. This is gated on the configuration property 'fail_on_timestamp_with_timezone'.

Motivation and Context

Due to issue facebookincubator/velox#10338, there can be cases where comparisons with Timestamp with Timezone can be wrong.

Impact

Will add further details soon.

Test Plan

Working on adding tests.

== NO RELEASE NOTE ==

@kgpai kgpai requested a review from a team as a code owner July 11, 2024 01:13
@kgpai kgpai marked this pull request as draft July 11, 2024 01:14
if (auto joinNode =
std::dynamic_pointer_cast<const core::AbstractJoinNode>(planNode)) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can AbstractJoinNode be used standalone?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it would cover both HashJoin and MergeJoin.

return true;
}
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AggregationNode::aggregates_::call is a CallTypedExprPtr that can contain a LambdaTypedExprPtr as input. Should we check it too?

@facebook-github-bot
Copy link
Copy Markdown
Collaborator

@gggrace14 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@kgpai kgpai force-pushed the filter_out_timestamp_tz branch from 8ef9d99 to 1ec636e Compare July 11, 2024 08:05
@facebook-github-bot
Copy link
Copy Markdown
Collaborator

@kgpai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@kgpai kgpai force-pushed the filter_out_timestamp_tz branch from 1ec636e to 5c38d5a Compare July 11, 2024 17:38
@facebook-github-bot
Copy link
Copy Markdown
Collaborator

@kgpai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@kgpai kgpai force-pushed the filter_out_timestamp_tz branch from 5c38d5a to 064a415 Compare July 11, 2024 18:35
@facebook-github-bot
Copy link
Copy Markdown
Collaborator

@kgpai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@aditi-pandit
Copy link
Copy Markdown
Contributor

@kgpai @gggrace14 @kagamiori : Am curious why the invalidation happens during Prestissimo plan conversion instead of the Presto sql parser/semantic validation of the plan ? This code is brittle and could need a lot of tweaking. Its better to reject such a query at the co-ordinator itself if we are using native engine.

It would be cleaner to drop the Timestamp with Timezone type and functions from the list here for native engine
https://github.com/prestodb/presto/blob/master/presto-main/src/main/java/com/facebook/presto/metadata/BuiltInTypeAndFunctionNamespaceManager.java#L681

https://github.com/prestodb/presto/blob/master/presto-main/src/main/java/com/facebook/presto/metadata/BuiltInTypeAndFunctionNamespaceManager.java#L681

There could be other approaches as well.

@tdcmeehan : Would be great to hear your thoughts.

@kgpai kgpai force-pushed the filter_out_timestamp_tz branch from 064a415 to 1a34455 Compare July 11, 2024 22:03
@facebook-github-bot
Copy link
Copy Markdown
Collaborator

@kgpai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@amitkdutta
Copy link
Copy Markdown
Contributor

amitkdutta commented Jul 11, 2024

@kgpai @gggrace14 @kagamiori : Am curious why the invalidation happens during Prestissimo plan conversion instead of the Presto sql parser/semantic validation of the plan ? This code is brittle and could need a lot of tweaking. Its better to reject such a query at the co-ordinator itself if we are using native engine.

It would be cleaner to drop the Timestamp with Timezone type and functions from the list here for native engine https://github.com/prestodb/presto/blob/master/presto-main/src/main/java/com/facebook/presto/metadata/BuiltInTypeAndFunctionNamespaceManager.java#L681

https://github.com/prestodb/presto/blob/master/presto-main/src/main/java/com/facebook/presto/metadata/BuiltInTypeAndFunctionNamespaceManager.java#L681

There could be other approaches as well.

@tdcmeehan : Would be great to hear your thoughts.

@aditi-pandit We plan to fix this in velox as quickly as possible. We are putting this interim solution guarded by a query config and intend to remove it as soon as velox support is landed (which we expected to be done in a few weeks). Don't want to add any change in co-ordinator at this moment since its temporary. Ideally we want to make the velox change, but it will require a few weeks.

@kgpai kgpai force-pushed the filter_out_timestamp_tz branch from 1a34455 to 9200982 Compare July 12, 2024 00:19
@facebook-github-bot
Copy link
Copy Markdown
Collaborator

@kgpai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@tdcmeehan
Copy link
Copy Markdown
Contributor

Understood that this is a temporary measure. But I don't understand the hesitation around adding this into the coordinator? It would just be a better check--it would occur during analysis, rather than during execution (which is after resources have been allocated, after queueing, tasks have been spun up and splits have begun to be enumerated). I also think it would just be simpler to put it there.

@amitkdutta
Copy link
Copy Markdown
Contributor

Understood that this is a temporary measure. But I don't understand the hesitation around adding this into the coordinator? It would just be a better check--it would occur during analysis, rather than during execution (which is after resources have been allocated, after queueing, tasks have been spun up and splits have begun to be enumerated). I also think it would just be simpler to put it there.

@tdcmeehan @aditi-pandit Yea coordinator is definitely a better place, I am trying here
#23191
I stumbled upon into two cases (#23191 (comment)), will be really helpful if you can guide and take a look. This change is very important and also time sensitive.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants