Skip to content

Add preferred timeout for small dynamic filters#22527

Closed
Dith3r wants to merge 3 commits intotrinodb:masterfrom
Dith3r:ke/df-timeout
Closed

Add preferred timeout for small dynamic filters#22527
Dith3r wants to merge 3 commits intotrinodb:masterfrom
Dith3r:ke/df-timeout

Conversation

@Dith3r
Copy link
Copy Markdown
Member

@Dith3r Dith3r commented Jun 27, 2024

Description

Add a CBO rule to estimate if dynamic filter is worth waiting for. Focus on small tables that are most often used as dimension tables and nodes that generate small number of distinct values.

Rule DeterminePreferredDynamicFilterTimeout could allow us to remove 20s forced wait time for every possible DF in JDBC connectors if table presents the least row count statistic.

Query Baseline duration[s] PR duration[s]
tpcds/q59 12.10 4.55
tpcds/q02 9.63 3.47
tpcds/q36 5.82 2.60
tpcds/q53 8.04 4.80
tpcds/q13 7.7 5.1

Additional context and related issues

Release notes

( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
(X) Release notes are required, with the following suggested text:

# General
* Improve performance of queries with selective joins. ({issue}`22527`)

@cla-bot cla-bot bot added the cla-signed label Jun 27, 2024
@github-actions github-actions bot added iceberg Iceberg connector delta-lake Delta Lake connector hive Hive connector mongodb MongoDB connector labels Jun 27, 2024
@Dith3r Dith3r force-pushed the ke/df-timeout branch 6 times, most recently from 3e150d1 to 6ee8f40 Compare July 3, 2024 19:23
@Dith3r Dith3r force-pushed the ke/df-timeout branch 4 times, most recently from a70fc00 to c970010 Compare July 16, 2024 06:08
@Dith3r Dith3r force-pushed the ke/df-timeout branch 3 times, most recently from 41e1320 to 5238c35 Compare July 16, 2024 11:09
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why isn’t this a Rule or set of them? Visitor-based optimizers are legacy, and we’re moving away from them. We should avoid introducing any new ones.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're trying to change an attribute of the filter node on the probe side scan of join based on the estimated size of the build side of that join. I'm not sure how this would be designed as one or more series of Rules.
Do we have a similar Rule elsewhere that we could use as a reference ?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rule approach would require matching against Join/SemiJoin/DynamicFilterSourceNode and processing all probe side FilterNodes, extracting conjuncts to match against join node dynamic filter. It would cause far more lookups and rewrites than extracting all dynamic filter sources and process filter nodes once.

@Dith3r Dith3r force-pushed the ke/df-timeout branch 2 times, most recently from 20ba16f to 09b8a31 Compare July 17, 2024 07:16
@github-actions
Copy link
Copy Markdown

github-actions bot commented Aug 7, 2024

This pull request has gone a while without any activity. Tagging the Trino developer relations team: @bitsondatadev @colebow @mosabua

@github-actions github-actions bot added the stale label Aug 7, 2024
@wendigo
Copy link
Copy Markdown
Contributor

wendigo commented Aug 7, 2024

@Dith3r can you rebase?

@wendigo
Copy link
Copy Markdown
Contributor

wendigo commented Aug 8, 2024

@sopel39 ptal

@sopel39
Copy link
Copy Markdown
Member

sopel39 commented Aug 8, 2024

I think @martint has some architecture comments for this PR. @martint are you OK with approach here?

@github-actions github-actions bot removed the stale label Aug 8, 2024
@github-actions
Copy link
Copy Markdown

This pull request has gone a while without any activity. Tagging the Trino developer relations team: @bitsondatadev @colebow @mosabua

@github-actions github-actions bot added the stale label Aug 29, 2024
@wendigo
Copy link
Copy Markdown
Contributor

wendigo commented Aug 29, 2024

@Dith3r please rebase :)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Dith3r could you remind me why it was the case?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you call getPreferredTimeout before acquiring future from isBlocked you can receive future with excluded already resolved DF associated with preferred timeout and wait for DFs that have no preferred timeout set.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If DF is already resolved, then no waiting will happen anyway, right?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

Add a CBO rule to estimate if dynamic filter is worth waiting for.
Focus on small tables that are most often used as dimension tables.
DynamicFilters.Descriptor::getId,
DynamicFilters.Descriptor::getPreferredTimeout,
(timeout1, timeout2) -> {
if (timeout1.isPresent() && timeout2.isPresent()) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you could use io.trino.util.Optionals#combine

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would require similar function which is based on OptionalLong.

return !isAtMostScalar(joinNode.getRight()) && !isAtMostScalar(joinNode.getLeft());
}

// Skip for joins with multi keys since output row count stats estimation can wrong due to
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can wrong -> can be wrong

}

// Skip for joins with multi keys since output row count stats estimation can wrong due to
// low correlation between multiple join keys.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it should probably: can be wrong due to potentially high correlation between join columns, which might lead to underestimation of join output row count

private static boolean isExpandingPlanNode(PlanNode node)
{
return node instanceof JoinNode
// consider union node and exchange node with multiple sources as expanding since it merge the rows
Copy link
Copy Markdown
Member

@sopel39 sopel39 Oct 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be more like union and exchange nodes that consume from multiple source are considered expanding since they produce more data than their children individually


private DynamicFilterTimeout getBuildSideState(PlanNode planNode, Symbol dynamicFilterSymbol)
{
if (isAtMostScalar(planNode)) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment would be nice + commit message

.matches();
}

private static boolean isExpandingPlanNode(PlanNode node)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be moved below usage

@Dith3r Dith3r closed this Nov 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla-signed delta-lake Delta Lake connector hive Hive connector iceberg Iceberg connector mongodb MongoDB connector performance

Development

Successfully merging this pull request may close these issues.

5 participants