Skip to content

Increase default dynamic filter collection limits#17130

Merged
raunaqmorarka merged 4 commits intotrinodb:masterfrom
raunaqmorarka:large-df
Apr 26, 2023
Merged

Increase default dynamic filter collection limits#17130
raunaqmorarka merged 4 commits intotrinodb:masterfrom
raunaqmorarka:large-df

Conversation

@raunaqmorarka
Copy link
Copy Markdown
Member

Description

Increase default dynamic filter collection limits

Additional context and related issues

Potentially alternative to #16875

Release notes

(x) This is not user-visible or docs only and no release notes are required.
( ) Release notes are required, please propose a release note for me.
( ) Release notes are required, with the following suggested text:

# Section
* Fix some things. ({issue}`issuenumber`)

Copy link
Copy Markdown
Member

@sopel39 sopel39 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm % assuming coordinator is not becoming a bottleneck now. Might be worth running concurrency benchmarks.

@raunaqmorarka raunaqmorarka force-pushed the large-df branch 9 times, most recently from aff6be3 to bd487e9 Compare April 22, 2023 13:05
@raunaqmorarka raunaqmorarka requested a review from sopel39 April 22, 2023 17:12
@raunaqmorarka
Copy link
Copy Markdown
Member Author

lgtm % assuming coordinator is not becoming a bottleneck now. Might be worth running concurrency benchmarks.

I've added a couple of small tweaks and made the increases smaller based on benchmarks. PTAL again

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the trick here? Avoiding stream?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Each invocation to getDynamicFilterSummaries was creating a new map of completed dynamic filters and then we would lookup filter in that map.
I just simplified that to lookup in existing map directly.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unrelated change?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once we switch to higher default, the sizing factor on maxDistinctValues doesn't need to be as aggressive.
The previous value of 8192 also turned out to be good in microbenchmarking but a bit over eager in using memory for actual queries. Not big enough to notice any change in peak memory usage but I decided to make it more conservative anyway.

Avoid constructing RangeView
Resizing blockbuilder is not as expensive as rehashing
or a high fill ratio in TypedSet. So we can be more conservative
in it's sizing to stay under size limits
@raunaqmorarka raunaqmorarka merged commit 1dd8893 into trinodb:master Apr 26, 2023
@raunaqmorarka raunaqmorarka deleted the large-df branch April 26, 2023 03:18
@mosabua
Copy link
Copy Markdown
Member

mosabua commented Apr 26, 2023

Do we need docs for any of these? Or updates to existing docs?

@raunaqmorarka
Copy link
Copy Markdown
Member Author

Do we need docs for any of these? Or updates to existing docs?

Some of these properties are mentioned in dynamic filtering docs, we want to avoid advertising these low level knobs as we want users to stick to only using enable-large-dynamic-filters if they need to.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

3 participants