-
Notifications
You must be signed in to change notification settings - Fork 5.5k
feat: Add session properties for aggregation compaction #26874
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -544,3 +544,27 @@ output for each input batch. | |
| If this is true, then the protocol::SpatialJoinNode is converted to a | ||
| velox::core::SpatialJoinNode. Otherwise, it is converted to a | ||
| velox::core::NestedLoopJoinNode. | ||
|
|
||
| ``native_aggregation_compaction_bytes_threshold`` | ||
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
|
|
||
| * **Type:** ``bigint`` | ||
| * **Default value:** ``0`` | ||
|
|
||
| Native Execution only. Memory threshold in bytes for triggering string compaction | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do you envision compaction to apply for non-string data types ? If this is only for strings then we could clarify the naming to be specific about string compaction. Also its hard to follow from an end-user perspective It might be useful to describe how total string storage is calculated so that its easier to understand how to set this property. If we can compute that from some Velox metrics (available through Prometheus), then it would be great to share the computation. Do you have some Velox blog article or documentation for this work ? Would be great to link that here.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @aditi-pandit , thanks for the great questions,
For approx_most_frequent, the accumulator uses: Tracks activeBytes_ (bytes used by strings currently in the summary) and evictedBytes_ (bytes used by evicted/dead strings)
Unfortunately, there's no direct Velox/Prometheus metric for this today. Users may need to estimate based on their data characteristics (e.g., average string size × summary capacity × expected churn).
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
This would have been useful information to include in the documentation for the reader. |
||
| during global aggregation. When total string storage exceeds this limit and the | ||
| unused memory ratio is high, compaction is triggered to reclaim dead strings. | ||
| Disabled by default (0). Currently only applies to approx_most_frequent aggregate | ||
| with StringView type during global aggregation. | ||
|
|
||
| ``native_aggregation_compaction_unused_memory_ratio`` | ||
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
|
|
||
| * **Type:** ``double`` | ||
| * **Minimum value:** ``0`` | ||
| * **Maximum value:** ``1`` | ||
| * **Default value:** ``0.25`` | ||
|
|
||
| Native Execution only. Ratio of unused (evicted) bytes to total bytes that triggers | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Here as well its difficult to understand how "unused (evicted) bytes" is computed. If we can compute that from some Velox metrics (available through Prometheus), then it would be great to share the computation. |
||
| compaction. The value is in the range of [0, 1). Currently only applies to | ||
| approx_most_frequent aggregate with StringView type during global aggregation. | ||
Uh oh!
There was an error while loading. Please reload this page.