Skip to content

Comments

[native] Add session property for partial aggregation and spill#23527

Merged
kewang1024 merged 1 commit intoprestodb:masterfrom
kewang1024:add-session
Aug 28, 2024
Merged

[native] Add session property for partial aggregation and spill#23527
kewang1024 merged 1 commit intoprestodb:masterfrom
kewang1024:add-session

Conversation

@kewang1024
Copy link
Collaborator

@kewang1024 kewang1024 commented Aug 27, 2024

== RELEASE NOTES ==

General Changes
* Add a session property `native_max_partial_aggregation_memory` which specifies presto native max partial aggregation memory when data reduction is not optimal :pr:`23527`
* Add a session property `native_max_extended_partial_aggregation_memory` which specifies presto native max partial aggregation memory when data reduction is optimal :pr:`23527`
* Add a session property `native_max_spill_bytes ` which specifies presto native max allowed spill bytes :pr:`23527`

@kewang1024 kewang1024 requested review from a team as code owners August 27, 2024 08:21
@steveburnett
Copy link
Contributor

Should this be documented on https://prestodb.io/docs/current/presto_cpp/properties.html ?

xiaoxmeng
xiaoxmeng previously approved these changes Aug 27, 2024
Copy link
Contributor

@xiaoxmeng xiaoxmeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kewang1024 LGTM. Thanks!

Copy link
Contributor

@xiaoxmeng xiaoxmeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kewang1024 do we still have the sanity check for the native session conversion in Prestissimo? Thanks!

public static final String NATIVE_EXECUTION_PROCESS_REUSE_ENABLED = "native_execution_process_reuse_enabled";
public static final String NATIVE_DEBUG_VALIDATE_OUTPUT_FROM_OPERATORS = "native_debug_validate_output_from_operators";

public static final String NATIVE_MAX_PARTIAL_AGGREGATION_MEMORY = "native_max_partial_aggregation_memory";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want to expose max_extended_partial_aggregation_memory as well? Velox supports to dynamically adjust aggregation memory limit in a range defined by the above two configs? Thanks!

@kewang1024 kewang1024 changed the title [native] Add session property for max_partial_aggregation_memory and max_spill… [native] Add session property for partial aggregation and spill Aug 27, 2024
@kewang1024
Copy link
Collaborator Author

kewang1024 commented Aug 27, 2024

Should this be documented on https://prestodb.io/docs/current/presto_cpp/properties.html ?

Actually this is session property which is different than the configs, so should not be added here.
But we can start a new wiki to document the session properties, do you think you can help with setting up new page? @steveburnett

@kewang1024 kewang1024 force-pushed the add-session branch 2 times, most recently from 9bf6160 to 5839bcb Compare August 27, 2024 23:26
Copy link
Contributor

@aditi-pandit aditi-pandit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @kewang1024 for this code.

Also as per @steveburnett's comment we should document atleast native_max_spill_bytes at https://prestodb.io/docs/current/presto_cpp/properties.html

/// The maximum allowed spill file size.
static constexpr const char* kMaxSpillFileSize = "native_max_spill_file_size";

static constexpr const char* kMaxSpillBytes = "native_max_spill_bytes";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be good to specify all the operators (Hashjoin, HashAgg, OrderBy, RowNumber, Window etc) to which this value applies

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have much context on spilling so might not have the full list, @xiaoxmeng do you know the full list?

I can add in a separate PR. We need this change to make an internal release cut.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kewang1024 @aditi-pandit: this applies at task level and is not specific to a particular operator.

Copy link
Contributor

@yingsu00 yingsu00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kewang1024 This PR needs the release note. Will you be able to add one? Thanks.

@steveburnett
Copy link
Contributor

Should this be documented on https://prestodb.io/docs/current/presto_cpp/properties.html ?

Actually this is session property which is different than the configs, so should not be added here. But we can start a new wiki to document the session properties, do you think you can help with setting up new page? @steveburnett

Like the Presto Properties Reference page, the Presto C++ Properties Reference page begins with the statement "The following pages are not a complete list of all configuration and session properties ".

I know the sentence before that reads "This section describes Presto C++ configuration properties." which is exclusionary, but that could be changed to "This section describes Presto C++ configuration and session properties." and we don't have a separate session properties page in the Presto doc. (Maybe we should.)

I don't think we need a separate page for Presto C++ session properties, so I think all three of these session properties could be added to Presto C++ Properties Reference.

We could highlight the distinction for session properties in the description. For example, in your current draft adding max_spill_bytes to Presto C++ Properties Reference, the description could read "This session property (my new words emphasized here for clarity) specifies the max spill bytes limit set for each query."

If you or others feel strongly we need a separate page for Presto C++ Session Properties, I am open to the idea. Help me understand why a separate page is important.

@jaystarshot jaystarshot mentioned this pull request Nov 1, 2024
25 tasks
@tdcmeehan tdcmeehan added the from:Meta PR from Meta label Dec 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

from:Meta PR from Meta

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants