Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace supports_bounded_execution with supports_retract_batch #6695

Merged
merged 7 commits into from
Jun 16, 2023
Merged

Replace supports_bounded_execution with supports_retract_batch #6695

merged 7 commits into from
Jun 16, 2023

Conversation

mustafasrepo
Copy link
Contributor

@mustafasrepo mustafasrepo commented Jun 16, 2023

Which issue does this PR close?

Related to #5781

Rationale for this change

With the changes in the #6671, supports_retract_batch method is introduced to the Accumulator trait. With the introduction of supports_retract_batch method, supports_bounded_execution method is no longer necessary for the AggregateExpr trait. (Similar to the case we have move supports_bounded_execution trait from BuiltinWindowFunctionExpr to PartitionEvalautor trait.)

This PR removes supports_bounded_execution method from AggregateExpr and moves its functionality to supports_retract_batch method in the Accumulator` trait for existing accumulators.

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

stuartcarnie and others added 3 commits June 14, 2023 17:12
Rationale:

The default implementation of the `Accumulator` trait returns an error
for the `retract_batch` API.
@mustafasrepo mustafasrepo marked this pull request as draft June 16, 2023 06:40
@github-actions github-actions bot added core Core DataFusion crate logical-expr Logical plan and expressions physical-expr Physical Expressions labels Jun 16, 2023
# Conflicts:
#	datafusion/core/src/physical_plan/udaf.rs
#	datafusion/core/src/physical_plan/windows/mod.rs
#	datafusion/core/tests/user_defined_aggregates.rs
@github-actions github-actions bot removed logical-expr Logical plan and expressions core Core DataFusion crate labels Jun 16, 2023
@mustafasrepo mustafasrepo marked this pull request as ready for review June 16, 2023 06:50
@@ -155,8 +155,7 @@ impl WindowExpr for PlainAggregateWindowExpr {
}

fn uses_bounded_memory(&self) -> bool {
self.aggregate.supports_bounded_execution()
&& !self.window_frame.end_bound.is_unbounded()
!self.window_frame.end_bound.is_unbounded()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After thinking through this logic. Actually, as long as end_bound is not bounded (not UNBOUNDED FOLLOWING such as in the form N FOLLOWING). We can produce results without waiting for the whole data to come (If accumulator do not support retract_batch method. We wouldn't be able to run queries in the form M PRECEDING and N FOLLOWING, in this case we will give an error anyway.). Hence here we do not need to check for self.aggregate.supports_bounded_execution() (acc.supports_retract_batch() method with the new API.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense to me

@@ -139,8 +139,7 @@ impl WindowExpr for SlidingAggregateWindowExpr {
}

fn uses_bounded_memory(&self) -> bool {
self.aggregate.supports_bounded_execution()
&& !self.window_frame.end_bound.is_unbounded()
!self.window_frame.end_bound.is_unbounded()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to the case above.

@@ -248,7 +248,7 @@ pub trait PartitionEvaluator: Debug + Send {
/// (3,4),
/// ]
/// ```
fn evaluate_with_rank_all(
fn evaluate_all_with_rank(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, evaluate_all_with_rank is better name. As part of this PR, I changed the method from evaluate_with_rank_all to evaluate_all_with_rank.

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @mustafasrepo 🙏

@@ -96,12 +96,6 @@ pub trait AggregateExpr: Send + Sync + Debug + PartialEq<dyn Any> {
false
}

/// Specifies whether this aggregate function can run using bounded memory.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

THis is nice to have moved this logic entirely to the accumulator 👍

@alamb alamb merged commit 8da5f26 into apache:main Jun 16, 2023
21 checks passed
@mustafasrepo mustafasrepo deleted the feature/6671_exp branch July 14, 2023 07:01
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
physical-expr Physical Expressions
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants