Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Combine evaluate_stateful and evaluate_inside_range #6665

Merged
merged 6 commits into from
Jun 14, 2023
Merged

Combine evaluate_stateful and evaluate_inside_range #6665

merged 6 commits into from
Jun 14, 2023

Conversation

mustafasrepo
Copy link
Contributor

@mustafasrepo mustafasrepo commented Jun 14, 2023

Which issue does this PR close?

Related to #5781

Rationale for this change

What changes are included in this PR?

This PR implements the proposal in the #PR12

  • evaluate_stateful and evaluate_inside_range are combined under single evaluate methods
  • uses_window_frame and supports_bounded_execution methods are moved from BoundedWindowFunctionExpr to PartitionEvaluator.

With the changes in this PR new Evaluators can be implemented according to table below.

uses_window_frame supports_bounded_execution function_to_implement
false false evaluate_all (if we were to implement PERCENT_RANK it would end up in this quadrant, we cannot produce any result without seeing whole data)
false true evaluate (optionally can also implement evaluate_all for more optimized implementation. However, there will be default implementation that is suboptimal) . If we were to implement ROW_NUMBER it will end up in this quadrant. Example OddRowNumber showcases this use case
true false evaluate (I think as long as uses_window_frame is true. There is no way for supports_bounded_execution to be false). I couldn't come up with any example for this quadrant
true true evaluate. If we were to implement FIRST_VALUE, it would end up in this quadrant

Are these changes tested?

Are there any user-facing changes?

@github-actions github-actions bot added the physical-expr Physical Expressions label Jun 14, 2023
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense to me @mustafasrepo and is a very nice cleanup. Thank you!

/// Can this function be evaluated with (only) rank
///
/// If `include_rank` is true, then [`Self::create_evaluator`] must
/// implement [`PartitionEvaluator::evaluate_with_rank`]
/// implement [`PartitionEvaluator::evaluate_with_rank_all`]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you plan to give the same treatment (move to PartitionEvaluator) to include_rank?

Copy link
Contributor Author

@mustafasrepo mustafasrepo Jun 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can do so, what do you think about it. Should we do that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For consistency, I have moved include_rank flag to PartitionEvaluator also.

ScalarValue::iter_to_array(res.into_iter())
} else {
Err(DataFusionError::NotImplemented(
"evaluate_all is not implemented by default".into(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"evaluate_all is not implemented by default".into(),
format!("evaluate_all is not implemented for {} when using window frames", self.name()),

I think the error could be more helpful, but I can also add that as a follow on PR

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great to me -- thanks again @mustafasrepo

@@ -115,12 +115,6 @@ pub(crate) struct RankEvaluator {
}

impl PartitionEvaluator for RankEvaluator {
fn get_range(&self, idx: usize, _n_rows: usize) -> Result<Range<usize>> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️

@alamb alamb merged commit 4184a7f into apache:main Jun 14, 2023
@mustafasrepo mustafasrepo deleted the feature/refactor_window_eval branch July 25, 2023 05:55
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
physical-expr Physical Expressions
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants