-
Notifications
You must be signed in to change notification settings - Fork 1.9k
TopDown EnforceSorting implementation #5290
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@mustafasrepo @ozankabak @yahoNanJing |
|
Thank you, we will digest this in the next several days and leave comments as we make progress. |
|
The PR is very large in scope. It changes parts of the old code (and certainly makes some changes to its tests), and also adds new code (and new tests). It would be much easier to review this if it were broken down to two PRs, where the first one only replicates the current functionality, has no functionality regressions, and does not change any tests at all; with the second PR adding new functionality. Right now, the new rule is significantly longer than the old rule (which is bad), but it offers more functionality (which is great). So is switching from bottom-up to top-down a good change or a bad change? We can't tell easily. Now, let me share my (very) preliminary impression so far after a cursory look: I see that it has better handling of sort preserving merges, smarter push-down of sorts under unions, and adds support for sort merge joins. These are the good bits. The cons are that it seems to lose partition awareness (though I'm not sure about this yet) and it seems to regress on some cases where it was doing better before. I think at least some of these are due to the presumption that there is a global output ordering to preserve, and I am not sure I agree with that. Anyway, we will disentangle and review in detail, but I want to give you a heads up that this will take some time. We will need to analyze every case carefully, go back to the old version of the code (and tests), compare and contrast etc. Before we form an idea on the merits of bottom-up vs. top-down, our goal will be to create two functionally equal implementations passing exactly the same test suite. Without that, it is not possible to objectively decide. Whatever the result on bottom-up vs. top-down is, I think this exercise will end up making the rule better, so that's great 🚀 I will keep you posted as we make progress in the upcoming days. |
Sure, please take your time and I will add more comments to the code to explain the rule process. We can leverage the same framework to do more advanced optimizations like re-ordering(PostgreSQL has this optimization) the multiple window execs in the plan tree and further reduce the number of Sorts. Generally I think the Top-Down based approach is more easy and straightforward to collect and propagate necessary properties and find the global optimal plan. Some UT results are changed. Yes, I think the major arguing point is whether we should preserve output ordering during optimization process or we can trim the unnecessary sort columns. There are some other UT result changes, I am not sure whether they are due to the original bug or the new rule introduced regression, need to double confirm with you and check carefully, especially this one |
|
Regarding the WindowExec/Window expression reverse, do we support the reverse for all the built-in window functions? |
|
Not all built-in window functions are reversible. There is an indicator in the API called |
|
Some future work:
|
|
We started the work to get the two approaches to a comparable state, @mustafasrepo is actively working on it. We will post more updates as we make progress. |
| if prop.sort_options.is_some() { | ||
| PhysicalSortExpr { | ||
| expr: prop.expr.clone(), | ||
| options: prop.sort_options.unwrap(), | ||
| } | ||
| } else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can use If let Some(sort_options) = prop.sort_options idiom here. This would remove .unwrap()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure.
| let impact_result_ordering = plan.output_ordering().is_some() | ||
| || plan.output_partitioning().partition_count() <= 1 | ||
| || plan.as_any().downcast_ref::<GlobalLimitExec>().is_some() | ||
| || plan.as_any().downcast_ref::<LocalLimitExec>().is_some(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Patterns in the form: plan.as_any().downcast_ref::<ExecName>().is_some() can be converted to plan.as_any().is::<ExecName>().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, will do.
| .plan | ||
| .as_any() | ||
| .downcast_ref::<GlobalLimitExec>() | ||
| .is_some() | ||
| || self | ||
| .plan | ||
| .as_any() | ||
| .downcast_ref::<LocalLimitExec>() | ||
| .is_some() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment above applies here. Also you can construct a util function is_limit to check for Exec is GlobalLimitExec or LocalLimitExec. This util can be used for the check at the PlanWithSortRequirements::init.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
| // SortPreservingMergeExec + SortExec(local/global) is the same as the global SortExec | ||
| // Remove unnecessary SortPreservingMergeExec + SortExec(local/global) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of SortPreservingMergeExec + SortExec(local/global), you can use SortExec(local/global) -> SortPreservingMergeExec. Second reflects hierarchy better. I think.
| } else if plan.as_any().downcast_ref::<WindowAggExec>().is_some() | ||
| || plan | ||
| .as_any() | ||
| .downcast_ref::<BoundedWindowAggExec>() | ||
| .is_some() | ||
| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can write a util function is_window to check for whether executor is WindowAggExec or BoundedWindowAggExec. There are couple of place this check is done. I think it will be handy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure
| let flags = window_plan | ||
| .children() | ||
| .into_iter() | ||
| .map(|child| { | ||
| // If the child is leaf node, check the output ordering | ||
| if child.children().is_empty() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At this state we know that executor is window. It has exactly one child. Hence I think we can remove this iteration. We can just use body of map, where child is window_plan.children()[0].clone().
| if requirements_compatible( | ||
| top_requirement, | ||
| child.required_input_ordering()[0].as_deref(), | ||
| || child.equivalence_properties(), | ||
| ) || requirements_compatible( | ||
| child.required_input_ordering()[0].as_deref(), | ||
| top_requirement, | ||
| || child.equivalence_properties(), | ||
| ) || requirements_compatible( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This pattern is used a lot. I think you can construct a util to check whether one of the requirements satisfy other (where function does this check). That would be better I think.
| fn extract_window_info_from_plan( | ||
| plan: &Arc<dyn ExecutionPlan>, | ||
| ) -> Option<WindowExecInfo> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this function can return Result<WindowExecInfo>. This change would remove some the unwraps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure
| if requirement.sort_options.is_some() { | ||
| self.options == requirement.sort_options.unwrap() | ||
| && self.expr.eq(&requirement.expr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use If let Some() idiom here. this would remove .unwrap() in the body.
|
@mingmwang I went through this PR in detail. Thanks for the great work. Let me summarize my observations. The Top-Down approach is better at pushing down Since some tests improve and some regress with this change. I have constructed a unified test bench for these rules (I use union of the tests for In the meantime, This PR has nice changes that do not need to wait for final rule version. Specifically, printing global flag in |
Thanks a lot, I will take a closer look at your comparing result tomorrow. I guess most of the failure case in Top-Down rule is because the current implementation try to keep the final output ordering and will not remove all the SortExec very aggressively. I will also resolve all the review comments tomorrow. Regarding split the PR, I do want to split it into two, because in this PR actually besides the new rule implementation and the added new tests, other changes are very tiny. |
|
|
@mustafasrepo @ozankabak And In the TopDown rule implementation, I can add another configuration flag to enable removing Sorts more aggressively and achieve the same results as the the Bottom-up rule, but we still need to define clearly in which cases Sort can be removed. |
This would be great, thank you. We will perform one more pass of analysis with that and get clarity on what else is left (if any). |
datafusion/core/tests/sql/window.rs
Outdated
| "SortPreservingMergeExec: [c1@0 ASC NULLS LAST]", | ||
| " SortExec: expr=[c1@0 ASC NULLS LAST], global=false", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that final SortPreservingMergeExec in the plan is redundant. Its input has already single partition. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree with you. It is redundant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is still some optimization room in the TopDown rule regarding the window expr ordering and global ordering.
|
FYI @mingmwang, we created an implementation sketch combining bottom-up and top-down approaches, which passes the whole (unified) test set. You can find it here. We created this so that it can give us a baseline/goal in our quest to achieve full functionality with the simplest design possible. When you add the global ordering/window expr ordering functionality, we will do one more analysis and comparison with that baseline and give more feedback to you. |
Sure, I will take a closer look today and tomorrow. |
|
What is the status of this PR (it looks like it may be waiting to address some more feedback). Shall we mark it as "draft" as we work on it (I am trying to make sure all PRs marked as "ready for review" are actually waiting on review) |
Sounds good to me, we will let you know when we converge |
|
@mingmwang, it seems you are busy these days. I think it might be a good idea to create a PR to get the new/extended test suite and a base (passing) implementation in. Concurrently, we can continue collaborating on this PR to arrive at a more elegant design that passes the new/extended test suite and replace the base implementation. @mustafasrepo worked out the base impl and test suite already, so it will be easy for us to clean it up and get it in quickly. Sounds good? |
|
Just close this PR. |
Which issue does this PR close?
Closes #5289.
Rationale for this change
Reimplement the EnforceSorting rule in a Top-Down approach to add/remove Sort when unnecessary
The new implementation does not need to keep the lineage information and record the position SortExec in another tree structure. The Sort removing and adding is driven by the properties and requirements.
What changes are included in this PR?
Sortis redundant, the input of theSortalready has a finer ordering than thisSortenforces.Sortdoes not impact the final result ordering, some operators likeRepartitionExeccan not maintain the input ordering so that theSortin its descendants can be removedOrdering requirements is pushed down and propagated from the top node to its children and descendants. The basic process is:
Sort, generate new requirements if the current node itself has sort requirements to its input(required_input_ordering). ForUnionExec, also generated new requirements based onUnionExec's output ordering properties to keep the main input ordering semantics correct and trim the unnecessary sort columns.Sort.SortMergeJoinExec,WindowAggExec,SortPreservingMergeExec, etc) itself has its own sort requirements to its input. Check the compatibility of the parent requirements with its own sort requirements.a) If the required input ordering is more specific, do not push down the parent requirements, keep everything unchanged.
b) If the the parent requirements are more specific, push down the parent requirements
c) If they are not compatible, add
Sort, generate new requirements from required input ordering.This is achieved by the combination of
GlobalSortSelection,EnforceDistributionandEnforceSorting.Are these changes tested?
Are there any user-facing changes?