Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve TreeNode and LogicalPlan APIs to accept owned closures, deprecate transform_down_mut() and transform_up_mut() #10126

Merged
merged 2 commits into from
Apr 22, 2024

Conversation

peter-toth
Copy link
Contributor

@peter-toth peter-toth commented Apr 18, 2024

Which issue does this PR close?

Closes #10097, part of: #10121.

Rationale for this change

  • Currently the only difference between TreeNode::transform_down() and TreeNode::transform_down_mut() is that transform_down() accepts an &Fn, while transform_down_mut() accepts an &mut FnMut closure.
    As Fn is subtrait of FnMut, it is possible to use transform_down_mut() instead of transform_down() everywhere. But because transform_down() has a better name, let's fix transform_down() and get rid of transform_down_mut().
    The same applies to TreeNode::transform_up() vs. TreeNode::transform_up_mut() as well.
  • Currently TreeNode::apply(), TreeNode::transform(), TreeNode::transform_down(), TreeNode::transform_up(), TreeNode::transform_down_up() and their LogicalPlan::...with_subqueries() variants use &Fn or &mut FnMut closures, which is a bit cumbersome for the API users. This PR proposes to change these APIs to accept owned FnMut closures.

Please note that this is an API breaking change but the fix is straightforward.

What changes are included in this PR?

This PR:

  • Modifies TreeNode::apply(), TreeNode::transform(), TreeNode::transform_down(), TreeNode::transform_up(), TreeNode::transform_down_up() and their LogicalPlan::...with_subqueries() variants to use owned FnMut closures.
  • Changes all transform_down_mut() calls to transform_down() and transform_up_mut() calls to transform_up().
  • Deprecates transform_down_mut() and transform_up_mut().
  • Removes LogicalPlan::transform_down_mut_with_subqueries() and LogicalPlan::transform_up_mut_with_subqueries() to keep LogicalPlan::..._with_subqueries() APIs in sync with TreeNode APIs. (Please note that transform_down_mut_with_subqueries() and transform_up_mut_with_subqueries() haven't been released yet so we can freely remove them.)

Are these changes tested?

Yes, with existing UTs.

Are there any user-facing changes?

No.

@github-actions github-actions bot added sql SQL Planner logical-expr Logical plan and expressions physical-expr Physical Expressions optimizer Optimizer rules core Core DataFusion crate labels Apr 18, 2024
@peter-toth
Copy link
Contributor Author

@alamb this PR might conflict with #10035, but I can update mine if yours gets merged first.

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @peter-toth -- this is looking really neat. It would be really cool if we could figure out how to avoid neeidng &mut in front of all the closures (and that might actually make the change backwards compatible too).

However, I realize this may not be possible / easy to do. If that is the case, I think this approach would be an improvement over the current state on master.

What do you think?

self,
f: &F,
f: &mut F,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried a bit locally to change this to be f: F (aka take an "owned" F) -- which would make it easier to call this API also works -- so the callsites don't need to be changed to be &mut all over the place

However, I tried it and I got some strange errors when instantiating macros 🤔

error: reached the recursion limit while instantiating `<expr::Expr as TreeNode>::transform_up::<&mut &mut &mut &mut ...>`
   --> /Users/andrewlamb/Software/arrow-datafusion/datafusion/common/src/tree_node.rs:184:50
    |
184 |         handle_transform_recursion_up!(self, |c| c.transform_up(&mut f), f)
    |                                                  ^^^^^^^^^^^^^^^^^^^^^^
    |
note: `transform_up` defined here
   --> /Users/andrewlamb/Software/arrow-datafusion/datafusion/common/src/tree_node.rs:180:5
    |
180 | /     fn transform_up<F: FnMut(Self) -> Result<Transformed<Self>>>(
181 | |         self,
182 | |         mut f: F,
183 | |     ) -> Result<Transformed<Self>> {
    | |__________________________________^
    = note: the full type name has been written to '/Users/andrewlamb/Software/arrow-datafusion/target/debug/deps/datafusion_expr-598aae957f8f2540.long-type.txt'


Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I did try this already and run into the same issue...
But now I think it is doable with some smart refactor. Let me update this PR today or tomorrow and then we can decide if this change make sense or not

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

d781ca7 changes TreeNode::apply(), TreeNode::transform(), TreeNode::transform_down(), TreeNode::transform_up(), TreeNode::transform_down_up() and their LogicalPlan::...with_subqueries() variants to use owned closures.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we like this change I can update the PR description.

@@ -60,7 +60,7 @@ impl PhysicalOptimizerRule for OptimizeAggregateOrder {
plan: Arc<dyn ExecutionPlan>,
_config: &ConfigOptions,
) -> Result<Arc<dyn ExecutionPlan>> {
plan.transform_up(&get_common_requirement_of_aggregate_input)
plan.transform_up(&mut get_common_requirement_of_aggregate_input)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just so weird -- I think most users would expect that this looks like this (though as I mentioned above, I couldn't make it work in a few minutes)

-        plan.transform_up(&mut get_common_requirement_of_aggregate_input)
+        plan.transform_up(get_common_requirement_of_aggregate_input)

}

/// Similarly to [`Self::transform_up`], rewrites this node and its inputs using `f`,
/// including subqueries that may appear in expressions such as `IN (SELECT
/// ...)`.
pub fn transform_up_with_subqueries<F: Fn(Self) -> Result<Transformed<Self>>>(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️

@berkaysynnada
Copy link
Contributor

Thanks, @peter-toth—this really simplifies things. I'm wondering if there's an option to avoid changing the transform's Fn to FnMut (thus not requiring &mut for closures) by using the rewrite API for passes that need mutability inevitably. AFAIK rewrite can handle that for the transform_up and transform_down cases.

… transform_down and transform_down_up APIs to accept owned closures
@peter-toth
Copy link
Contributor Author

Thanks, @peter-toth—this really simplifies things. I'm wondering if there's an option to avoid changing the transform's Fn to FnMut (thus not requiring &mut for closures) by using the rewrite API for passes that need mutability inevitably. AFAIK rewrite can handle that for the transform_up and transform_down cases.

My concern is that it would require to refactor all current transform_down_mut() and transform_up_mut() usecases to use the rewrite() API, which is a bigger change required from the API users than just change to transform_down() (or transform_up()) and & to &mut.

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @peter-toth -- I think this looks like a major step forward in TreeNode API usability (both due to removing the transform_***_mut variants, as well as accepting closures directly -- very nice 👌 🏆

Thanks, @peter-toth—this really simplifies things. I'm wondering if there's an option to avoid changing the transform's Fn to FnMut (thus not requiring &mut for closures) by using the rewrite API for passes that need mutability inevitably. AFAIK rewrite can handle that for the transform_up and transform_down cases.

@berkaysynnada I think the most recent change to this PR addresses your concern (closures do not need a &mut anymore). Can you take another look?

@alamb this PR might conflict with #10035, but I can update mine if yours gets merged first.

Thanks for the heads up. Since #10126 isn't approved / merged yet I think we can simply merge this one first (and then I can clean up that one with the changes)

cc @liukun4515 @mustafasrepo @ozankabak

@@ -164,7 +164,7 @@ impl ScalarUDFImpl for ScalarFunctionWrapper {
impl ScalarFunctionWrapper {
// replaces placeholders such as $1 with actual arguments (args[0]
fn replacement(expr: &Expr, args: &[Expr]) -> Result<Expr> {
let result = expr.clone().transform(&|e| {
let result = expr.clone().transform(|e| {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually think this is a really nice improvement in usability -- to not have to put a & in front of the closure is 💯

@peter-toth peter-toth changed the title Deprecate TreeNode::transform_down_mut() and TreeNode::transform_up_mut() methods Improve TreeNode and LogicalPlan APIs to accept owned closures Apr 19, 2024
@peter-toth
Copy link
Contributor Author

I've updated the PR title and description to reflect the latest changes.

@alamb alamb changed the title Improve TreeNode and LogicalPlan APIs to accept owned closures Improve TreeNode and LogicalPlan APIs to accept owned closures, deprecate transform_down_mut() and transform_up_mut() Apr 21, 2024
@alamb
Copy link
Contributor

alamb commented Apr 21, 2024

I plan to merge this PR on Monday unless anyone else would like additional time to review or has additional comments

@alamb
Copy link
Contributor

alamb commented Apr 22, 2024

Thanks again @peter-toth -- epic work

@alamb alamb merged commit 3c3cb87 into apache:main Apr 22, 2024
28 checks passed
ccciudatu pushed a commit to hstack/arrow-datafusion that referenced this pull request Apr 26, 2024
…eprecate `transform_down_mut()` and `transform_up_mut()` (apache#10126)

* Deprecate `TreeNode::transform_down_mut()` and `TreeNode::transform_up_mut()` methods

* Refactor `TreeNode` and `LogicalPlan` apply, transform, transform_up, transform_down and transform_down_up APIs to accept owned closures
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Core DataFusion crate logical-expr Logical plan and expressions optimizer Optimizer rules physical-expr Physical Expressions sql SQL Planner
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Deprecate TreeNode transform_xx_mut methods
3 participants