Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add static_name() to ExecutionPlan #10266

Merged
merged 2 commits into from
Apr 29, 2024

Conversation

waynexia
Copy link
Member

Which issue does this PR close?

Closes #10246

Rationale for this change

Add static_name() method to make the name of an execution plan accessible without constructing the struct.

What changes are included in this PR?

A new public API in ExecutionPlan

Are these changes tested?

Yes

Are there any user-facing changes?

Yes

@@ -474,7 +474,7 @@ type SharedGate = Arc<Gate>;

#[cfg(test)]
mod tests {
use std::sync::atomic::{AtomicBool, Ordering};
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not related to this PR but it fails clippy check in my env.

@andygrove
Copy link
Member

@waynexia This looks good, but could you add a test to demonstrate what the output looks like?

Signed-off-by: Ruihang Xia <[email protected]>
@waynexia
Copy link
Member Author

@waynexia This looks good, but could you add a test to demonstrate what the output looks like?

Thanks for reviewing. I added one in fbb5f4a, please check it out

Copy link
Member

@andygrove andygrove left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks @waynexia

@waynexia waynexia merged commit c963d21 into apache:main Apr 29, 2024
23 checks passed
@matthewmturner
Copy link
Contributor

This is nice @waynexia

@waynexia waynexia deleted the exec-plan-static-name branch April 29, 2024 13:08
Comment on lines +120 to +125
fn name(&self) -> &'static str
where
Self: Sized,
{
Self::static_name()
}
Copy link
Contributor

@joroKr21 joroKr21 Jun 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now we can't call plan.name() inside ExecutionPlanVisitor:

    fn pre_visit(&mut self, plan: &dyn ExecutionPlan) -> Result<bool, Self::Error> {
        let plan_name = plan.name().to_owned();
        ...
    }
error: the `name` method cannot be invoked on a trait object
     |
1552 |         let plan_name = plan.name().to_owned();
     |                              ^^^^
     |
     |
121  |         Self: Sized,
     |               ----- this has a `Sized` requirement

Copy link
Member Author

@waynexia waynexia Jun 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this method cannot be called from a trait object where Sized is not required.

I'm considering making a breaking API change to deprecate the default implementation and provide the static name on each impl. This can avoid the issue we're facing now and also keep other functionalities working like before.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's nice to be able to call name() on the dyn object.
As long as we can keep doing that I would be happy 😄

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, I'll submit a patch later this day (or tomorrow).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI #11047

Michael-J-Ward added a commit to Michael-J-Ward/datafusion-python that referenced this pull request Jul 24, 2024
This required trait method was added upstream [0] and recommends to simply forward to `static_name`.

[0]: apache/datafusion#10266
Michael-J-Ward added a commit to Michael-J-Ward/datafusion-python that referenced this pull request Jul 24, 2024
This required trait method was added upstream [0] and recommends to simply forward to `static_name`.

[0]: apache/datafusion#10266
Michael-J-Ward added a commit to Michael-J-Ward/datafusion-python that referenced this pull request Jul 25, 2024
This required trait method was added upstream [0] and recommends to simply forward to `static_name`.

[0]: apache/datafusion#10266
andygrove pushed a commit to apache/datafusion-python that referenced this pull request Jul 31, 2024
* chore: update datafusion deps

* feat: impl ExecutionPlan::static_name() for DatasetExec

This required trait method was added upstream [0] and recommends to simply forward to `static_name`.

[0]: apache/datafusion#10266

* feat: update first_value and last_value wrappers.

Upstream signatures were changed for the new new `AggregateBuilder` api [0].

This simply gets the code to work. We should better incorporate that API into `datafusion-python`.

[0] apache/datafusion#10560

* migrate count to UDAF

Builtin Count was removed upstream.

TBD whether we want to re-implement `count_star` with new API.

Ref: apache/datafusion#10893

* migrate approx_percentile_cont, approx_distinct, and approx_median to UDAF

Ref: approx_distinct apache/datafusion#10851
Ref: approx_median apache/datafusion#10840
Ref: approx_percentile_cont and _with_weight apache/datafusion#10917

* migrate avg to UDAF

Ref: apache/datafusion#10964

* migrage corr to UDAF

Ref: apache/datafusion#10884

* migrate grouping to UDAF

Ref: apache/datafusion#10906

* add alias `mean` for UDAF `avg`

* migrate stddev to UDAF

Ref: apache/datafusion#10827

* remove rust alias for stddev

The python wrapper now provides stddev_samp alias.

* migrage var_pop to UDAF

Ref: apache/datafusion#10836

* migrate regr_* functions to UDAF

Ref: apache/datafusion#10898

* migrate bitwise functions to UDAF

The functions now take a single expression instead of a Vec<_>.

Ref: apache/datafusion#10930

* add missing variants for ScalarValue with todo

* fix typo in approx_percentile_cont

* add distinct arg to count

* comment out failing test

`approx_percentile_cont` is now returning a DoubleArray instead of an IntArray.

This may be a bug upstream; it requires further investigation.

* update tests to expect lowercase `sum` in query plans

This was changed upstream.

Ref: apache/datafusion#10831

* update ScalarType data_type map

* add docs dependency pickleshare

* re-implement count_star

* lint: ruff python lint

* lint: rust cargo fmt

* include name of window function in error for find_window_fn

* refactor `find_window_fn` for debug clarity

* search default aggregate functions by both name and aliases

The alias list no longer includes the name of the function.

Ref: apache/datafusion#10658

* fix markdown in find_window_fn docs

* parameterize test_window_functions

`first_value` and `last_value` are currently failing and marked as xfail.

* add test ids to test_simple_select tests marked xfail

* update find_window_fn to search built-ins first

The behavior of `first_value` and `last_value` UDAFs currently does not match the built-in behavior.
This allowed me to remove `marks=pytest.xfail` from the window tests.

* improve first_call and last_call use of the builder API

* remove trailing todos

* fix examples/substrait.py

* chore: remove explicit aliases from functions.rs

Ref: #779

* remove `array_fn!` aliases

* remove alias rules for `expr_fn_vec!`

* remove alias rules from `expr_fn!` macro

* remove unnecessary pyo3 var-arg signatures in functions.rs

* remove pyo3 signatures that provided defaults for first_value and last_value

* parametrize test_string_functions

* test regr_ function wrappers

Closes #778
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

idea: add another static_name() method to ExecutionPlan trait
4 participants