Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add name method to execution plan #9793

Merged

Conversation

matthewmturner
Copy link
Contributor

Which issue does this PR close?

Closes #9558

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

@github-actions github-actions bot added the core Core DataFusion crate label Mar 25, 2024
@@ -113,6 +113,10 @@ pub use datafusion_execution::{RecordBatchStream, SendableRecordBatchStream};
/// [`required_input_distribution`]: ExecutionPlan::required_input_distribution
/// [`required_input_ordering`]: ExecutionPlan::required_input_ordering
pub trait ExecutionPlan: Debug + DisplayAs + Send + Sync {
/// Short name for the ExecutionPlan, such as 'ParquetExec'.
fn name(&self) -> &'static str {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we have a default impl here? the user might forget to override the name

Copy link
Contributor

@comphead comphead left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Id say its LGTM, altough I would probably more incline to use just a default method withstd::any::type_name::<Self>(); so the code will be cleaner, but looks like this option was discussed

@matthewmturner
Copy link
Contributor Author

Id say its LGTM, altough I would probably more incline to use just a default method withstd::any::type_name::<Self>(); so the code will be cleaner, but looks like this option was discussed

After rethinking I think that that option is indeed a better default. I will update to that. FYI @SteveLauC

@matthewmturner
Copy link
Contributor Author

@comphead @SteveLauC I have updated the default impl. Let me know if any more concerns.

@@ -113,6 +113,15 @@ pub use datafusion_execution::{RecordBatchStream, SendableRecordBatchStream};
/// [`required_input_distribution`]: ExecutionPlan::required_input_distribution
/// [`required_input_ordering`]: ExecutionPlan::required_input_ordering
pub trait ExecutionPlan: Debug + DisplayAs + Send + Sync {
/// Short name for the ExecutionPlan, such as 'ParquetExec'.
fn name(&self) -> &'static str {
let full_name = std::any::type_name::<Self>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just realized that users can implement their own implementations if they don't want this method to involve any computation, nice!

Copy link
Contributor

@comphead comphead left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @matthewmturner I think we really close.
Reg to test, wondering can it be smaller? I'd say we dont need the full implementations for trait methods, just stubs because you testing out just .name() field and its overrides?

Instead of implementations you can put macros like todo()!, 'unreacheable()!', 'not_implemented()!'

@matthewmturner
Copy link
Contributor Author

@comphead sure, i didnt realize those could override the type analysis. updating now

Copy link
Contributor

@comphead comphead left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @matthewmturner lgtm
1 more thing is probably we need to ensure the name is unique? if implementation will be separated across different crates and users can develop their own implementations it can possibly be that the name will not be unique for 2 different structs

@matthewmturner
Copy link
Contributor Author

@comphead thats a good / interesting point. what comes to mind is perhaps datafusion statically defining / guaranteeing uniqueness of builtin exec names (perhaps could be automated?) and then users would be required to ensure uniqueness of their own execs. im not sure we would want to add some runtime check for that.

i think this could be a follow on PR but let me know if you think otherwise.

@comphead
Copy link
Contributor

I'm okay with the followup, as there is only 1 use case for now, but we need to keep this in mind

@alamb alamb merged commit 3a1e3ad into apache:main Mar 28, 2024
24 checks passed
@alamb
Copy link
Contributor

alamb commented Mar 28, 2024

Thanks everyone!

Lordworms pushed a commit to Lordworms/arrow-datafusion that referenced this pull request Apr 1, 2024
* Add name method to execution plan

* Cleanup

* Change default impl

* Add tests

* Clippy

* Use unimplemented macro

* Fix
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Core DataFusion crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add name method to ExecutionPlan trait
4 participants