Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Library Guide: Extending DataFusion's operators: custom LogicalPlan and ExecutionPlans #7308

Open
alamb opened this issue Aug 16, 2023 · 2 comments
Labels
documentation Improvements or additions to documentation enhancement New feature or request

Comments

@alamb
Copy link
Contributor

alamb commented Aug 16, 2023

Is your feature request related to a problem or challenge?

Part of #7014

If we want to have DataFusion used as the core of many new systems, we need it to be as easy as possible for someone to get their idea working on top of DataFusion.

Thanks to @tshauck we now have a basic Library Users Guide ❤️ and this ticket describes expanding it out

Describe the solution you'd like

Fill in the content of https://arrow.apache.org/datafusion/library-user-guide/extending-operators.html

We can draw inspiration from https://github.com/apache/arrow-datafusion/blob/main/datafusion/core/tests/user_defined/user_defined_plan.rs

Example Outline

  1. Introduce an example plan node that can not be expressed with existing relational operators (maybe pivot rows to columns, like here)
  2. Show how to define the Logical extension user defined node
  3. SHow how to use an extension planner physical planner to plan such a node (example here)
  4. Show how to create a simplified execution plan / stream

The examples directory holds a bunch more of examples: https://github.com/apache/arrow-datafusion/tree/main/datafusion-examples

Describe alternatives you've considered

No response

Additional context

No response

@alamb alamb added documentation Improvements or additions to documentation enhancement New feature or request devrel labels Aug 16, 2023
@brayanjuls
Copy link
Contributor

I was investigating about pivoting in the DataFrame API and found some of the links in this issues are broken, leaving the replacement here for someone trying to work on this in the future

  1. pivot rows to columns, link
  2. how to use extension physical planner, link

@alamb
Copy link
Contributor Author

alamb commented Aug 19, 2024

Thanks @brayanjuls

@alamb alamb removed the devrel label Oct 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants