-
Notifications
You must be signed in to change notification settings - Fork 184
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid recalculating nodes in a graph #696
Comments
Design document here: https://docs.google.com/document/d/1vqMtAv8bNpw-ARTt-MPZwQQGZlSJUJb64hGspEHB8dE/edit?usp=sharing |
CALCITE - Optimized plan
JSON Graph representations will be attached |
We should use Calcite, to provide us the info we need to know which relational algebra nodes are shared. Through some experiments I have found the following:
THen we can get a plan that looks like this:
Notice that we now have an Note that the suggestion here, comes in part from learnings from experimenting with the work started here: https://github.com/BlazingDB/blazingsql/pull/704/files |
Using the new information we can get from calcite, we can know which kernels should be reused. This now means that we have kernels that can output to more than one kernel. There are many ways we can do that. To help describe the options, lets assume there is a kernel 1A that is outputting to two kernels 2A and 2B. Option 1: Option 2: |
We should go with Option 1. I have been thinking about the implementation details and a couple things to keep in mind. Additionally, the logic that receives a message that that is sent to a specific cache (i.e. |
Tasks to this:
The text was updated successfully, but these errors were encountered: