Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve efficiency of multiple optimizer passes #3892

Closed
andygrove opened this issue Oct 19, 2022 · 3 comments · Fixed by #5623
Closed

Improve efficiency of multiple optimizer passes #3892

andygrove opened this issue Oct 19, 2022 · 3 comments · Fixed by #5623
Labels
enhancement New feature or request optimizer Optimizer rules

Comments

@andygrove
Copy link
Member

Is your feature request related to a problem or challenge? Please describe what you are trying to do.
Improve efficiency of multiple optimizer passes

Describe the solution you'd like

Thanks to @isidentical for this suggestion.

Instead of having the optimizer decide when it is done by seeing if the last pass changed the plan or not, based on the Display representation of the plan, it might also make sense to compute a unique plan id (bottom up) so that we can also use this to detect optimization cycles.

A very basic example is (assuming each letter is a unique plan id) A -> B -> C -> A -> B -> [max passes times more], where even though the previous plan is different from the current one we would still need to exit the loop. Having a unique id would mean we can just store a set somewhere and check against if known_plans.contains(new_plan.id) and it would break the loop.

Describe alternatives you've considered

Additional context
Discussion at https://github.com/apache/arrow-datafusion/pull/3880/files#r998491734

@andygrove andygrove added enhancement New feature or request optimizer Optimizer rules labels Oct 19, 2022
@Dandandan
Copy link
Contributor

Wouldn't this be a hash?

@isidentical
Copy link
Contributor

Wouldn't this be a hash?

Pretty much! I am not sure if there is anything internally that would prevent them being 'hashable' (as in getting consistent summary), but if there is none we can even implement native rust Hash and just use a hash-set.

@HaoYang670
Copy link
Contributor

As this is very similar to the optimizer in compilers, I guess we can also find some inspiration from how compilers judge to finish the optimizing process.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request optimizer Optimizer rules
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants