Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Display: Support preserve_partitioning on SortExec physical plan. #10153

Merged
merged 8 commits into from
Apr 30, 2024

Conversation

kavirajk
Copy link
Contributor

@kavirajk kavirajk commented Apr 20, 2024

Which issue does this PR close?

Closes #10138

Rationale for this change

Display of SortExec execution plan doesn't have visibility on preserve_partitioning field

SortExec: TopK(fetch=10000), expr=[NASME@0 ASC,VISITS@1 DESC NULLS LAST]

What changes are included in this PR?

Only important change made in this PR is

Display of SortExec execution plan now support preserve_partitioning field

SortExec: TopK(fetch=10000), expr=[NASME@0 ASC,VISITS@1 DESC NULLS LAST], preserve_partitioning=[true] 

Everything else is either updating tests manually or via some script. Following are the steps I used to update tests automatically (thanks @alamb for the tip)

  1. cargo test --test sqllogictests -- --complete - that updates all .slt files except ones from tpch benchmarks
  2. Generate tpch benchmarks data using steps mentioned here.
  3. INCLUDE_TPCH=true cargo test --test sqllogictests -- --complete - to update tpch benchmarks tests outputs.

Are these changes tested?

Yes

Are there any user-facing changes?

May be, if user try to format the Sort execution plan.

@github-actions github-actions bot added the core Core DataFusion crate label Apr 20, 2024
@alamb
Copy link
Contributor

alamb commented Apr 23, 2024

I think you can update the tests using completion mode: https://github.com/apache/datafusion/tree/main/datafusion/sqllogictest#updating-tests-completion-mode

That probably will get this PR mostly working

Fixes: apache#10138

`Display` of SortExec execution plan now support `preserve_partitioning` field

Signed-off-by: Kaviraj <[email protected]>
Signed-off-by: Kaviraj <[email protected]>
@kavirajk kavirajk force-pushed the kavirajk/sortexec-physical-plan-format branch from 56b18c7 to 6dbe69a Compare April 28, 2024 15:04
@github-actions github-actions bot added the sqllogictest SQL Logic Tests (.slt) label Apr 28, 2024
@kavirajk
Copy link
Contributor Author

kavirajk commented Apr 28, 2024

@alamb thanks for the suggestion 👍 . Quick follow up. That command you mentioned sqllogictests --complete only updates the .slt files that are part of datafusion/sqllogictest crate itself.

I don't see it updates any of the other test cases where we are asserting the physical plan string representation. That still needs manual updates correct (around 79 tests were failing)?.

I started updating those, just wanted to check if there are any easy/automated way that I'm not aware of :)

@kavirajk kavirajk marked this pull request as ready for review April 28, 2024 22:51
@@ -868,11 +868,12 @@ impl DisplayAs for SortExec {
match t {
DisplayFormatType::Default | DisplayFormatType::Verbose => {
let expr = PhysicalSortExpr::format_list(&self.expr);
let preserve_partioning = self.preserve_partitioning;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NOTE for reviewers: These changes in sort.rs is the only real change for this PR.

Everything else is just update tests either manually or automatically (see PR description for steps I used to update those)

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @kavirajk -- this looks like an epic PR (lots of tests to update!)

I think this really improves the readability of the plans and will help make understanding them much easier.

@alamb
Copy link
Contributor

alamb commented Apr 29, 2024

I took the liberty of merging up from main to this branch to ensure we didn't have any logical conflicts (e.g. newly added explain plans for example). I think we can merge it once CI passes

@alamb alamb merged commit 369e201 into apache:main Apr 30, 2024
23 checks passed
@alamb
Copy link
Contributor

alamb commented Apr 30, 2024

Thanks again @kavirajk

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Core DataFusion crate sqllogictest SQL Logic Tests (.slt)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Display] show preserve_partitioning in sortExec
2 participants