Skip to content

Conversation

@wiedld
Copy link

@wiedld wiedld commented Feb 12, 2025

Follow up to #58, which demonstrates:

  • (behavior): the output of EnforceDistribution
  • (bug): how this output was then made invalid in the subsequent EnforceSorting.

In this PR, we tried to recreate on the latest apache main branch. It was found that:

  • (behavior): the output of EnforceDistribution is different
  • (bug): bug still occurs. If we provide the same valid plan, the EnforceSorting makes it invalid.

Followups:

Upstream/apache fix is going to focus on the EnforceSorting bug.
Separately, going to make some of the test setup cleanups from the EnforceDistribution learnings.

@github-actions github-actions bot added the core label Feb 12, 2025
Comment on lines +3486 to +3493
/// Same starting plan as [`repartitions_for_aggregate_after_sorted_union`], but adds a projection
/// as well between the union and aggregate. This change the outcome:
///
/// * we still get repartitioning
/// * we still get another sort added
/// * we no longer get 2 additional sorts pushed down below the union, after the first run
#[test]
fn repartitions_for_aggregate_after_sorted_union_projection() -> Result<()> {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This no longer inserts the coalesce. Even after 2 runs of enforce distribution.

}

#[tokio::test]
async fn test_preserve_needed_coalesce() -> Result<()> {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But this bug still occurs. EnforceSorting can make an valid plan, become invalid.

Comment on lines +3419 to +3424
// Demonstrate: plan changes are not idempotent.
//
// After the second run, the optimizer:
// * replaces the SPM with a sort, and removes the sorts pushed down below the union.
// * swaps the repartition vs SPM.
let expected_after_second_run = &[
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Proof that it's not idempotent.

Comment on lines +3530 to +3534
// Demonstrate: plan changes are not idempotent.
//
// After the second run, the optimizer:
// * replaces the SPM with a sort.
let expected_after_second_run = &[
Copy link
Author

@wiedld wiedld Feb 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is proof that it's not idempotent? 🤔

@wiedld wiedld closed this Feb 28, 2025
@wiedld wiedld mentioned this pull request Jul 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants