Skip to content

Conversation

@ulysses-you
Copy link
Contributor

@ulysses-you ulysses-you commented Aug 2, 2022

this is for backport #37330 into branch-3.3

What changes were proposed in this pull request?

Optimize Global sort to RepartitionByExpression, for example:

Sort local             Sort local
  Sort global    =>      RepartitionByExpression

Why are the changes needed?

If a global sort below a local sort, the only meaningful thing is it's distribution. So this pr optimizes that global sort to RepartitionByExpression to save a local sort.

Does this PR introduce any user-facing change?

we fix a bug in #37250 and that pr backport into branch-3.3. However, that fix may introduce performance regression. This pr itself is only to improve performance but in order to avoid the regression, we also backport this pr. see the details #37330 (comment)

How was this patch tested?

add test

Closes #37330 from ulysses-you/optimize-sort.

Authored-by: ulysses-you [email protected]
Signed-off-by: Wenchen Fan [email protected]

### What changes were proposed in this pull request?

Optimize Global sort to RepartitionByExpression, for example:
```
Sort local             Sort local
  Sort global    =>      RepartitionByExpression
```

### Why are the changes needed?

If a global sort below a local sort, the only meaningful thing is it's distribution. So this pr optimizes that global sort to RepartitionByExpression to save a local sort.

### Does this PR introduce _any_ user-facing change?

no, only improve performance

### How was this patch tested?

add test

Closes apache#37330 from ulysses-you/optimize-sort.

Authored-by: ulysses-you <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
@github-actions github-actions bot added the SQL label Aug 2, 2022
@ulysses-you
Copy link
Contributor Author

cc @cloud-fan

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you describe some context, please? The current PR description is only describing improvements which is not qualified for legitimate backporting. I believe you have a reason to fix some bugs or regression, @ulysses-you and @cloud-fan .

no, only improve performance

@ulysses-you
Copy link
Contributor Author

@dongjoon-hyun yes, the story is we fix a bug in #37250 and that pr backport into branch-3.3. However, that fix may introduce performance regression. This pr itself is only to improve performance but in order to avoid the regression, we also backport this pr. see the details #37330 (comment)

@dongjoon-hyun
Copy link
Member

Than you, @ulysses-you . Please put the explanation into the PR description.

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @huaxingao , too.

@ulysses-you
Copy link
Contributor Author

@dongjoon-hyun thank you, I have updated the description

@cloud-fan
Copy link
Contributor

@cloud-fan
Copy link
Contributor

merging to 3.3!

cloud-fan pushed a commit that referenced this pull request Aug 3, 2022
this is for backport #37330 into branch-3.3
### What changes were proposed in this pull request?

Optimize Global sort to RepartitionByExpression, for example:
```
Sort local             Sort local
  Sort global    =>      RepartitionByExpression
```

### Why are the changes needed?

If a global sort below a local sort, the only meaningful thing is it's distribution. So this pr optimizes that global sort to RepartitionByExpression to save a local sort.

### Does this PR introduce _any_ user-facing change?

we fix a bug in #37250 and that pr backport into branch-3.3. However, that fix may introduce performance regression. This pr itself is only to improve performance but in order to avoid the regression, we also backport this pr. see the details #37330 (comment)

### How was this patch tested?

add test

Closes #37330 from ulysses-you/optimize-sort.

Authored-by: ulysses-you <ulyssesyou18gmail.com>
Signed-off-by: Wenchen Fan <wenchendatabricks.com>

Closes #37373 from ulysses-you/SPARK-39911-3.3.

Authored-by: ulysses-you <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
@cloud-fan cloud-fan closed this Aug 3, 2022
@ulysses-you ulysses-you deleted the SPARK-39911-3.3 branch August 3, 2022 03:14
@dongjoon-hyun
Copy link
Member

Thank you all.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants