Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RepartitionExec incorrectly reports metrics for all partitions against a single partition #10015

Closed
alamb opened this issue Apr 9, 2024 · 0 comments · Fixed by #10025
Closed
Assignees
Labels
bug Something isn't working

Comments

@alamb
Copy link
Contributor

alamb commented Apr 9, 2024

Describe the bug

@crepererum notes in in #10009 (comment)_

This was broken before: the first parameter was set to partition, i.e. the first partition that initializes the state. That's clearly wrong. I now initialize it to 0 and will fix the tracking properly in a follow-up PR.

To Reproduce

No response

Expected behavior

The metrics for each partition should be assigned to their individual partition

Additional context

Found on #10009

@alamb alamb added the bug Something isn't working label Apr 9, 2024
@crepererum crepererum self-assigned this Apr 10, 2024
crepererum added a commit to crepererum/arrow-datafusion that referenced this issue Apr 10, 2024
`RepartitionExec` is somewhat special. While most execs operate on
"input partition = output partition", `RepartitionExec` drives all of
its work using input-bound tasks. The metrics "fetch time" and
"repartition time" therefore have to be accounted for the input
partition, not for the output partition. The only metric that has an
input & output partition label is the "send time".

Fixes apache#10015.
alamb pushed a commit that referenced this issue Apr 10, 2024
`RepartitionExec` is somewhat special. While most execs operate on
"input partition = output partition", `RepartitionExec` drives all of
its work using input-bound tasks. The metrics "fetch time" and
"repartition time" therefore have to be accounted for the input
partition, not for the output partition. The only metric that has an
input & output partition label is the "send time".

Fixes #10015.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants