Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

minor: non-overlapping repart_time and send_time metrics #11440

Merged
merged 1 commit into from
Jul 16, 2024

Conversation

korowa
Copy link
Contributor

@korowa korowa commented Jul 12, 2024

Which issue does this PR close?

Closes #.

Rationale for this change

Currently repart_time and send_time metrics for RepartitionExec may have significant overlap since timer for repart_time is binded to repartitioned batches iterator, (partition_iter) which is used for sending these batches.

This PR limits the time accounted by repart_time timer.

What changes are included in this PR?

Repartition time now accounts only the time required to calculate input batch row indices distribution across output partitions and the time required to produce output batches.

Are these changes tested?

Only manually.

Are there any user-facing changes?

No

@korowa korowa changed the title minor: split repartition time and send time metrics minor: non-overlapping repart_time and send_time metrics Jul 12, 2024
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to me -- thank you @korowa

datafusion/physical-plan/src/repartition/mod.rs Outdated Show resolved Hide resolved
datafusion/physical-plan/src/repartition/mod.rs Outdated Show resolved Hide resolved
@alamb alamb merged commit 2837e02 into apache:main Jul 16, 2024
23 checks passed
@alamb
Copy link
Contributor

alamb commented Jul 16, 2024

Thanks @korowa

findepi pushed a commit to findepi/datafusion that referenced this pull request Jul 16, 2024
xinlifoobar pushed a commit to xinlifoobar/datafusion that referenced this pull request Jul 17, 2024
xinlifoobar pushed a commit to xinlifoobar/datafusion that referenced this pull request Jul 18, 2024
wiedld pushed a commit to influxdata/arrow-datafusion that referenced this pull request Jul 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants