Skip to content

Comments

Make HttpRemoteTaskWithEventLoop::addSplit() update pending split stats immediately#26126

Merged
spershin merged 1 commit intoprestodb:masterfrom
spershin:FixHttpRemoteTaskWithEventLoop
Sep 23, 2025
Merged

Make HttpRemoteTaskWithEventLoop::addSplit() update pending split stats immediately#26126
spershin merged 1 commit intoprestodb:masterfrom
spershin:FixHttpRemoteTaskWithEventLoop

Conversation

@spershin
Copy link
Contributor

Description

Introduction of HttpRemoteTaskWithEventLoop to solve lock contention had an unexpected effect: split statistic in the class which is being used by the split scheduling code would not update immediately.
This could lead in the split scheduling code adding an overwhelming number of splits to a single worker. Cases of 1200 splits sent in a single message have been detected when a usual packet would contain 1-5 splits.

This is being fixed by updating the critical statistics synchronously while leaving the rest to be run asynchronously.

Test Plan

This has been tested in an intense production shadow environment, where previously we had very bad skews of split distribution.

== NO RELEASE NOTE ==

@spershin spershin requested a review from a team as a code owner September 23, 2025 03:20
@prestodb-ci prestodb-ci added the from:Meta PR from Meta label Sep 23, 2025
@sourcery-ai
Copy link
Contributor

sourcery-ai bot commented Sep 23, 2025

Reviewer's Guide

This PR refactors HttpRemoteTaskWithEventLoop to use atomic counters for pending split stats, updates those stats synchronously in addSplits(), and streamlines split queue space threshold management to prevent skewed split distribution.

File-Level Changes

Change Details Files
Replace volatile split counters with atomic types and update stats immediately on split addition
  • Convert pendingSourceSplitCount and pendingSourceSplitsWeight to AtomicInteger/AtomicLong
  • Compute new split count/weight before event loop and atomically add them
  • Call updateTaskStats() synchronously when new splits arrive
HttpRemoteTaskWithEventLoop.java
Refactor split queue space threshold logic and listener notification
  • Replace OptionalLong threshold with whenSplitQueueWeightThreshold field
  • Introduce setSplitQueueWeightThreshold() and splitQueueHasSpace() helpers
  • Simplify whenSplitQueueHasSpace() to check via splitQueueHasSpace() and register listeners accordingly
HttpRemoteTaskWithEventLoop.java
Accumulate and apply removal of acknowledged splits with atomic updates
  • Accumulate removed splits count/weight during processTaskUpdate
  • Atomically decrement counters, then call updateTaskStats() and updateSplitQueueSpace() only if removals occurred
HttpRemoteTaskWithEventLoop.java
Clear atomic counters in cleanup path
  • Reset pendingSourceSplitCount and pendingSourceSplitsWeight via .set(0)
  • Ensure updateTaskStats() is invoked after reset
HttpRemoteTaskWithEventLoop.java

Possibly linked issues


Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes and they look great!

Prompt for AI Agents
Please address the comments from this code review:

## Individual Comments

### Comment 1
<location> `presto-main/src/main/java/com/facebook/presto/server/remotetask/HttpRemoteTaskWithEventLoop.java:558-567` </location>
<code_context>
             return;
         }

+        int count = 0;
+        long weight = 0;
+        for (Entry<PlanNodeId, Collection<Split>> entry : splitsBySource.asMap().entrySet()) {
+            PlanNodeId sourceId = entry.getKey();
+            Collection<Split> splits = entry.getValue();
+
+            if (tableScanPlanNodeIds.contains(sourceId)) {
+                count += splits.size();
+                weight += splits.stream().map(Split::getSplitWeight)
+                        .mapToLong(SplitWeight::getRawValue)
+                        .sum();
+            }
+        }
+        if (count != 0) {
+            pendingSourceSplitCount.addAndGet(count);
+            pendingSourceSplitsWeight.addAndGet(weight);
</code_context>

<issue_to_address>
**issue (bug_risk):** Potential for double-counting splits if addSplits is called concurrently.

Because pendingSplits is not thread-safe, concurrent calls to addSplits may cause race conditions. Please either document that addSplits must be single-threaded or switch to a thread-safe collection for pendingSplits.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment on lines +558 to +567
int count = 0;
long weight = 0;
for (Entry<PlanNodeId, Collection<Split>> entry : splitsBySource.asMap().entrySet()) {
PlanNodeId sourceId = entry.getKey();
Collection<Split> splits = entry.getValue();

if (tableScanPlanNodeIds.contains(sourceId)) {
count += splits.size();
weight += splits.stream().map(Split::getSplitWeight)
.mapToLong(SplitWeight::getRawValue)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (bug_risk): Potential for double-counting splits if addSplits is called concurrently.

Because pendingSplits is not thread-safe, concurrent calls to addSplits may cause race conditions. Please either document that addSplits must be single-threaded or switch to a thread-safe collection for pendingSplits.

@tdcmeehan
Copy link
Contributor

Thanks for the investigation @spershin. Does this relate in any way to the recent work on task based scheduling for Prestissimo clusters?

@spershin spershin merged commit 6243a1b into prestodb:master Sep 23, 2025
76 of 77 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

from:Meta PR from Meta

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants