Skip to content

Implement task level retries#9818

Merged
martint merged 12 commits intotrinodb:masterfrom
arhimondr:implement-task-level-retries
Jan 21, 2022
Merged

Implement task level retries#9818
martint merged 12 commits intotrinodb:masterfrom
arhimondr:implement-task-level-retries

Conversation

@arhimondr
Copy link
Copy Markdown
Contributor

No description provided.

@cla-bot cla-bot bot added the cla-signed label Oct 29, 2021
@arhimondr arhimondr force-pushed the implement-task-level-retries branch from eb068b3 to 5ae17d5 Compare October 29, 2021 19:16
@arhimondr arhimondr requested a review from linzebing October 29, 2021 19:23
@arhimondr arhimondr force-pushed the implement-task-level-retries branch from 5ae17d5 to 10eaae8 Compare November 1, 2021 21:31
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel this state is not necessary and adds confusion --- even for UI display. If the stage has finished its tasks, then it's finished; in the case of a retry, we should either show it's RUNNING again, or we make it explicit saying that it's a retry.

Suggested change
* Stage is finished running existing tasks but more tasks could be scheduled in the future.
* Stage has finished running existing tasks but more tasks could be scheduled in the future.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the stage has finished its tasks, then it's finished; in the case of a retry, we should either show it's RUNNING again, or we make it explicit saying that it's a retry.

That's a good point. I also don't really like the "PENDING" state. The problem is that we need to have a terminal state, a state that would indicate that no more tasks can be scheduled in a given stage. This is needed for the final stage info creation that would create a final summary of runtime statistics for a given stage. I was thinking about something like COMPLETED / FINISHED, but the semantic difference between these two words is too subtle, and I thought i may introduce even more confusion.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be declared as void

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was trying to be consistent with the other two methods, cancelTask and abortTask

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need ImmutableSet.copyOf here? I don't see a potential race condition.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is to find a number of unique nodes. The immutability doesn't have any special meaning here, just a useful constructor method for a set. Ideally it should be refactored to provide a number of unique nodes explicitly, as a constructor parameter. But the refactor wasn't trivial. I decided to delay the refactor until later since the impact of this de-duplication is not that high (it is only called once per stage and fixed mappings is rather a corner case and are rare).

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be more idiomatic for that purpose:

bucketToNode.stream()
    .distinct()
    .count();

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that we call scheduleTask with the last two parameters as ImmutableMultimap.of() very often, does it make sense to create an overloaded version of scheduleTask?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about that. There's a subtle trade-off though. Overloads tend to add complexity, as now you need to think if there's any semantic difference between two overloads and what that semantic difference might be. Another trade-off is that overloads add extra overhead when trying to search through the code where the method is used. You need to check each overload separately.

@arhimondr arhimondr force-pushed the implement-task-level-retries branch 3 times, most recently from 5cc34ca to ea58f80 Compare November 2, 2021 18:18
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this needs to be synchronized now?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method has been made public, and now it can be called by any thread, thus synchronization must be enforced (similarly to cancel and abort)

}

@Override
public synchronized void noMoreTasks()
Copy link
Copy Markdown
Member

@linzebing linzebing Nov 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can retry tasks still be added after noMoreTasks gets called? Or it will only be called after all tasks finish/fail?

If it's the former, then I don't see why we need this method; if it's the latter, why call checkInputFinished in taskFailed/taskFinished?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can retry tasks still be added after noMoreTasks gets called?

No. After this method is called no new tasks can be added.

if it's the latter, why call checkInputFinished in taskFailed/taskFinished?

The noMoreTasks might be called when no future retries are anticipated (when retries are disabled or when number of attempts is exhausted). Yet some tasks might still be running.

Multimap<PlanNodeId, Split> tableScanSplits = batchTask.getSplits();
Multimap<PlanNodeId, Split> remoteSplits = createRemoteSplits(batchTask.getShuffleInputs());

Multimap<PlanNodeId, Split> taskSplits = ImmutableListMultimap.<PlanNodeId, Split>builder()
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For my understanding, one of tableScanSplits and remoteSplits will be empty because a batch task either reads from table scans or shuffle files.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case of bucketed or collocated join it can read both, a table (thus the splits) and a remote exchange (if one of the tables is not bucketed and must be repartitioned)

StageId stageId = stageExecution.getStageId();
allStages.put(stageId, stageExecution);
if (fragment.getPartitioning().isCoordinatorOnly()) {
coordinatorStagesInTopologicalOrder.add(stageExecution);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually this is reverse topological order, not sure if we can have better naming here to avoid confusion

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it's a little non intuitive. But the way the plan is structured is top down. The root node has references to it's children. Since we are following the reference direction it is still a topological order (despite the plan leaves come last)

log.warn(failureInfo.toException(), "Task failed: %s", taskId);
ErrorCode errorCode = failureInfo.getErrorCode();
if (remainingRetryAttempts > 0 && (errorCode == null || errorCode.getType() != USER_ERROR)) {
remainingRetryAttempts--;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So remainingRetryAttempts is on a stage-basis (instead of task-basis)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently it is a stage based, but I'm not sure if that's the right strategy. We should discuss what is the right strategy.

@arhimondr arhimondr force-pushed the implement-task-level-retries branch 2 times, most recently from f716aff to 01b40d3 Compare November 4, 2021 07:08
@arhimondr arhimondr force-pushed the implement-task-level-retries branch 2 times, most recently from 1016d56 to 2a9c5fa Compare November 5, 2021 01:29
@arhimondr arhimondr force-pushed the implement-task-level-retries branch from 2a9c5fa to 149fafb Compare November 5, 2021 18:30
@arhimondr arhimondr mentioned this pull request Nov 10, 2021
31 tasks
@arhimondr arhimondr force-pushed the implement-task-level-retries branch from 149fafb to 390c682 Compare November 15, 2021 17:10
@arhimondr arhimondr force-pushed the implement-task-level-retries branch from 54fb567 to 6d4a9b3 Compare January 13, 2022 22:29
@arhimondr arhimondr force-pushed the implement-task-level-retries branch from 6d4a9b3 to 0cf6b73 Compare January 14, 2022 18:34
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LOG

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently it is inconsistent across the code base. But there are 260 matches when I search for Logger log and only 33 when I search for Logger LOG, so it looks like Logger log is preferred.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I'm aware of the inconsistency. But it should be LOG per convention since it's a static final variable. We should not contribute to the unconventional usage.

Copy link
Copy Markdown
Contributor Author

@arhimondr arhimondr Jan 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, code style convention says The names of variables declared class constants and of ANSI constants should be all uppercase with words separated by underscores ("_").: https://www.oracle.com/java/technologies/javase/codeconventions-namingconventions.html

However whether a Logger can be considered a constant is controversial. Generally constants are expected to be inherently immutable objects. Declaring something as private static final does not make something a constant. The declaration may also be used for static fields assigned once and accessible only from within a class.

For example codestyle document from Google says:

Constant names use CONSTANT_CASE: all uppercase letters, with each word separated from the next by a single underscore. But what is a constant, exactly?
Constants are static final fields whose contents are deeply immutable and whose methods have no detectable side effects. This includes primitives, Strings, immutable types, and immutable collections of immutable types. If any of the instance's observable state can change, it is not a constant. Merely intending to never mutate the object is not enough.

And they explicitly mention examples that shouldn't be considered constants and declaring a logger is one of them:

static final Logger logger = Logger.getLogger(MyClass.getName());

https://google.github.io/styleguide/javaguide.html#s5.2.4-constant-names

I don't have a particularly strong opinion here. It feels either should do as long as it is used consistently. In the current state it feels like Logger log will be more consistent with the other places in the code base.

If you feel particularly strong I can do the rename. Regardless we should probably follow up with a PR that would rename all other places to make it consistent and add a style check rule.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does "the implementation" know if an attempt succeeded or failed?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clarified

Comment on lines 25 to 32
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the purpose of these? Every object implements these methods, so they don't impose any requirement or constraint on implementations.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately it is impossible to enforce presence of these methods compile time. The idea behind leaving them explicitly declared in this interface is to make it more difficult to miss for somebody who is going to be implementing this interface. Though yeah, it doesn't provide any strong guarantees.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be called just exchangeClientSupplier.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While we don't have anything named "client" for external exchange IMO it still improves readability as it emphasizes that this supplies a client specifically for direct exchange.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I'm aware of the inconsistency. But it should be LOG per convention since it's a static final variable. We should not contribute to the unconventional usage.

@arhimondr arhimondr force-pushed the implement-task-level-retries branch 3 times, most recently from 3b607d7 to 15ce61f Compare January 19, 2022 18:43
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be more idiomatic for that purpose:

bucketToNode.stream()
    .distinct()
    .count();

Comment on lines +350 to +364
cancelRunningTasks(abort);
cancelBlockedFuture();
releaseAcquiredNode();
closeTaskSource();
closeSinkExchange();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only if a failure in closing would affect the query results (e.g., incomplete results), otherwise, we should just log an error and ignore.

Streaming upload to S3 allocates a 16MB buffer (by default) for each
output stream. Failure recovery tests create a table partitioned into
~60 partitions. Since for each partition at least one file
must be created the engine has to allocate ~1GB of buffer space. These
buffer allocations push the memory reservation beyond the maximum heap
size.
To avoid a clash when both testTargetMaxFileSizePartitioned and
testTargetMaxFileSize are executed concurrently
@arhimondr arhimondr force-pushed the implement-task-level-retries branch from 15ce61f to 991145a Compare January 20, 2022 21:46
@martint martint merged commit f81af8f into trinodb:master Jan 21, 2022
@github-actions github-actions bot added this to the 369 milestone Jan 21, 2022
This was referenced Jan 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

4 participants