Add memory limit check for HashBuilderOperator's spilling#17905
Merged
highker merged 1 commit intoprestodb:masterfrom Jul 5, 2022
Merged
Add memory limit check for HashBuilderOperator's spilling#17905highker merged 1 commit intoprestodb:masterfrom
highker merged 1 commit intoprestodb:masterfrom
Conversation
pgupta2
reviewed
Jun 20, 2022
presto-main/src/main/java/com/facebook/presto/operator/HashBuilderOperator.java
Outdated
Show resolved
Hide resolved
highker
reviewed
Jun 21, 2022
presto-main/src/main/java/com/facebook/presto/operator/HashBuilderOperator.java
Outdated
Show resolved
Hide resolved
presto-main/src/main/java/com/facebook/presto/operator/HashBuilderOperator.java
Outdated
Show resolved
Hide resolved
presto-main/src/test/java/com/facebook/presto/operator/TestHashJoinOperator.java
Outdated
Show resolved
Hide resolved
presto-main/src/main/java/com/facebook/presto/operator/HashBuilderOperator.java
Outdated
Show resolved
Hide resolved
|
I can merge the PR once @pgupta2 feels it's in good shape |
3dddc2e to
9f11460
Compare
kewang1024
reviewed
Jun 24, 2022
Collaborator
There was a problem hiding this comment.
According to the release guidance, can we make the release note more user-facing (probably also change the commit title and PR title)
"Add memory limit check for HashBuilderOperator's spilling and fail fast if exceeding to avoid unnecessary processing"
presto-main/src/main/java/com/facebook/presto/operator/HashBuilderOperator.java
Outdated
Show resolved
Hide resolved
9f11460 to
51db7f1
Compare
Contributor
Author
Thanks for the review. Updated the wording following your suggestion. |
pgupta2
approved these changes
Jun 27, 2022
When the HashBuilderOperator spills more data than allowed memory it fails fast to avoid unnecessary processing
51db7f1 to
4c5173a
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Bug description
Currently,
HashBuilderOperatorspills data regardless of memory limits until all pages are processed. When the operator spills more data than allowed memory limit it fails during unspilling because spilled data cannot fit into memory. It leads to unecessary data processing and spilling even though the task is bound to fail eventually.Example stacktrace from task that spilled 11 GB data with 5 GB memory limit.
Fix
The fix tracks future memory footprint of spilled data and fails immediately when spilled data exceeds
max-memory-per-node. It saves significant amount of resources that would be otherwise used for processing and spilling pages with inevitable failure at the end.Introducing new error message for that case
Test plan
Notes