Skip to content

Add memory limit check for HashBuilderOperator's spilling#17905

Merged
highker merged 1 commit intoprestodb:masterfrom
vanekjar:hash-operator-spill
Jul 5, 2022
Merged

Add memory limit check for HashBuilderOperator's spilling#17905
highker merged 1 commit intoprestodb:masterfrom
vanekjar:hash-operator-spill

Conversation

@vanekjar
Copy link
Copy Markdown
Contributor

@vanekjar vanekjar commented Jun 18, 2022

Bug description

Currently, HashBuilderOperator spills data regardless of memory limits until all pages are processed. When the operator spills more data than allowed memory limit it fails during unspilling because spilled data cannot fit into memory. It leads to unecessary data processing and spilling even though the task is bound to fail eventually.

Example stacktrace from task that spilled 11 GB data with 5 GB memory limit.

com.facebook.presto.ExceededMemoryLimitException: Query exceeded per-node user memory limit of 5GB [Allocated: 22kB, Delta: 11.02GB, Top Consumers: {ScanFilterAndProjectOperator=192.36MB, GenericPartitioningSpiller=188.37MB, HashBuilderOperator=37.87MB}, Details: [{"taskId":"2.0.537","reservation":"226.24MB","topConsumers":[{"type":"LookupJoinOperator","planNodeId":"16","reservations":["74.66MB","50.69MB","32.74MB","30.29MB"],"total":"188.37MB"},{"type":"HashBuilderOperator","planNodeId":"16","reservations":["12.21MB","10.60MB","9.04MB","6.04MB"],"total":"37.87MB","info":"LEFT;PARTITIONED;"}]}]]
	at com.facebook.presto.ExceededMemoryLimitException.exceededLocalUserMemoryLimit(ExceededMemoryLimitException.java:49)
	at com.facebook.presto.memory.QueryContext.enforceUserMemoryLimit(QueryContext.java:496)
	at com.facebook.presto.memory.QueryContext.updateUserMemory(QueryContext.java:192)
	at com.facebook.presto.memory.QueryContext$QueryMemoryReservationHandler.reserveMemory(QueryContext.java:462)
	at com.facebook.presto.memory.context.RootAggregatedMemoryContext.updateBytes(RootAggregatedMemoryContext.java:37)
	at com.facebook.presto.memory.context.ChildAggregatedMemoryContext.updateBytes(ChildAggregatedMemoryContext.java:38)
	at com.facebook.presto.memory.context.ChildAggregatedMemoryContext.updateBytes(ChildAggregatedMemoryContext.java:38)
	at com.facebook.presto.memory.context.ChildAggregatedMemoryContext.updateBytes(ChildAggregatedMemoryContext.java:38)
	at com.facebook.presto.memory.context.ChildAggregatedMemoryContext.updateBytes(ChildAggregatedMemoryContext.java:38)
	at com.facebook.presto.memory.context.SimpleLocalMemoryContext.setBytes(SimpleLocalMemoryContext.java:82)
	at com.facebook.presto.operator.OperatorContext$InternalLocalMemoryContext.setBytes(OperatorContext.java:672)
	at com.facebook.presto.operator.HashBuilderOperator.unspillLookupSourceIfRequested(HashBuilderOperator.java:547)
	at com.facebook.presto.operator.HashBuilderOperator.finish(HashBuilderOperator.java:470)
	at com.facebook.presto.operator.Driver.processInternal(Driver.java:474)
	at com.facebook.presto.operator.Driver.lambda$processFor$9(Driver.java:307)
	at com.facebook.presto.operator.Driver.tryWithLock(Driver.java:728)
	at com.facebook.presto.operator.Driver.processFor(Driver.java:300)

Fix

The fix tracks future memory footprint of spilled data and fails immediately when spilled data exceeds max-memory-per-node. It saves significant amount of resources that would be otherwise used for processing and spilling pages with inevitable failure at the end.

Introducing new error message for that case

PrestoSparkNonRetryableExecutionException: Query exceeded per-node user memory limit of 5GB [Spilled: 5.00GB, Operator: HashBuilderOperator] 

Test plan

  • Unit test
  • Manual end-to-end test

Notes

== RELEASE NOTES ==

General Changes
* Add memory limit check for HashBuilderOperator's spilling and fail fast if exceeding to avoid unnecessary processing.

@vanekjar vanekjar requested a review from a team as a code owner June 18, 2022 04:22
@vanekjar vanekjar requested a review from NikhilCollooru June 18, 2022 04:22
@vanekjar vanekjar requested a review from highker June 20, 2022 20:13
@highker
Copy link
Copy Markdown

highker commented Jun 21, 2022

I can merge the PR once @pgupta2 feels it's in good shape

@vanekjar vanekjar force-pushed the hash-operator-spill branch from 3dddc2e to 9f11460 Compare June 21, 2022 23:06
@highker highker requested a review from pgupta2 June 23, 2022 12:46
Copy link
Copy Markdown
Collaborator

@kewang1024 kewang1024 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the release guidance, can we make the release note more user-facing (probably also change the commit title and PR title)
"Add memory limit check for HashBuilderOperator's spilling and fail fast if exceeding to avoid unnecessary processing"

@kewang1024 kewang1024 self-requested a review June 24, 2022 18:39
Copy link
Copy Markdown
Collaborator

@kewang1024 kewang1024 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise LGTM

@vanekjar vanekjar force-pushed the hash-operator-spill branch from 9f11460 to 51db7f1 Compare June 27, 2022 17:44
@vanekjar vanekjar changed the title HashBuilderOperator obeys memory limit when spilling data Add memory limit check for HashBuilderOperator's spilling Jun 27, 2022
@vanekjar
Copy link
Copy Markdown
Contributor Author

According to the release guidance, can we make the release note more user-facing (probably also change the commit title and PR title) "Add memory limit check for HashBuilderOperator's spilling and fail fast if exceeding to avoid unnecessary processing"

Thanks for the review. Updated the wording following your suggestion.

When the HashBuilderOperator spills more data than allowed memory it
fails fast to avoid unnecessary processing
@vanekjar vanekjar force-pushed the hash-operator-spill branch from 51db7f1 to 4c5173a Compare June 27, 2022 22:49
@highker highker merged commit 04e4900 into prestodb:master Jul 5, 2022
@vanekjar vanekjar deleted the hash-operator-spill branch July 5, 2022 22:09
@highker highker mentioned this pull request Jul 6, 2022
7 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants