Fix memory accounting leak in aggregation#12333
Merged
arhimondr merged 6 commits intotrinodb:masterfrom May 13, 2022
Merged
Conversation
linzebing
approved these changes
May 11, 2022
Member
|
Failure may be related |
21ae538 to
479c5a1
Compare
Contributor
Author
losipiuk
reviewed
May 13, 2022
Member
There was a problem hiding this comment.
nit: This is always called with 1; maybe replace with processOnce()?
Contributor
Author
There was a problem hiding this comment.
I was thinking about that. It feels like processOnce is less intuitive. Though I don't have a strong opinion here.
losipiuk
approved these changes
May 13, 2022
Nested operators are used to compute column level statistics on write
An interrupt might occur when isFinishedInternal is executed. If it occurs at that step it should still be post-processed correctly.
Some tests are written to fail with an external failure what causes tasks being retried several times.
479c5a1 to
e2c54df
Compare
Contributor
Author
|
CI: #12385 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
All the memory pool related failures from #11275 I checked are
INSERTqueries with a memory leak reported in either anAggregationOperatoror anHashAggregationOperator.I managed to reproduce it locally only once by running the
INSERT INTO invalid_partition_value VALUES (4, 'test' || chr(13))query in a loop and checking whether cluster memory pools are free upon competition.The issue is incredibly difficult to reproduce as there are two things that must go wrong:
TableWriterOperatorinstance must happen to get used after close (fixed byEnsure no methods are called on operator after close)TableWriterOperatoroperator must trigger an internal aggregation state update to cause it's memory context to get updatedUnder normal circumstances the
OperatorContextis destroyed during an operator close what prevents any further memory allocations for a given operator. However theOperatorContextfor a nested operator within theTableWriterOperatorwasn't being properly closed (fixed byDestroy OperatorContext for nested operators).Also i found that memory can be still be reserved after memory context close (by using
tryReserve). Fixed byEnsure no memory can be allocated after memory context closeFix
Core
Under rare circumstances
DMLqueries (INSERT,CTAS, etc.) could not release reserved memory what could potentially result in cluster eventually running out of memory.Related issues, pull requests, and links
Fixes #11275
Documentation
(x) No documentation is needed.
( ) Sufficient documentation is included in this PR.
( ) Documentation PR is available with #prnumber.
( ) Documentation issue #issuenumber is filed, and can be handled later.
Release notes
(x) No release notes entries required.
( ) Release notes entries required with the following suggested text: