Extend memory exceeded errors with more details#16297
Extend memory exceeded errors with more details#16297arhimondr merged 1 commit intoprestodb:masterfrom
Conversation
4248452 to
12ca84b
Compare
presto-benchmark/src/main/java/com/facebook/presto/benchmark/AbstractOperatorBenchmark.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Maybe rename to EXCEEDED_MEMORY_LIMIT_ERRORS_VERBOSE_LOGGING_ENABLED
There was a problem hiding this comment.
EXCEEDED_MEMORY_LIMIT_ERRORS_VERBOSE_LOGGING_ENABLED may sound a little misleading, as it suggests that the logging verbosity is getting changed. Setting this property to true won't increase logging verbosity, but is going to provide extra details in error messages.
There was a problem hiding this comment.
Wondering why is this done in this way rather than passing it while creating QueryContext object?
There was a problem hiding this comment.
Unfortunately there's a design flaw in how we create QueryContext. In Presto classic the QueryContext object is shared between all tasks of the same query and is created when the first task of a query is scheduled on the node. While in practice 2 tasks of the same query shouldn't have different session properties the interface allows it. Since the hack ...
We are already using a similar hack for setting memory limits: https://github.com/prestodb/presto/blob/master/presto-main/src/main/java/com/facebook/presto/execution/SqlTaskManager.java#L405
Perhaps we need to refactor it at some point. But that is probably beyond the scope of this PR.
presto-main/src/main/java/com/facebook/presto/memory/QueryContext.java
Outdated
Show resolved
Hide resolved
presto-main/src/main/java/com/facebook/presto/operator/TaskContext.java
Outdated
Show resolved
Hide resolved
12ca84b to
d9f18c3
Compare
presto-main/src/main/java/com/facebook/presto/operator/TaskContext.java
Outdated
Show resolved
Hide resolved
presto-main/src/test/java/com/facebook/presto/operator/GroupByHashYieldAssertion.java
Outdated
Show resolved
Hide resolved
This is neeeded to simplify out of memory errors debugging. The verbose
details are completely optional and disabled by default to avoid issues
related to displaying and storing a verbose error message.
The details contain memory reservation for each operator. Having these
details would allow us to more easily identify the root cause of a
memory related failure as it contains the following details:
- Task id
- Memory allocation breakdown by operator instances
- Plan Node Id
- Additional details, such as type of join or whether the distinct
accumulator is used in hash aggregation
Here is an example of an error message with details:
```
Query exceeded per-node total memory limit of 50MB [Allocated: 49.41MB, Delta: 4.12MB, Top Consumers: {HashAggregationOperator=30.85MB, InMemoryHashAggregationBuilder=18.56MB, ScanFilterAndProjectOperator=468B}, Details: [ {
"taskId" : "0.0.0",
"reservation" : "49.41MB",
"topConsumers" : [ {
"type" : "HashAggregationOperator",
"planNodeId" : "3",
"reservations" : [ "9.78MB", "7.79MB", "7.47MB", "2.93MB", "1.81MB", "1.08MB", "0B", "0B", "0B", "0B", "0B", "0B", "0B", "0B", "0B", "0B" ],
"total" : "30.85MB"
}, {
"type" : "HashAggregationOperator",
"planNodeId" : "138",
"reservations" : [ "18.56MB" ],
"total" : "18.56MB"
}, {
"type" : "ScanFilterAndProjectOperator",
"planNodeId" : "153",
"reservations" : [ "468B" ],
"total" : "468B"
} ]
} ]]
```
d9f18c3 to
5786c94
Compare
This is neeeded to simplify out of memory errors debugging. The verbose
details are completely optional and disabled by default to avoid issues
related to displaying and storing a verbose error message.
The details contain memory reservation for each operator. Having these
details would allow us to more easily identify the root cause of a
memory related failure as it contains the following details:
accumulator is used in hash aggregation
Here is an example of an error message with details: