Conversation
Some queries take too much drivers for their leaf splits that causes issues like ExceededMemoryLimit. It needs to be controllable.
|
Some notes:
|
|
Essentially, both
I would think with both Questions:
|
With
If query-max-memory-per-node already reaches to the node's memory cap, it cannot be increased.
Is the min value 1? When I reduced that number below 4, I noticed that many deadlocks began to occur. |
If you still have more driver slots left, there can always be another query that would push memory utilization over node total mem. Why not to reduce |
We can confine certain heavy queries to have less concurrency. That being said, other light queries can run in a better performance/stability. |
What prevents user from issuing 2 heavy queries that would push node beyond mem limit? |
Some heavy queries can exceed memory limit due to accumulated memory in ScanFilterAndProjectOperator. That can be mitigated by confining maxDriversPerQuery. Particularly when the filter's selectivity rate is high, so the output buffer is filled up slowly. |
Yes. However, you mentioned that so what does prevent two such heavy queries to run on a node and exceed node memory cap (one of the queries will fail anyway)? |
| } | ||
| } | ||
|
|
||
| public static int getMaxDriversPerQuery(Session session) |
There was a problem hiding this comment.
You should only be able to reduce max_drivers_per_query and not increase it vs value in FeatureConfig
|
👋 @JunhyungSong - this PR is inactive and doesn't seem to be under development. If you'd like to continue work on this at any point in the future, feel free to re-open. |
Some queries take too much drivers for their leaf splits that causes
issues like ExceededMemoryLimit. It needs to be controllable.