Add listener model for TaskThresholdMemoryRevokingScheduler#15505
Conversation
There was a problem hiding this comment.
Not too sure what we think of this way of alerting the TaskThresholdMemoryRevokingScheduler. Not ideal but this is the cleanest way I could think of - other alternatives are passing the listener around somehow instead of piggybacking on MemoryPool. The thing I dont like about this is that MemoryPool now has two sets of disparate listeners and technically this is a task/operator context allocation.
|
I also thought about adding the listener to SqlTaskManager but that required a pretty large refactor since OperatorContext does not have access to SqlTask/SqlTaskExecution/SqlTaskManager so I didn't end up taking that route. |
highker
left a comment
There was a problem hiding this comment.
maybe add a simple application to this use case and we can tell if the design makes sense. So far I think it's mostly legit but would love to see it running with some operators
|
Okay cool, I will go forward with adding tests that demonstrate this then. |
fb9494c to
f1ddbd5
Compare
There was a problem hiding this comment.
Why not just keep SqlTaskManager here directly?
There was a problem hiding this comment.
Testing - without parameterizing it, it is difficult to make this part of the code modular and testable.
There was a problem hiding this comment.
See @VisibleForTesting constructor for reference
presto-main/src/main/java/com/facebook/presto/execution/MemoryRevokingScheduler.java
Outdated
Show resolved
Hide resolved
presto-main/src/main/java/com/facebook/presto/memory/TaskRevocableMemoryListener.java
Outdated
Show resolved
Hide resolved
bc11850 to
c6bb33a
Compare
|
Related pending changes: #15569 Be careful about changing memory revoking behaviors in light of the fact that memory revoking has mostly been disabled by accident since its inception |
|
I saw your PR earlier @pettyjamesm - can you further explain what you mean by "mostly been disabled"? I figured your PR is mostly removing the overcounting due to a query repeatedly counting memory from all other tasks in that query. Wouldn't this mean that revoking is triggered more frequently and not it being "disabled"? |
The trigger to attempt revoking has been hyperactive, but actual revoking has been underactive because of the over counting. Very little revoking has ever been occurring in typical production environments because it was effectively short-circuited by the over counting and early termination logic |
|
I see what is going on. For each task we first get its root QueryContext and visit its children. Each task belonging to the QueryContext will visit and triggering revoking on the same tasks every time. This explains why this works mostly fine in prod/testing when only aggregation spilling is enabled but is definitely a problem when join spilling especially. It is worth testing without this PR and seeing if there is still any merit to using a listener model after incorporating #15569. |
c6bb33a to
4f386c2
Compare
|
@pettyjamesm, is spilling working well in Amazon? For FB, it's the first time we will turn on spilling in prod. Though we haven't, there are quite a few bugs we are fixing. |
@highker The biggest problem we've had with spilling has always been the limitations on when spilling could occur. In Athena v1 scenarios that could spill were extremely narrow so spilling to disk was pretty rare. In v2 the potential spilling opportunities are broader but still somewhat limited in the version of presto it's based off of. I can't personally recall seeing spill to disk ever cause a query failure but if there were subtle behavioral issues they may have just gone unnoticed. What bugs have you run into? |
|
@pettyjamesm Many of the bugs we have fixed had to do with memory reporting primarily - some operators were under-reporting memory used causing spilling to not be triggered when it should be. Most of the spilling bugs that we have fixed are unnoticeable in environments where Presto has sufficiently high amount of heap space. They are much more serious in low memory use cases though since poor spilling can easily lead to worker heap OOM. |
Test plan - deployed to cluster, travis, wrote tests
Previously, TaskThresholdMemoryRevokingScheduler had only a polling based model for when a task allocates too much memory. This introduces a listener-based model instead which ensures that we do not have to wait for the polling thread to be scheduled. Instead, we immediately schedule revoking.