-
Notifications
You must be signed in to change notification settings - Fork 7k
[core] Limit core worker gRPC reply threads to 2 by default #58771
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 6 commits
5b3bec0
866d87e
9f86f38
127c325
b16f354
a553110
2d89b68
d27b39e
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -14,6 +14,7 @@ | |
|
|
||
| #include "ray/rpc/server_call.h" | ||
|
|
||
| #include <atomic> | ||
| #include <memory> | ||
|
|
||
| #include "ray/common/ray_config.h" | ||
|
|
@@ -22,9 +23,18 @@ namespace ray { | |
| namespace rpc { | ||
| namespace { | ||
|
|
||
| std::atomic<ServerCallThreadPoolMode> &ThreadPoolMode() { | ||
| static std::atomic<ServerCallThreadPoolMode> mode( | ||
| ServerCallThreadPoolMode::SYSTEM_COMPONENT); | ||
| return mode; | ||
| } | ||
|
|
||
| std::unique_ptr<boost::asio::thread_pool> &_GetServerCallExecutor() { | ||
| static auto thread_pool = std::make_unique<boost::asio::thread_pool>( | ||
| ::RayConfig::instance().num_server_call_thread()); | ||
| ThreadPoolMode().load(std::memory_order_acquire) == | ||
|
||
| ServerCallThreadPoolMode::CORE_WORKER | ||
| ? ::RayConfig::instance().core_worker_num_server_call_thread() | ||
| : ::RayConfig::instance().num_server_call_thread()); | ||
| return thread_pool; | ||
| } | ||
|
|
||
|
|
@@ -36,7 +46,14 @@ void DrainServerCallExecutor() { GetServerCallExecutor().join(); } | |
|
|
||
| void ResetServerCallExecutor() { | ||
| _GetServerCallExecutor() = std::make_unique<boost::asio::thread_pool>( | ||
| ::RayConfig::instance().num_server_call_thread()); | ||
| ThreadPoolMode().load(std::memory_order_acquire) == | ||
| ServerCallThreadPoolMode::CORE_WORKER | ||
| ? ::RayConfig::instance().core_worker_num_server_call_thread() | ||
| : ::RayConfig::instance().num_server_call_thread()); | ||
| } | ||
|
|
||
| void SetServerCallThreadPoolMode(ServerCallThreadPoolMode mode) { | ||
| ThreadPoolMode().store(mode, std::memory_order_release); | ||
| } | ||
|
|
||
| } // namespace rpc | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -50,6 +50,8 @@ enum class ClusterIdAuthType { | |
| /// This pool is shared across gRPC servers. | ||
| boost::asio::thread_pool &GetServerCallExecutor(); | ||
|
|
||
| enum class ServerCallThreadPoolMode { SYSTEM_COMPONENT = 0, CORE_WORKER = 1 }; | ||
|
|
||
| /// Drain the executor. | ||
| void DrainServerCallExecutor(); | ||
|
|
||
|
|
@@ -59,6 +61,10 @@ void DrainServerCallExecutor(); | |
| /// because they are global. | ||
| void ResetServerCallExecutor(); | ||
|
|
||
| /// Set which config this process uses for the global reply thread pool. | ||
| /// Call before the first GetServerCallExecutor(). | ||
| void SetServerCallThreadPoolMode(ServerCallThreadPoolMode mode); | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Needing to set this at the right time is a little weird. I guess you could avoid it by passing in the type of server from ServerCallImpl as an arg or template param but that would mean a new template param or arg all the way down the stack and idk if we want that just for this.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If it "helps" we already have this pattern for setting vars for GRPC elsewhere.... Though that is perhaps a weak justification.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thank you! Let me rethink this. The reason I went with this approach is that I saw a similar pattern in the core worker code lol, so I followed that and added
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. OK — I don’t really see a better option here besides (1) “set before using” or (2) “pass the parameter all the way down the stack”. Given the size of this change and the current setup, we already require InitializeSystemConfig() to be called before GetServerCallExecutor(). We also have service_handler_.WaitUntilInitialized(), so I think it’s quite safe to call Let me know what you think.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ya that's ok for now, if we use this pattern more can think more about it |
||
|
|
||
| /// Represents state of a `ServerCall`. | ||
| enum class ServerCallState { | ||
| /// The call is created and waiting for an incoming request. | ||
|
|
||
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change does alter the meaning of num_server_call_thread, but I believe that’s exactly what we want.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's nix etc. There is no etc. after raylet and GCS, and this specifically doesn't effect workers. So let's not make it confusing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure!