You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In this asynchronous method, call the synchronous code worker.generate_gate(params), it seems to cause service congestion. In this case, await acquire_worker_semaphore() seems to be invalid because there is always only one model running in the service?
is this right?
The text was updated successfully, but these errors were encountered:
@merrymercyworker.generate_gate(params) is runnning in main thread. The program is not blocked by semaphore. StreamResponse will start new subthread, so /worker_generate_stream doesn't blocked.
example:
first request: POST /worker_generate , Try to resp for as long as possible
second request: POST /worker_get_status, When the first request is not completed, the second request will be blocked.
solution:
I don't know much about python. Is it elegant in this way? Place generate_gate in non main thread
I saw this code in model_worker.py & multi_model_worker.py
In this asynchronous method, call the synchronous code worker.generate_gate(params), it seems to cause service congestion. In this case, await acquire_worker_semaphore() seems to be invalid because there is always only one model running in the service?
is this right?
The text was updated successfully, but these errors were encountered: