Skip to content

Commit 68643c5

Browse files
dominicshanshanmikeiovine
authored andcommitted
[https://nvbugs/5648685][fix] Fix openAI server waiting time to avoid large model weight loading out time (NVIDIA#9254)
Signed-off-by: Wangshanshan <[email protected]>
1 parent cd20c50 commit 68643c5

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

tests/unittest/llmapi/apps/openai_server.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@
1616

1717
class RemoteOpenAIServer:
1818
DUMMY_API_KEY = "tensorrt_llm"
19-
MAX_SERVER_START_WAIT_S = 600 # wait for server to start for 600 seconds
19+
MAX_SERVER_START_WAIT_S = 7200 # wait for server to start for 7200 seconds (~ 2 hours) for LLM models weight loading
2020

2121
def __init__(self,
2222
model: str,

0 commit comments

Comments
 (0)