Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure server warmup before benchmark #91

Merged
merged 2 commits into from
May 28, 2024
Merged

Conversation

JoeZijunZhou
Copy link
Collaborator

No description provided.

@JoeZijunZhou JoeZijunZhou marked this pull request as ready for review May 28, 2024 20:36
@hosseinsarshar
Copy link

I think 10 seconds is too long - 2-5 perfectly worked for me.

@liurupeng
Copy link

after Vivian added the AOT support, could we use that to identify if the replica has warmed up?

Copy link
Collaborator

@FanhaiLu1 FanhaiLu1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of sleep x seconds, can you wait until all the warm up request return all the tokens?

@JoeZijunZhou
Copy link
Collaborator Author

after Vivian added the AOT support, could we use that to identify if the replica has warmed up?

Yes, that would be the ideal signal to resolve this issue.

@JoeZijunZhou
Copy link
Collaborator Author

Instead of sleep x seconds, can you wait until all the warm up request return all the tokens?

There is a case when the warmup requests done before server warmup complete. Vivian is working on getting the server warmup complete signal from engine. This is a temp workaround.

@JoeZijunZhou JoeZijunZhou merged commit a223df9 into main May 28, 2024
3 checks passed
@JoeZijunZhou JoeZijunZhou deleted the zijun/fix-warmup branch May 28, 2024 22:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants