New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Use parameter based num as inference request max output length #10

Merged

FanhaiLu1 merged 1 commit into AI-Hypercomputer:main from FanhaiLu1:benchmark

Mar 14, 2024

Collaborator

FanhaiLu1 commented Mar 14, 2024

This feature add max_output_length arg and use it as inference request max output length. The default value is 1024 which same as MLPerf.

Before this PR, we use dataset target output as inference request max output length. It's not right for below reasons:

In real inference use case, user don't know what max output length
Not same as MLPerf, MPPerf use 1024 to setup max output length

For long term, we may refactor the benchmark to make sure it's generic and align with industry setting.


          Set max_output_length by 1024 as default and use max_output_length in…

… inference request

JoeZijunZhou approved these changes

View reviewed changes

FanhaiLu1 merged commit 8289e65 into AI-Hypercomputer:main

3 checks passed

FanhaiLu1 deleted the benchmark branch

March 14, 2024 22:29

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet