Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use parameter based num as inference request max output length #10

Merged
merged 1 commit into from
Mar 14, 2024

Conversation

FanhaiLu1
Copy link
Collaborator

This feature add max_output_length arg and use it as inference request max output length. The default value is 1024 which same as MLPerf.

Before this PR, we use dataset target output as inference request max output length. It's not right for below reasons:

  • In real inference use case, user don't know what max output length
  • Not same as MLPerf, MPPerf use 1024 to setup max output length

For long term, we may refactor the benchmark to make sure it's generic and align with industry setting.

@FanhaiLu1 FanhaiLu1 merged commit 8289e65 into AI-Hypercomputer:main Mar 14, 2024
3 checks passed
@FanhaiLu1 FanhaiLu1 deleted the benchmark branch March 14, 2024 22:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants