-
Notifications
You must be signed in to change notification settings - Fork 271
Adding support for bf16_full_eval #610
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -15,15 +15,15 @@ | |
| MODELS_TO_TEST = { | ||
| "bf16": [ | ||
| ("facebook/bart-large-cnn", "Habana/bart", 4.691, 26.0688, 2, 1), | ||
| ("t5-3b", "Habana/t5", 2.28, 21.56, 2, 1), | ||
| ("t5-3b", "Habana/t5", 2.88, 21.56, 2, 1), | ||
| ], | ||
| } | ||
| else: | ||
| # Gaudi1 CI baselines | ||
| MODELS_TO_TEST = { | ||
| "bf16": [ | ||
| ("facebook/bart-large-cnn", "Habana/bart", 2.588, 26.0688, 2, 1), | ||
| ("t5-3b", "Habana/t5", 0.585, 21.72, 2, 1), | ||
| ("t5-3b", "Habana/t5", 0.98, 21.56, 2, 1), | ||
|
Comment on lines
17
to
+26
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For Gaudi1, I get a RougeLsum of 21.3831 and a throughput of 1.005. It doesn't matter much since the test passes (no need to update the numbers).
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I didn't get different RougeLsum. When I added perf numbers, I ran the test twice to check and I was getting same RogueLsum. Let me check again.
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Interesting, did you run it with Synapse 1.13?
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I could see the variation. However, I'm seeing variation on v1.9-release too for the test "test_run_summarization_t5-small_multi_card". Can you confirm if it's same on your end
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I cannot run multi-card tests on my Gaudi2 instance at the moment but if you observed the same behavior for "test_run_summarization_t5-small_multi_card" it means that this "issue" was already there before. |
||
| ], | ||
| } | ||
|
|
||
|
|
@@ -76,6 +76,8 @@ def _test_text_summarization( | |
|
|
||
| if not deepspeed: | ||
| command.append("--bf16") | ||
| if model_name == "t5-3b": | ||
| command.append("--bf16_full_eval") | ||
|
|
||
| with TemporaryDirectory() as tmp_dir: | ||
| command.append(f"--output_dir {tmp_dir}") | ||
|
|
||
Uh oh!
There was an error while loading. Please reload this page.