Add an option to use fp8-all-gather only without fp8 computation. #1093

y-sq · 2024-10-16T17:56:39Z

Summary:
The implementation reuses WeightWithDynamicFloat8CastTensor class and the Float8Linear module.

I added an if-else branch in the existing Float8Linear module to re-use our existing logics to handle different casting cases, such as pre-/post-forward for delayed scaling, pre-compute amax for fp8-all-gather.

Differential Revision: D63056142

pytorch-bot · 2024-10-16T17:56:42Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1093

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 5c5f4f2 with merge base ae77f40 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

facebook-github-bot · 2024-10-16T17:56:47Z

This pull request was exported from Phabricator. Differential Revision: D63056142

weifengpy · 2024-10-21T05:29:17Z

CI error looks relevant. maybe take a look before landing?

  ImportError while importing test module '/pytorch/ao/test/float8/test_fsdp2/test_fsdp2_fp8_comm_only.py'.
  Hint: make sure your test modules/packages have valid Python names.
  Traceback:
  /opt/conda/envs/venv/lib/python3.9/importlib/__init__.py:127: in import_module
      return _bootstrap._gcd_import(name[level:], package, level)
  test/float8/test_fsdp2/test_fsdp2_fp8_comm_only.py:14: in <module>
      from torch.distributed._composable.fsdp import fully_shard, MixedPrecisionPolicy
  E   ModuleNotFoundError: No module named 'torch.distributed._composable.fsdp'```

…torch#1093) Summary: The implementation reuses `WeightWithDynamicFloat8CastTensor` class and the `Float8Linear` module. I added an if-else branch in the existing `Float8Linear` module to re-use our existing logics to handle different casting cases, such as pre-/post-forward for delayed scaling, pre-compute amax for fp8-all-gather. Reviewed By: weifengpy Differential Revision: D63056142

facebook-github-bot · 2024-10-31T05:48:34Z

This pull request was exported from Phabricator. Differential Revision: D63056142

…torch#1093) Summary: The implementation reuses `WeightWithDynamicFloat8CastTensor` class and the `Float8Linear` module. I added an if-else branch in the existing `Float8Linear` module to re-use our existing logics to handle different casting cases, such as pre-/post-forward for delayed scaling, pre-compute amax for fp8-all-gather. Reviewed By: weifengpy Differential Revision: D63056142

facebook-github-bot · 2024-10-31T05:49:06Z

This pull request was exported from Phabricator. Differential Revision: D63056142

…torch#1093) Summary: The implementation reuses `WeightWithDynamicFloat8CastTensor` class and the `Float8Linear` module. I added an if-else branch in the existing `Float8Linear` module to re-use our existing logics to handle different casting cases, such as pre-/post-forward for delayed scaling, pre-compute amax for fp8-all-gather. Reviewed By: weifengpy Differential Revision: D63056142

facebook-github-bot · 2024-10-31T07:07:53Z

This pull request was exported from Phabricator. Differential Revision: D63056142

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 16, 2024

facebook-github-bot added the fb-exported label Oct 16, 2024

y-sq requested review from vkuzo, weifengpy and drisspg October 16, 2024 17:57

weifengpy approved these changes Oct 21, 2024

View reviewed changes

y-sq force-pushed the export-D63056142 branch from 7857bb1 to 2950a2e Compare October 31, 2024 05:48

y-sq force-pushed the export-D63056142 branch from 2950a2e to 0f3ef23 Compare October 31, 2024 05:48

y-sq force-pushed the export-D63056142 branch from 0f3ef23 to 5c5f4f2 Compare October 31, 2024 07:07

facebook-github-bot merged commit 2761917 into pytorch:main Oct 31, 2024
17 of 19 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add an option to use fp8-all-gather only without fp8 computation. #1093

Add an option to use fp8-all-gather only without fp8 computation. #1093

y-sq commented Oct 16, 2024

pytorch-bot bot commented Oct 16, 2024 •

edited

Loading

facebook-github-bot commented Oct 16, 2024

weifengpy commented Oct 21, 2024

facebook-github-bot commented Oct 31, 2024

facebook-github-bot commented Oct 31, 2024

facebook-github-bot commented Oct 31, 2024

Add an option to use fp8-all-gather only without fp8 computation. #1093

Add an option to use fp8-all-gather only without fp8 computation. #1093

Conversation

y-sq commented Oct 16, 2024

pytorch-bot bot commented Oct 16, 2024 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1093

✅ No Failures

facebook-github-bot commented Oct 16, 2024

weifengpy commented Oct 21, 2024

facebook-github-bot commented Oct 31, 2024

facebook-github-bot commented Oct 31, 2024

facebook-github-bot commented Oct 31, 2024

pytorch-bot bot commented Oct 16, 2024 •

edited

Loading