Add AQT tensor parallel for float8_dynamic_quant #1078

jainapurva · 2024-10-15T01:01:44Z

Added support for tensor parallel for float8_dynamic_activation_float8_weight

- Supports PerTensor scaling
- Supports PerRow scaling

Added op implementations:

aten.slice.Tensor
aten.view.default

pytorch-bot · 2024-10-15T01:01:48Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1078

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 3dfc799 with merge base e7b33bc ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

jerryzh168 · 2024-10-16T00:11:34Z

torchao/dtypes/affine_quantized_tensor.py

+                    return return_and_correct_aliasing(
+                        func, args, kwargs, args[0]._apply_fn_to_data(lambda x: aten.slice.Tensor(x, dim, start, end, step))
+                    )
+                else:


what is this case btw, is this ndim==0 or some other case?

Yes, it's ndim==0. I've updated the condition to check for ndim=0, instead of generic condition.

jerryzh168

looks good, thanks!

…at/ folder (pytorch#1076) * [Hackability Refactor] Move known_model_params under torchchat (pytorch#1073) * [Hackability Refactor] Migrate CLI call sites to explicitly go through torchchat.py (pytorch#1075) * [Hackability Refactor] Move model.py underneath torchchat/ (pytorch#1077) * Move model.py * Clear out init to avoid package circular import * [Hackability Refactor] Move select top level docs into folders within torchchat (pytorch#1080) * [Hackability Refactor] Move the top level util folder into torchchat/utils (pytorch#1079) * [Hackability Refactor] Move the top level util file into torchchat/utils/ * Cleared out init to avoid packing * [Hackability Refactor] Collapse gguf_util into gguf_loader (pytorch#1078) * [Hackability Refactor] Collapse gguf_util into gguf_loader * Update bad import * [Hackability Refactor] Move model_config into torchchat/model_config (pytorch#1082) * [Hackability Refactor] Move cli related files under torchchat/cli (pytorch#1083) * [Hackability Refactor] Move build/util into torchchat/utils (pytorch#1084) * [Hackability Refactor] Easy Moves: eval, gguf_loader, quantize, model_dist (pytorch#1085) * [Hackability Refactor] Easy Cheap Moves: eval, gguf_loader, quantize, model_dist * Update eval.py call sites that slipped through the initial pass * [Hackability Refactor] Update missed direct file calls to use torchchat.py (pytorch#1088) * [Hackability Refactor] Move export and generate under torchchat/ (pytorch#1089) * [Hackability Refactor] Move scripts under torchchat/utils (pytorch#1090) * [Hackability Refactor] Move scripts under torchchat/utils * Fix install script for AOTI * Update referenced path in build_android * Adding missing utils path * Add another layer for torchchat * Move the source command depending on if TC root is defined * [Hackability Refactor] Move installation related files into install/ (pytorch#1081) * [Hackability Refactor] Move installation related files into install/ * Fix install req path * Test fix with install path for bash * Debug messages * Remove changes to install in et_python_libs * Remove debug echo * Fix pin path for et * [Hackability Refactor] Restricted Lint (pytorch#1091) * [Hackability Refactor] Removing __main__ from export/generate/eval (pytorch#1092)

jainapurva requested review from jerryzh168 and drisspg October 15, 2024 01:01

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 15, 2024

Float8 tensor parallel for aqt_dynamic_act_weight

6314d88

jainapurva force-pushed the float8_dyn_aqt_tp branch from 21711e0 to 6314d88 Compare October 15, 2024 01:03

Float8 tensor parallel for aqt_dynamic_act_weight

26d84b5

jainapurva force-pushed the float8_dyn_aqt_tp branch from 3b5c8f9 to e0966cc Compare October 15, 2024 23:46

jerryzh168 reviewed Oct 16, 2024

View reviewed changes

Added support for PerRow granularity

3dfc799

jainapurva force-pushed the float8_dyn_aqt_tp branch from e0966cc to 3dfc799 Compare October 16, 2024 02:51

jerryzh168 approved these changes Oct 16, 2024

View reviewed changes

jainapurva marked this pull request as ready for review October 16, 2024 04:22

jainapurva merged commit 7a35695 into main Oct 16, 2024
17 checks passed

jainapurva mentioned this pull request Oct 16, 2024

Tensor Parallelism Support for AffineQuantizedTensor #988

Open

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add AQT tensor parallel for float8_dynamic_quant #1078

Add AQT tensor parallel for float8_dynamic_quant #1078

jainapurva commented Oct 15, 2024 •

edited

Loading

pytorch-bot bot commented Oct 15, 2024 •

edited

Loading

jerryzh168 Oct 16, 2024

jainapurva Oct 16, 2024

jerryzh168 left a comment

Add AQT tensor parallel for float8_dynamic_quant #1078

Add AQT tensor parallel for float8_dynamic_quant #1078

Conversation

jainapurva commented Oct 15, 2024 • edited Loading

pytorch-bot bot commented Oct 15, 2024 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1078

✅ No Failures

jerryzh168 Oct 16, 2024

Choose a reason for hiding this comment

jainapurva Oct 16, 2024

Choose a reason for hiding this comment

jerryzh168 left a comment

Choose a reason for hiding this comment

jainapurva commented Oct 15, 2024 •

edited

Loading

pytorch-bot bot commented Oct 15, 2024 •

edited

Loading