[StaticQuant] Update how block_size is calculated with Observers #815

drisspg · 2024-09-05T04:40:47Z

Stacked PRs:

[StaticQuant] Update how block_size is calculated with Observers

pytorch-bot · 2024-09-05T04:40:49Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/815

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit de0b7fd with merge base 848e123 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

torchao/quantization/observer.py

jerryzh168

looks great, thanks for adding docs and the additional test! had one comment inline, please address the comment before merging

stack-info: PR: #815, branch: drisspg/stack/10

…ytorch#815) By wrapping attempt to load a model with `try {} catch (std::runtime_error) {}` and attempting to create model on GPU first, as attempt to load CPU model on CUDA destroys CUDA context (bugs/fixes againt PyTorch are coming, tracked in pytorch/pytorch#126547 ) Also, fix two bugs in the repo: - Initialize `Tokenizer::initialized_` to false - Change name of the tokenizer file in a workflow from `tokenizer.bin` to `tokenizer.model` Fixes pytorch/torchchat#709 Test plan: ``` python3 torchchat.py export --checkpoint-path checkpoints/stories15M/model.pth --output-dso-path model_cpu.so --device cpu python3 torchchat.py export --checkpoint-path checkpoints/stories15M/model.pth --output-dso-path model.so ./cmake-out/aoti_run ./model.so -z checkpoints/stories15M/tokenizer.model ./cmake-out/aoti_run ./model_cpu.so -z checkpoints/stories15M/tokenizer.model ```

drisspg force-pushed the drisspg/stack/10 branch from 82265e7 to 2579f23 Compare September 5, 2024 04:40

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Sep 5, 2024

drisspg mentioned this pull request Sep 5, 2024

[StaticQuant] add a linear observer class and test #807

Merged

drisspg force-pushed the drisspg/stack/10 branch 3 times, most recently from 591472c to 155e41c Compare September 5, 2024 05:04

drisspg commented Sep 5, 2024

View reviewed changes

torchao/quantization/observer.py Show resolved Hide resolved

jerryzh168 reviewed Sep 5, 2024

View reviewed changes

torchao/quantization/observer.py Outdated Show resolved Hide resolved

jerryzh168 approved these changes Sep 5, 2024

View reviewed changes

drisspg force-pushed the drisspg/stack/10 branch from 155e41c to 7caa703 Compare September 5, 2024 18:22

[StaticQuant] Update how block_size is calculated with Observers

de0b7fd

stack-info: PR: #815, branch: drisspg/stack/10

drisspg force-pushed the drisspg/stack/10 branch from 7caa703 to de0b7fd Compare September 5, 2024 18:30

drisspg merged commit a1b3e67 into main Sep 5, 2024
17 checks passed

HDCharles pushed a commit that referenced this pull request Sep 9, 2024

[StaticQuant] Update how block_size is calculated with Observers (#815)

1cc9245

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[StaticQuant] Update how block_size is calculated with Observers #815

[StaticQuant] Update how block_size is calculated with Observers #815

drisspg commented Sep 5, 2024 •

edited

Loading

pytorch-bot bot commented Sep 5, 2024 •

edited

Loading

jerryzh168 left a comment

[StaticQuant] Update how block_size is calculated with Observers #815

[StaticQuant] Update how block_size is calculated with Observers #815

Conversation

drisspg commented Sep 5, 2024 • edited Loading

pytorch-bot bot commented Sep 5, 2024 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/815

✅ No Failures

jerryzh168 left a comment

Choose a reason for hiding this comment

drisspg commented Sep 5, 2024 •

edited

Loading

pytorch-bot bot commented Sep 5, 2024 •

edited

Loading