Add semi-structured sparsity to hf eval #576

jcaip · 2024-07-30T21:53:50Z

Summary:

This PR adds in huggingface eval for semi-structured sparsity with the nm-testing/SparseLlama-3-8B-pruned_50.2of4 checkpoint published by neuralmagic. We can accelerate this checkpoint with to_sparse_semi_structured with minimal accuracy loss.

Also adds in an example notebook that outlines how to do text generation with huggingface's model and shows a 80 tok/s -> 87 tok/s speedup.

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:

pytorch-bot · 2024-07-30T21:53:53Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/576

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit fc82248 with merge base 5f35645 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

msaroufim · 2024-07-30T22:48:23Z

scripts/hf_eval.py

    with torch.no_grad():
        result = evaluate(
            HFLM(
                pretrained=model.to(device),
                tokenizer=tokenizer,
-                batch_size=batch_size,
+                batch_size="auto",
+                max_batch_size=2048,


jcaip · 2024-08-02T19:07:17Z

mkn	dense_time	sparse_time	dense_memory	sparse_memory	speedup
(1, 4096, 4096)	0.026794	0.048279	0.050628	0.136668	0.553
(1, 4096, 1024)	0.014931	0.048054	0.025443	0.046953	0.310
(1, 4096, 14336)	0.098401	0.077046	0.134575	0.435717	1.277
(1, 14336, 4096)	0.084088	0.153705	0.134534	0.435533	0.547
(8, 4096, 4096)	0.028638	0.045108	0.050800	0.136783	0.635
(8, 4096, 1024)	0.015131	0.044912	0.025529	0.047025	0.337
(8, 4096, 14336)	0.099945	0.073439	0.135034	0.435975	1.364
(8, 14336, 4096)	0.083806	0.148976	0.134850	0.435791	0.563
(16, 4096, 4096)	0.029405	0.045662	0.050996	0.136979	0.642
(16, 4096, 1024)	0.020930	0.045057	0.025628	0.047123	0.463
(16, 4096, 14336)	0.100186	0.073811	0.135558	0.436499	1.365
(16, 14336, 4096)	0.084441	0.150410	0.135210	0.436151	0.561

msaroufim · 2024-08-06T02:09:36Z

tutorials/huggingface_24sparse_example.ipynb

@@ -0,0 +1,383 @@
+{


Really not a fan of checking in notebooks, mind turning this into a python file, there should be a button to do this in jupyter, alternatively link to the tutorial hosted on google colab

Sure, that fine with me.

msaroufim · 2024-08-08T22:00:48Z

tutorials/huggingface_24sparse_example.py

+torch.sparse.SparseSemiStructuredTensor._FORCE_CUTLASS = True
+torch.set_float32_matmul_precision('high')
+
+def timed(fn):


we already have timing utilities in torchao/utils.py

Add hf example for semi-structured sparsity

8d64eca

Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jul 30, 2024

msaroufim reviewed Jul 30, 2024

View reviewed changes

jcaip added 2 commits August 2, 2024 11:37

Merge branch 'main' into jcaip/hf-sparse-example

67b8814

updated notebook

22acc9d

jcaip and others added 7 commits August 2, 2024 12:07

update

de03379

update hf example

122faaa

Update version.txt

20821d3

Merge branch 'main' into jcaip/hf-sparse-example

e30c587

update hf_eval changes

c0b3a03

update

0729156

Merge branch 'main' into jcaip/hf-sparse-example

843be9a

jcaip marked this pull request as ready for review August 6, 2024 00:44

jcaip changed the title ~~[wip] Add hf example for semi-structured sparsity~~ Add semi-structured sparsity to hf eval Aug 6, 2024

msaroufim reviewed Aug 6, 2024

View reviewed changes

jcaip added 2 commits August 8, 2024 13:02

remove notebook and add script

7e8ffc4

fix merge conflict

fc82248

msaroufim reviewed Aug 8, 2024

View reviewed changes

msaroufim approved these changes Aug 23, 2024

View reviewed changes

msaroufim merged commit eaf2908 into main Aug 23, 2024
14 checks passed

msaroufim deleted the jcaip/hf-sparse-example branch August 23, 2024 17:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add semi-structured sparsity to hf eval #576

Add semi-structured sparsity to hf eval #576

jcaip commented Jul 30, 2024 •

edited

Loading

pytorch-bot bot commented Jul 30, 2024 •

edited

Loading

msaroufim Jul 30, 2024

jcaip commented Aug 2, 2024

msaroufim Aug 6, 2024

jcaip Aug 6, 2024

msaroufim Aug 8, 2024

Add semi-structured sparsity to hf eval #576

Add semi-structured sparsity to hf eval #576

Conversation

jcaip commented Jul 30, 2024 • edited Loading

pytorch-bot bot commented Jul 30, 2024 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/576

✅ No Failures

msaroufim Jul 30, 2024

Choose a reason for hiding this comment

jcaip commented Aug 2, 2024

msaroufim Aug 6, 2024

Choose a reason for hiding this comment

jcaip Aug 6, 2024

Choose a reason for hiding this comment

msaroufim Aug 8, 2024

Choose a reason for hiding this comment

jcaip commented Jul 30, 2024 •

edited

Loading

pytorch-bot bot commented Jul 30, 2024 •

edited

Loading