Add segment-anything-fast perf/acc benchmarks to torchao #457

jcaip · 2024-06-28T00:03:08Z

This PR adds in segment-anything-fast evaluation to torchao, and also adds benchmarks for int8 quantization + 2:4 sparsity.

With this we can run combined perf/accuracy benchmarks for segment-anything. This should give us a starting point for the relative perf vs relative acc graph for PTC.

Model Type	Technique	img/s	memory (MiB)	mIoU	relative speedup	relative accuracy
ViT-h	baseline (bfloat16, max-autotune)	22.75	15172	0.5811
	int8 dynamic quant (attn + mlp)	24.91	15154	0.5822	1.09x	100.19%
	2:4 sparsity (mlp only)	24.81	15632	0.5672	1.10x	97.61%
	2:4 sparsity (attn + mlp)	24.30	13429	0.5306	1.07x	91.31%
	int8 dynamic quant (attn) int8 dynamic quant + 2:4 sparsity (mlp lin1) 2:4 sparsity (mlp lin2)	26.46	14865	0.5668	1.16x	97.54%

This just copies over the evaluation scripts. Eventually I think we should move over the modeling code too, but plan to do that in a subsequent PR.

pytorch-bot · 2024-06-28T00:03:10Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/457

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit ad8b42f with merge base 5d22ad2 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

jcaip · 2024-06-28T02:53:02Z

torchao/sparsity/README.md


-| qkv  | proj | lin1 | lin2 | time | memory | img/s |
-| ---- | ---- | ---- | ---- | ---- | ------ | ----- |
-| None | None | None | None | 1361.73 | 15.81 | 23.50 |


These numbers are a bit higher than the one reported in the new benchmark script, I'm pretty sure this is because we reuse the same example for benchmarking and not because of any changes to the code.

I grabbed the output of TORCH_LOGS=output_code for both the new and old benchmark script and diffed them, they look pretty much identical:

msaroufim · 2024-06-28T19:56:59Z

torchao/sparsity/README.md

@@ -20,29 +20,32 @@ More concretely, we hope to provide tutorials and APIs for both sparse kernels (

 ## Success Stories

-#### segment-anything
+#### segment-anything-fast


can you update the main README as well with some of these results

msaroufim · 2024-06-28T19:58:12Z

benchmarks/benchmark_sam.py

-        # run_once(qkv="quant",                     proj="quant",                     lin1="quant+sparse (cusparselt)",    lin2="quant+sparse (cusparselt)"),
-        # run_once(qkv="quant+sparse (cutlass)",    proj="quant+sparse (cutlass)",    lin1="quant+sparse (cutlass)",       lin2="quant+sparse (cutlass)"),
-    ]
+    ALL_RUNS = [run_once(qkv="quant", proj="quant", lin1="quant+sparse (cusparselt)", lin2="sparse (cusparselt)")]


could be cleaner?

Let me just remove this file, it's unnecessary now that we have the SAF eval code. I just left it in the PR so I could pull torch logs.

msaroufim

some minor nits

jcaip · 2024-07-02T15:04:13Z

@pytorchbot merge

pytorchmergebot · 2024-07-02T15:05:09Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorchmergebot · 2024-07-02T15:50:54Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

jcaip · 2024-07-02T16:00:17Z

@pytorchbot merge

pytorchmergebot · 2024-07-02T16:00:43Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

msaroufim · 2024-07-02T16:38:31Z

@huydhn @clee2000 another example of mergebot not merging stuff

This PR adds in segment-anything-fast evaluation to torchao, and also adds benchmarks for int8 quantization + 2:4 sparsity. With this we can run combined perf/accuracy benchmarks for segment-anything. This should give us a starting point for the relative perf vs relative acc graph for PTC. | Model Type | Technique | img/s | memory (MiB) | mIoU | relative speedup | relative accuracy | |------------|------------------------------------------------------------------------------------------------------|-------|--------------|--------|------------------|-------------------| | ViT-h | baseline (bfloat16, max-autotune) | 22.75 | 15172 | 0.5811 | | | | | int8 dynamic quant (attn + mlp) | 24.91 | 15154 | 0.5822 | **1.09x** | **100.19%** | | | 2:4 sparsity (mlp only) | 24.81 | 15632 | 0.5672 | **1.10x** | **97.61%** | | | 2:4 sparsity (attn + mlp) | 24.30 | 13429 | 0.5306 | **1.07x** | **91.31%** | | | int8 dynamic quant (attn)<br>int8 dynamic quant + 2:4 sparsity (mlp lin1)<br>2:4 sparsity (mlp lin2) | 26.46 | 14865 | 0.5668 | **1.16x** | **97.54%** | This just copies over the evaluation scripts. Eventually I think we should move over the modeling code too, but plan to do that in a subsequent PR.

* scrub & reformat code * use full paths * set tiktoken init to False, not None to align with new tokenizer chatting logic

jcaip added 5 commits June 25, 2024 10:55

add sam

fb5bb70

added eval

ec55547

added makefile to make reproducing results easier

a876e53

update

bdabf94

updated

ca6610d

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 28, 2024

jcaip marked this pull request as draft June 28, 2024 00:03

jcaip added 5 commits June 27, 2024 17:04

removed autotuner configs

35169b4

udpated README

00967be

remove unnecessary files

c3d98d9

updated README

cfd4b40

updated table

50b0412

jcaip marked this pull request as ready for review June 28, 2024 02:18

jcaip changed the title ~~[wip] Add segment-anything-fast benchmarks to torchao~~ Add segment-anything-fast perf/acc benchmarks to torchao Jun 28, 2024

jcaip commented Jun 28, 2024

View reviewed changes

jcaip requested review from HDCharles and msaroufim June 28, 2024 12:58

msaroufim reviewed Jun 28, 2024

View reviewed changes

msaroufim approved these changes Jun 28, 2024

View reviewed changes

jcaip added 2 commits July 2, 2024 07:09

updated README

0e0def2

make table nicer

5b3c08f

pytorchmergebot added the merging label Jul 2, 2024

pytorchmergebot added Merged merging and removed merging labels Jul 2, 2024

pytorchmergebot removed the merging label Jul 2, 2024

pytorchmergebot added the merging label Jul 2, 2024

pytorchmergebot removed the merging label Jul 2, 2024

Merge branch 'main' into jcaip/sam

ad8b42f

jcaip merged commit f22e8e8 into main Jul 2, 2024
13 checks passed

yanbing-j pushed a commit to yanbing-j/ao that referenced this pull request Dec 9, 2024

Code formatting (pytorch#457)

95a8555

* scrub & reformat code * use full paths * set tiktoken init to False, not None to align with new tokenizer chatting logic

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add segment-anything-fast perf/acc benchmarks to torchao #457

Add segment-anything-fast perf/acc benchmarks to torchao #457

jcaip commented Jun 28, 2024 •

edited

Loading

pytorch-bot bot commented Jun 28, 2024 •

edited

Loading

jcaip Jun 28, 2024 •

edited

Loading

msaroufim Jun 28, 2024

msaroufim Jun 28, 2024

jcaip Jun 28, 2024

msaroufim left a comment

jcaip commented Jul 2, 2024

pytorchmergebot commented Jul 2, 2024

pytorchmergebot commented Jul 2, 2024

jcaip commented Jul 2, 2024

pytorchmergebot commented Jul 2, 2024

msaroufim commented Jul 2, 2024

Add segment-anything-fast perf/acc benchmarks to torchao #457

Add segment-anything-fast perf/acc benchmarks to torchao #457

Conversation

jcaip commented Jun 28, 2024 • edited Loading

pytorch-bot bot commented Jun 28, 2024 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/457

✅ No Failures

jcaip Jun 28, 2024 • edited Loading

Choose a reason for hiding this comment

msaroufim Jun 28, 2024

Choose a reason for hiding this comment

msaroufim Jun 28, 2024

Choose a reason for hiding this comment

jcaip Jun 28, 2024

Choose a reason for hiding this comment

msaroufim left a comment

Choose a reason for hiding this comment

jcaip commented Jul 2, 2024

pytorchmergebot commented Jul 2, 2024

Merge started

pytorchmergebot commented Jul 2, 2024

Merge started

jcaip commented Jul 2, 2024

pytorchmergebot commented Jul 2, 2024

Merge started

msaroufim commented Jul 2, 2024

jcaip commented Jun 28, 2024 •

edited

Loading

pytorch-bot bot commented Jun 28, 2024 •

edited

Loading

jcaip Jun 28, 2024 •

edited

Loading