fixing peak memory stats for benchmark #353

HDCharles · 2024-06-12T23:36:02Z

Summary: we were hitting the peak upon model load, not during model runtime, this is an issue since users can load model to cpu/meta which significantly reduces mem usage during model load/quant.

Test Plan: sh benchmarks.sh

Reviewers:

Subscribers:

Tasks:

Tags:

Summary: we were hitting the peak upon model load, not during model runtime, this is an issue since users can load model to cpu/meta which significantly reduces mem usage during model load/quant. Test Plan: sh benchmarks.sh Reviewers: Subscribers: Tasks: Tags:

pytorch-bot · 2024-06-12T23:36:05Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/353

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure

As of commit f7620fe with merge base 950a893 ():

NEW FAILURE - The following job has failed:

Run Regression Tests / test (CPU 2.3, linux.4xlarge, torch==2.3.0 --index-url https://download.pytorch.org/whl/cpu, cpu) / linux-job (gh)
curl: (22) The requested URL returned error:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:

msaroufim · 2024-06-13T00:42:28Z

torchao/_models/llama/benchmark_results.txt

@@ -1,8 +1,16 @@
-20240610164534, tok/s= 94.91, mem/s=1424.58 GB/s, peak_mem=16.43 GB, model_size=15.01 GB quant: None, mod: Meta-Llama-3-8B, compile: True, compile_prefill: False, dtype: torch.bfloat16, device: cuda repro: python generate.py --checkpoint_path ../../../../gpt-fast/checkpoints/meta-llama/Meta-Llama-3-8B/model.pth --device cuda --precision torch.bfloat16 --compile --num_samples 5 --max_new_tokens 200 --top_k 200 --temperature 0.8


should we keep this file you think? Feels subsumed by the table which is significantly clearer to read

having full repros of everything can be nice

msaroufim

very helpful thanks! you can ignore the ci failure since this a docs only change

* fixing peak memory stats for benchmark Summary: we were hitting the peak upon model load, not during model runtime, this is an issue since users can load model to cpu/meta which significantly reduces mem usage during model load/quant. Test Plan: sh benchmarks.sh Reviewers: Subscribers: Tasks: Tags: * improve language Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 12, 2024

improve language

f7620fe

Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:

HDCharles requested review from msaroufim, supriyar and jerryzh168 June 12, 2024 23:37

msaroufim reviewed Jun 13, 2024

View reviewed changes

msaroufim approved these changes Jun 13, 2024

View reviewed changes

msaroufim merged commit ead8cc8 into main Jun 13, 2024
12 of 13 checks passed

msaroufim deleted the 078 branch June 13, 2024 00:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fixing peak memory stats for benchmark #353

fixing peak memory stats for benchmark #353

HDCharles commented Jun 12, 2024

pytorch-bot bot commented Jun 12, 2024 •

edited

Loading

msaroufim Jun 13, 2024

HDCharles Jun 13, 2024

msaroufim left a comment •

edited

Loading

		@@ -1,8 +1,16 @@
		20240610164534, tok/s= 94.91, mem/s=1424.58 GB/s, peak_mem=16.43 GB, model_size=15.01 GB quant: None, mod: Meta-Llama-3-8B, compile: True, compile_prefill: False, dtype: torch.bfloat16, device: cuda repro: python generate.py --checkpoint_path ../../../../gpt-fast/checkpoints/meta-llama/Meta-Llama-3-8B/model.pth --device cuda --precision torch.bfloat16 --compile --num_samples 5 --max_new_tokens 200 --top_k 200 --temperature 0.8

fixing peak memory stats for benchmark #353

fixing peak memory stats for benchmark #353

Conversation

HDCharles commented Jun 12, 2024

pytorch-bot bot commented Jun 12, 2024 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/353

❌ 1 New Failure

msaroufim Jun 13, 2024

Choose a reason for hiding this comment

HDCharles Jun 13, 2024

Choose a reason for hiding this comment

msaroufim left a comment • edited Loading

Choose a reason for hiding this comment

pytorch-bot bot commented Jun 12, 2024 •

edited

Loading

msaroufim left a comment •

edited

Loading