SAM2 Fast AMG: memory profiling and more compile #1296

cpuhrsch · 2024-11-16T00:41:40Z

More changes to reduce latency.

pytorch-bot · 2024-11-16T00:41:43Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1296

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

[DomainsOnly] Jobs fail with GLIBC version not found

❌ 1 New Failure

As of commit 4391732 with merge base 06ad55a ():

NEW FAILURE - The following job has failed:

Run Regression Tests / test (CUDA 2.4, linux.g5.12xlarge.nvidia.gpu, torch==2.4.0, cuda, 12.1) / linux-job (gh)
##[error]fatal: couldn't find remote ref refs/pull/1296/merge

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Fix pytorch#1296 Align with https://github.com/pytorch/pytorch/blame/main/requirements.txt#L5

* add pp_dim, distributed, num_gpus, num_nodes as cmd line args * add tp_dim * add elastic_launch * working, can now launch from cli * Remove numpy < 2.0 pin to align with pytorch (pytorch#1301) Fix pytorch#1296 Align with https://github.com/pytorch/pytorch/blame/main/requirements.txt#L5 * Update torchtune pin to 0.4.0-dev20241010 (pytorch#1300) Co-authored-by: vmpuri <[email protected]> * Unbreak gguf util CI job by fixing numpy version (pytorch#1307) Setting numpy version to be the range required by gguf: https://github.com/ggerganov/llama.cpp/blob/master/gguf-py/pyproject.toml * Remove apparently-unused import torchvision in model.py (pytorch#1305) Co-authored-by: vmpuri <[email protected]> * remove global var for tokenizer type + patch tokenizer to allow list of sequences * make pp tp visible in interface * Add llama 3.1 to dist_run.py * [WIP] Move dist inf into its own generator * Add initial generator interface to dist inference * Added generate method and placeholder scheduler * use prompt parameter for dist generation * Enforce tp>=2 * Build tokenizer from TokenizerArgs * Disable torchchat format + constrain possible models for distributed * disable calling dist_run.py directly for now * Restore original dist_run.py for now * disable _maybe_parallelize_model again * Reenable arg.model_name in dist_run.py * Use singleton logger instead of print in generate * Address PR comments; try/expect in launch_dist_inference; added comments --------- Co-authored-by: lessw2020 <[email protected]> Co-authored-by: Mengwei Liu <[email protected]> Co-authored-by: vmpuri <[email protected]> Co-authored-by: vmpuri <[email protected]> Co-authored-by: Scott Wolchok <[email protected]>

cpuhrsch added 7 commits November 14, 2024 00:26

Memory profiling and reduced memory

8c12fb1

uncomment max_memory_allocated

2a2e034

More annotations and bigger batch benchmarks

c2e818e

TODO and cleanup

9e8aa3c

float16 interpolate and more compile

f5ac8b5

More annotations

89d3ca1

compile the flatten

e2fbfad

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Nov 16, 2024

cpuhrsch added the topic: improvement Use this tag if this PR is an improvement (doesn't fit into any of the other categories) label Nov 16, 2024

jerryzh168 approved these changes Nov 16, 2024

View reviewed changes

cpuhrsch added 3 commits November 15, 2024 18:14

Transforms device

ebd4534

First results

1a2e613

Updated results

4391732

cpuhrsch merged commit d4ca98f into pytorch:main Nov 16, 2024
14 of 15 checks passed

sunjiweiswift pushed a commit to sunjiweiswift/ao that referenced this pull request Nov 25, 2024

SAM2 Fast AMG: memory profiling and more compile (pytorch#1296)

01a7b47

yanbing-j pushed a commit to yanbing-j/ao that referenced this pull request Dec 9, 2024

Remove numpy < 2.0 pin to align with pytorch (pytorch#1301)

dd9747f

Fix pytorch#1296 Align with https://github.com/pytorch/pytorch/blame/main/requirements.txt#L5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

SAM2 Fast AMG: memory profiling and more compile #1296

SAM2 Fast AMG: memory profiling and more compile #1296

Uh oh!

cpuhrsch commented Nov 16, 2024

Uh oh!

pytorch-bot bot commented Nov 16, 2024 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

SAM2 Fast AMG: memory profiling and more compile #1296

SAM2 Fast AMG: memory profiling and more compile #1296

Uh oh!

Conversation

cpuhrsch commented Nov 16, 2024

Uh oh!

pytorch-bot bot commented Nov 16, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1296

❗ 1 Active SEVs

❌ 1 New Failure

Uh oh!

Uh oh!

Uh oh!

pytorch-bot bot commented Nov 16, 2024 •

edited

Loading