-
Notifications
You must be signed in to change notification settings - Fork 185
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable CPU Offload for Intel GPU #1324
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1324
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 03ac00f with merge base 478d15b (): This comment was automatically generated by Dr. CI and updates every 15 minutes. |
755414f
to
742ffbf
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the feature addition! Hopefully once the device-agnostic API support arrives, we can eliminate the if-else checks 😆
@dbyoung18 Can you run Otherwise, everything else looks good already! |
Done for ruff format. Hopes the bnb issue could be resolved soon. THX again for ur review and quick feedback:) |
@dbyoung18 Can you merge from main? #1343 should fix the bnb issue. Also, can you also update the doc here? https://github.com/pytorch/ao/tree/main/torchao/prototype/low_bit_optim#optimizer-cpu-offload After that we are good to merge 😃 |
Signed-off-by: dbyoung18 <[email protected]>
Signed-off-by: dbyoung18 <[email protected]>
Signed-off-by: dbyoung18 <[email protected]>
Signed-off-by: dbyoung18 <[email protected]>
Signed-off-by: dbyoung18 <[email protected]>
Signed-off-by: dbyoung18 <[email protected]>
Signed-off-by: dbyoung18 <[email protected]>
Signed-off-by: dbyoung18 <[email protected]>
Signed-off-by: dbyoung18 <[email protected]>
Signed-off-by: dbyoung18 <[email protected]>
649b00b
to
9d15d4b
Compare
Signed-off-by: dbyoung18 <[email protected]>
Done for both. We have a plan to gradually support torch-ao & pytorch core on Intel GPU. For this PR it covers CPU Offload only and I will look into the remain part of low-bit optimizers for next step. Since meanwhile we are also on the way to upstream FlashAttention backend to pytorch core(target v2.6 or v2.7), would like to add benchmark data to the README when it's ready. So currently, I only modify the README to make the CPU-Offload part to cover XPU scope. THX for review and I am also looking forward to make further contributions soon.😃 |
Sounds good! The low-bit optimizers rely entirely on the tensor subclass + torch.compile() stack, so as long as there is a triton build that supports XPU backend, it should work out-of-the-box! |
Background
Current CPU Offload in torchao only supports CUDA backend. We would like to add support for Intel GPU with the device option "xpu".
Details