-
Notifications
You must be signed in to change notification settings - Fork 185
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[low-bit optim] Fix Adam4bit support on PyTorch 2.3 and 2.4. Update AdamFp8 torch requirement #755
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/755
Note: Links to docs will display an error until the docs builds have been completed. ❌ 2 New FailuresAs of commit d83a1c1 with merge base ba2d3b1 (): NEW FAILURES - The following jobs have failed:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
convert this to draft since I'm also investigating torch version support of FP8 optim. FP8 optim has never run in CI due to sm89 constraint. |
Fixed issue with 4-bit Adam. Now 4-bit Adam works with PyTorch 2.3 likes in the past. Hopefully CI is green. The issue seems to be related to this pytorch/pytorch#128649 I kinda feel conflicted about this change, since now the optimizer state is flattened, instead of having the same shape as param. Will try a better solution in the future. I think it has to do with dynamic compile also. 4-bit optim is giving us a lot of headaches 🤣. |
…damFp8 torch requirement (pytorch#755) * update doc on torch version * update doc * update * fix 4-bit problem * update doc * update
See #744 (comment)