Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[NF4] .to() fixes #1312

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

[NF4] .to() fixes #1312

wants to merge 2 commits into from

Conversation

gau-nernst
Copy link
Collaborator

Fixes #1310

Copy link

pytorch-bot bot commented Nov 19, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1312

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

✅ No Failures

As of commit f309aea with merge base 26648c2 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Nov 19, 2024
@gau-nernst gau-nernst added the topic: bug fix Use this tag for PRs that fix bugs label Nov 19, 2024
@gau-nernst gau-nernst marked this pull request as ready for review November 19, 2024 17:38


@implements_torch_function(torch.Tensor.cuda)
def function_cuda(*args, **kwargs):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this func not work call_from_inner_tensors

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

call_from_inner_tensors() does not call the method on .scaler_mean and .nf4 attribute, hence I use __tensor_flatten__ instead.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. That's another anti-pattern since call_from_inner_tensors is typically applied for all the tensors and not just specific to sharding properties. But this makes sense. Maybe we should just like update it and have a flag that says ignore sharding or not.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise, thank you. Yeah, I really didn't like the to calling dequant secretly, so I think this more aligns with what we've seen in the rest of the library.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea I also penned some of my thoughts in #1310.

To clarify, this PR does not change the behavior "to calling dequant secretly (when dtype is specified)". Apart from fixes for .scaler_mean and .nf4 attributes, this PR only changes .cuda() behavior to not dequantize (previously .cuda() will propagate to aten._to_copy, which dequantize), so it's more consistent with .cpu() as well as the general "not dequantize when dtype is not specified".

@drisspg
Copy link
Contributor

drisspg commented Nov 19, 2024

@ebsmothers Are there any ways to validate this against some tune flows to make sure it doesn't break anything?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. topic: bug fix Use this tag for PRs that fix bugs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[NF4] Various bugs in how NF4 handles .to() to move to a different device
3 participants