[NF4] `.to()` fixes #1312

gau-nernst · 2024-11-19T15:37:13Z

Fixes #1310

pytorch-bot · 2024-11-19T15:37:18Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1312

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

[DomainsOnly] Jobs fail with GLIBC version not found

✅ No Failures

As of commit f309aea with merge base 26648c2 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

drisspg · 2024-11-19T22:35:17Z

torchao/dtypes/nf4tensor.py

+
+
+@implements_torch_function(torch.Tensor.cuda)
+def function_cuda(*args, **kwargs):


does this func not work call_from_inner_tensors

call_from_inner_tensors() does not call the method on .scaler_mean and .nf4 attribute, hence I use __tensor_flatten__ instead.

I see. That's another anti-pattern since call_from_inner_tensors is typically applied for all the tensors and not just specific to sharding properties. But this makes sense. Maybe we should just like update it and have a flag that says ignore sharding or not.

Otherwise, thank you. Yeah, I really didn't like the to calling dequant secretly, so I think this more aligns with what we've seen in the rest of the library.

Yea I also penned some of my thoughts in #1310.

To clarify, this PR does not change the behavior "to calling dequant secretly (when dtype is specified)". Apart from fixes for .scaler_mean and .nf4 attributes, this PR only changes .cuda() behavior to not dequantize (previously .cuda() will propagate to aten._to_copy, which dequantize), so it's more consistent with .cpu() as well as the general "not dequantize when dtype is not specified".

drisspg · 2024-11-19T22:36:00Z

@ebsmothers Are there any ways to validate this against some tune flows to make sure it doesn't break anything?

initial fixes

bfa1594

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Nov 19, 2024

run ruff

f309aea

gau-nernst added the topic: bug fix Use this tag for PRs that fix bugs label Nov 19, 2024

gau-nernst marked this pull request as ready for review November 19, 2024 17:38

jerryzh168 requested a review from drisspg November 19, 2024 22:28

drisspg reviewed Nov 19, 2024

View reviewed changes

drisspg approved these changes Nov 19, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[NF4] `.to()` fixes #1312

[NF4] `.to()` fixes #1312

gau-nernst commented Nov 19, 2024

pytorch-bot bot commented Nov 19, 2024 •

edited

Loading

drisspg Nov 19, 2024

gau-nernst Nov 19, 2024

drisspg Nov 19, 2024

drisspg Nov 19, 2024

gau-nernst Nov 19, 2024

drisspg commented Nov 19, 2024



		@implements_torch_function(torch.Tensor.cuda)
		def function_cuda(args, *kwargs):

[NF4] .to() fixes #1312

Are you sure you want to change the base?

[NF4] .to() fixes #1312

Conversation

gau-nernst commented Nov 19, 2024

pytorch-bot bot commented Nov 19, 2024 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1312

❗ 1 Active SEVs

✅ No Failures

drisspg Nov 19, 2024

Choose a reason for hiding this comment

gau-nernst Nov 19, 2024

Choose a reason for hiding this comment

drisspg Nov 19, 2024

Choose a reason for hiding this comment

drisspg Nov 19, 2024

Choose a reason for hiding this comment

gau-nernst Nov 19, 2024

Choose a reason for hiding this comment

drisspg commented Nov 19, 2024

[NF4] `.to()` fixes #1312

[NF4] `.to()` fixes #1312

pytorch-bot bot commented Nov 19, 2024 •

edited

Loading