DCAE-Sana obtains the NAN value after decoding the latent which comes from encoding image #96

ZouYa99 · 2024-12-18T08:18:30Z

I ran the following code.However, the "output" is all NAN values. I'm confused. Can you help me?

from efficientvit.ae_model_zoo import DCAE_HF
import torch
dc_ae = DCAE_HF.from_pretrained("/data/dc-ae-f32c32-sana-1.0")
from PIL import Image
import torch
import torchvision.transforms as transforms
from torchvision.utils import save_image
from efficientvit.apps.utils.image import DMCrop

device = torch.device("cuda")
dc_ae = dc_ae.to(device).eval()

transform = transforms.Compose([
    DMCrop(512), # resolution
    transforms.ToTensor(),
    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)),
])
image = Image.open("assets/fig/girl.png")
x = transform(image)[None].to(device)
latent = dc_ae.encode(x)
output = dc_ae.decode(latent)

The text was updated successfully, but these errors were encountered:

lawrence-cj · 2024-12-19T03:17:57Z

Hi, can you check If the dc_ae.dtype is FP32 or BF16? @ZouYa99

ZouYa99 · 2024-12-20T06:57:28Z

It is torch.float32. @lawrence-cj

lawrence-cj · 2024-12-20T11:41:08Z

@chenjy2003 Junyu, can you help here?

chenjy2003 · 2024-12-20T11:51:07Z

Hi @ZouYa99 , I ran the same code (except replacing /data/dc-ae-f32c32-sana-1.0 with mit-han-lab/dc-ae-f32c32-sana-1.0), and the result is normal. I'm not sure if there is something wrong with this local checkpoint. Could you please elaborate on how you downloaded the weights and check if the weights are correct?

ZouYa99 · 2024-12-24T03:19:56Z

I am sorry to say that my device cannot run "mit-han-lab/dc-ae-f32c32-sana-1.0" directly because of the network. So I downloaded this model again (https://huggingface.co/mit-han-lab/dc-ae-f32c32-sana-1.0). But there's still a problem with NAN.

chenjy2003 · 2024-12-24T05:46:04Z

@ZouYa99 I'm still not sure the reason for this NAN issue. Maybe something is wrong with the environment. Since our models are merged into diffusers, I would recommend you give it a try. Please upgrade diffusers first by pip install -U diffusers.

from PIL import Image
import torch
import torchvision.transforms as transforms
from torchvision.utils import save_image
from diffusers import AutoencoderDC

device = torch.device("cuda")
dc_ae: AutoencoderDC = AutoencoderDC.from_pretrained(f"mit-han-lab/dc-ae-f32c32-sana-1.0-diffusers", torch_dtype=torch.float32).to(device).eval()

transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize(0.5, 0.5),
])

image = Image.open("assets/fig/girl.png")
x = transform(image)[None].to(device)
latent = dc_ae.encode(x).latent
y = dc_ae.decode(latent).sample
save_image(y * 0.5 + 0.5, "demo_dc_ae.png")

lawrence-cj added the bug Something isn't working label Dec 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DCAE-Sana obtains the NAN value after decoding the latent which comes from encoding image #96

DCAE-Sana obtains the NAN value after decoding the latent which comes from encoding image #96

ZouYa99 commented Dec 18, 2024 •

edited

Loading

lawrence-cj commented Dec 19, 2024

ZouYa99 commented Dec 20, 2024 •

edited

Loading

lawrence-cj commented Dec 20, 2024

chenjy2003 commented Dec 20, 2024

ZouYa99 commented Dec 24, 2024

chenjy2003 commented Dec 24, 2024

DCAE-Sana obtains the NAN value after decoding the latent which comes from encoding image #96

DCAE-Sana obtains the NAN value after decoding the latent which comes from encoding image #96

Comments

ZouYa99 commented Dec 18, 2024 • edited Loading

lawrence-cj commented Dec 19, 2024

ZouYa99 commented Dec 20, 2024 • edited Loading

lawrence-cj commented Dec 20, 2024

chenjy2003 commented Dec 20, 2024

ZouYa99 commented Dec 24, 2024

chenjy2003 commented Dec 24, 2024

ZouYa99 commented Dec 18, 2024 •

edited

Loading

ZouYa99 commented Dec 20, 2024 •

edited

Loading