Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mps not supported now on 2.1.66 #690

Closed
vicento opened this issue Oct 15, 2023 · 51 comments
Closed

mps not supported now on 2.1.66 #690

vicento opened this issue Oct 15, 2023 · 51 comments
Labels
bug Something isn't working

Comments

@vicento
Copy link

vicento commented Oct 15, 2023

Device mps:0 does not support the torch.fft functions used in the FreeU node, switching to CPU.
How can i get full MPS support on my silicon mac ?

@lllyasviel
Copy link
Owner

is that showing if you do not use FreeU or you always see this?

@vicento
Copy link
Author

vicento commented Oct 15, 2023

if i use freeU i get this :
Device mps:0 does not support the torch.fft functions used in the FreeU node, switching to CPU.
without freeU doesn't see mps error
i am on 2.1.675 now

@guoreex
Copy link

guoreex commented Oct 17, 2023

macos 14.0
macbookpro m1, 16g ram

[Fooocus Model Management] Moving model(s) has taken 61.51 seconds
[Sampler] Fooocus sampler is activated.
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using tokenizers before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
0%| | 0/30 [00:00<?, ?it/s]/Volumes/1TSSD/AI-project/fooocus/Fooocus/modules/anisotropic.py:132: UserWarning: The operator 'aten::std_mean.correction' is not currently supported on the MPS backend and will fall back to run on the CPU. This may have performance implications. (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/mps/MPSFallback.mm:13.)
s, m = torch.std_mean(g, dim=(1, 2, 3), keepdim=True)
3%|█▍ | 1/30 [01:40<48:41, 100.73s/it]

@Gitterman69
Copy link

App started successful. Use the app with http://127.0.0.1:7860/ or 127.0.0.1:7860
[Parameters] Adaptive CFG = 7
[Parameters] Sharpness = 2
[Parameters] ADM Scale = 1.5 : 0.8 : 0.3
[Parameters] CFG = 7.0
[Parameters] Seed = 4730616638981956459
[Parameters] Sampler = dpmpp_2m_sde_gpu - karras
[Parameters] Steps = 30 - 20
[Fooocus] Initializing ...
[Fooocus] Loading models ...
[Fooocus] Processing prompts ...
[Fooocus] Preparing Fooocus text #1 ...
[Prompt Expansion] New suffix: intricate, highly detailed, digital painting, artstation, concept art, smooth, sharp focus, illustration, unreal engine 5, 8 k, art by artgerm and greg rutkowski and alphonse mucha
[Fooocus] Preparing Fooocus text #2 ...
[Prompt Expansion] New suffix: extremely detailed, artstation, 8 k, sensual lighting, incredible art, wlop, artgerm
[Fooocus] Encoding positive #1 ...
[Fooocus] Encoding positive #2 ...
[Fooocus] Encoding negative #1 ...
[Fooocus] Encoding negative #2 ...
Preparation time: 3.79 seconds
[Sampler] refiner_swap_method = joint
[Sampler] sigma_min = 0.02916753850877285, sigma_max = 14.614643096923828
Requested to load SDXL
Loading 1 new model
[Fooocus Model Management] Moving model(s) has taken 60.01 seconds
[Sampler] Fooocus sampler is activated.
0%| | 0/30 [00:00<?, ?it/s]/Applications/Fooocus/modules/anisotropic.py:132: UserWarning: The operator 'aten::std_mean.correction' is not currently supported on the MPS backend and will fall back to run on the CPU. This may have performance implications. (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/mps/MPSFallback.mm:13.)
s, m = torch.std_mean(g, dim=(1, 2, 3), keepdim=True)
7%|██▋ | 2/30 [04:50<1:07:49, 145.34s/it]
User stopped
Total time: 355.80 seconds

@Gitterman69
Copy link

if i use freeU i get this : Device mps:0 does not support the torch.fft functions used in the FreeU node, switching to CPU. without freeU doesn't see mps error i am on 2.1.675 now

same here :( i want to use fooocus on my macbook m1 so bad :)

@Gitterman69
Copy link

yes - this is so sad :( @lllyasviel heeeeeeeeeelp

@Gitterman69
Copy link

bump

@tiancool
Copy link

Pay attention to the same problem

@Gitterman69
Copy link

bump

@none2serious
Copy link

same.

@waterydan
Copy link

waterydan commented Nov 29, 2023

Same here, Fooocus v2.1.8241
Looks like Pytorch has added support: pytorch/pytorch#110829

@robpal
Copy link

robpal commented Nov 30, 2023

bump

1 similar comment
@Tygrha
Copy link

Tygrha commented Nov 30, 2023

bump

@Tygrha
Copy link

Tygrha commented Nov 30, 2023

[Fooocus] Encoding positive #1 ...
[Fooocus] Encoding positive #2 ...
[Fooocus] Encoding negative #1 ...
[Fooocus] Encoding negative #2 ...
[Parameters] Denoising Strength = 1.0
[Parameters] Initial Latent shape: Image Space (1152, 896)
Preparation time: 20.42 seconds
[Sampler] refiner_swap_method = joint
[Sampler] sigma_min = 0.0291671771556139, sigma_max = 14.614643096923828
Requested to load SDXL
Loading 1 new model
[Fooocus Model Management] Moving model(s) has taken 113.59 seconds
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using tokenizers before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
0%| | 0/30 [00:00<?, ?it/s]/Users/ty/Desktop/Fooocus-main/modules/anisotropic.py:132: UserWarning: The operator 'aten::std_mean.correction' is not currently supported on the MPS backend and will fall back to run on the CPU. This may have performance implications. (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/mps/MPSFallback.mm:13.)
s, m = torch.std_mean(g, dim=(1, 2, 3), keepdim=True)
7%|██████ | 2/30 [06:31<1:30:11, 193.25s/it]

@ambraxia
Copy link

The operator 'aten::std_mean.correction' is not currently supported on the MPS backend and will fall back to run on the CPU

The same error

193.25s/it

The same speed

Bump

@Tygrha
Copy link

Tygrha commented Nov 30, 2023

If my understanding is correct, it looks like a compatibility issue with pytorch and apple silicon.

Does anyone know if there is a way to adjust the parameters in Fooocus to avoid using the 'aten:std_mean.coorection', but rather some other calculation that would be supported?

@dch09
Copy link

dch09 commented Dec 1, 2023

Bump, I'd love to get an update on that.

@emilmammadov
Copy link

any update?

1 similar comment
@muratdemirkiran33
Copy link

any update?

@Gitterman69
Copy link

bumpedy bump bump - can we haz some mac arm love plz

@gregplumbly
Copy link

UserWarning: The operator 'aten::std_mean.correction' is not currently supported on the MPS backend and will fall back to run on the CPU. This may have performance implications. (Triggered internally at /Users/runner/work/_temp/anaconda/conda-bld/pytorch_1701416305940/work/aten/src/ATen/mps/MPSFallback.mm:13.)
s, m = torch.std_mean(g, dim=(1, 2, 3), keepdim=True)

@Kolch
Copy link

Kolch commented Dec 4, 2023

same here

@BoomerCorp
Copy link

Same on Intel iMac 2020 with AMD Graphics?

@winnduu
Copy link

winnduu commented Dec 5, 2023

Macbook M3 Pro suffers from the same problem - So i guess the whole lineup of Apples Silicon CPU's is affected

@amediantsev
Copy link

amediantsev commented Dec 7, 2023

Same on MacBook Pro Apple M2 Pro, macOS Sonoma Version 14.1.

... UserWarning: The operator 'aten::std_mean.correction' is not currently supported on the MPS backend and will fall back to run on the CPU. This may have performance implications. (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/mps/MPSFallback.mm:13.) ...

@bhardwajshash
Copy link

Same issue! Bump

@semyonf
Copy link

semyonf commented Dec 18, 2023

Same

@oscody
Copy link

oscody commented Dec 20, 2023

same

@Dkray
Copy link

Dkray commented Dec 21, 2023

same
M2 Pro, Fooocus version: 2.1.854

@Gitterman69
Copy link

Gitterman69 commented Dec 21, 2023 via email

@Dkray
Copy link

Dkray commented Dec 22, 2023

The most interesting thing is that PyTorch works correctly. The test script from the Apple site produces the correct result: tensor([1.], device='mps:0')

@GuiFV
Copy link

GuiFV commented Dec 23, 2023

up

@SimcoeHops
Copy link

Experiencing the same "Device mps:0 does not support the torch.fft functions used in the FreeU node, switching to CPU" even though a test script output indicates Metal is installed with GPU support (tensor([1.], device='mps:0'). I'm running ComfyUI on an M2 MBP 96gb etc.

@igm503
Copy link

igm503 commented Dec 24, 2023

Unfortunately, implementing fft on the mps backend is pretty difficult.

I think the reason no one has taken up adding fft support to the mps backend is because of a lack of support from Apple. I could be wrong about this, but my understanding is that cuda has fft built in or available via a library, whereas the fourier transform functionality the relevant Apple libraries has for the m-family gpus is relatively limited as of now, so an implementation of this in pytorch might require implementing a full or near-full custom fft algorithm. I don't see why I couldn't be done, but it'll take a hero.

@dfl
Copy link

dfl commented Dec 29, 2023

@igm503 I wonder if VkFFT would fit the bill?
c.f. https://news.ycombinator.com/item?id=36968582

@mashb1t mashb1t added bug Something isn't working help wanted Extra attention is needed labels Dec 31, 2023
@schuster-rainer
Copy link

on the latest version of fooocus performance improved a lot. I'm running with:

PYTORCH_ENABLE_MPS_FALLBACK=1 python webui.py --disable-offload-from-vram

as I have a M1 with 64GB. my iterations with LCM are ~1.25 per second for a 768x11344 image. Non LCM renders take roughly double the time.

@Dkray
Copy link

Dkray commented Jan 17, 2024

on the latest version of fooocus performance improved a lot. I'm running with:

PYTORCH_ENABLE_MPS_FALLBACK=1 python webui.py --disable-offload-from-vram

as I have a M1 with 64GB. my iterations with LCM are ~1.25 per second for a 768x11344 image. Non LCM renders take roughly double the time.

my result is - UserWarning: The operator 'aten::std_mean.correction' is not currently supported on the MPS backend and will fall back to run on the CPU. This may have performance implications. (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/mps/MPSFallback.mm:13.)
s, m = torch.std_mean(g, dim=(1, 2, 3), keepdim=True)
3%|█▎ | 1/30 [03:40<1:46:36, 220.57s

@schuster-rainer
Copy link

schuster-rainer commented Jan 17, 2024

on the latest version of fooocus performance improved a lot. I'm running with:
PYTORCH_ENABLE_MPS_FALLBACK=1 python webui.py --disable-offload-from-vram
as I have a M1 with 64GB. my iterations with LCM are ~1.25 per second for a 768x11344 image. Non LCM renders take roughly double the time.

my result is - UserWarning: The operator 'aten::std_mean.correction' is not currently supported on the MPS backend and will fall back to run on the CPU. This may have performance implications. (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/mps/MPSFallback.mm:13.) s, m = torch.std_mean(g, dim=(1, 2, 3), keepdim=True) 3%|█▎ | 1/30 [03:40<1:46:36, 220.57s

that's correct. Hence the fallback variable. I'm running a M1 Max on Sonoma (MacOS 14). If you have an older OS / slower cpu it might take longer. If you're on a macbook, make sure your not in energy saving mode and if you overwrite the sampler in 'Developer Debug Mode'. Samplers are slower or faster depending on which one your using

@Dkray
Copy link

Dkray commented Jan 17, 2024

I'm running a M1 Max on Sonoma (MacOS 14)

M2 Pro on Sonoma )

@Tiago-Zarzavidjian
Copy link

on the latest version of fooocus performance improved a lot. I'm running with:
PYTORCH_ENABLE_MPS_FALLBACK=1 python webui.py --disable-offload-from-vram
as I have a M1 with 64GB. my iterations with LCM are ~1.25 per second for a 768x11344 image. Non LCM renders take roughly double the time.

my result is - UserWarning: The operator 'aten::std_mean.correction' is not currently supported on the MPS backend and will fall back to run on the CPU. This may have performance implications. (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/mps/MPSFallback.mm:13.) s, m = torch.std_mean(g, dim=(1, 2, 3), keepdim=True) 3%|█▎ | 1/30 [03:40<1:46:36, 220.57s

I'm getting this exact error on my M3 Pro on Sonoma 14.2.1

@igm503
Copy link

igm503 commented Jan 23, 2024

Would need to look into it, but adding aten::std_mean to the MPS backend doesn't seem too difficult, if it isn't already there in a nightly.

@vicento
Copy link
Author

vicento commented Jan 24, 2024

I'd be happy to see someone do it. It's not difficult for those who know how

@hike2008
Copy link

up

@legz
Copy link

legz commented Feb 18, 2024

Same here on Mac mini M2 / Sonoma 14.2.1

@igm503
Copy link

igm503 commented Feb 19, 2024

FYI, the warning about cpu fallback for aten::std_mean isn't an error--if that's all you're seeing, then the program is running fine, just slower than it would be if that op was implemented for the PyTorch MPS backend.

I'm going to check it out tomorrow and see if I have time to add support for it/if it's already supported in a nightly.

Also FYI, if y'all are still encountering problems with torch.fft support on the MPS device, there's a pull request in the pytorch repo right now that will add more support for fft functions on the macbooks, so the next pytorch nightly releases might be worth trying.

@igm503
Copy link

igm503 commented Feb 19, 2024

aten::std_mean is indeed in the latest nightly version of pytorch. Not sure when it'll make it into a release version.

@legz
Copy link

legz commented Feb 21, 2024

FYI, the warning about cpu fallback for aten::std_mean isn't an error--if that's all you're seeing, then the program is running fine, just slower than it would be if that op was implemented for the PyTorch MPS backend.

You're absolutely right, but 30+ minutes per image on an M2/16GB is not really usable.
Thanks for the additional info about nightly version of pytorch!

@mashb1t mashb1t removed the help wanted Extra attention is needed label Feb 22, 2024
@mashb1t mashb1t modified the milestone: 2.3.0 (draft) Mar 10, 2024
@whz739723619
Copy link

same

@Gitterman69
Copy link

any news?!

@mashb1t
Copy link
Collaborator

mashb1t commented Jun 4, 2024

@Gitterman69 in general the MPS backend is supported (beta), but not all functions are optimized. Please find the instructions on how to install Fooocus for MacOS / MPS backend in https://github.com/lllyasviel/Fooocus?tab=readme-ov-file#mac and discuss problems in #129

@mashb1t mashb1t closed this as completed Jul 26, 2024
@medienbueroleipzig
Copy link

medienbueroleipzig commented Nov 10, 2024

The operator is not yet supported by Apple, that's all. You can (want to) tinker with it as much as you want.
--disable-offload-from-vram only shortens the loading of the model, but has no effect on the lack of support for the pytorch operator aten::std_mean.correction.
Read more here: http://medienbüro-leipzig.de/index.php?action=faq&cat=10&id=24&artlang=en

Der Operator wird bis dato nicht von Apple unterstützt, dass ist alles. Da kann man noch so viel rumschrauben (wollen) wie man will.
--disable-offload-from-vram verkürzt nur das Laden des models, aber hat keine Auswirkungen auf die fehlende Unterstützung für den pytorch-operator aten::std_mean.correction.
Liest du mehr hier :http://medienbüro-leipzig.de/index.php?action=faq&cat=9&id=23&artlang=de

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests