Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RTX 2060 tried everything won't run since 11/28 #1621

Closed
AFOLcast opened this issue Dec 28, 2023 · 16 comments
Closed

RTX 2060 tried everything won't run since 11/28 #1621

AFOLcast opened this issue Dec 28, 2023 · 16 comments
Labels
bug Something isn't working help wanted Extra attention is needed

Comments

@AFOLcast
Copy link

Read Troubleshoot

[x] I admit that I have read the Troubleshoot before making this issue.

Describe the problem
Started a clean re-install. Followed all troubleshooting. Swap memory is at 44000-60000. Tried with and without old xformers. Most recent run with new xformers. Hangs.

Full Console Log

D:\Fooocus>.\python_embeded\python.exe -s Fooocus\entry_with_update.py
Already up-to-date
Update succeeded.
[System ARGV] ['Fooocus\entry_with_update.py']
Python 3.10.9 (tags/v3.10.9:1dd9be6, Dec 6 2022, 20:01:21) [MSC v.1934 64 bit (AMD64)]
Fooocus version: 2.1.855
Running on local URL: http://127.0.0.1:7865

To create a public link, set share=True in launch().
Total VRAM 6144 MB, total RAM 16200 MB
Set vram state to: NORMAL_VRAM
Always offload VRAM
Device: cuda:0 NVIDIA GeForce RTX 2060 : native
VAE dtype: torch.float32
Using pytorch cross attention
Refiner unloaded.
model_type EPS
UNet ADM Dimension 2816
Using pytorch attention in VAE
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
Using pytorch attention in VAE
extra {'cond_stage_model.clip_l.text_projection', 'cond_stage_model.clip_g.transformer.text_model.embeddings.position_ids', 'cond_stage_model.clip_l.logit_scale'}
Base model loaded: D:\Fooocus\Fooocus\models\checkpoints\juggernautXL_version6Rundiffusion.safetensors
Request to load LoRAs [['sd_xl_offset_example-lora_1.0.safetensors', 0.1], ['None', 1.0], ['None', 1.0], ['None', 1.0], ['None', 1.0]] for model [D:\Fooocus\Fooocus\models\checkpoints\juggernautXL_version6Rundiffusion.safetensors].
Loaded LoRA [D:\Fooocus\Fooocus\models\loras\sd_xl_offset_example-lora_1.0.safetensors] for UNet [D:\Fooocus\Fooocus\models\checkpoints\juggernautXL_version6Rundiffusion.safetensors] with 788 keys at weight 0.1.
Fooocus V2 Expansion: Vocab with 642 words.
Fooocus Expansion engine loaded for cuda:0, use_fp16 = True.
Requested to load SDXLClipModel
Requested to load GPT2LMHeadModel
Loading 2 new models
[Fooocus Model Management] Moving model(s) has taken 0.72 seconds
App started successful. Use the app with http://127.0.0.1:7865/ or 127.0.0.1:7865
[Parameters] Adaptive CFG = 7
[Parameters] Sharpness = 2
[Parameters] ADM Scale = 1.5 : 0.8 : 0.3
[Parameters] CFG = 4.0
[Parameters] Seed = 8128164886135262337
[Parameters] Sampler = dpmpp_2m_sde_gpu - karras
[Parameters] Steps = 30 - 15
[Fooocus] Initializing ...
[Fooocus] Loading models ...
Refiner unloaded.
[Fooocus] Processing prompts ...
[Fooocus] Preparing Fooocus text #1 ...
[Prompt Expansion] cute puppy, fine intricate, elegant, highly detailed, symmetry, sharp focus, majestic, amazing bright colors, radiant light, vivid color, coherent, dazzling, brilliant, colorful, very scientific background, professional, winning, open artistic, deep aesthetic, magical, scenic, thought complex, extremely cool, creative, cinematic, singular, best, real, imagined, dramatic
[Fooocus] Preparing Fooocus text #2 ...
Traceback (most recent call last):
File "D:\Fooocus\Fooocus\modules\async_worker.py", line 806, in worker
handler(task)
File "D:\Fooocus\python_embeded\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "D:\Fooocus\python_embeded\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "D:\Fooocus\Fooocus\modules\async_worker.py", line 408, in handler
expansion = pipeline.final_expansion(t['task_prompt'], t['task_seed'])
File "D:\Fooocus\python_embeded\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "D:\Fooocus\python_embeded\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "D:\Fooocus\Fooocus\extras\expansion.py", line 117, in call
features = self.model.generate(**tokenized_kwargs,
File "D:\Fooocus\python_embeded\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "D:\Fooocus\python_embeded\lib\site-packages\transformers\generation\utils.py", line 1572, in generate
return self.sample(
File "D:\Fooocus\python_embeded\lib\site-packages\transformers\generation\utils.py", line 2619, in sample
outputs = self(
File "D:\Fooocus\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "D:\Fooocus\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "D:\Fooocus\python_embeded\lib\site-packages\transformers\models\gpt2\modeling_gpt2.py", line 1080, in forward
transformer_outputs = self.transformer(
File "D:\Fooocus\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "D:\Fooocus\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "D:\Fooocus\python_embeded\lib\site-packages\transformers\models\gpt2\modeling_gpt2.py", line 903, in forward
outputs = block(
File "D:\Fooocus\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "D:\Fooocus\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "D:\Fooocus\python_embeded\lib\site-packages\transformers\models\gpt2\modeling_gpt2.py", line 391, in forward
attn_outputs = self.attn(
File "D:\Fooocus\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "D:\Fooocus\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "D:\Fooocus\python_embeded\lib\site-packages\transformers\models\gpt2\modeling_gpt2.py", line 332, in forward
attn_output, attn_weights = self._attn(query, key, value, attention_mask, head_mask)
File "D:\Fooocus\python_embeded\lib\site-packages\transformers\models\gpt2\modeling_gpt2.py", line 202, in _attn
mask_value = torch.full([], mask_value, dtype=attn_weights.dtype).to(attn_weights.device)
RuntimeError: CUDA error: the launch timed out and was terminated
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

Total time: 1874.34 seconds

@mashb1t
Copy link
Collaborator

mashb1t commented Dec 28, 2023

Same error as in AUTOMATIC1111/stable-diffusion-webui#2144, where one of the solutions was to do exactly what your error has output:

For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

Can you please add this to your startup command (either directly or in run.bat) and check again?

CUDA_LAUNCH_BLOCKING=1 .\python_embeded\python.exe -s Fooocus\entry_with_update.py

@AFOLcast
Copy link
Author

AFOLcast commented Dec 28, 2023 via email

@mashb1t
Copy link
Collaborator

mashb1t commented Dec 28, 2023

Sure, happy to explain it to you.
As of your console log you start Fooocus by executing this line (either manually or via run.bat) in D:\Fooocus:
.\python_embeded\python.exe -s Fooocus\entry_with_update.py
My proposal is to just prefix it with CUDA_LAUNCH_BLOCKING=1 as suggested by the transformers package (origin of the error you've provided) for further debugging and analysis. This may even solve your issue completely, but let's test.

CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

To do so, you can either directly execute mentioned command in D:\Fooocus or adjust the existing line in your run.bat file.

Hope this explanation helped to understand what this does.

@AFOLcast
Copy link
Author

Did as you suggested. Maybe too literally. Got this error message.

D:\Fooocus>CUDA_LAUNCH_BLOCKING=1 .\python_embeded\python.exe -s Fooocus\entry_with_update.py
'CUDA_LAUNCH_BLOCKING' is not recognized as an internal or external command,
operable program or batch file.

D:\Fooocus>pause
Press any key to continue . . .

@AFOLcast
Copy link
Author

Trying it as two statements:

D:\Fooocus>set CUDA_LAUNCH_BLOCKING=1

D:\Fooocus> .\python_embeded\python.exe -s Fooocus\entry_with_update.py
Already up-to-date

@mashb1t
Copy link
Collaborator

mashb1t commented Dec 28, 2023

Yeah, the option i mentioned is for Linux, sry.
Content of my run.bat file:

set CUDA_LAUNCH_BLOCKING=1
.\python_embeded\python.exe -s Fooocus\entry_with_update.py <args here>
pause

@AFOLcast
Copy link
Author

It's running now. Won't know for a little while whether it will bomb out or not. Even with Afterburner, slow. But I do great work with Fooocus, so I'm REALLY trying to make this happen,

@AFOLcast
Copy link
Author

Failed. Here's the console:

Microsoft Windows [Version 10.0.22631.2861]
(c) Microsoft Corporation. All rights reserved.

D:\Fooocus>CUDA_LAUNCH_BLOCKING=1
'CUDA_LAUNCH_BLOCKING' is not recognized as an internal or external command,
operable program or batch file.

D:\Fooocus>set CUDA_LAUNCH_BLOCKING=1

D:\Fooocus> .\python_embeded\python.exe -s Fooocus\entry_with_update.py
Already up-to-date
Update succeeded.
[System ARGV] ['Fooocus\entry_with_update.py']
Python 3.10.9 (tags/v3.10.9:1dd9be6, Dec 6 2022, 20:01:21) [MSC v.1934 64 bit (AMD64)]
Fooocus version: 2.1.855
Running on local URL: http://127.0.0.1:7865

To create a public link, set share=True in launch().
Total VRAM 6144 MB, total RAM 16200 MB
Set vram state to: NORMAL_VRAM
Always offload VRAM
Device: cuda:0 NVIDIA GeForce RTX 2060 : native
VAE dtype: torch.float32
Using pytorch cross attention
Refiner unloaded.
model_type EPS
UNet ADM Dimension 2816
Using pytorch attention in VAE
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
Using pytorch attention in VAE
extra {'cond_stage_model.clip_l.text_projection', 'cond_stage_model.clip_l.logit_scale', 'cond_stage_model.clip_g.transformer.text_model.embeddings.position_ids'}
Base model loaded: D:\Fooocus\Fooocus\models\checkpoints\juggernautXL_version6Rundiffusion.safetensors
Request to load LoRAs [['sd_xl_offset_example-lora_1.0.safetensors', 0.1], ['None', 1.0], ['None', 1.0], ['None', 1.0], ['None', 1.0]] for model [D:\Fooocus\Fooocus\models\checkpoints\juggernautXL_version6Rundiffusion.safetensors].
Loaded LoRA [D:\Fooocus\Fooocus\models\loras\sd_xl_offset_example-lora_1.0.safetensors] for UNet [D:\Fooocus\Fooocus\models\checkpoints\juggernautXL_version6Rundiffusion.safetensors] with 788 keys at weight 0.1.
Fooocus V2 Expansion: Vocab with 642 words.
Fooocus Expansion engine loaded for cuda:0, use_fp16 = True.
Requested to load SDXLClipModel
Requested to load GPT2LMHeadModel
Loading 2 new models
[Fooocus Model Management] Moving model(s) has taken 0.92 seconds
App started successful. Use the app with http://127.0.0.1:7865/ or 127.0.0.1:7865
[Parameters] Adaptive CFG = 7
[Parameters] Sharpness = 2
[Parameters] ADM Scale = 1.5 : 0.8 : 0.3
[Parameters] CFG = 4.0
[Parameters] Seed = 8000631531285694637
[Parameters] Sampler = dpmpp_2m_sde_gpu - karras
[Parameters] Steps = 30 - 15
[Fooocus] Initializing ...
[Fooocus] Loading models ...
Refiner unloaded.
[Fooocus] Processing prompts ...
[Fooocus] Preparing Fooocus text #1 ...
[Prompt Expansion] cute puppy, fine detail, intricate, elegant, dynamic, vibrant color, highly detailed, symmetry, sharp focus, beautiful, divine, professional, ambient light, cute, magical, vivid, artistic, true magic, pure, full background, dramatic, shining, epic, great composition, cinematic, winning, perfect, rational, scenic, lively, novel, atmosphere, best
[Fooocus] Preparing Fooocus text #2 ...
[Prompt Expansion] cute puppy, intricate, elegant, highly detailed, wonderful colors, sweet, sharp focus, symmetry, fine detail, colorful, professional, extremely luxury, stunning, enhanced quality, very inspirational, color, winning, epic, cinematic, amazing, creative, beautiful, pure, attractive, cute, best, light, hopeful, thought, iconic, clear, perfect, luxurious
[Fooocus] Encoding positive #1 ...
[Fooocus Model Management] Moving model(s) has taken 0.28 seconds
Traceback (most recent call last):
File "D:\Fooocus\Fooocus\modules\async_worker.py", line 806, in worker
handler(task)
File "D:\Fooocus\python_embeded\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "D:\Fooocus\python_embeded\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "D:\Fooocus\Fooocus\modules\async_worker.py", line 415, in handler
t['c'] = pipeline.clip_encode(texts=t['positive'], pool_top_k=t['positive_top_k'])
File "D:\Fooocus\python_embeded\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "D:\Fooocus\python_embeded\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "D:\Fooocus\Fooocus\modules\default_pipeline.py", line 190, in clip_encode
cond, pooled = clip_encode_single(final_clip, text)
File "D:\Fooocus\python_embeded\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "D:\Fooocus\python_embeded\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "D:\Fooocus\Fooocus\modules\default_pipeline.py", line 148, in clip_encode_single
result = clip.encode_from_tokens(tokens, return_pooled=True)
File "D:\Fooocus\Fooocus\ldm_patched\modules\sd.py", line 131, in encode_from_tokens
cond, pooled = self.cond_stage_model.encode_token_weights(tokens)
File "D:\Fooocus\Fooocus\ldm_patched\modules\sdxl_clip.py", line 54, in encode_token_weights
g_out, g_pooled = self.clip_g.encode_token_weights(token_weight_pairs_g)
File "D:\Fooocus\Fooocus\modules\patch_clip.py", line 57, in patched_encode_token_weights
out, pooled = self.encode(to_encode)
File "D:\Fooocus\Fooocus\ldm_patched\modules\sd1_clip.py", line 191, in encode
return self(tokens)
File "D:\Fooocus\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "D:\Fooocus\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "D:\Fooocus\Fooocus\modules\patch_clip.py", line 143, in patched_SDClipModel_forward
outputs = self.transformer(input_ids=tokens, attention_mask=attention_mask,
File "D:\Fooocus\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "D:\Fooocus\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "D:\Fooocus\python_embeded\lib\site-packages\transformers\models\clip\modeling_clip.py", line 822, in forward
return self.text_model(
File "D:\Fooocus\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "D:\Fooocus\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "D:\Fooocus\python_embeded\lib\site-packages\transformers\models\clip\modeling_clip.py", line 740, in forward
encoder_outputs = self.encoder(
File "D:\Fooocus\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "D:\Fooocus\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "D:\Fooocus\python_embeded\lib\site-packages\transformers\models\clip\modeling_clip.py", line 654, in forward
layer_outputs = encoder_layer(
File "D:\Fooocus\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "D:\Fooocus\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "D:\Fooocus\python_embeded\lib\site-packages\transformers\models\clip\modeling_clip.py", line 393, in forward
hidden_states = self.mlp(hidden_states)
File "D:\Fooocus\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "D:\Fooocus\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "D:\Fooocus\python_embeded\lib\site-packages\transformers\models\clip\modeling_clip.py", line 350, in forward
hidden_states = self.fc2(hidden_states)
File "D:\Fooocus\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "D:\Fooocus\python_embeded\lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "D:\Fooocus\Fooocus\ldm_patched\modules\ops.py", line 45, in forward
return torch.nn.functional.linear(input, weight, bias)
RuntimeError: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling cublasSgemm( handle, opa, opb, m, n, k, &alpha, a, lda, b, ldb, &beta, c, ldc)
Total time: 732.56 seconds

@mashb1t
Copy link
Collaborator

mashb1t commented Dec 28, 2023

This could be a problem with an outdated CUDA version as you don't seem to be using the one-click-installer files (run.bat etc.).
Which CUDA (11.8 / 12.1 / X) and pytorch version are you using?

@AFOLcast
Copy link
Author

Yes. I am using the run.bat files. I didn't know to use "set" the first time. How do I check Cuda & pytorch version? I simply did a clean install of the most recent version. I'm using the most recent Nvidia driver as well. Hmmm. Photoshop just crapped out saying my gpu is not current. Could that have happened from the set CUDA_LAUNCH_BLOCKING=1 command? Gonna restart. Things are getting wonky,

@mashb1t
Copy link
Collaborator

mashb1t commented Dec 28, 2023

For me, this can be checked in the folder Fooocus\python_embeded\Lib\site-packages.
There should be a folder torch and one folder below another one named torch-2.1.0+cu121.dist-info (torch 2.1.0 & CUDA 12.1).
If this does not exist, you might have another version installed and the folder might be named differently.

@AFOLcast
Copy link
Author

Indeed. This is what I found: torch-2.1.0+cu121.dist-info

@mashb1t mashb1t added bug Something isn't working help wanted Extra attention is needed labels Dec 28, 2023
@mashb1t
Copy link
Collaborator

mashb1t commented Dec 28, 2023

Sorry, i sadly don't have a direct solution to this, maybe somebody else has additional input.

@AFOLcast
Copy link
Author

Do you have any idea id setting low vram might affect this? Or how to accomplish that?

@mashb1t
Copy link
Collaborator

mashb1t commented Dec 28, 2023

You can certainly try to set --always-low-vram and run it again, but i doubt that this will help. Let's give it a shot!

@AFOLcast
Copy link
Author

Ok. Now working. I was always bad at the scientific method. I never test one thing at a time.

I reinstalled the nvidia driver.
I checked all my cuda settings in Nvidia. Some of them had changed that I wasn't aware of. Perhaps from a recent update. I made sure python.exe was set only for Nvidia GPU.
I tried first with the "new" 12 xformers. Bombed out. Tried with the "old" 11 x formers. Worked like a champ.

Now I had done ALL this and more in the last several weeks. Never worked before. But now working with the latests version, .855, and the "old" cuda 11 xformers.

Couldn't be happier.

Marking closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants