Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GPU no longer available after update [Intel IPEX] #2261

Closed
bvhari opened this issue Apr 11, 2024 · 3 comments
Closed

GPU no longer available after update [Intel IPEX] #2261

bvhari opened this issue Apr 11, 2024 · 3 comments
Labels
help wanted Extra attention is needed Troubleshooting

Comments

@bvhari
Copy link

bvhari commented Apr 11, 2024

I get this warning when I run ./gui.sh --use-ipex:
WARNING Torch reports GPU not available

I already tried running ./setup.sh --use-ipex

If I manually activate the venv, open python, and import intel_extension_for_pytorch, it loads without any issue and my GPU is visible in torch.xpu.get_device_properties(0)

Everything was working fine before the recent update which installed a different version of intel_extension_for_pytorch.

If I ignore the warning and run a lora training anyway, I get this error:

Traceback (most recent call last):
  File "kohya_ss/venv/bin/accelerate", line 5, in <module>
    from accelerate.commands.accelerate_cli import main
  File "kohya_ss/venv/lib/python3.10/site-packages/accelerate/commands/accelerate_cli.py", line 23, in <module>
    from accelerate.commands.test import test_command_parser
  File "kohya_ss/venv/lib/python3.10/site-packages/accelerate/commands/test.py", line 20, in <module>
    from accelerate.test_utils import execute_subprocess_async
  File "kohya_ss/venv/lib/python3.10/site-packages/accelerate/test_utils/__init__.py", line 23, in <module>
    from .scripts import test_script, test_sync, test_ops  # isort: skip
  File "kohya_ss/venv/lib/python3.10/site-packages/accelerate/test_utils/scripts/test_script.py", line 45, in <module>
    if is_xpu_available():
  File "kohya_ss/venv/lib/python3.10/site-packages/accelerate/utils/imports.py", line 290, in is_xpu_available
    import intel_extension_for_pytorch  # noqa: F401
  File "kohya_ss/venv/lib/python3.10/site-packages/intel_extension_for_pytorch/__init__.py", line 94, in <module>
    from .utils._proxy_module import *
  File "kohya_ss/venv/lib/python3.10/site-packages/intel_extension_for_pytorch/utils/_proxy_module.py", line 2, in <module>
    import intel_extension_for_pytorch._C
ImportError: kohya_ss/venv/lib/libmkl_sycl_data_fitting.so.4: undefined symbol: _ZN4sycl3_V17handler19getMaxWorkGroups_v2Ev

Specs: Intel ARC 770 LE, Windows 10, WSL2, Ubuntu 22.04.3, Python 3.10.12

@bmaltais bmaltais added help wanted Extra attention is needed Troubleshooting labels Apr 11, 2024
@bmaltais
Copy link
Owner

I can't test and fix this as I don't have this GPU. Someone with such a system will need to figure out the solution...

@Disty0
Copy link
Contributor

Disty0 commented Apr 11, 2024

Remove the venv folder. This was noted in the PP notes: #2181

You can also use DISABLE_VENV_LIBS=1 env variable to force gui.sh to use the system OneAPI instead of the venv.
Manually activating the venv won't activate the MKL / DPCPP in the venv and it will fall back to the system OneAPI (if it exist and is activated).

Side Note: Kohya SS doesn't use the system OneAPI. So you can remove it and get 16GB space back if you want.
We only install the bits we need in the venv. This method uses 2.5 GB instead of 16GB and we can isolate the MKL / DPCPP versions from the rest of the system.

@bvhari
Copy link
Author

bvhari commented Apr 12, 2024

@Disty0 Tried on a fresh WSL2 setup and it is working now
Thanks!

@bvhari bvhari closed this as completed Apr 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed Troubleshooting
Projects
None yet
Development

No branches or pull requests

3 participants