Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ollama: oneapi version support? #327999

Open
KenMacD opened this issue Jul 17, 2024 · 7 comments
Open

ollama: oneapi version support? #327999

KenMacD opened this issue Jul 17, 2024 · 7 comments
Labels
0.kind: enhancement Add something new

Comments

@KenMacD
Copy link
Contributor

KenMacD commented Jul 17, 2024

I see ollama has a cuda and rocm version. Ollama appears to support a oneapi version now. From some tests it appears this would require access to a libze_intel_gpu.so library, which I see in the intel-compute-runtime.drivers.

I tested setting OLLAMA_INTEL_GPU=1 (with OLLAMA_DEBUG=1) in the module settings and see the following logs:

ollama[x]: time=x level=DEBUG source=gpu.go:488 msg="discovered GPU libraries" paths=[]
ollama[x]: time=x level=DEBUG source=gpu.go:435 msg="Searching for GPU library" name=libze_intel_gpu.so
ollama[x]: time=x level=DEBUG source=gpu.go:454 msg="gpu library search" globs="[/var/lib/private/ollama/libze_intel_gpu.so* /usr/lib/x86_64-linux-gnu/libze_intel_gpu.so* /usr/lib*/libze_intel_gpu.so*]"

It'd be nice to see an intel/openapi/sycl support added.

@KenMacD KenMacD added the 0.kind: bug Something is broken label Jul 17, 2024
@abysssol abysssol added 0.kind: enhancement Add something new and removed 0.kind: bug Something is broken labels Jul 17, 2024
@abysssol
Copy link
Contributor

I agree that oneapi support appears to be a good idea, especially given the (admittedly unlikely) possibility of wider industry adoption by amd and nvidia. However, given my lack of access to a supported device for testing, combined with oneapi support being quite new in ollama, possibly being incomplete or buggy for now, as well as my lack of experience with oneapi at the moment, I wouldn't expect oneapi support particularly soon in nix; it'll probably be at least a couple weeks.

If you can help with testing, or want to open a pull request, I'd be happy to work together with you to add support sooner than I'd be able to do myself.

@MordragT
Copy link

I have it packaged for oneapi here https://github.com/MordragT/nixos/blob/master/pkgs/by-name/ollama-sycl/default.nix
Iam pretty sure you will (if you want to compile from source) need the properitary dpcpp compiler or atleast intels open source llvm sycl compiler. My package uses the former. However I wanted to tryout the open source llvm sycl compiler first before I try to upstream it.

@RonnyPfannschmidt
Copy link
Contributor

is there a easy way to pull this package in for custom flakes, i'd like to get into gpu speedup on my system as cpu is just too slow for sensible usage

@MordragT
Copy link

MordragT commented Oct 7, 2024

is there a easy way to pull this package in for custom flakes, i'd like to get into gpu speedup on my system as cpu is just too slow for sensible usage

Yes I provide an overlay so you should in theory just be able to:

inputs.ollama.url = "github:MordragT/nixos";

...

import nixpkgs { ... overlays = [ ollama.overlays.default ]; };

...
  services.ollama = {
    enable = true;
    package = pkgs.ollama-sycl;
    environmentVariables = {
      OLLAMA_INTEL_GPU = "1";
    };
    loadModels = [
    ];
  };
  systemd.services.ollama.serviceConfig.MemoryDenyWriteExecute = lib.mkForce false;

@RonnyPfannschmidt
Copy link
Contributor

thanks - ill try to integrate that with my home modules

@jiriks74
Copy link
Contributor

jiriks74 commented Dec 9, 2024

Hello,
as of now I own a latop eith the Intel Arc A370M dGPU. I could try to test some things for you (if the dGPU willl work properly...)

@peigongdsd
Copy link
Contributor

Sadly this is not working for me, with the log

Dec 12 18:33:17 ThinkBookX ollama[316098]: time=2024-12-12T18:33:17.908+08:00 level=DEBUG source=payload.go:71 msg="availableServers : found" file=/tmp/ollama4065539553/runners/oneapi/ollama_llama_server
Dec 12 18:33:17 ThinkBookX ollama[316098]: time=2024-12-12T18:33:17.908+08:00 level=INFO source=payload.go:44 msg="Dynamic LLM libraries [oneapi]"
Dec 12 18:33:17 ThinkBookX ollama[316098]: time=2024-12-12T18:33:17.908+08:00 level=DEBUG source=payload.go:45 msg="Override detection logic by setting OLLAMA_LLM_LIBRARY"
Dec 12 18:33:17 ThinkBookX ollama[316098]: time=2024-12-12T18:33:17.908+08:00 level=DEBUG source=sched.go:105 msg="starting llm scheduler"
Dec 12 18:33:17 ThinkBookX ollama[316098]: time=2024-12-12T18:33:17.908+08:00 level=INFO source=gpu.go:200 msg="looking for compatible GPUs"
Dec 12 18:33:17 ThinkBookX ollama[316098]: time=2024-12-12T18:33:17.908+08:00 level=WARN source=gpu.go:669 msg="unable to locate gpu dependency libraries"
Dec 12 18:33:17 ThinkBookX ollama[316098]: time=2024-12-12T18:33:17.908+08:00 level=DEBUG source=gpu.go:86 msg="searching for GPU discovery libraries for NVIDIA"
Dec 12 18:33:17 ThinkBookX ollama[316098]: time=2024-12-12T18:33:17.908+08:00 level=DEBUG source=gpu.go:468 msg="Searching for GPU library" name=libcuda.so*
Dec 12 18:33:17 ThinkBookX ollama[316098]: time=2024-12-12T18:33:17.908+08:00 level=WARN source=gpu.go:669 msg="unable to locate gpu dependency libraries"
Dec 12 18:33:17 ThinkBookX ollama[316098]: time=2024-12-12T18:33:17.908+08:00 level=DEBUG source=gpu.go:491 msg="gpu library search" globs="[libcuda.so* /run/opengl-driver/lib/libcuda.so* /usr/local/cuda*/targets/*/lib/libcuda.so* /usr/lib/*-linux-gnu/nvidia/current/libcuda.so* /usr/lib/*-linux-gnu/libcuda.so* /usr/lib/wsl/lib/libcuda.so* /usr/lib/wsl/drivers/*/libcuda.so* /opt/cuda/lib*/libcuda.so* /usr/local/cuda/lib*/libcuda.so* /usr/lib*/libcuda.so* /usr/local/lib*/libcuda.so*]"
Dec 12 18:33:17 ThinkBookX ollama[316098]: time=2024-12-12T18:33:17.909+08:00 level=DEBUG source=gpu.go:525 msg="discovered GPU libraries" paths=[]
Dec 12 18:33:17 ThinkBookX ollama[316098]: time=2024-12-12T18:33:17.909+08:00 level=DEBUG source=gpu.go:468 msg="Searching for GPU library" name=libcudart.so*
Dec 12 18:33:17 ThinkBookX ollama[316098]: time=2024-12-12T18:33:17.909+08:00 level=WARN source=gpu.go:669 msg="unable to locate gpu dependency libraries"
Dec 12 18:33:17 ThinkBookX ollama[316098]: time=2024-12-12T18:33:17.909+08:00 level=DEBUG source=gpu.go:491 msg="gpu library search" globs="[libcudart.so* /run/opengl-driver/lib/libcudart.so* /tmp/ollama4065539553/runners/cuda*/libcudart.so* /usr/local/cuda/lib64/libcudart.so* /usr/lib/x86_64-linux-gnu/nvidia/current/libcudart.so* /usr/lib/x86_64-linux-gnu/libcudart.so* /usr/lib/wsl/lib/libcudart.so* /usr/lib/wsl/drivers/*/libcudart.so* /opt/cuda/lib64/libcudart.so* /usr/local/cuda*/targets/aarch64-linux/lib/libcudart.so* /usr/lib/aarch64-linux-gnu/nvidia/current/libcudart.so* /usr/lib/aarch64-linux-gnu/libcudart.so* /usr/local/cuda/lib*/libcudart.so* /usr/lib*/libcudart.so* /usr/local/lib*/libcudart.so*]"
Dec 12 18:33:17 ThinkBookX ollama[316098]: time=2024-12-12T18:33:17.909+08:00 level=DEBUG source=gpu.go:525 msg="discovered GPU libraries" paths=[]
Dec 12 18:33:17 ThinkBookX ollama[316098]: time=2024-12-12T18:33:17.909+08:00 level=DEBUG source=gpu.go:468 msg="Searching for GPU library" name=libze_intel_gpu.so*
Dec 12 18:33:17 ThinkBookX ollama[316098]: time=2024-12-12T18:33:17.909+08:00 level=WARN source=gpu.go:669 msg="unable to locate gpu dependency libraries"
Dec 12 18:33:17 ThinkBookX ollama[316098]: time=2024-12-12T18:33:17.909+08:00 level=DEBUG source=gpu.go:491 msg="gpu library search" globs="[libze_intel_gpu.so* /run/opengl-driver/lib/libze_intel_gpu.so* /usr/lib/x86_64-linux-gnu/libze_intel_gpu.so* /usr/lib*/libze_intel_gpu.so*]"
Dec 12 18:33:17 ThinkBookX ollama[316098]: time=2024-12-12T18:33:17.909+08:00 level=DEBUG source=gpu.go:525 msg="discovered GPU libraries" paths=[]
Dec 12 18:33:17 ThinkBookX ollama[316098]: time=2024-12-12T18:33:17.909+08:00 level=DEBUG source=amd_linux.go:371 msg="amdgpu driver not detected /sys/module/amdgpu"
Dec 12 18:33:17 ThinkBookX ollama[316098]: time=2024-12-12T18:33:17.909+08:00 level=INFO source=gpu.go:347 msg="no compatible GPUs were discovered"
Dec 12 18:33:17 ThinkBookX ollama[316098]: time=2024-12-12T18:33:17.909+08:00 level=INFO source=types.go:107 msg="inference compute" id=0 library=cpu variant=avx2 compute="" driver=0.0 name="" total="31.0 GiB" available="19.7 GiB"

Looks like there are sill libraries missing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0.kind: enhancement Add something new
Projects
None yet
Development

No branches or pull requests

6 participants