Vulkan #15

AsbjornOlling · 2024-11-04T18:01:45Z

This PR adds the vulkan feature to the llama-cpp-2 depdendency.

It also adds a few vulkan dependencies to the shell.nix environment.

AsbjornOlling · 2024-11-04T18:09:52Z

I can no longer reproduce the build-time issue I had before.. strange. It seems to just work?

I also added the three vulkan dependencies to the default nix derivation's buildInputs, which seems to fix nix compilation as well.
They probably don't all need to be in nativeBuildInputs AND buildInputs, but this seems to work.

AsbjornOlling · 2024-11-05T13:41:35Z

ggerganov/llama.cpp#9582
Oh wow...
I hit this issue locally. Luckily it's not an issue in CI.

I will add ulimit -n 2048 to our shell.nix

AsbjornOlling · 2024-11-05T14:39:05Z

Running all tests concurrently crashes in CI as well as on my machine locally. I added --test-threads=1 to fix this for now.

Now for a strange one: the tests run in the CI runner, but I get different token outputs on the CI runner than I do on my machine locally.

AsbjornOlling · 2024-11-07T15:09:02Z

Ok so, I set up the nix derivation to also run unit tests (inside the nix build sandbox), using a software renderer from mesa.

It reproduces the same error that we see in github actions. As in, the exact same nonsense tokens.

Run nix build to reproduce the error locally.

AsbjornOlling · 2024-11-07T15:20:58Z

OOOhkay- I think I have a lead. It has to do with over-allocating the "GPU"

It runs fine and passes both tests if I set with_n_gpu_layers(0)

How do we address this? Can we ask the system for how much VRAM is available, and estimate how many layers fit?

AsbjornOlling · 2024-11-07T15:21:25Z

I think that the reason I'm not hitting this locally is just that I have plenty of VRAM for the small models we run on my machine.

AsbjornOlling · 2024-11-07T16:35:11Z

But it's not just a question of trying to load a model that's bigger than the VRAM available.

I tried loading Gemma 2 27B - which is way bigger than the VRAM I have on my laptop - but it just fills my GPU and then offloads the rest to my system RAM. That's using Vulkan w/ my AMD 7700S.

So maybe the error is specifically when using a specific software renderer?

volesen · 2024-11-12T08:26:03Z

Maybe we should addd the following to cargo.toml and use no features as default.

[target.'cfg(not(target_os = "macos"))'.dependencies]
llama-cpp-2 = { version = "0.1.83", features = ["vulkan"] }

Setting the Vulkan features requires that a VulkanSDK in found for MacOS builds. That is, features = ["vulkan"] is not ignored.

However, I am not sure whether that is blocked by rust-lang/cargo#1197

emilnorsker · 2024-11-12T15:06:31Z

Libraries for the gd.extension should be updated to reflect the new folder structure as well.

Updated:
[libraries]
linux.debug.x86_64 = "res://../nobodywho/target/debug/libnobodywho.so"
linux.release.x86_64 = "res://../nobodywho/target/release/libnobodywho.so"
windows.debug.x86_64 = "res://../nobodywho/target/debug/nobodywho.dll"
windows.release.x86_64 = "res://../nobodywho/target/release/nobodywho.dll"
macos.debug = "res://../nobodywho/target/debug/libnobodywho.dylib"
macos.release = "res://../nobodywho/target/release/libnobodywho.dylib"
macos.debug.arm64 = "res://../nobodywho/target/debug/libnobodywho.dylib"
macos.release.arm64 = "res://../nobodywho/target/release/libnobodywho.dylib"

otherwise cool, lgtm!

AsbjornOlling added 2 commits November 4, 2024 18:49

do vulkan build

e9283d6

add vulkan deps to buildInputs

3121416

AsbjornOlling added 2 commits November 5, 2024 13:00

add vulkan deps to github actions

f76f938

install glslc in github actions

0967250

AsbjornOlling added 2 commits November 5, 2024 14:49

increase open files limit w/ ulimit to allow vulkan compilation

1ccc47a

don't run tests concurrently

aee7bd4

reproduce vulkan test failure in nix

0e3daef

whoops - still offload layers to gpu

2b0877c

possible fix: only offload to gpu on system with dedicated gpu

6d63454

only build vulkan if not on macos

783b2ea

AsbjornOlling changed the title ~~Draft: vulkan~~ Vulkan Nov 12, 2024

AsbjornOlling merged commit 75a2c28 into main Nov 12, 2024
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Vulkan #15

Vulkan #15

AsbjornOlling commented Nov 4, 2024

AsbjornOlling commented Nov 4, 2024

AsbjornOlling commented Nov 5, 2024 •

edited

Loading

AsbjornOlling commented Nov 5, 2024

AsbjornOlling commented Nov 7, 2024 •

edited

Loading

AsbjornOlling commented Nov 7, 2024

AsbjornOlling commented Nov 7, 2024

AsbjornOlling commented Nov 7, 2024

volesen commented Nov 12, 2024

emilnorsker commented Nov 12, 2024

Vulkan #15

Vulkan #15

Conversation

AsbjornOlling commented Nov 4, 2024

AsbjornOlling commented Nov 4, 2024

AsbjornOlling commented Nov 5, 2024 • edited Loading

AsbjornOlling commented Nov 5, 2024

AsbjornOlling commented Nov 7, 2024 • edited Loading

AsbjornOlling commented Nov 7, 2024

AsbjornOlling commented Nov 7, 2024

AsbjornOlling commented Nov 7, 2024

volesen commented Nov 12, 2024

emilnorsker commented Nov 12, 2024

AsbjornOlling commented Nov 5, 2024 •

edited

Loading

AsbjornOlling commented Nov 7, 2024 •

edited

Loading