-
Notifications
You must be signed in to change notification settings - Fork 1
Open
Description
Currently windows and linux x64 prebuilds with CUDA are available. It would be nice to support more backends and arch/platform combinations - but also leave an escape hatch for anybody wanting to do builds themselves, for greater flexibility.
Some of the difficulties I know of are:
- CUDA prebuilds are huge and can easily hit npm package size limit.
- Total size is multiplied by the platform/arch combinations we want to ship (I think the download on install alleviates this, but they're still pretty large?)
- It looks like separate Vulkan and CUDA builds are necessary as well, so that would multiply the amount of prebuilds again (but Vulkan bins are not that large fortunately)
- Upstream changes frequently and prebuilds might be permanently out of date.
@giladgd created a very thoughtful (best of all worlds?) solution for that over at withcatai/node-llama-cpp - could take inspiration from it. Maybe you're interested in collaborating or sharing some wisdom on this, Gilad?
The gist of it, as far as I read it:
- node-llama-cpp publishes x separate prebuild packages, all of them containing platform/arch conditionals in their package.json's. Backends are split into different packages as well.
- At runtime user decides which prebuild - or custom build - should be require'd
- For custom builds theres cli commands to download a llama.cpp release and build it with cmake-js.
Metadata
Metadata
Assignees
Labels
No labels