Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
52 changes: 32 additions & 20 deletions packages/qvac-lib-infer-llamacpp-llm/NOTICE
Original file line number Diff line number Diff line change
Expand Up @@ -37,10 +37,14 @@ Third-Party Model Licenses

dolphin-mixtral-2x7b-dop-Q2_K-00001-of-00005
https://huggingface.co/jmb95/laser-dolphin-mixtral-2x7b-dpo-GGUF
gpt-oss-120b-Q4_K_M-00001-of-00002
https://huggingface.co/unsloth/gpt-oss-120b-GGUF
gpt-oss-20b-GGUF
https://huggingface.co/unsloth/gpt-oss-20b-GGUF
laser-dolphin-mixtral-2x7b-dpo-GGUF
https://huggingface.co/jmb95/laser-dolphin-mixtral-2x7b-dpo-GGUF
LightOnOCR-2-1B-ocr-soup-GGUF
https://huggingface.co/noctrex/LightOnOCR-2-1B-ocr-soup-GGUF
Llama_3.2_1B_Intruct_Tool_Calling_V2-GGUF
https://huggingface.co/mav23/Llama_3.2_1B_Intruct_Tool_Calling_V2-GGUF
Qwen3-0.6B-GGUF
Expand All @@ -66,6 +70,11 @@ Third-Party Model Licenses
SmolVLM2-500M-Video-Instruct-GGUF
https://huggingface.co/ggml-org/SmolVLM2-500M-Video-Instruct-GGUF

--- cc-by-4.0 (Creative Commons Attribution 4.0 International) ---

AfriqueGemma-4B-GGUF
https://huggingface.co/mradermacher/AfriqueGemma-4B-GGUF

--- health-ai-developer-foundations (HAI-DEF Terms of Use) ---

medgemma-4b-it-GGUF
Expand All @@ -82,6 +91,15 @@ Third-Party Model Licenses
Llama-3.2-1B-Instruct-Q4_0-00001-of-00008
https://huggingface.co/bartowski/Llama-3.2-1B-Instruct-GGUF

--- mit (MIT License) ---

bitnet_b1_58-3B-TQ2_0
https://huggingface.co/1bitLLM/bitnet_b1_58-3B
bitnet_b1_58-large-TQ2_0
https://huggingface.co/1bitLLM/bitnet_b1_58-large
bitnet_b1_58-xl-TQ2_0
https://huggingface.co/1bitLLM/bitnet_b1_58-xl

=========================================================================
Modification Notice
Comment thread
gianni-cor marked this conversation as resolved.
=========================================================================
Expand All @@ -107,30 +125,28 @@ JavaScript Dependencies
@qvac/dl-hyperdrive@0.1.1
@qvac/error@0.1.1
@qvac/infer-base@0.1.1
@qvac/infer-base@0.2.2
@qvac/infer-base@0.3.0
@qvac/logging@0.1.0
@qvac/response@0.1.2
adaptive-timeout@1.0.1
https://github.com/holepunchto/adaptive-timeout
b4a@1.8.0
Comment thread
gianni-cor marked this conversation as resolved.
b4a@1.7.5
https://github.com/holepunchto/b4a
bare-addon-resolve@1.10.0
https://github.com/holepunchto/bare-addon-resolve
bare-buffer@3.4.4
https://github.com/holepunchto/bare-buffer
bare-env@3.0.0
https://github.com/holepunchto/bare-env
bare-events@2.4.2
https://github.com/holepunchto/bare-events
bare-events@2.8.2
https://github.com/holepunchto/bare-events
bare-fs@4.5.5
bare-fs@4.5.4
https://github.com/holepunchto/bare-fs
bare-hrtime@2.1.1
https://github.com/holepunchto/bare-hrtime
bare-module-resolve@1.12.1
https://github.com/holepunchto/bare-module-resolve
bare-os@3.7.0
bare-os@3.6.2
https://github.com/holepunchto/bare-os
bare-path@3.0.0
https://github.com/holepunchto/bare-path
Expand All @@ -150,7 +166,7 @@ JavaScript Dependencies
https://github.com/holepunchto/bare-url
blind-relay@1.4.0
https://github.com/holepunchto/blind-relay
compact-encoding@2.19.0
compact-encoding@2.18.0
https://github.com/compact-encoding/compact-encoding
device-file@2.3.1
https://github.com/holepunchto/device-file
Expand All @@ -167,22 +183,18 @@ JavaScript Dependencies
hypercore-id-encoding@1.3.0
https://github.com/holepunchto/hypercore-id-encoding
hypercore-storage@2.4.1
hyperdrive@13.3.0
hyperdrive@13.2.1
https://github.com/holepunchto/hyperdrive
hyperschema@1.20.1
hyperschema@1.19.1
https://github.com/holepunchto/hyperschema
index-encoder@3.4.0
https://github.com/holepunchto/index-encoder
mirror-drive@1.13.0
mirror-drive@1.12.0
https://github.com/holepunchto/mirror-drive
noise-handshake@4.2.0
https://github.com/holepunchto/noise-handshake
quickbit-native@2.4.8
https://github.com/holepunchto/quickbit-native
rabin-native@2.0.0
https://github.com/holepunchto/rabin-native
rabin-stream@2.0.0
https://github.com/holepunchto/rabin-stream
rache@1.0.0
https://github.com/holepunchto/rache
refcounter@1.0.0
Expand Down Expand Up @@ -239,7 +251,7 @@ JavaScript Dependencies
https://github.com/holepunchto/corestore
debounceify@1.1.0
https://github.com/mafintosh/debounceify
dht-rpc@6.26.3
dht-rpc@6.26.1
https://github.com/mafintosh/dht-rpc
fast-fifo@1.3.2
https://github.com/mafintosh/fast-fifo
Expand All @@ -251,13 +263,13 @@ JavaScript Dependencies
https://github.com/mafintosh/generate-string
hyperbee@2.27.3
https://github.com/holepunchto/hyperbee
hypercore@11.26.0
hypercore@11.24.0
https://github.com/holepunchto/hypercore
hypercore-crypto@3.6.1
https://github.com/mafintosh/hypercore-crypto
hyperdht@6.29.0
https://github.com/holepunchto/hyperdht
hyperswarm@4.17.0
hyperswarm@4.16.0
https://github.com/holepunchto/hyperswarm
is-options@1.0.2
https://github.com/mafintosh/is-options
Expand Down Expand Up @@ -354,10 +366,10 @@ Python Dependencies
numpy
https://numpy.org

--- mpl-2.0 AND mit ---
--- mit (MIT License) ---

tqdm
Comment thread
gianni-cor marked this conversation as resolved.
https://tqdm.github.io
llama-cpp-python
https://pypi.org/project/llama-cpp-python/


=========================================================================
Expand Down
19 changes: 11 additions & 8 deletions packages/qvac-lib-infer-llamacpp-llm/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,8 +16,8 @@ This native C++ addon, built using the `Bare` Runtime, simplifies running Large
- [7. Run Inference](#7-run-inference)
- [8. Release Resources](#8-release-resources)
- [API behavior by state](#api-behavior-by-state)
- [Quickstart Example](#quickstart-example)
- [Fine-tuning](#fine-tuning)
- [Quickstart Example](#quickstart-example)
- [Other Examples](#other-examples)
- [Architecture](#architecture)
- [Benchmarking](#benchmarking)
Expand Down Expand Up @@ -266,6 +266,14 @@ The following table describes the expected behavior of `run` and `cancel` depend
When `run()` is called while another job is active, the implementation first waits briefly for the previous job to settle. This preserves single-job behavior while still failing fast when the instance is busy. If the second run cannot be accepted (timeout or addon busy rejection), it throws:
- `"Cannot set new job: a job is already set or being processed"`


## Fine-tuning

The library supports **LoRA finetuning** of GGUF models: train small adapter weights on top of a base model, then save the adapter and load it at inference time via the `lora` config option. You can pause and resume training from checkpoints.

For the full API, dataset format, parameters, and examples, see the **[Finetuning guide](docs/finetuning.md)**.


## Quickstart Example

Clone the repository and navigate to it:
Expand All @@ -283,11 +291,6 @@ Run the quickstart example (uses examples/quickstart.js):
npm run quickstart
```

## Fine-tuning

The library supports **LoRA finetuning** of GGUF models: train small adapter weights on top of a base model, then save the adapter and load it at inference time via the `lora` config option. You can pause and resume training from checkpoints.

For the full API, dataset format, parameters, and examples, see the **[Finetuning guide](docs/finetuning.md)**.

## Other examples

Expand All @@ -296,8 +299,8 @@ For the full API, dataset format, parameters, and examples, see the **[Finetunin
- [Multi-Cache](./examples/multiCache.js) – Demonstrates session handling and caching capabilities.
- [Native Logging](./examples/nativelog.js) – Demonstrates C++ addon logging integration.
- [Tool Calling](./examples/toolCalling.js) – Demonstrates tool calling capabilities.
- [LoRA Finetuning](./examples/simple-lora-finetune.js) – Basic LoRA finetuning.
- [LoRA Finetuning Pause/Resume](./examples/simple-lora-finetune-pause-resume.js) – Pause and resume finetuning.
- [LoRA Finetuning](./examples/finetune/simple-lora-finetune.js) – Basic LoRA finetuning.
Comment thread
gianni-cor marked this conversation as resolved.
- [LoRA Finetuning Pause/Resume](./examples/finetune/simple-lora-finetune-pause-resume.js) – Pause and resume finetuning.
- [LoRA Inference](./examples/simple-lora-inference.js) – Inference with a finetuned LoRA adapter.

## OCR with Vision-Language Models
Expand Down
Loading