Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
106 commits
Select commit Hold shift + click to select a range
f014e66
chore[bc]: remove BaseInference inheritance and WeightsProvider from …
donriddo Apr 7, 2026
541e3a4
chore[bc]: remove BaseInference inheritance and WeightsProvider from …
donriddo Apr 8, 2026
7494997
fix: correct FinetuneProgress and finetune terminal handling in outpu…
donriddo Apr 9, 2026
9dd5690
fix: update embed examples to use new constructor shape
donriddo Apr 9, 2026
ace7b57
fix: update all LLM examples and model-loading test to new constructo…
donriddo Apr 9, 2026
91c2375
fix: update sharded model test to download shards to disk first
donriddo Apr 9, 2026
921871c
chore[bc]: remove BaseInference inheritance from diffusion addon
donriddo Apr 9, 2026
0e57194
feat[api]: expose backendDevice stat in SDK for LLM and embed addons
donriddo Apr 7, 2026
c76b809
fix: restore JSDoc comments in index.js and index.d.ts
donriddo Apr 9, 2026
a419b58
fix: update embed benchmark tooling to new constructor shape
donriddo Apr 9, 2026
502be84
fix: update LLM benchmark tooling to new constructor shape
donriddo Apr 9, 2026
2871e15
fix: pass typed config object to embed addon and restore addonCtor pa…
donriddo Apr 10, 2026
b3dc214
fix: update LLM perf benchmark sweep and judge to new constructor shape
donriddo Apr 10, 2026
efaf21e
fix: pass no-mmap as empty string flag in embed benchmark config
donriddo Apr 10, 2026
5a32ab7
docs: update embed README and data-flows for new constructor pattern
donriddo Apr 10, 2026
a087ffd
docs: update LLM README, finetuning, and afriquegemma docs for new co…
donriddo Apr 10, 2026
3b27c36
docs: update SD README for new constructor pattern
donriddo Apr 10, 2026
71d885b
fix: guard SD _runInternal against run-before-load with clear error
donriddo Apr 10, 2026
ebcf734
fix: update LLM prepare-prompts and verify-prompts to new constructor
donriddo Apr 10, 2026
0ada4a5
fix: update LLM finetuning unit tests to new constructor and exclusiv…
donriddo Apr 10, 2026
26bb3c0
docs: update LLM architecture, data-flows, finetuning, README sharded…
donriddo Apr 10, 2026
8cf9e0e
docs: update embed architecture, data-flows, README sharded contract
donriddo Apr 10, 2026
0d1c713
feat[api]: add embedWithStats client helper and backendDevice tests
donriddo Apr 10, 2026
e49b76f
merge: chore/embed-addon-interface-refactor for SDK migration
donriddo Apr 10, 2026
7ddcafd
merge: chore/llm-addon-interface-refactor for SDK migration
donriddo Apr 10, 2026
d270bb4
merge: chore/sd-addon-interface-refactor for SDK migration
donriddo Apr 10, 2026
3df1fcf
chore[bc]: migrate SDK plugins to new addon constructor shape
donriddo Apr 10, 2026
975ee3d
fix: drop loader destructuring from embed multi-instance test
donriddo Apr 10, 2026
2e8d063
docs: align LLM finetuning docs and mobile README with new constructor
donriddo Apr 10, 2026
0175b02
feat[api]: re-export embedWithStats and EmbedStats from SDK root entr…
donriddo Apr 10, 2026
c7e51a3
docs: align SD architecture.md with new constructor and composition p…
donriddo Apr 10, 2026
f5c9426
chore[bc]: address PR #1496 review findings and bump to 0.2.0
donriddo Apr 10, 2026
b12f22a
chore[bc]: address PR #1493 review findings and bump to 0.14.0
donriddo Apr 10, 2026
c9cfb3e
chore[bc]: address PR #1494 review findings and bump to 0.15.0
donriddo Apr 10, 2026
bca021d
refactor: move LLM C++ event normalization into addon.js
donriddo Apr 10, 2026
b213769
refactor: move embed C++ event normalization into addon.js
donriddo Apr 10, 2026
7083f7e
refactor: move SD C++ event normalization into addon.js
donriddo Apr 10, 2026
90a07d3
fix: address PR #1494 second-round review findings
donriddo Apr 10, 2026
eeba652
docs: address PR #1493 second-round review findings
donriddo Apr 10, 2026
e506204
fix: address PR #1496 second-round review findings
donriddo Apr 10, 2026
eb71bf4
feat[api]: expose openclCacheDir / cache-type-k / cache-type-v config…
donriddo Apr 10, 2026
211d3ad
Merge remote-tracking branch 'upstream/main' into chore/sdk-expose-ba…
donriddo Apr 10, 2026
b6c493d
fix: use dot notation for openclCacheDir now that GGMLConfig types it
donriddo Apr 10, 2026
b6bbc60
Merge remote-tracking branch 'origin/main' into chore/embed-addon-int…
donriddo Apr 10, 2026
61350ae
Merge remote-tracking branch 'origin/main' into chore/llm-addon-inter…
donriddo Apr 10, 2026
4a3a261
Merge remote-tracking branch 'origin/main' into chore/sd-addon-interf…
donriddo Apr 10, 2026
7bbb40d
Merge remote-tracking branch 'origin/main' into chore/sdk-addon-inter…
donriddo Apr 10, 2026
a36b943
Merge branch 'chore/embed-addon-interface-refactor' into chore/sdk-ad…
donriddo Apr 10, 2026
406444a
Merge branch 'chore/llm-addon-interface-refactor' into chore/sdk-addo…
donriddo Apr 10, 2026
4c84c04
Merge branch 'chore/sd-addon-interface-refactor' into chore/sdk-addon…
donriddo Apr 10, 2026
4388126
fix: extract pickPrimaryGgufPath, restore multiModal example, fix docs
donriddo Apr 12, 2026
ab71710
fix: remove task-doc reference and refactor-narration comments
donriddo Apr 12, 2026
f699479
Merge branch 'chore/llm-addon-interface-refactor' into chore/sdk-addo…
donriddo Apr 12, 2026
c3808c6
Merge branch 'chore/sd-addon-interface-refactor' into chore/sdk-addon…
donriddo Apr 12, 2026
7282513
fix: restore camelCase-to-snake_case regex in transformLlmConfig
donriddo Apr 12, 2026
bd341ff
test: split backendDevice schema tests into per-module test files
donriddo Apr 12, 2026
d27dea9
fix[bc]: return { embedding, stats } from embed() instead of raw vectors
donriddo Apr 12, 2026
91abf51
Merge branch 'chore/sdk-expose-backend-device-stat' into chore/sdk-ad…
donriddo Apr 12, 2026
47b4034
fix: remove internal task-doc reference from mapAddonEvent JSDoc
donriddo Apr 12, 2026
9088c9e
fix: correct version in architecture.md and remove stale dl-filesyste…
donriddo Apr 12, 2026
7772fbe
chore[bc]: add SDK CHANGELOG for 0.9.0 and bump version
donriddo Apr 12, 2026
da70bcf
Merge branch 'chore/embed-addon-interface-refactor' into chore/sdk-ad…
donriddo Apr 12, 2026
9409079
Merge branch 'chore/llm-addon-interface-refactor' into chore/sdk-addo…
donriddo Apr 12, 2026
11a5735
Merge branch 'chore/sdk-expose-backend-device-stat' into chore/sdk-ad…
donriddo Apr 12, 2026
9c318c1
fix: restore openclCacheDir after camelCase-to-snake_case transform
donriddo Apr 12, 2026
48b41d6
Merge branch 'chore/sdk-expose-backend-device-stat' into chore/sdk-ad…
donriddo Apr 12, 2026
187e694
fix: align _hasActiveResponse clearing with embed pattern
donriddo Apr 14, 2026
69dae27
fix: accurate README busy-state, log rejected inferences, add mapAddo…
donriddo Apr 14, 2026
8d12649
fix: throw on second load() instead of silently unload+reload
donriddo Apr 14, 2026
754da22
fix: throw on second load(), log rejected responses, add mapAddonEven…
donriddo Apr 14, 2026
45d3b19
fix: throw on second load(), log rejected responses, add mapAddonEven…
donriddo Apr 14, 2026
7cd5ac4
Merge branch 'chore/embed-addon-interface-refactor' into chore/sdk-ad…
donriddo Apr 14, 2026
b38bbdd
Merge branch 'chore/llm-addon-interface-refactor' into chore/sdk-addo…
donriddo Apr 14, 2026
a974699
Merge branch 'chore/sd-addon-interface-refactor' into chore/sdk-addon…
donriddo Apr 14, 2026
334950a
Merge remote-tracking branch 'upstream/main' into chore/llm-addon-int…
donriddo Apr 14, 2026
00ee6d2
Merge remote-tracking branch 'upstream/main' into chore/sdk-expose-ba…
donriddo Apr 14, 2026
ee12523
Merge branch 'chore/llm-addon-interface-refactor' into chore/sdk-addo…
donriddo Apr 14, 2026
52226ab
Merge branch 'chore/sdk-expose-backend-device-stat' into chore/sdk-ad…
donriddo Apr 14, 2026
7a88c8d
fix: correct expandGGUFIntoShards JSDoc to say "first shard-matching …
donriddo Apr 14, 2026
51bb1e3
fix: restore JSDoc on run() that was dropped during BaseInference rem…
donriddo Apr 14, 2026
b0a6d08
fix: restore JSDoc on run() that was dropped during BaseInference rem…
donriddo Apr 14, 2026
4ce007c
Merge branch 'chore/sd-addon-interface-refactor' into chore/sdk-addon…
donriddo Apr 14, 2026
dfd4973
Merge branch 'chore/llm-addon-interface-refactor' into chore/sdk-addo…
donriddo Apr 14, 2026
80c63a8
Merge remote-tracking branch 'upstream/main' into chore/llm-addon-int…
donriddo Apr 14, 2026
79176eb
Merge remote-tracking branch 'upstream/main' into chore/sdk-expose-ba…
donriddo Apr 14, 2026
fc5191a
Merge branch 'chore/llm-addon-interface-refactor' into chore/sdk-addo…
donriddo Apr 14, 2026
fbb22b9
Merge branch 'chore/sdk-expose-backend-device-stat' into chore/sdk-ad…
donriddo Apr 14, 2026
8d4fb1a
chore: revert SDK version + CHANGELOG, collapse EmbedStats re-export …
donriddo Apr 14, 2026
c131cef
Merge branch 'chore/sdk-expose-backend-device-stat' into chore/sdk-ad…
donriddo Apr 14, 2026
8131641
fix: migrate afriquegemma-edge-cases test to new addon constructor
donriddo Apr 14, 2026
73790c8
Merge branch 'chore/llm-addon-interface-refactor' into chore/sdk-addo…
donriddo Apr 14, 2026
96fc198
fix: extract pickPrimaryGgufPath, document network-streaming loss, wa…
donriddo Apr 14, 2026
5063a20
fix: correct CHANGELOG error quote and remove dead files.model fallback
donriddo Apr 14, 2026
264585e
Merge branch 'chore/embed-addon-interface-refactor' into chore/sdk-ad…
donriddo Apr 14, 2026
7b9462f
Merge branch 'chore/sd-addon-interface-refactor' into chore/sdk-addon…
donriddo Apr 14, 2026
4d3a303
fix: make load() a silent no-op when already loaded (ReadyResource pa…
donriddo Apr 15, 2026
b51ca89
fix: make load() idempotent when already loaded
donriddo Apr 15, 2026
f6f424d
fix: make load() idempotent when already loaded
donriddo Apr 15, 2026
ad8b757
Merge branch 'chore/embed-addon-interface-refactor' into chore/sdk-ad…
donriddo Apr 15, 2026
4eeded4
Merge branch 'chore/llm-addon-interface-refactor' into chore/sdk-addo…
donriddo Apr 15, 2026
afb8e3c
Merge branch 'chore/sd-addon-interface-refactor' into chore/sdk-addon…
donriddo Apr 15, 2026
51a9e9e
fix: update SDKModule.embed type for new { embedding, stats? } return…
donriddo Apr 15, 2026
3490c52
Merge branch 'chore/sdk-expose-backend-device-stat' into chore/sdk-ad…
donriddo Apr 15, 2026
1fcb083
Merge remote-tracking branch 'upstream/main' into chore/sdk-addon-int…
donriddo Apr 15, 2026
92e423e
Merge remote-tracking branch 'upstream/main' into chore/sdk-addon-int…
donriddo Apr 15, 2026
8712fb3
chore: flip addon deps from file:../ to semver ranges
donriddo Apr 20, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
77 changes: 77 additions & 0 deletions packages/lib-infer-diffusion/CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,82 @@
# Changelog

## [0.3.0] - 2026-04-15

This release migrates the diffusion addon off `BaseInference` inheritance and onto the composable `createJobHandler` + `exclusiveRunQueue` utilities from `@qvac/infer-base@^0.4.0`. The constructor signature is replaced with a single object whose `files` field carries absolute paths for every model component, mirroring the parallel embed and LLM addon refactors. This is a breaking change β€” every caller must update.

## Breaking Changes

### Constructor signature: single object with `files` instead of `(args, config)`

`ImgStableDiffusion` now takes a single `{ files, config, logger?, opts? }` object. The old `diskPath` + `modelName` + per-component filename pattern is gone β€” callers pass absolute paths directly via `files`. Companion model fields are renamed (`clipLModel` β†’ `clipL`, `clipGModel` β†’ `clipG`, `t5XxlModel` β†’ `t5Xxl`, `llmModel` β†’ `llm`, `vaeModel` β†’ `vae`).

```js
// BEFORE (≀ 0.1.x)
const model = new ImgStableDiffusion({
diskPath: '/models',
modelName: 'flux-2-klein-4b-Q8_0.gguf',
llmModel: 'Qwen3-4B-Q4_K_M.gguf',
vaeModel: 'flux2-vae.safetensors',
logger: console
}, { threads: 8 })

// AFTER (0.3.0)
const model = new ImgStableDiffusion({
files: {
model: '/models/flux-2-klein-4b-Q8_0.gguf',
llm: '/models/Qwen3-4B-Q4_K_M.gguf',
vae: '/models/flux2-vae.safetensors'
},
config: { threads: 8 },
logger: console,
opts: { stats: true }
})
```

### `BaseInference` inheritance removed

`ImgStableDiffusion` no longer extends `BaseInference`. The class composes `createJobHandler` and `exclusiveRunQueue` from `@qvac/infer-base@^0.4.0` directly. The public lifecycle (`load` / `run` / `cancel` / `unload` / `getState`) is unchanged in shape; only construction differs. Internal helpers like `_withExclusiveRun` and `_outputCallback` are removed.

### Caller owns absolute paths β€” addon no longer joins `diskPath` + filename

Callers that previously relied on the addon to resolve `path.join(diskPath, filename)` must now do that resolution themselves before constructing the model.

### `getState()` returns a narrower shape

`getState()` previously returned `{ configLoaded, weightsLoaded, destroyed }` (the three-field shape from `BaseInference`). It now returns `{ configLoaded }` only. The `weightsLoaded` and `destroyed` fields are gone β€” `weightsLoaded` collapsed into `configLoaded` because the refactored `load()` does both in one step, and `destroyed` is no longer tracked since `unload()` resets `configLoaded` and nulls the addon handle instead. Callers reading `state.weightsLoaded` or `state.destroyed` must switch to `state.configLoaded`.

## Features

### Constructor input validation

The constructor now throws `TypeError('files.model must be an absolute path string')` when `files.model` is missing or not a string, or `TypeError('files.model must be an absolute path (got: <value>)')` when supplied as a relative path. This produces a clear error for callers porting old code instead of a confusing `Cannot read properties of undefined`. The same validation applies to optional companion fields (`clipL`, `clipG`, `t5Xxl`, `llm`, `vae`) when supplied.

### `run()`-before-`load()` guard

Calling `run()` before `load()` now throws `Error('Addon not initialized. Call load() first.')` instead of crashing in native code. Covered by a new regression test in `test/integration/api-behavior.test.js`.

### `load()` is now idempotent when already loaded

A second `load()` call on an already-loaded instance is now a silent no-op instead of unloading and reloading. This aligns with the ReadyResource pattern used elsewhere in QVAC and prevents accidental double-loads from triggering expensive work. Callers that intentionally want to swap weights must call `unload()` first (which clears `configLoaded`) and then `load()` again.

### Broader split-layout detection

`isSplitLayout` now also triggers when only `clipL` or `clipG` is supplied. This closes a footgun where a FLUX.1 caller passing `{ model, clipL, clipG, vae }` (without `t5Xxl`) would silently mis-route the diffusion model into the all-in-one `path` parameter and fail to load.

## Bug Fixes

### `unload()` clears the addon reference

`unload()` now sets `this.addon = null` after `await this.addon.unload()`, so post-unload `cancel()` / `run()` calls hit the explicit `if (!this.addon)` guard rather than dereferencing a disposed native handle.

### Unknown addon events no longer pollute the output stream

`_addonOutputCallback` previously had a fallthrough that pushed any non-error / non-image / non-stats event into `response.output` (including `null` and `undefined`). It now logs unknown events at debug level and does not feed them into the active response.

## Pull Requests

- [#1496](https://github.com/tetherto/qvac/pull/1496) - chore[bc]: diffusion addon interface refactor β€” remove BaseInference

## [0.2.0] - 2026-04-15

### Added
Expand Down
41 changes: 24 additions & 17 deletions packages/lib-infer-diffusion/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -176,28 +176,35 @@ const path = require('bare-path')
const MODELS_DIR = path.resolve(__dirname, './models')
const args = {
logger: console,
diskPath: MODELS_DIR,
modelName: 'flux-2-klein-4b-Q8_0.gguf',
llmModel: 'Qwen3-4B-Q4_K_M.gguf', // Qwen3 text encoder for FLUX.2 [klein]
vaeModel: 'flux2-vae.safetensors'
files: {
model: path.join(MODELS_DIR, 'flux-2-klein-4b-Q8_0.gguf'),
llm: path.join(MODELS_DIR, 'Qwen3-4B-Q4_K_M.gguf'), // Qwen3 text encoder for FLUX.2 [klein]
vae: path.join(MODELS_DIR, 'flux2-vae.safetensors')
},
config: { threads: 8 },
opts: { stats: true }
}
```

| Property | Required | Description |
|----------|----------|-------------|
| `diskPath` | βœ… | Local directory where model files are already stored |
| `modelName` | βœ… | Diffusion model file name (all-in-one for SD1.x/2.x; diffusion-only GGUF for FLUX.2) |
| `files` | βœ… | Object of absolute paths to model files (see below) |
| `files.model` | βœ… | Absolute path to diffusion model file (all-in-one for SD1.x/2.x; diffusion-only GGUF for FLUX.2) |
| `files.clipL` | β€” | Absolute path to separate CLIP-L text encoder (SD3) |
| `files.clipG` | β€” | Absolute path to separate CLIP-G text encoder (SDXL / SD3) |
| `files.t5Xxl` | β€” | Absolute path to separate T5-XXL text encoder (SD3) |
| `files.llm` | β€” | Absolute path to Qwen3 LLM text encoder (FLUX.2 [klein]) |
| `files.vae` | β€” | Absolute path to separate VAE file |
| `config` | β€” | Native backend configuration object (see next section) |
| `logger` | β€” | Logger instance (e.g. `console`) |
| `clipLModel` | β€” | Separate CLIP-L text encoder (SD3) |
| `clipGModel` | β€” | Separate CLIP-G text encoder (SDXL / SD3) |
| `t5XxlModel` | β€” | Separate T5-XXL text encoder (SD3) |
| `llmModel` | β€” | Qwen3 LLM text encoder (FLUX.2 [klein]) |
| `vaeModel` | β€” | Separate VAE file |
| `opts` | β€” | Additional options (e.g. `{ stats: true }`) |

### 3. Create the `config` object
### 3. Configure the native backend (`args.config`)

`config` is a field on the `args` object built in step 2 β€” there is no separate constructor argument. The native backend reads it during `load()`.

```js
const config = {
args.config = {
threads: 8 // CPU threads for tensor operations (Metal handles GPU automatically)
}
```
Expand All @@ -216,18 +223,18 @@ Config values are coerced to strings internally. Generation parameters (prompt,
### 4. Create a Model Instance

```js
const model = new ImgStableDiffusion(args, config)
const model = new ImgStableDiffusion(args)
```

The constructor stores configuration only β€” no memory is allocated yet.
The constructor takes a single object containing `files`, `config`, `logger`, and `opts`. It stores configuration only β€” no memory is allocated yet.

### 5. Load the Model

```js
await model.load()
```

This creates the native `sd_ctx_t` and loads all weights into memory. It can take 10–30 seconds depending on disk speed and model size. All model files must already be present on disk at `diskPath`.
This creates the native `sd_ctx_t` and loads all weights into memory. It can take 10–30 seconds depending on disk speed and model size. All model files must be passed as absolute paths via the `files` object.

### 6. Run Inference

Expand Down Expand Up @@ -360,7 +367,7 @@ await model.unload()

### Stable Diffusion 1.x / 2.x

Pass an all-in-one checkpoint directly as `modelName`. No separate encoders needed.
Pass an all-in-one checkpoint absolute path as `files.model`. No separate encoders needed.

---

Expand Down
37 changes: 36 additions & 1 deletion packages/lib-infer-diffusion/addon.js
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,41 @@

const path = require('bare-path')

/**
* Map a raw native event from the C++ stable-diffusion addon to a logical
* event consumed by `ImgStableDiffusion`.
*
* The native binding emits events with C++-mangled names and varied
* payload shapes. This wrapper normalizes them into one of:
* - `'Output'` β€” image bytes (`Uint8Array`) or progress JSON tick (`string`)
* - `'Error'` β€” failure
* - `'JobEnded'` β€” terminal RuntimeStats payload (object)
*
* Returns `{ type, data, error }` or `null` for unknown event/data shapes
* (caller logs at debug level).
*
*
* @param {string} rawEvent
* @param {*} rawData
* @param {*} rawError
* @returns {{ type: string, data: *, error: * } | null}
*/
function mapAddonEvent (rawEvent, rawData, rawError) {
if (typeof rawEvent === 'string' && rawEvent.includes('Error')) {
return { type: 'Error', data: rawData, error: rawError }
}

if (rawData instanceof Uint8Array || typeof rawData === 'string') {
return { type: 'Output', data: rawData, error: null }
}

if (rawData && typeof rawData === 'object') {
return { type: 'JobEnded', data: rawData, error: null }
}

return null
}

/**
* Extract pixel dimensions from a PNG or JPEG buffer without a full decode.
*
Expand Down Expand Up @@ -151,4 +186,4 @@ class SdInterface {
}
}

module.exports = { SdInterface, readImageDimensions }
module.exports = { SdInterface, mapAddonEvent, readImageDimensions }
Loading
Loading