chore[bc]: diffusion addon interface refactor - remove BaseInference#1496
Merged
gianni-cor merged 35 commits intoApr 17, 2026
Merged
Conversation
Replace class inheritance with composable utilities from @qvac/infer-base@0.4.0:
- createJobHandler() for single-job lifecycle management
- exclusiveRunQueue() for run serialization
Constructor now takes { files: { model, clipL?, clipG?, t5Xxl?, llm?, vae? }, config, logger, opts }
instead of { diskPath, modelName, clipLModel, ... } + config.
All examples and tests updated to new constructor shape.
donriddo
added a commit
to donriddo/qvac
that referenced
this pull request
Apr 10, 2026
The embed, LLM, and diffusion addons (PRs tetherto#1493, tetherto#1494, tetherto#1496) drop BaseInference + WeightsProvider in favour of composable utilities and expose a single-arg constructor of the form `new Addon({ files, config, logger, opts })` where `files.model` is either a string (diffusion) or an array of absolute paths to all GGUF shards (embed/LLM). Per the addon-loader-abstraction proposal, shard expansion now belongs to the SDK plugin layer rather than the addon. This commit: - Adds `expandGGUFIntoShards()` in `server/utils`, mirroring the C++ `GGUFShards::expandGGUFIntoShards` regex contract. It is pure string manipulation so it works under both bun (unit tests) and bare (production), and handles POSIX and Windows separators. - Rewrites `llamacpp-embedding`, `llamacpp-completion`, and `sdcpp-generation` plugins to call the new constructor. FilesystemDL and the loader-adapter shim are dropped from these three plugins — they no longer create or pass a loader, returning `{ model, loader: undefined }` (the SD plugin already followed this pattern). The diffusion artifact keys are renamed `clipLModel`/`t5XxlModel`/etc. → `clipL`/`t5Xxl`/etc. to match the new `DiffusionFiles` shape. - Pins `@qvac/embed-llamacpp`, `@qvac/llm-llamacpp`, and `@qvac/diffusion-cpp` to `file:../...` so the SDK consumes the in-monorepo refactored addons until the addon PRs land and publish versioned releases. These pins must be bumped back to semver ranges before merging. - Adds a unit test for `expandGGUFIntoShards` covering non-sharded paths, sharded paths supplied at any index, nested directories, single-shard sharded models, relative paths, and Windows separators. `@qvac/dl-filesystem` stays in `dependencies` because OCR, parakeet, nmtcpp, and whispercpp plugins still rely on it.
jesusmb1995
requested changes
Apr 10, 2026
Bumps `@qvac/diffusion-cpp` to `0.2.0` per the addon-changelog
process — minor bump on a pre-1.0 package signals the breaking
constructor change to consumers using semver ranges. Adds the
matching `0.2.0` block to `CHANGELOG.md` documenting the new
single-object constructor with `files`, the removal of
`BaseInference`, and every behaviour change in this release.
Hardens the JS layer based on the review:
- Constructor now throws a clear `TypeError` when `files` /
`files.model` is missing, instead of crashing with an opaque
"cannot read properties of undefined" later.
- `createJobHandler({ cancel })` closure uses optional chaining so
a `response.cancel()` after `unload()` is a no-op rather than a
`TypeError`.
- `unload()` sets `this.addon = null` after `addon.unload()`, so
the existing `if (!this.addon)` guard in `_runInternal` is also
effective post-unload.
- `cancel()` re-adds the defensive `?.cancel` check.
- `isSplitLayout` now also triggers on `clipL` / `clipG`, closing
a footgun where a FLUX.1 caller passing only encoders without
`t5Xxl` would silently misroute the diffusion model into the
all-in-one `path` parameter and fail to load.
- `_addonOutputCallback` no longer pushes unknown event payloads
into the active response output stream — unknown events are
logged at debug level instead. The error log line is updated to
pass the `Error` object directly so loggers can format the full
stack.
Doc + test cleanup:
- README section 3 now describes `args.config` as a field of the
same `args` object built in section 2 (the old wording made it
sound like a separate constructor argument).
- The `api-behavior` integration test no longer calls
`binding.releaseLogger()` manually in the teardown — `unload()`
already releases the native logger via `_releaseNativeLogger`.
gianni-cor
requested changes
Apr 15, 2026
gianni-cor
left a comment
Contributor
There was a problem hiding this comment.
Requesting changes for concurrent lifecycle safety. load() should be serialized through the same queue as run() / unload() so overlapping load() calls on the same instance cannot race on shared addon state.
- load() now runs through `this._run()` so concurrent calls on the same instance serialize instead of racing past the `configLoaded` guard. Two overlapping loads could previously both allocate a native addon and clobber `this.addon`, leaking one native handle. - _load() now wraps `addon.activate()` in a try/catch that best-effort unloads the partially-initialized addon, releases the native logger, and resets `this.addon = null` before re-throwing. Matches the crash-safety pattern already in embed and LLM. A failing activate() no longer leaves a zombie native instance that the next load() would orphan. - Add `test:unit` and `test:unit:generate` scripts that run the JS unit tests under `test/unit/*.test.js` via brittle + bare. Wire `test:unit` into `test:all` and into the PR workflow's ts-checks job so `map-addon-event.test.js` runs on every PR. - `.gitignore` the generated `test/unit/all.js` brittle runner. - CHANGELOG: document both fixes under Bug Fixes.
Unit tests need the bare runtime, which is only installed globally in the integration-test workflow (via npm install -g bare). My previous commit wired test:unit into the ts-checks job, which doesn't install bare, so it would have failed with command not found in CI. Chain test:unit at the script level instead — the integration-test workflow already runs npm run test:integration, which now runs unit tests first. Matches the standalone-repo precedent (qvac-lib-dl-filesystem, qvac-lib-decoder-audio, qvac-lib-error-base, etc.) of having the test script drive both.
Reviewer flagged that test:unit invoked the `bare` CLI, but the ts-checks job did not install it. My previous commit's workaround — chaining test:unit into test:integration at the script level — would have re-run unit tests on every platform in the 7-way integration matrix. Revert both. Use the existing `tetherto/oss-actions/.github/actions/run-lint-and-unit-tests` action instead, same as `qvac-lib-infer-onnx` and `ocr-onnx`. The action installs bare globally and runs `npm run test:unit --if-present` in a single fast step.
Matches the standalone-repo precedent (qvac-lib-inference-addon-base, qvac-lib-dl-filesystem, etc.) so 'npm run test' runs both flows locally for developers.
maxim-smotrov
requested changes
Apr 16, 2026
maxim-smotrov
left a comment
Contributor
There was a problem hiding this comment.
Please fix the mermaid diagram syntax.
Mermaid's classDiagram uses { and } as class-body delimiters, so the
inline object literal in the method signatures broke the parser and
prevented the diagram from rendering on GitHub. Replace:
- constructor(args: {files, config, logger?, opts?}) →
constructor(args: ImgStableDiffusionArgs) (matches index.d.ts)
- getState() {configLoaded} →
getState() State
Reported by maxim-smotrov at architecture.md:150.
The run-lint-and-unit-tests action runs `npm run lint` and `npm run test:unit`. The step name "Run JavaScript tests" hides the lint half. Rename to "Run lint and unit tests" and update the step id accordingly.
- index.d.ts: replace the stale "not yet supported, throws at runtime"
JSDoc on `init_image` and `strength` with accurate docs covering the
FLUX.2 in-context-conditioning branch and the SD/SDEdit branch. This
PR ships img2img support end-to-end, so the type hover docs
contradicted the runtime behavior.
- docs/img2img-quickstart.md: rewrite against the refactored API.
Replace the two-arg constructor (`diskPath`, `modelName`,
`llmModel`, `vaeModel`) with the single-object `{ files, config,
logger }` shape, switch every example from `model.img2img(...)` to
`model.run({ init_image, ... })`, and correct the package name from
`@qvac/lib-infer-diffusion` to `@qvac/diffusion-cpp`.
- docs/architecture.md: bump the package header to v0.3.0.
- examples/quick-test.js: delete. It used ESM `import` syntax in a
CommonJS package and imported a non-existent `diffusionAddon` named
export; nothing referenced it.
- addon/src/model-interface/SdModel.cpp: remove the duplicate
`genParams.height = imgH;` in the `useRefImages` branch. Harmless
dead store, but easy to miss in review.
- architecture.md: Key Features list `Generation modes` now includes img2img alongside txt2img, matching the shipped runtime. - CHANGELOG.md: migration example marker changes from `BEFORE (<= 0.1.x)` to `BEFORE (<= 0.2.x)` since 0.2.x also used the old two-argument constructor. - index.d.ts: trim the init_image / strength JSDoc to verified facts. The old text said both were "not yet supported, throws at runtime", which is false on this branch (img2img ships end-to-end). Previous revision added branch-specific behavior claims; replaced with the minimal accurate description.
Both methods had one-line JSDoc on main describing what they do
("Cancel the current generation job.", "Unload the model and release
all resources."). The refactor dropped the JSDoc comments when it
rewrote the method bodies. Restore them since the purpose statement
is still accurate.
Tighten comments this PR introduced that drifted into over-explanation. Leave pre-existing comments as-is. - addon.js mapAddonEvent JSDoc: drop multi-paragraph prose; keep the one-sentence contract plus the param block. - index.js constructor: collapse the cancel-closure rationale to one line. - docs/architecture.md: bump Last Updated to 2026-04-16.
The refactor commit silently dropped the JSDoc block on _createAddon() and the 'Error during stable-diffusion model load' error log in _load(). Put them back so the refactor only changes what needs to change.
_load() wrapped only await this.addon.activate() in try/catch, but _createAddon() calls _connectNativeLogger() and then constructs SdInterface. If the SdInterface constructor throws, _nativeLoggerActive stays set and the native logger hook is never released; a retry on the same instance would reconnect on top of a stale hook. Move _createAddon() inside the try so the existing catch path runs _releaseNativeLogger() for every pre-activate failure.
The 'test' alias was only consumed by 'test:all', and neither was referenced in CI workflows or the README. 'test:all' ran test:unit twice because it called both test:unit and the 'test' alias. Remove 'test' and rewrite 'test:all' to run test:unit, test:integration, and test:cpp directly.
The 0.3.0 section used ## for Breaking Changes / Features / Bug Fixes / Pull Requests, making them siblings of the version heading instead of children; bump those to ### and the leaf subsections from ### to #### so the TOC renders correctly, matching the 0.2.x entries. Two architecture.md spots still said the exclusive run queue serialises only run()/unload() even though load() now also wraps in this._run(...); align both.
maxim-smotrov
previously approved these changes
Apr 17, 2026
Three flowchart node labels contained ( ) inside unquoted [ ] labels: RunJob[addon.runJob(paramsJson)] EncodePrompt[Encode prompt (CLIP)] InitLatents[Initialize random latents (seed)] Mermaid treats the ( as the start of a stadium-shaped node inside the rectangular node, so the diagram failed to parse. Quote each label so the parens render literally.
yuranich
previously approved these changes
Apr 17, 2026
docs/img2img-quickstart.md already notes that the FLUX.2 in-context conditioning path does not use strength and the input image is routed through ref_images instead. The d.ts on GenerationParams.strength still implied the knob applies universally; match the doc so IDE hover docs tell the same story.
Contributor
Author
|
/review |
Contributor
Tier-based Approval Status |
maxim-smotrov
approved these changes
Apr 17, 2026
dev-nid
approved these changes
Apr 17, 2026
jesusmb1995
approved these changes
Apr 17, 2026
gianni-cor
approved these changes
Apr 17, 2026
Contributor
|
/review |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
🎯 What problem does this PR solve?
BaseInferencesolely for job-management boilerplate.diskPath+ individual model-name strings, forcing callers to also pass a filesystem root.index.js, spreading bridge logic across the class.📝 How does it solve it?
BaseInferenceinheritance; composecreateJobHandler()+exclusiveRunQueue()from@qvac/infer-base@^0.4.0directly.{ files: { model, clipL?, clipG?, t5Xxl?, llm?, vae? }, config?, logger?, opts? }with absolute paths per component.assertAbsolute(...)throws aTypeErroron missing/relative paths, mapping the old two-argument shape to a clear error for callers porting old code.addon.js(mapAddonEvent): Error event name →Error,Uint8Array→Output(image bytes), string →Output(progress tick JSON), plain object →JobEnded(RuntimeStats), unknown shapes →null(caller logs at debug level).run()-before-load()guard: throwsError('Addon not initialized. Call load() first.')instead of crashing in native code.load()on an already-loaded instance is now a silent no-op (ReadyResource pattern). Callunload()first if you intend to swap weights, thenload()again.load()is serialized through the sameexclusiveRunQueueasrun()andunload(), so two concurrentload()calls on the same instance no longer race past theconfigLoadedguard and leak a native handle.addon.activate()throws during_load(), the partially initialized addon is best-effort-unloaded, the native logger is released, andthis.addonis reset tonullbefore the error is re-thrown.isSplitLayoutheuristic to also trigger when onlyclipLorclipGis supplied — closes a footgun where a FLUX.1 caller passing{ model, clipL, clipG, vae }withoutt5Xxlwould silently mis-route the diffusion model into the all-in-onepathparameter and fail to load.unload()nullsthis.addonso post-unloadcancel()/run()calls hit the guards instead of dereferencing a disposed native handle.warnlevel instead of being silently swallowed.@qvac/infer-base^0.2.2→^0.4.0; new@qvac/loggingruntime dependency.🧪 How was it tested?
npm run lint(standard) andnpm run test:dtsclean.test/unit/map-addon-event.test.js— 7 tests covering Error event precedence, Uint8Array → Output, string → Output, plain-object → JobEnded, and null returns for unknown shapes.💥 Breaking Changes
BEFORE (≤ 0.2.x):
AFTER (0.3.0):
Additional behavior changes:
load()is a silent no-op (ReadyResource pattern). Callunload()first if you intend to swap weights.clipLModel/clipGModel/t5XxlModel/llmModel/vaeModel→clipL/clipG/t5Xxl/llm/vae.diskPath+ name; addon treats file paths as authoritative.🔌 API Changes