Skip to content

chore[notask|skiplog]: trigger CLI release 0.2.2#1011

Merged
simon-iribarren merged 1 commit into
tetherto:release-cli-0.2.2from
lauripiisang:release-cli-0.2.2
Mar 19, 2026
Merged

chore[notask|skiplog]: trigger CLI release 0.2.2#1011
simon-iribarren merged 1 commit into
tetherto:release-cli-0.2.2from
lauripiisang:release-cli-0.2.2

Conversation

@lauripiisang

Copy link
Copy Markdown
Contributor

🎯 What problem does this PR solve?

  • release was broken - workflow didn't have npm install

📝 How does it solve it?

  • workflow was fixed in main

@lauripiisang lauripiisang requested review from a team as code owners March 19, 2026 12:44
@lauripiisang lauripiisang changed the title chore: trigger CLI release 0.2.2 chore[notask|skiplog]: trigger CLI release 0.2.2 Mar 19, 2026
@simon-iribarren

Copy link
Copy Markdown
Contributor

/review

@github-actions

Copy link
Copy Markdown
Contributor

Tier-based Approval Status

**PR Tier:** TIER1

**Current Status:** ✅ APPROVED

**Requirements:**
- 1 Team Member approval ✅ (3/1)
- 1 Team Lead OR Management approval ✅ (1/1)



---
*This comment is automatically updated when reviews change.*

@simon-iribarren simon-iribarren merged commit cb7d0eb into tetherto:release-cli-0.2.2 Mar 19, 2026
12 of 19 checks passed
simon-iribarren added a commit that referenced this pull request Mar 23, 2026
* chore: trigger CLI release 0.2.2 (#1011)

* doc[notask|skiplog]: add changelog for CLI v0.2.2 (#1013)

* doc[notask|skiplog]: add changelog for CLI v0.2.2

Made-with: Cursor

* fix: preserve existing changelog history

Made-with: Cursor

---------

Co-authored-by: Lauri Piisang <lauri.piisang@gmail.com>
BrunoCampana added a commit that referenced this pull request Mar 23, 2026
* fix: fix race condition in LLM example download utility (#1019)

* fix: fix race condition in LLM example download utility

The redirect handler in examples/utils.js called fs.unlink fire-and-forget
then immediately recursed into downloadModel. The recursive call could find
the empty file still on disk (existsSync → true) before unlink completed,
causing an ENOENT crash on the subsequent statSync.

Port the proven download pattern from test/integration/utils.js:
- Wait for unlink callback before recursing on redirect
- Handle 307/308 redirects (HuggingFace uses 302)
- Handle relative redirect URLs
- Use safeResolve/safeReject guards to prevent double settlement
- Add response error handler and fileStream error handler

* fix: use URL constructor for safer redirect resolution


* fix: fix race condition in embed and diffusion download utilities

Port the proven download pattern from the LLM package (PR #1019):
- Wait for fs.unlink callback before recursing on redirect
- Add safeResolve/safeReject guards to prevent double settlement
- Handle 307/308 redirects in embed examples/utils.js
- Add fileStream and response error handlers
- Use URL constructor for safer redirect resolution
- Use close event instead of finish for write completion


---------

Co-authored-by: gianni-cor <gianfranco.cordella@tether.io>

* doc: update README - table of packages - add diffusion and diagnostics - key features - add openAI-compatible API (#1033)

* fix: fix docs build and escape MDX curly braces in errors.mdx and removed randomly created (#1051)

* doc: generate API docs for v0.8.0

* chore[notask]: remove accidentally committed file

* fix: fix docs build and escape MDX curly braces in errors.mdx and removed random

* fix: revert pre-build script

---------

Co-authored-by: Bruno Campana <7632562+BrunoCampana@users.noreply.github.com>

* Fix security issues flagged by CodeQL in TTS package (#1058)

* Updated qvac-lint-cpp to match latest version from original repo (#1064)

* fix: add native job IDs to addon-cpp callbacks (#955)

* fix: preserve addon job ownership across cancel/reuse

Propagate native job IDs through addon-cpp queued callbacks so late cancel events stay attached to the cancelled job. Remove the Parakeet stale-cancel workaround and align Whisper with the shared runtime contract.

Made-with: Cursor

* chore: scope addon-cpp job-id update to 1.1.3

Limit this branch to the shared addon-cpp runtime changes and bump the package to 1.1.3. Follow-up addon consumer updates will land in separate PRs after the registry is updated.

Made-with: Cursor

* fix: move pending job state before unlock

Copy the pending job into local state before releasing the JobRunner mutex so processing and error paths no longer read job_ without synchronization.

Made-with: Cursor

---------

Co-authored-by: GustavoA1604 <54457676+GustavoA1604@users.noreply.github.com>

* Removed overlay ports. Build from registry. (#1066)

* fix: use object config format in nativelog example (#1070)

* QVAC-13813 chore: add int8 parakeet eou and sortformer production registry entries (#1035)

* chore: Add int8 quantised models for Parakeet EOU and Sortformer

* fix: Add links for quantised parakeet models

* fix: Remove tokenizer for int8

---------

Co-authored-by: Ishan Vohra <ishanvohra@Ishans-MacBook-Air.local>
Co-authored-by: Yury Samarin <yuri.a.samarin@gmail.com>

* fix[notask]: resolve code scanning security findings in nmtcpp and ocr-onnx (#1060)

* fix[notask]: resolve code scanning security findings in nmtcpp and ocr-onnx

Fix ReDoS vulnerabilities in indic-processor URL and numeral regexes by
removing nested quantifiers. Fix ReDoS in sacremoses tokenizer protected
patterns by requiring opening quotes to eliminate ambiguous backtracking.
Fix incomplete string replacement in indic_normalize by using global
regex for pipe character substitution. Replace insecure tempfile.mktemp
with NamedTemporaryFile in ocr-onnx benchmark script.

* fix[notask]: resolve polynomial ReDoS in numeral and other patterns

Fix _NUMERAL_PATTERN by replacing ambiguous \d+\.?\d* with
\d+(?:\.\d+)? to eliminate overlapping digit quantifiers.
Fix _OTHER_PATTERN by bounding the prefix to {0,100} to prevent
polynomial backtracking when no separator is found.

* fix[notask]: bound regex quantifiers to eliminate polynomial ReDoS

Replace unbounded \d+ with \d{1,20} and \w+ with \w{1,100} in
_NUMERAL_PATTERN and _OTHER_PATTERN to make backtracking constant-time
regardless of input length. No real-world numeral exceeds 20 digits
and no hashtag/mention exceeds 100 chars.

---------

Co-authored-by: RamazTs <66473301+RamazTs@users.noreply.github.com>

* feat[whisper][notask]: add streaming VAD transcription to whisper addon (#998)

* feat: add streaming VAD transcription to whisper addon

- Add C++ StreamingProcessor with Silero VAD for speech segmentation
- StreamingProcessor runs on its own thread, buffers incoming audio,
  and uses whisper_vad_* APIs to detect speech boundaries
- RAII wrapper (VadSegmentsPtr) for automatic VAD segment cleanup
- Backpressure handling: drop oldest audio when buffer exceeds cap
- JS bindings: startStreaming, appendStreamingAudio, endStreaming
- New error codes for streaming operations (6012-6014)
- Addon state properly reset in response finally handler

Made-with: Cursor

* fix: address PR review comments for whisper streaming VAD

- Replace g_streamingProcessors map with single-processor globals
  (one active streaming job at a time per Gustavo's feedback)
- Wire streaming cleanup into cancel and destroyInstance via
  cancelWithStreaming and destroyInstanceWithStreaming wrappers
- Add StreamingProcessor::cancel() for forceful abort with
  model cancellation and thread join
- Fix stats accumulation: use WhisperModel::process(Input&) void
  overload + takeOutput() so stats accumulate across segments
  instead of resetting per-segment
- Add WhisperModel::prepareForStreaming() to reset stats and
  cancel flag once at session start
- Propagate segment processing errors via hasError_ flag and
  queue exception at stream end
- Add streaming methods to MockedBinding (startStreaming,
  appendStreamingAudio, endStreaming, error simulation)
- Add 6 unit tests covering streaming lifecycle, stats, cancel,
  destroy, error propagation, and concurrent session rejection
- Add example.streaming-vad.js demonstrating runStreaming() API
  with fs.createReadStream as audio source

Made-with: Cursor

---------

Co-authored-by: Raju <raju.sharma>

* QVAC-14357 fix(onnx): Code clean-up and fixes (#1049)

* (feature) llamacpp-llm: dynamic tools (#706)

* (improvement) llamacpp-llm: Qwen3 dynamic tools template

* (improvement) llamacpp-llm: add llm config tools flag

* (improvement) llamacpp-llm: use template based on tools param

* (improvement) llamacpp-llm: count tools token offset with tokenizer

* (improvement) llamacpp-llm: track n-past, run Qwen3 tests, fix reset

* (improvement) llamacpp-llm: save cache with respect to tools flag

* (fix) llamacpp-llm: add Qwen3ToolsDynamicTemplate.cpp to production CMakeLists

The new source file was added to the test CMakeLists but missing from the addon and cli_tool targets, causing an undefined symbol linker error on CI win64 builds.

Made-with: Cursor

* chore: retrigger CI for CMakeLists fix

Made-with: Cursor

* (fix) llamacpp-llm: fix use-after-free SIGSEGV on process exit (linux)

Reorder TextLlmContext members so threadpools are declared before llamaInit_. C++ destroys members in reverse declaration order, so llamaInit_ (which calls llama_free) now runs while threadpools are still alive, preventing use-after-free when llama_free accesses attached threadpool pointers.

Made-with: Cursor

* Revert "(fix) llamacpp-llm: fix use-after-free SIGSEGV on process exit (linux)"

This reverts commit 7d9c237.

* (fix) llamacpp-llm: robust threadpool teardown to prevent SIGSEGV on exit

The ThreadPoolDeleter was doing ggml backend registry lookups during destruction, which is fragile during process teardown when the registry may already be torn down. Additionally, threadpools attached to llama_context could be freed before the context itself, causing use-after-free. Fix: cache ggml_threadpool_free fn pointer at construction time, and add explicit destructor that detaches threadpools before freeing them.

Made-with: Cursor

* Revert "(fix) llamacpp-llm: robust threadpool teardown to prevent SIGSEGV on exit"

This reverts commit 4e66b38.

* fix(llm): reset stale state before non-cached run after prefill

When a prefill run leaves nPast_ > 0 and the next run is a non-cached single-shot, the stale KV cache and dynamic-tools bookkeeping (nPastBeforeTools_, nConversationOnlyTokens_) caused token duplication and incorrect cache trimming. Clear state eagerly when shouldResetAfterInference is true and nPast_ is non-zero.

Made-with: Cursor

* fix(llm): trim stale tool tokens in multi-turn sessions with tools_at_end

When tools_at_end is true and a session continues without explicit save between turns, old tool+response tokens remained in the KV cache. New tool tokens were appended, causing conflicting tool definitions.

Add a guard in processPrompt() that trims from nPastBeforeTools_ to nPast_ before eval when stale tool tokens are detected. Includes new dynamic-tools integration tests covering changing tools, same tools, and single-shot regression.

Made-with: Cursor

* (fix) llamacpp-llm: dynamic tools cache trim, tmp template, debugs

* fix(llm): pass toolsAtEnd flag to context constructors to fix template selection race

The toolsAtEnd flag was set via setToolsAtEnd() after context creation,
but getChatTemplateForModel() was called during construction — always
seeing toolsAtEnd=0 and selecting the wrong Qwen3 template.

Pass the flag through createContext() into TextLlmContext and
MtmdLlmContext constructors so the correct template is selected
from the start. Also restore the conditional template selection
in ChatTemplateUtils that was previously hardcoded.

* feat(llm): strip tool_call/think blocks from re-sent assistant responses

Add stripInternalBlocks() helper to testToolRemoval.js and
benchToolsPlacement.js to remove <tool_call> and <think> blocks
from assistant responses before including them in conversation
history. Prevents model from pattern-matching on old tool calls
and hallucinating removed tools.

Also extend benchToolsPlacement to 20 turns and add HTML chart.

* (fix) llamacpp-llm: use correct template in tests

* (chore) llamacpp-llm: move qwen3 cache tests to own file

* (improvement) llamacpp-llm: simplify nPastBeforeTools reset, multi-turn cache tests

* (improvement) llamacpp-llm: simply nPastBeforeTools tracking, no trim on save

* (chore) llamacpp-llm: remove redundant getters and cleanup

* (internal) llamacpp-llm: run Qwen3 context tests

* (chore) cleanup

* (chore) fix lint errors in examples

* (chore) fix remaining lint errors in benchToolsPlacement

* (chore) fix indentation in benchToolsPlacement ternary

* (chore) llamacpp-llm: remove unused example files

* (chore) remove scratch planning docs

* (doc) llamacpp-llm: tools_at_end param description

* (chore) llamacpp-llm: changelog and version bump

* refactor(llamacpp-llm): address PR #706 review comments

Implement all 10 reviewer requests from PR #706 (jesusmb1995, gianni-cor).

| # | Reviewer | Request | Result |
|---|---------|---------|--------|
| R1 | @jesusmb1995 | Extract DynamicToolsState class | Done - new class in LlmContext.hpp with toolsAtEnd_, nConversationOnlyTokens_, nPastBeforeTools_, recordToolBoundary(), reset() |
| R2 | @jesusmb1995 | Collapse 3 virtual methods into single dynamicToolsState() accessor | Done - removed setToolsAtEnd, getNPastBeforeTools, setNPastBeforeTools virtuals; added dynamicToolsState() non-virtual accessor on base class |
| R3 | @gianni-cor | Remove redundant setToolsAtEnd() after createContext() | Done - removed the 4-line block in LlamaModel::init() |
| R4 | @gianni-cor | Add assert: nConversationOnlyTokens_ <= inputTokens.size() | Done - added in TextLlmContext::tokenizeChat |
| R5 | @gianni-cor | Reset nConversationOnlyTokens_ in TextLlmContext::resetState | Done - both contexts now call dynamicToolsState().reset() which resets both values |
| R6 | @gianni-cor | Guard tools_at_end for non-Qwen3 models | Done - architecture check after config parsing, logs warning and disables flag |
| R7 | @gianni-cor | Fix off-by-A trim error (disable add_generation_prompt) | Done - both TextLlmContext and MtmdLlmContext save/restore add_generation_prompt=false during no-tools tokenization |
| R8 | @gianni-cor | Add cold-start reset in MtmdLlmContext::tokenizeChat | Done - dynamicToolsState().reset() added at cold-start path |
| R9 | @gianni-cor | Cap firstMsgTokens_ after post-eval trim | Done - setFirstMsgTokens(getNPast()) if inflated after trim |
| R10 | @gianni-cor | Remove duplicate toolsAtEnd_ from LlamaModel | Done - runtime code in processPromptImpl queries dynamicToolsState().toolsAtEnd() instead of state_->toolsAtEnd_ |

Made-with: Cursor

* refactor(llamacpp-llm): remove toolsAtEnd_ from ReloadableState, single source of truth in DynamicToolsState

Made-with: Cursor

* fix(llamacpp-llm): use dts.reset() after post-eval trim for full state cleanup

Made-with: Cursor

* (draft) llamacpp-llm: dynamic tools cache tokens test debug

* (internal) llamacpp-llm: dynamic tools token count and cache match test

* Revert "(internal) llamacpp-llm: dynamic tools token count and cache match test"

This reverts commit 181b98a.

* Revert "(draft) llamacpp-llm: dynamic tools cache tokens test debug"

This reverts commit 27e6a5c.

* fix(llamacpp-llm): address PR review comments N3-N8, merge main

N3: Save/restore inputs.use_jinja around no-tools tokenization to
    prevent getPrompt() Jinja fallback from corrupting the flag.
N4: Remove dead Jinja template variables (ns.multi_step_tool,
    ns.last_query_index) from Qwen3ToolsDynamicTemplate.
N5: Add missing assert(conversationOnlyTokens <= totalTokens) in
    MtmdLlmContext::tokenizeChat, matching TextLlmContext.
N6: Document Qwen3-only model support in tools-at-end.md.
N7: Merge duplicate if(nPast_==0 && !isCacheLoaded) blocks in
    TextLlmContext::tokenizeChat.
N8: Remove unnecessary save/restore of inputs.tools and
    inputs.add_generation_prompt (locals not read after).

Also: merge main into feature branch, move dynamic-tools changelog
to separate 0.13.1 entry.

Made-with: Cursor

* style(llamacpp-llm): apply clang-format to all PR-touched C++ files

Made-with: Cursor

* style(llamacpp-llm): fix remaining clang-format-19 brace-init formatting

Made-with: Cursor

* chore: remove accidentally committed binary file

The file packages/ocr-onnx/big_and_clear_watermarks.png was unintentionally staged during merge conflict resolution.

Made-with: Cursor

* chore(llm): bump version to 0.14.0

Made-with: Cursor

* chore: remove working artifacts from feature branch

Made-with: Cursor

* chore: remove accidentally committed sdk model history file

Made-with: Cursor

* doc: add dynamic-tools examples to README

Made-with: Cursor

* fix(llm): reset use_jinja from params_ instead of save/restore

Made-with: Cursor

* fix(llm): reset use_jinja before second getPrompt call

Made-with: Cursor

---------

Co-authored-by: Dmitry Malishev <dmitry.malishev@tether.io>
Co-authored-by: olyasir <sirkinolya@gmail.com>
Co-authored-by: gianni <gianfranco.cordella@tether.io>

* [tetherto/qvac] fix(nmtcpp): fix critical C++ bugs, add lint-cpp, update README (#1071)

* fix(nmtcpp): fix critical C++ bugs, add lint-cpp, update README

- Fix UB: PivotTranslationModel::translateString missing return path
- Fix cancel propagation to sub-models in PivotTranslationModel
- Fix stopTranslation_ flag never reset after cancel
- Fix translateBatch ignoring cancellation flag
- Fix private inheritance of IModelCancel in TranslationModel and
  PivotTranslationModel (enables dynamic_cast from framework)
- Fix typo: "Invalid backed type" -> "Invalid backend type"
- Fix operator precedence in detectBackendType (add explicit parens)
- Add lint-cpp script to package.json
- Update README: fix Bare version mismatch, doc links, pause/resume
  claim, add pivot example, update clone URLs for monorepo, clarify
  Bergamot build flag

Made-with: Cursor

* delete Move Semantics

---------

Co-authored-by: olyasir <sirkinolya@gmail.com>

* chore[notask]: backmerge release @qvac/cli v0.2.2 (#1076)

* chore: trigger CLI release 0.2.2 (#1011)

* doc[notask|skiplog]: add changelog for CLI v0.2.2 (#1013)

* doc[notask|skiplog]: add changelog for CLI v0.2.2

Made-with: Cursor

* fix: preserve existing changelog history

Made-with: Cursor

---------

Co-authored-by: Lauri Piisang <lauri.piisang@gmail.com>

* QVAC-14188: langdetect-text-cld2 ISO 369-3 support (#1078)

feat: cld2 support for ISO 639-1/2/3 code inputs for getting language names

* fix: handle absolute companion model paths in diffusion addon (#1077)

The SDK's resolveConfig() resolves companion model names (clipL, clipG,
t5Xxl, llm, vae) to absolute disk paths. Previously, the addon always
joined these with diskPath, which would produce broken double-joined
paths when given an already-absolute path. Add a resolve() helper that
passes absolute paths through unchanged and only joins relative ones.

Co-authored-by: gianni-cor <gianfranco.cordella@tether.io>

* fix: recover content gaps (#1067)

* infra[notask]: extend onnx tts mobile device farm timeouts and run q4/q4f16 matrix (#1075)

* chore: Add fp16 and q4 models in mobile integration tests

* fix: Increase timeout and run q4 and q4f16 models

---------

Co-authored-by: Ishan Vohra <ishanvohra@Ishans-MacBook-Air.local>
Co-authored-by: GustavoA1604 <54457676+GustavoA1604@users.noreply.github.com>

* fix: replace lab results test fixture image (#1063)

Update the DocTR lab results fixture to use the new realistic sample while keeping the original filename for existing test and workflow references.

Made-with: Cursor

Co-authored-by: olyasir <sirkinolya@gmail.com>

* fix: update package.json URLs to monorepo for all packages (#1088)

* fix: update package.json URLs to point to monorepo for LLM, Embed, and Diffusion addons

The repository, bugs, and homepage URLs pointed to old standalone repos
that are either private or non-existent. Update to point to the qvac
monorepo with correct directory fields for npm.

* fix: update package.json URLs to monorepo for nmtcpp, ocr-onnx, and registry-server

Same fix as the previous commit but for the remaining packages with
stale standalone repo URLs.

* fix: add repository and homepage fields to remaining JS packages

Add consistent repository, bugs, and homepage fields pointing to
the monorepo for error, dl-base, dl-filesystem, dl-hyperdrive,
infer-base, langdetect-text, and rag packages.

* fix: add monorepo metadata to remaining packages

Add repository (with directory), bugs, and homepage fields to sdk,
logging, decoder-audio, diagnostics, onnx, tts-onnx, and
langdetect-text-cld2. Fix whispercpp to include directory in
repository and package-scoped homepage.

* fix: add monorepo metadata to cli, registry-client, and registry-schema

Add homepage to cli. Add repository, bugs, and homepage to
registry-client and registry-schema sub-packages.

* feat[notask]: add download profiler for registry blob performance diagnostics (#1040)

* feat[notask]: add download profiler for registry blob performance diagnostics

Made-with: Cursor

* fix: move profiler deps from devDependencies to dependencies

Made-with: Cursor

* doc: add profile command and example to client README

Made-with: Cursor

* fix: show full peer keys in profiler output for troubleshooting

Made-with: Cursor

* fix: validate parseInt results for interval and timeout CLI flags

Made-with: Cursor

---------

Co-authored-by: Proletter <40578159+Proletter@users.noreply.github.com>
Co-authored-by: Simon Iribarren <simon.ig13@gmail.com>

* fix: resolve dependabot alerts for registry-server transitive deps (#1093)

* fix(registry-server): PBKDF2 for passphrase-derived keys (CodeQL #9) (#1065)

* fix(registry-server): derive passphrase keys with PBKDF2

Replace single-pass SHA-256 with PBKDF2-HMAC-SHA256 (310k iterations)
for deterministic test keys; addresses CodeQL js/insufficient-password-hash.

* chore(registry-server): remove passphrase migration note from guide

---------

Co-authored-by: Proletter <40578159+Proletter@users.noreply.github.com>

---------

Co-authored-by: Ridwan Taiwo <donriddo@gmail.com>
Co-authored-by: gianni-cor <gianfranco.cordella@tether.io>
Co-authored-by: Giacomo <119889121+GiacomoSorbiWork@users.noreply.github.com>
Co-authored-by: GustavoA1604 <54457676+GustavoA1604@users.noreply.github.com>
Co-authored-by: Juan Pablo Garibotti Arias <juan.arias@bitfinex.com>
Co-authored-by: ogad-tether <omar.gad@tether.io>
Co-authored-by: dev-nid <nidhinpd811@gmail.com>
Co-authored-by: Ishan Vohra <ishanvohra2@gmail.com>
Co-authored-by: Ishan Vohra <ishanvohra@Ishans-MacBook-Air.local>
Co-authored-by: Yury Samarin <yuri.a.samarin@gmail.com>
Co-authored-by: olyasir <sirkinolya@gmail.com>
Co-authored-by: RamazTs <66473301+RamazTs@users.noreply.github.com>
Co-authored-by: Raju Sharma <sharmaraju352@gmail.com>
Co-authored-by: iancris <17702377+iancris@users.noreply.github.com>
Co-authored-by: Mikhail Sotnikov <mialsot@gmail.com>
Co-authored-by: Dmitry Malishev <dmitry.malishev@tether.io>
Co-authored-by: alsrivas <40749307+Alok-Ranjan23@users.noreply.github.com>
Co-authored-by: Simon Iribarren <simon.ig13@gmail.com>
Co-authored-by: Lauri Piisang <lauri.piisang@gmail.com>
Co-authored-by: Proletter <40578159+Proletter@users.noreply.github.com>
Proletter pushed a commit that referenced this pull request May 24, 2026
* chore: trigger CLI release 0.2.2 (#1011)

* doc[notask|skiplog]: add changelog for CLI v0.2.2 (#1013)

* doc[notask|skiplog]: add changelog for CLI v0.2.2

Made-with: Cursor

* fix: preserve existing changelog history

Made-with: Cursor

---------

Co-authored-by: Lauri Piisang <lauri.piisang@gmail.com>
Proletter added a commit that referenced this pull request May 24, 2026
* fix: fix race condition in LLM example download utility (#1019)

* fix: fix race condition in LLM example download utility

The redirect handler in examples/utils.js called fs.unlink fire-and-forget
then immediately recursed into downloadModel. The recursive call could find
the empty file still on disk (existsSync → true) before unlink completed,
causing an ENOENT crash on the subsequent statSync.

Port the proven download pattern from test/integration/utils.js:
- Wait for unlink callback before recursing on redirect
- Handle 307/308 redirects (HuggingFace uses 302)
- Handle relative redirect URLs
- Use safeResolve/safeReject guards to prevent double settlement
- Add response error handler and fileStream error handler

* fix: use URL constructor for safer redirect resolution


* fix: fix race condition in embed and diffusion download utilities

Port the proven download pattern from the LLM package (PR #1019):
- Wait for fs.unlink callback before recursing on redirect
- Add safeResolve/safeReject guards to prevent double settlement
- Handle 307/308 redirects in embed examples/utils.js
- Add fileStream and response error handlers
- Use URL constructor for safer redirect resolution
- Use close event instead of finish for write completion


---------

Co-authored-by: gianni-cor <gianfranco.cordella@tether.io>

* doc: update README - table of packages - add diffusion and diagnostics - key features - add openAI-compatible API (#1033)

* fix: fix docs build and escape MDX curly braces in errors.mdx and removed randomly created (#1051)

* doc: generate API docs for v0.8.0

* chore[notask]: remove accidentally committed file

* fix: fix docs build and escape MDX curly braces in errors.mdx and removed random

* fix: revert pre-build script

---------

Co-authored-by: Bruno Campana <7632562+BrunoCampana@users.noreply.github.com>

* Fix security issues flagged by CodeQL in TTS package (#1058)

* Updated qvac-lint-cpp to match latest version from original repo (#1064)

* fix: add native job IDs to addon-cpp callbacks (#955)

* fix: preserve addon job ownership across cancel/reuse

Propagate native job IDs through addon-cpp queued callbacks so late cancel events stay attached to the cancelled job. Remove the Parakeet stale-cancel workaround and align Whisper with the shared runtime contract.

Made-with: Cursor

* chore: scope addon-cpp job-id update to 1.1.3

Limit this branch to the shared addon-cpp runtime changes and bump the package to 1.1.3. Follow-up addon consumer updates will land in separate PRs after the registry is updated.

Made-with: Cursor

* fix: move pending job state before unlock

Copy the pending job into local state before releasing the JobRunner mutex so processing and error paths no longer read job_ without synchronization.

Made-with: Cursor

---------

Co-authored-by: GustavoA1604 <54457676+GustavoA1604@users.noreply.github.com>

* Removed overlay ports. Build from registry. (#1066)

* fix: use object config format in nativelog example (#1070)

* QVAC-13813 chore: add int8 parakeet eou and sortformer production registry entries (#1035)

* chore: Add int8 quantised models for Parakeet EOU and Sortformer

* fix: Add links for quantised parakeet models

* fix: Remove tokenizer for int8

---------

Co-authored-by: Ishan Vohra <ishanvohra@Ishans-MacBook-Air.local>
Co-authored-by: Yury Samarin <yuri.a.samarin@gmail.com>

* fix[notask]: resolve code scanning security findings in nmtcpp and ocr-onnx (#1060)

* fix[notask]: resolve code scanning security findings in nmtcpp and ocr-onnx

Fix ReDoS vulnerabilities in indic-processor URL and numeral regexes by
removing nested quantifiers. Fix ReDoS in sacremoses tokenizer protected
patterns by requiring opening quotes to eliminate ambiguous backtracking.
Fix incomplete string replacement in indic_normalize by using global
regex for pipe character substitution. Replace insecure tempfile.mktemp
with NamedTemporaryFile in ocr-onnx benchmark script.

* fix[notask]: resolve polynomial ReDoS in numeral and other patterns

Fix _NUMERAL_PATTERN by replacing ambiguous \d+\.?\d* with
\d+(?:\.\d+)? to eliminate overlapping digit quantifiers.
Fix _OTHER_PATTERN by bounding the prefix to {0,100} to prevent
polynomial backtracking when no separator is found.

* fix[notask]: bound regex quantifiers to eliminate polynomial ReDoS

Replace unbounded \d+ with \d{1,20} and \w+ with \w{1,100} in
_NUMERAL_PATTERN and _OTHER_PATTERN to make backtracking constant-time
regardless of input length. No real-world numeral exceeds 20 digits
and no hashtag/mention exceeds 100 chars.

---------

Co-authored-by: RamazTs <66473301+RamazTs@users.noreply.github.com>

* feat[whisper][notask]: add streaming VAD transcription to whisper addon (#998)

* feat: add streaming VAD transcription to whisper addon

- Add C++ StreamingProcessor with Silero VAD for speech segmentation
- StreamingProcessor runs on its own thread, buffers incoming audio,
  and uses whisper_vad_* APIs to detect speech boundaries
- RAII wrapper (VadSegmentsPtr) for automatic VAD segment cleanup
- Backpressure handling: drop oldest audio when buffer exceeds cap
- JS bindings: startStreaming, appendStreamingAudio, endStreaming
- New error codes for streaming operations (6012-6014)
- Addon state properly reset in response finally handler

Made-with: Cursor

* fix: address PR review comments for whisper streaming VAD

- Replace g_streamingProcessors map with single-processor globals
  (one active streaming job at a time per Gustavo's feedback)
- Wire streaming cleanup into cancel and destroyInstance via
  cancelWithStreaming and destroyInstanceWithStreaming wrappers
- Add StreamingProcessor::cancel() for forceful abort with
  model cancellation and thread join
- Fix stats accumulation: use WhisperModel::process(Input&) void
  overload + takeOutput() so stats accumulate across segments
  instead of resetting per-segment
- Add WhisperModel::prepareForStreaming() to reset stats and
  cancel flag once at session start
- Propagate segment processing errors via hasError_ flag and
  queue exception at stream end
- Add streaming methods to MockedBinding (startStreaming,
  appendStreamingAudio, endStreaming, error simulation)
- Add 6 unit tests covering streaming lifecycle, stats, cancel,
  destroy, error propagation, and concurrent session rejection
- Add example.streaming-vad.js demonstrating runStreaming() API
  with fs.createReadStream as audio source

Made-with: Cursor

---------

Co-authored-by: Raju <raju.sharma>

* QVAC-14357 fix(onnx): Code clean-up and fixes (#1049)

* (feature) llamacpp-llm: dynamic tools (#706)

* (improvement) llamacpp-llm: Qwen3 dynamic tools template

* (improvement) llamacpp-llm: add llm config tools flag

* (improvement) llamacpp-llm: use template based on tools param

* (improvement) llamacpp-llm: count tools token offset with tokenizer

* (improvement) llamacpp-llm: track n-past, run Qwen3 tests, fix reset

* (improvement) llamacpp-llm: save cache with respect to tools flag

* (fix) llamacpp-llm: add Qwen3ToolsDynamicTemplate.cpp to production CMakeLists

The new source file was added to the test CMakeLists but missing from the addon and cli_tool targets, causing an undefined symbol linker error on CI win64 builds.

Made-with: Cursor

* chore: retrigger CI for CMakeLists fix

Made-with: Cursor

* (fix) llamacpp-llm: fix use-after-free SIGSEGV on process exit (linux)

Reorder TextLlmContext members so threadpools are declared before llamaInit_. C++ destroys members in reverse declaration order, so llamaInit_ (which calls llama_free) now runs while threadpools are still alive, preventing use-after-free when llama_free accesses attached threadpool pointers.

Made-with: Cursor

* Revert "(fix) llamacpp-llm: fix use-after-free SIGSEGV on process exit (linux)"

This reverts commit 7d9c237.

* (fix) llamacpp-llm: robust threadpool teardown to prevent SIGSEGV on exit

The ThreadPoolDeleter was doing ggml backend registry lookups during destruction, which is fragile during process teardown when the registry may already be torn down. Additionally, threadpools attached to llama_context could be freed before the context itself, causing use-after-free. Fix: cache ggml_threadpool_free fn pointer at construction time, and add explicit destructor that detaches threadpools before freeing them.

Made-with: Cursor

* Revert "(fix) llamacpp-llm: robust threadpool teardown to prevent SIGSEGV on exit"

This reverts commit 4e66b38.

* fix(llm): reset stale state before non-cached run after prefill

When a prefill run leaves nPast_ > 0 and the next run is a non-cached single-shot, the stale KV cache and dynamic-tools bookkeeping (nPastBeforeTools_, nConversationOnlyTokens_) caused token duplication and incorrect cache trimming. Clear state eagerly when shouldResetAfterInference is true and nPast_ is non-zero.

Made-with: Cursor

* fix(llm): trim stale tool tokens in multi-turn sessions with tools_at_end

When tools_at_end is true and a session continues without explicit save between turns, old tool+response tokens remained in the KV cache. New tool tokens were appended, causing conflicting tool definitions.

Add a guard in processPrompt() that trims from nPastBeforeTools_ to nPast_ before eval when stale tool tokens are detected. Includes new dynamic-tools integration tests covering changing tools, same tools, and single-shot regression.

Made-with: Cursor

* (fix) llamacpp-llm: dynamic tools cache trim, tmp template, debugs

* fix(llm): pass toolsAtEnd flag to context constructors to fix template selection race

The toolsAtEnd flag was set via setToolsAtEnd() after context creation,
but getChatTemplateForModel() was called during construction — always
seeing toolsAtEnd=0 and selecting the wrong Qwen3 template.

Pass the flag through createContext() into TextLlmContext and
MtmdLlmContext constructors so the correct template is selected
from the start. Also restore the conditional template selection
in ChatTemplateUtils that was previously hardcoded.

* feat(llm): strip tool_call/think blocks from re-sent assistant responses

Add stripInternalBlocks() helper to testToolRemoval.js and
benchToolsPlacement.js to remove <tool_call> and <think> blocks
from assistant responses before including them in conversation
history. Prevents model from pattern-matching on old tool calls
and hallucinating removed tools.

Also extend benchToolsPlacement to 20 turns and add HTML chart.

* (fix) llamacpp-llm: use correct template in tests

* (chore) llamacpp-llm: move qwen3 cache tests to own file

* (improvement) llamacpp-llm: simplify nPastBeforeTools reset, multi-turn cache tests

* (improvement) llamacpp-llm: simply nPastBeforeTools tracking, no trim on save

* (chore) llamacpp-llm: remove redundant getters and cleanup

* (internal) llamacpp-llm: run Qwen3 context tests

* (chore) cleanup

* (chore) fix lint errors in examples

* (chore) fix remaining lint errors in benchToolsPlacement

* (chore) fix indentation in benchToolsPlacement ternary

* (chore) llamacpp-llm: remove unused example files

* (chore) remove scratch planning docs

* (doc) llamacpp-llm: tools_at_end param description

* (chore) llamacpp-llm: changelog and version bump

* refactor(llamacpp-llm): address PR #706 review comments

Implement all 10 reviewer requests from PR #706 (jesusmb1995, gianni-cor).

| # | Reviewer | Request | Result |
|---|---------|---------|--------|
| R1 | @jesusmb1995 | Extract DynamicToolsState class | Done - new class in LlmContext.hpp with toolsAtEnd_, nConversationOnlyTokens_, nPastBeforeTools_, recordToolBoundary(), reset() |
| R2 | @jesusmb1995 | Collapse 3 virtual methods into single dynamicToolsState() accessor | Done - removed setToolsAtEnd, getNPastBeforeTools, setNPastBeforeTools virtuals; added dynamicToolsState() non-virtual accessor on base class |
| R3 | @gianni-cor | Remove redundant setToolsAtEnd() after createContext() | Done - removed the 4-line block in LlamaModel::init() |
| R4 | @gianni-cor | Add assert: nConversationOnlyTokens_ <= inputTokens.size() | Done - added in TextLlmContext::tokenizeChat |
| R5 | @gianni-cor | Reset nConversationOnlyTokens_ in TextLlmContext::resetState | Done - both contexts now call dynamicToolsState().reset() which resets both values |
| R6 | @gianni-cor | Guard tools_at_end for non-Qwen3 models | Done - architecture check after config parsing, logs warning and disables flag |
| R7 | @gianni-cor | Fix off-by-A trim error (disable add_generation_prompt) | Done - both TextLlmContext and MtmdLlmContext save/restore add_generation_prompt=false during no-tools tokenization |
| R8 | @gianni-cor | Add cold-start reset in MtmdLlmContext::tokenizeChat | Done - dynamicToolsState().reset() added at cold-start path |
| R9 | @gianni-cor | Cap firstMsgTokens_ after post-eval trim | Done - setFirstMsgTokens(getNPast()) if inflated after trim |
| R10 | @gianni-cor | Remove duplicate toolsAtEnd_ from LlamaModel | Done - runtime code in processPromptImpl queries dynamicToolsState().toolsAtEnd() instead of state_->toolsAtEnd_ |

Made-with: Cursor

* refactor(llamacpp-llm): remove toolsAtEnd_ from ReloadableState, single source of truth in DynamicToolsState

Made-with: Cursor

* fix(llamacpp-llm): use dts.reset() after post-eval trim for full state cleanup

Made-with: Cursor

* (draft) llamacpp-llm: dynamic tools cache tokens test debug

* (internal) llamacpp-llm: dynamic tools token count and cache match test

* Revert "(internal) llamacpp-llm: dynamic tools token count and cache match test"

This reverts commit 181b98a.

* Revert "(draft) llamacpp-llm: dynamic tools cache tokens test debug"

This reverts commit 27e6a5c.

* fix(llamacpp-llm): address PR review comments N3-N8, merge main

N3: Save/restore inputs.use_jinja around no-tools tokenization to
    prevent getPrompt() Jinja fallback from corrupting the flag.
N4: Remove dead Jinja template variables (ns.multi_step_tool,
    ns.last_query_index) from Qwen3ToolsDynamicTemplate.
N5: Add missing assert(conversationOnlyTokens <= totalTokens) in
    MtmdLlmContext::tokenizeChat, matching TextLlmContext.
N6: Document Qwen3-only model support in tools-at-end.md.
N7: Merge duplicate if(nPast_==0 && !isCacheLoaded) blocks in
    TextLlmContext::tokenizeChat.
N8: Remove unnecessary save/restore of inputs.tools and
    inputs.add_generation_prompt (locals not read after).

Also: merge main into feature branch, move dynamic-tools changelog
to separate 0.13.1 entry.

Made-with: Cursor

* style(llamacpp-llm): apply clang-format to all PR-touched C++ files

Made-with: Cursor

* style(llamacpp-llm): fix remaining clang-format-19 brace-init formatting

Made-with: Cursor

* chore: remove accidentally committed binary file

The file packages/ocr-onnx/big_and_clear_watermarks.png was unintentionally staged during merge conflict resolution.

Made-with: Cursor

* chore(llm): bump version to 0.14.0

Made-with: Cursor

* chore: remove working artifacts from feature branch

Made-with: Cursor

* chore: remove accidentally committed sdk model history file

Made-with: Cursor

* doc: add dynamic-tools examples to README

Made-with: Cursor

* fix(llm): reset use_jinja from params_ instead of save/restore

Made-with: Cursor

* fix(llm): reset use_jinja before second getPrompt call

Made-with: Cursor

---------

Co-authored-by: Dmitry Malishev <dmitry.malishev@tether.io>
Co-authored-by: olyasir <sirkinolya@gmail.com>
Co-authored-by: gianni <gianfranco.cordella@tether.io>

* [tetherto/qvac] fix(nmtcpp): fix critical C++ bugs, add lint-cpp, update README (#1071)

* fix(nmtcpp): fix critical C++ bugs, add lint-cpp, update README

- Fix UB: PivotTranslationModel::translateString missing return path
- Fix cancel propagation to sub-models in PivotTranslationModel
- Fix stopTranslation_ flag never reset after cancel
- Fix translateBatch ignoring cancellation flag
- Fix private inheritance of IModelCancel in TranslationModel and
  PivotTranslationModel (enables dynamic_cast from framework)
- Fix typo: "Invalid backed type" -> "Invalid backend type"
- Fix operator precedence in detectBackendType (add explicit parens)
- Add lint-cpp script to package.json
- Update README: fix Bare version mismatch, doc links, pause/resume
  claim, add pivot example, update clone URLs for monorepo, clarify
  Bergamot build flag

Made-with: Cursor

* delete Move Semantics

---------

Co-authored-by: olyasir <sirkinolya@gmail.com>

* chore[notask]: backmerge release @qvac/cli v0.2.2 (#1076)

* chore: trigger CLI release 0.2.2 (#1011)

* doc[notask|skiplog]: add changelog for CLI v0.2.2 (#1013)

* doc[notask|skiplog]: add changelog for CLI v0.2.2

Made-with: Cursor

* fix: preserve existing changelog history

Made-with: Cursor

---------

Co-authored-by: Lauri Piisang <lauri.piisang@gmail.com>

* QVAC-14188: langdetect-text-cld2 ISO 369-3 support (#1078)

feat: cld2 support for ISO 639-1/2/3 code inputs for getting language names

* fix: handle absolute companion model paths in diffusion addon (#1077)

The SDK's resolveConfig() resolves companion model names (clipL, clipG,
t5Xxl, llm, vae) to absolute disk paths. Previously, the addon always
joined these with diskPath, which would produce broken double-joined
paths when given an already-absolute path. Add a resolve() helper that
passes absolute paths through unchanged and only joins relative ones.

Co-authored-by: gianni-cor <gianfranco.cordella@tether.io>

* fix: recover content gaps (#1067)

* infra[notask]: extend onnx tts mobile device farm timeouts and run q4/q4f16 matrix (#1075)

* chore: Add fp16 and q4 models in mobile integration tests

* fix: Increase timeout and run q4 and q4f16 models

---------

Co-authored-by: Ishan Vohra <ishanvohra@Ishans-MacBook-Air.local>
Co-authored-by: GustavoA1604 <54457676+GustavoA1604@users.noreply.github.com>

* fix: replace lab results test fixture image (#1063)

Update the DocTR lab results fixture to use the new realistic sample while keeping the original filename for existing test and workflow references.

Made-with: Cursor

Co-authored-by: olyasir <sirkinolya@gmail.com>

* fix: update package.json URLs to monorepo for all packages (#1088)

* fix: update package.json URLs to point to monorepo for LLM, Embed, and Diffusion addons

The repository, bugs, and homepage URLs pointed to old standalone repos
that are either private or non-existent. Update to point to the qvac
monorepo with correct directory fields for npm.

* fix: update package.json URLs to monorepo for nmtcpp, ocr-onnx, and registry-server

Same fix as the previous commit but for the remaining packages with
stale standalone repo URLs.

* fix: add repository and homepage fields to remaining JS packages

Add consistent repository, bugs, and homepage fields pointing to
the monorepo for error, dl-base, dl-filesystem, dl-hyperdrive,
infer-base, langdetect-text, and rag packages.

* fix: add monorepo metadata to remaining packages

Add repository (with directory), bugs, and homepage fields to sdk,
logging, decoder-audio, diagnostics, onnx, tts-onnx, and
langdetect-text-cld2. Fix whispercpp to include directory in
repository and package-scoped homepage.

* fix: add monorepo metadata to cli, registry-client, and registry-schema

Add homepage to cli. Add repository, bugs, and homepage to
registry-client and registry-schema sub-packages.

* feat[notask]: add download profiler for registry blob performance diagnostics (#1040)

* feat[notask]: add download profiler for registry blob performance diagnostics

Made-with: Cursor

* fix: move profiler deps from devDependencies to dependencies

Made-with: Cursor

* doc: add profile command and example to client README

Made-with: Cursor

* fix: show full peer keys in profiler output for troubleshooting

Made-with: Cursor

* fix: validate parseInt results for interval and timeout CLI flags

Made-with: Cursor

---------

Co-authored-by: Proletter <40578159+Proletter@users.noreply.github.com>
Co-authored-by: Simon Iribarren <simon.ig13@gmail.com>

* fix: resolve dependabot alerts for registry-server transitive deps (#1093)

* fix(registry-server): PBKDF2 for passphrase-derived keys (CodeQL #9) (#1065)

* fix(registry-server): derive passphrase keys with PBKDF2

Replace single-pass SHA-256 with PBKDF2-HMAC-SHA256 (310k iterations)
for deterministic test keys; addresses CodeQL js/insufficient-password-hash.

* chore(registry-server): remove passphrase migration note from guide

---------

Co-authored-by: Proletter <40578159+Proletter@users.noreply.github.com>

---------

Co-authored-by: Ridwan Taiwo <donriddo@gmail.com>
Co-authored-by: gianni-cor <gianfranco.cordella@tether.io>
Co-authored-by: Giacomo <119889121+GiacomoSorbiWork@users.noreply.github.com>
Co-authored-by: GustavoA1604 <54457676+GustavoA1604@users.noreply.github.com>
Co-authored-by: Juan Pablo Garibotti Arias <juan.arias@bitfinex.com>
Co-authored-by: ogad-tether <omar.gad@tether.io>
Co-authored-by: dev-nid <nidhinpd811@gmail.com>
Co-authored-by: Ishan Vohra <ishanvohra2@gmail.com>
Co-authored-by: Ishan Vohra <ishanvohra@Ishans-MacBook-Air.local>
Co-authored-by: Yury Samarin <yuri.a.samarin@gmail.com>
Co-authored-by: olyasir <sirkinolya@gmail.com>
Co-authored-by: RamazTs <66473301+RamazTs@users.noreply.github.com>
Co-authored-by: Raju Sharma <sharmaraju352@gmail.com>
Co-authored-by: iancris <17702377+iancris@users.noreply.github.com>
Co-authored-by: Mikhail Sotnikov <mialsot@gmail.com>
Co-authored-by: Dmitry Malishev <dmitry.malishev@tether.io>
Co-authored-by: alsrivas <40749307+Alok-Ranjan23@users.noreply.github.com>
Co-authored-by: Simon Iribarren <simon.ig13@gmail.com>
Co-authored-by: Lauri Piisang <lauri.piisang@gmail.com>
Co-authored-by: Proletter <40578159+Proletter@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants