Skip to content

QVAC-13559 feat[api]: sdk "dynamic" tools mode#745

Closed
mialso wants to merge 55 commits into
mainfrom
improvement/sdk-dynamic-tools-interface-support
Closed

QVAC-13559 feat[api]: sdk "dynamic" tools mode#745
mialso wants to merge 55 commits into
mainfrom
improvement/sdk-dynamic-tools-interface-support

Conversation

@mialso

@mialso mialso commented Mar 6, 2026

Copy link
Copy Markdown
Contributor

Description

Adds support for dynamic tools mode with @qvac/llm-llamacpp addon

Changes:

  • package.json: Upgraded @qvac/llm-llamacpp to ^0.17.0
  • new toolsMode configuration inteface
    • 'static' mode (default): Tools are prepended to history (existing behavior)
    • 'dynamic' mode: Tools exist in a user prompt scope and then removed from kv-cache

    the rational behind static/dynamic naming is for the consumer (app using sdk) to understand the difference from the usage perspective, while implementation right now uses append/prepend (tools_compact addon param - how cache works). So a possible impl change won't affect public API

  • unit tests, examples

how to check it works:

bun install
bun run build
bun run ./examples/llamacpp-dynamic-tools.ts
bun run ./examples/agentic-tools.ts

addon implementation

@DmitryMalishev DmitryMalishev force-pushed the feature/llm-dynamic-tools branch from 615c9e7 to 1e7b0f7 Compare March 10, 2026 12:44
@olyasir olyasir force-pushed the feature/llm-dynamic-tools branch from 06b0ae7 to c1e85c2 Compare March 13, 2026 09:19
Base automatically changed from feature/llm-dynamic-tools to main March 21, 2026 09:09
Comment thread packages/sdk/examples/llamacpp-dynamic-tools.ts Fixed
Comment thread packages/sdk/examples/llamacpp-dynamic-tools.ts Fixed
Comment thread packages/sdk/examples/llamacpp-dynamic-tools.ts Fixed
Comment thread packages/sdk/examples/llamacpp-dynamic-tools.ts Fixed
Comment thread packages/sdk/examples/llamacpp-dynamic-tools.ts Fixed
@github-actions

github-actions Bot commented Mar 23, 2026

Copy link
Copy Markdown
Contributor

Tier-based Approval Status

**PR Tier:** TIER1

**Current Status:** ❌ PENDING

**Requirements:**
- 1 Team Member approval ❌ (0/1)
- 1 Team Lead OR Management approval ❌ (0/1)



---
*This comment is automatically updated when reviews change.*

@mialso mialso force-pushed the improvement/sdk-dynamic-tools-interface-support branch from 96e46e2 to 5094f21 Compare March 23, 2026 14:02
@mialso mialso marked this pull request as ready for review March 23, 2026 14:03
@mialso mialso requested review from a team as code owners March 23, 2026 14:03
@mialso mialso changed the title (improvement) sdk: dynamic "toolsMode" (improvement) sdk: "dynamic" tools mode Mar 23, 2026
olyasir
olyasir previously approved these changes Mar 23, 2026
@mialso mialso changed the title (improvement) sdk: "dynamic" tools mode QVAC-13559 feat: sdk "dynamic" tools mode Mar 23, 2026
@mialso mialso changed the title QVAC-13559 feat: sdk "dynamic" tools mode QVAC-13559 feat[api]: sdk "dynamic" tools mode Mar 23, 2026
@@ -0,0 +1,202 @@
/* eslint-disable */
// @ts-nocheck

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why these?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@opaninakuffo initially I just copied sdk/examples/llamacpp-native-tools.ts and replaced the logic with "dynamic" tools handling - should remove so to pass both checks?

@mialso mialso Mar 23, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed comments, now both lint and type checks pass ✔️

Comment thread packages/sdk/server/bare/plugins/llamacpp-completion/ops/completion-stream.ts Outdated
properties: Record<string, unknown>;
required?: string[];
};
}>,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we have numerous existing tool tests. Can we modify some of those to be dynamic instead?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've assumed current tools tests cover available logic and if removed it would end with decreased coverage - I mean to remove a test in order to replace with a dynamic tools one it's required to "prove" it's useless, because a newly added "dynamic" tools one would go another code path 🤔
I will take a look if any test do same stuff, if possible to merge

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

improved tests:

  • extended available createToolsTest function to reuse with toolsMode param (removed prev added helper)
  • removed toolModeUnset test case as already covered by e.g. toolsSimpleFunction

Comment thread packages/sdk/examples/llamacpp-dynamic-tools.ts Outdated
@NamelsKing

Copy link
Copy Markdown
Contributor

please update bun.lock

olyasir
olyasir previously approved these changes Mar 25, 2026
const modelConfig = getModelConfig(modelId);
const systemPromptFromHistory = extractSystemPrompt(history);
const configHash = generateConfigHash(systemPromptFromHistory, tools);
const toolsModeForHash = (modelConfig as { toolsMode?: string }).toolsMode;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't this the same than the toolsMode constant in the outer scope?

@mialso mialso Mar 25, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@simon-iribarren yes, I but I've decided to follow current logic since there are 2 model configs right now and kvCache "branch" has it's own, like

  const modelConfig = getModelConfig(modelId);
  // <...>
  if (kvCache) {
    const modelConfig = getModelConfig(modelId);
    // at this point using toolsMode from the "second" config

hence this is just for consistency - otherwise prob we should refactor to have a single modelConfig?

mialso and others added 7 commits April 24, 2026 17:38
- Remove `CompletionDebugStats` schema and `debugStats` from
  `CompletionRun`, `statsEventSchema`, and `buildStreamResult`.
- Drop the runtime-debug-stats extraction and the
  `// @ts-expect-error test-error` workaround in
  `completion-stream.ts`. Those required an addon-side patch that
  is not in scope here.
- Delete the `agentic-tools.ts` example. It was the only consumer
  of the debug-stats fields and the source of CodeQL findings
  (ReDoS on `<think>`/`<tool_call>`, double-unescape, weak URL
  hostname check). `llamacpp-dynamic-tools.ts` remains as the
  canonical dynamic-tools example.
@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

- The PR introduced `validation: "custom"` + a `validator` callback
  in tools tests, but `@tetherto/qvac-test-suite@0.6.0`'s
  `Expectation` union does not include `"custom"`. Add a local
  `ToolsExpectation` extension that augments `Expectation` with
  the custom-validator shape, used by both the test definitions
  and the executor. The `TestDefinition` cast at construction
  keeps the runtime payload compatible with the test framework.
- Two `createToolsTest` call sites (`toolsSimpleFunction`,
  `toolsMultipleFunctions`) passed `["smoke"]` as the 5th
  positional arg, which collides with the `expectation` slot.
  Add the explicit `undefined` between `toolsMode` and the
  `suites` argument so the array reaches the `suites` parameter.
- The executor's `"custom"` branch now invokes the validator
  directly and shapes a `TestResult`, instead of forwarding to
  `ValidationHelpers.validate` (which only handles the built-in
  validation kinds).

Both issues only surfaced once the `test-e2e-smoke` label was
applied — the e2e workflow's tests-qvac typecheck leg is
label-gated, so prior pushes never typechecked these files.
stream?: boolean;
};
const toolsModelId = await this.resources.ensureLoaded("tools");
const resourceDep = p.toolsMode === ToolsModeType.dynamic ? "tools-dynamic" : "tools"
@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

The plugin's `transformLlmConfig` was missing the SDK-to-addon
translation for `toolsMode`. Without it, the addon receives
`tools_mode` as a CLI-style flag, which it does not recognize,
and load fails with `commonParamsParse: invalid argument:
--tools-mode`.

This translation existed in commit 2840f4d but was lost during
a later main merge that brought in the new addon constructor
shape (PR #1688). Reinstating the original mapping:

  toolsMode: "dynamic" → tools_compact: "true"
  toolsMode: "static"  → tools_compact: "false"
@github-actions

github-actions Bot commented Apr 28, 2026

Copy link
Copy Markdown
Contributor

QVAC E2E — android⚠️ no results

Config: suite=smoke · filter=(none) · exclude=(none)
View run · Artifacts

The test job did not produce a results artifact. Check the run for job-level failures.

@github-actions

github-actions Bot commented Apr 28, 2026

Copy link
Copy Markdown
Contributor

QVAC E2E — ios⚠️ no results

Config: suite=smoke · filter=(none) · exclude=(none)
View run · Artifacts

The test job did not produce a results artifact. Check the run for job-level failures.

@github-actions

github-actions Bot commented Apr 28, 2026

Copy link
Copy Markdown
Contributor

QVAC E2E — windows — ✅ all tests passed (88/88, 676s)

Config: suite=smoke · filter=(none) · exclude=(none)
View run · Artifacts

@github-actions

github-actions Bot commented Apr 28, 2026

Copy link
Copy Markdown
Contributor

QVAC E2E — linux — ❌ failed

Totals: 87/88 passed · 1 failed · 98.9% · 454s
Config: suite=smoke · filter=(none) · exclude=(none)
View run · Artifacts

Results by section

  • addon: 1/2 ❌

Failed tests

  • addon-logging-during-inference: Inference logging error: Cannot set new job: a job is already set or being processed

@github-actions

github-actions Bot commented Apr 28, 2026

Copy link
Copy Markdown
Contributor

QVAC E2E — macos⚠️ no results

Config: suite=smoke · filter=(none) · exclude=(none)
View run · Artifacts

The test job did not produce a results artifact. Check the run for job-level failures.

Comment on lines +14 to +15
promptTokens: z.number().optional(),
generatedTokens: z.number().optional(),

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

generatedTokens & promptTokens arent mapped through from the addon, kindly link

Comment on lines -250 to 272
if (!canSlice && savedCount > 0) {
if (savedCount > 0 && savedCount <= history.length) {
cachedMessageCounts.delete(cachePathToUse);
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Old code deleted stale counts (!canSlice && savedCount > 0 ⇒ savedCount > history.length). New code deletes valid counts (savedCount > 0 && savedCount <= history.length) and leaves stale ones in place. Combined with the fact that savedCount is no longer used for slicing at all, the cachedMessageCounts map has been quietly demoted from "track how much we've cached so we can resend the delta" to "we toggle a bit and then immediately overwrite it via recordCacheSaveCount." Either:

  • the map is now load-bearing only for the missing-cache-file failure path of recordCacheSaveCount, in which case the predicate should match the old "stale" intent (savedCount > history.length), or
  • the slicing-by-savedCount behavior should be restored for the cache-hit path, or
  • the map and recordCacheSaveCount should be removed entirely if they're truly dead.

Kindly document with whichever option as the current state is now contradictory with previous.

@opaninakuffo

Copy link
Copy Markdown
Contributor

Suggestions

from Cursor:

  • packages/sdk/server/bare/plugins/llamacpp-completion/ops/completion-stream.ts:259-279 — static-mode multi-message turns now send only the last message. Previously slice(savedCount) allowed callers to push multiple messages between completions and have them all forwarded. The new branch structure is lastMessages = [lastMsg] for static mode + non-tool/non-user-with-prev-assistant cases. For the canonical 1-msg-per-turn flow this is fine, but if a consumer (or a recovery path after an error) appends [assistant, user] or [user, user] between completions in static mode, only the last one will reach the model. Worth either calling out as an explicit precondition in the docstring above the function, or restoring the slice-from-savedCount behavior for the static path.
  • packages/sdk/server/bare/plugins/llamacpp-completion/ops/completion-stream.ts:430-438toolsModeForHash is dead/duplicate. Inside the if (kvCache) branch, modelConfig shadows the outer same-named variable (both come from getModelConfig(modelId)), and toolsModeForHash re-derives the same value as the outer toolsMode. Just use toolsMode and drop the inner getModelConfig + toolsModeForHash.
  • packages/sdk/server/bare/plugins/llamacpp-completion/ops/completion-stream.ts:418-433historyWithTools is computed even when kvCache is set and unused on that path. In the kvCache branch we pass raw history into prepareMessagesForCache and let addTools re-append. Move the insertToolsIntoHistory call into the else (no-cache) branch only, so we don't pay for the array spread when it's discarded.
  • packages/sdk/server/bare/plugins/llamacpp-completion/ops/completion-stream.ts:266-279 — unnecessary as HistoryMsg casts. history is already typed HistoryMsg[], so history[history.length - 1] as HistoryMsg and (history[i] as HistoryMsg) add noise and silence real type errors if the parameter ever loosens. Drop the casts.
  • packages/sdk/schemas/tools.ts:6-12 — naming for the new const is inconsistent with neighbors. The repo's pattern for value-as-enum is uppercase (VERBOSITY in llamacpp-config.ts, MODEL_TYPES exported from schemas/index.ts) with a separate PascalCase type alongside (ModelType). ToolsModeType as a runtime value with PascalCase + Type suffix reads like a type. Suggest TOOLS_MODE = { static: "static", dynamic: "dynamic" } as const plus type ToolsMode = (typeof TOOLS_MODE)[keyof typeof TOOLS_MODE]. Public API change, but the constant is brand new in this PR so no ecosystem cost.
  • packages/sdk/server/bare/plugins/llamacpp-completion/plugin.ts:62-66tools_compact is now emitted on every llamacpp model load. Because the schema default is toolsMode: "static", transformLlmConfig will set tools_compact: "false" for every model that goes through llmConfigSchema's transform. If the addon treats absent vs. "false" differently (or if some downstream tooling doesn't expect this key), this is a quiet behavior change. Worth confirming with the addon contract and/or only emitting tools_compact when the user explicitly opted into dynamic.
  • packages/sdk/examples/llamacpp-dynamic-tools.ts — clean up before shipping as a public example:
    • runToolInvocationContTest is exported but never invoked, contains commented-out blocks, and duplicates most of runToolInvocationTest. Either delete or rewrite as a focused second example.
    • Trailing commented-out invocations (// using same kvCache for a single session, // await runToolInvocationContTest(...)) should go.
    • Mixes single and double quotes (toolsMode: 'dynamic' next to role: "system"); the rest of examples/ and the SDK use double quotes.
    • tools3 declares parameters: z.object() (no shape arg) — depending on Zod version this either errors at runtime or produces an unintended schema.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

test-e2e-smoke Triggers smoke e2e test suite [Currently SDK-only] verify

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants