Skip to content

test[notask]: add behavioral integration tests for tools_compact#2252

Merged
tobi-legan merged 3 commits into
mainfrom
test/dynamic-tools-coverage
May 26, 2026
Merged

test[notask]: add behavioral integration tests for tools_compact#2252
tobi-legan merged 3 commits into
mainfrom
test/dynamic-tools-coverage

Conversation

@tobi-legan

@tobi-legan tobi-legan commented May 25, 2026

Copy link
Copy Markdown
Contributor

What problem does this PR solve?

  • The tools_compact feature had no integration tests covering tool_call output verification, concurrent run resilience, session lifecycle, evolved schemas, or enum-typed parameters
  • Only 7 basic tests existed (cache arithmetic, prompt validation, prefill)

How does it solve it?

Adds 12 new integration tests to tools-compact.test.js (7 existing + 12 new = 19 total), covering behavioral scenarios that were previously untested.

New tests:

  • Output contains tool_call block when tools are provided
  • Tool_call references correct tool after swap (no stale tool in KV cache)
  • Conversation history preserved after tool swap
  • Extended 5-turn session with mixed tool changes
  • Many tools with complex schemas (5 tools, real agent workload)
  • Session save → destroy → reload → continue with different tools
  • Cancel mid-generation then reuse with tools
  • Large tool payload near context limit (ctx_size=512)
  • Same tool name with evolved schema between turns
  • Concurrent model.run() rejects cleanly and model survives
  • Corrupted session file does not crash model
  • Tool with enum-typed parameters (mirrors SDK dynamic-tools test gap)

How was it tested?

All 19 tests pass locally with fresh 0.21.0 native build:

# tests = 19/19 pass
# asserts = 89/89 pass
# time = 33611ms
# ok

Tested on Apple M4 Pro with Qwen3-0.6B-Q8_0.gguf model, tools_compact: 'true'.

@tobi-legan tobi-legan changed the title test[notask]: add comprehensive integration tests for dynamic tools feature test[notask]: consolidate tools_compact integration test coverage May 25, 2026
@tobi-legan tobi-legan marked this pull request as ready for review May 25, 2026 17:40
@tobi-legan tobi-legan requested review from a team as code owners May 25, 2026 17:40
@tobi-legan tobi-legan added verified Authorize secrets / label-gate in PR workflows and removed verify labels May 25, 2026
Add 12 new integration tests to tools-compact.test.js covering
functional tool-call behavior, session resilience, and edge cases
that were not previously tested.

New coverage:
- tool_call output verification
- correct tool referenced after swap
- conversation history preserved after swap
- extended 5-turn session with mixed tool changes
- many tools with complex schemas
- session save/reload lifecycle
- cancel mid-generation resilience
- large tool payload near context limit
- evolved schema between turns
- concurrent model.run() rejection
- corrupted session file resilience
- enum-typed parameters (mirrors SDK gap)

All 19 tests pass locally (89/89 assertions).

Co-authored-by: Cursor <cursoragent@cursor.com>
@github-actions

github-actions Bot commented May 25, 2026

Copy link
Copy Markdown
Contributor

Mobile integration tests — @qvac/llm-llamacpp (iOS)

Result: passed

metric value
Devices passed 26
Devices failed 0
Test cases total 78
Test cases passed 78
Test cases failed 0
Test cases skipped 0

View workflow run

@github-actions

github-actions Bot commented May 25, 2026

Copy link
Copy Markdown
Contributor

Mobile integration tests — @qvac/llm-llamacpp (Android)

Result: passed

metric value
Devices passed 9
Devices failed 0
Test cases total 27
Test cases passed 27
Test cases failed 0
Test cases skipped 0

View workflow run

@github-actions

github-actions Bot commented May 26, 2026

Copy link
Copy Markdown
Contributor

Tier-based Approval Status

**PR Tier:** TIER1

**Current Status:** ✅ APPROVED

**Requirements:**
- 1 Team Member approval ✅ (1/1)
- 1 Team Lead OR Management approval ✅ (1/1)



---
*This comment is automatically updated when reviews change.*

@kinsta

kinsta Bot commented May 26, 2026

Copy link
Copy Markdown

Preview deployments for qvac-docs-staging ⚡️

Status Branch preview Commit preview
🔁 Deploying... N/A N/A

Commit: fcfb4c68914776af582214673e8c8d3181863874

Deployment ID: 150ed612-083d-4b7b-b041-b31e9da34c3c

Static site name: qvac-docs-staging-fazwv

@tobi-legan

Copy link
Copy Markdown
Contributor Author

/review

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

verified Authorize secrets / label-gate in PR workflows

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants