test[notask]: consolidate tools_compact integration test coverage#1406
Closed
tobi-legan wants to merge 6 commits into
Closed
test[notask]: consolidate tools_compact integration test coverage#1406tobi-legan wants to merge 6 commits into
tobi-legan wants to merge 6 commits into
Conversation
…analysis JS integration (dynamic-tools.test.js): 7 new tests covering pitch DoD gaps — tool_call output verification, conversation history across tool swap, A→B→A round-trip, 5-turn extended session, many-tools payload, session save/reload lifecycle, and cancel-mid-generation reuse. C++ unit (test_cache_management_qwen3.cpp): 1 regression test for the reviewer-flagged firstMsgTokens_ inflation bug — uses small context to force sliding-window discard after tools_at_end trim. Made-with: Cursor
Add 6 break-it tests derived purely from docs, pitch, and README — no implementation code was consulted. Each test represents a real integrator mistake or documentation gap: - conflicting config (tools_at_end=true + tools=false) - session disappears between turns - system message changes mid-conversation - stale tool_call blocks not stripped from prior response - tool with empty name - duplicate tool names in same prompt Also fix corrupted session file test error handling to prevent uncaught error from crashing the test runner. All 26 tests pass (109/109 assertions). Made-with: Cursor
- Rename [adversarial] tag to [edge-case] for clarity - Remove redundant "alternating tools/no-tools across 5 turns" test (covered by existing interleaving + 5-turn session tests) 25 tests remain, 0 redundancy. Made-with: Cursor
Use const for variables that are never reassigned. Made-with: Cursor
Preview deployments for qvac-docs-staging ⚡️
Commit: Deployment ID: Static site name: |
Contributor
Tier-based Approval Status |
Contributor
|
This draft PR is stale because it has been open 21 days and the author has not commented since opening. It is flagged for removal. Remove the stale label or comment on the PR or this will be closed in one day. |
Contributor
|
This draft PR was closed because it has been stalled for 22 days with no author comment since opening. You can reopen this PR later if it is still necessary. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What problem does this PR solve?
tools_compactfeature (formerlytools_at_end) had integration tests split across two files with different API patternsdynamic-tools.test.jsused the deprecatedtools_at_endconfig key and removedrole:'session'API — tests were not actually exercising the featureHow does it solve it?
Consolidates all tools_compact integration test coverage into a single canonical file (
tools-compact.test.js), using the correct API (tools_compactconfig key,cacheKeyrun option).Deleted:
dynamic-tools.test.js(broken: used legacy config key + dead session API)Added to
tools-compact.test.js(12 new tests, 7 existing = 19 total):How was it tested?
All 19 tests pass locally with fresh 0.21.0 native build:
Tested on Apple M4 Pro with Qwen3-0.6B-Q8_0.gguf model,
tools_compact: 'true'.