Different approach to determining final confidence level of prompt injection evaluation outcomes #6729

dorien-koelemeijer · 2026-01-27T07:52:50Z

Summary

Simplifies the logic for combining tool and context confidence scores; the previous approach was slightly too aggressive in tuning down false positives by zeroing out confidence in certain cases. This refactor uses a more balanced threshold-based approach using dampening/boosting rules to reduce false positives.

We'll need to do some user testing to determine whether this approach is solid or whether we need to tweak this slightly later on.

Type of Change

AI Assistance

This PR was created or reviewed with AI assistance

Testing

Quick local testing, but I can only test so much that way - will need broader user testing to determine if this is actually an improvement or if things need to be tweaked further.

Copilot

Pull request overview

This PR adjusts how the prompt-injection scanner combines the tool-level and conversation-context classifier outputs to derive a final security confidence score and logging details. The goal appears to be a more nuanced confidence-combination heuristic and richer logging while keeping the external ScanResult interface stable.

Changes:

Replace context-aware result selection (select_result_with_context_awareness) with a new numeric combination heuristic in combine_confidences, using both tool and context confidences.
Update logging in analyze_tool_call_with_context to structured fields, including per-signal confidences, presence of ML and pattern matches, and the effective malicious decision.
Build the final ScanResult from a synthetic DetailedScanResult that uses the combined confidence along with the tool’s pattern matches and ML confidence.

crates/goose/src/security/scanner.rs

Copilot

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated no new comments.

michaelneale

seems good

* main: (30 commits) Different approach to determining final confidence level of prompt injection evaluation outcomes (#6729) fix: read_resource_tool deadlock causing test_compaction to hang (#6737) Upgrade error handling (#6747) Fix/filter audience 6703 local (#6773) chore: re-sync package-lock.json (#6783) upgrade electron to 39.3.0 (#6779) allow skipping providers in test_providers.sh (#6778) fix: enable custom model entry for OpenRouter provider (#6761) Remove codex skills flag support (#6775) Improve mcp test (#6671) Feat/anthropic custom headers (#6774) Fix/GitHub copilot error handling 5845 (#6771) fix(ui): respect width parameter in MCP app size-changed notifications (#6376) fix: address compilation issue in main (#6776) Upgrade GitHub Actions for Node 24 compatibility (#6699) fix(google): preserve thought signatures in streaming responses (#6708) added reduce motion support for css animations and streaming text (#6551) fix: Re-enable subagents for Gemini models (#6513) fix(google): use parametersJsonSchema for full JSON Schema support (#6555) fix: respect GOOSE_CLI_MIN_PRIORITY for shell streaming output (#6558) ...

* 'main' of github.com:block/goose: (62 commits) Swap canonical model from openrouter to models.dev (#6625) Hook thinking status (#6815) Fetch new skills hourly (#6814) copilot instructions: Update "No prerelease docs" instruction (#6795) refactor: centralize audience filtering before providers receive messages (#6728) update doc to remind contributors to activate hermit and document minimal npm and node version (#6727) nit: don't spit out compaction when in term mode as it fills up the screen (#6799) fix: correct tool support detection in Tetrate provider model fetching (#6808) Session manager fixes (#6809) fix(desktop): handle quoted paths with spaces in extension commands (#6430) fix: we can default gooseignore without writing it out (#6802) fix broken link (#6810) docs: add Beads MCP extension tutorial (#6792) feat(goose): add support for AWS_BEARER_TOKEN_BEDROCK environment variable (#6739) [docs] Add OSS Skills Marketplace (#6752) feat: make skills available in codemode (#6763) Fix: Recipe Extensions Not Loading in Desktop (#6777) Different approach to determining final confidence level of prompt injection evaluation outcomes (#6729) fix: read_resource_tool deadlock causing test_compaction to hang (#6737) Upgrade error handling (#6747) ...

…sion-session * 'main' of github.com:block/goose: (78 commits) copilot instructions: Update "No prerelease docs" instruction (#6795) refactor: centralize audience filtering before providers receive messages (#6728) update doc to remind contributors to activate hermit and document minimal npm and node version (#6727) nit: don't spit out compaction when in term mode as it fills up the screen (#6799) fix: correct tool support detection in Tetrate provider model fetching (#6808) Session manager fixes (#6809) fix(desktop): handle quoted paths with spaces in extension commands (#6430) fix: we can default gooseignore without writing it out (#6802) fix broken link (#6810) docs: add Beads MCP extension tutorial (#6792) feat(goose): add support for AWS_BEARER_TOKEN_BEDROCK environment variable (#6739) [docs] Add OSS Skills Marketplace (#6752) feat: make skills available in codemode (#6763) Fix: Recipe Extensions Not Loading in Desktop (#6777) Different approach to determining final confidence level of prompt injection evaluation outcomes (#6729) fix: read_resource_tool deadlock causing test_compaction to hang (#6737) Upgrade error handling (#6747) Fix/filter audience 6703 local (#6773) chore: re-sync package-lock.json (#6783) upgrade electron to 39.3.0 (#6779) ...

Copilot AI review requested due to automatic review settings January 27, 2026 07:52

dorien-koelemeijer changed the title ~~Different approach to determining final confidence level~~ Different approach to determining final confidence level [WIP] Jan 27, 2026

Copilot started reviewing on behalf of dorien-koelemeijer January 27, 2026 07:53 View session

Copilot AI reviewed Jan 27, 2026

View reviewed changes

crates/goose/src/security/scanner.rs Show resolved Hide resolved

crates/goose/src/security/scanner.rs Show resolved Hide resolved

dorien-koelemeijer changed the title ~~Different approach to determining final confidence level [WIP]~~ Different approach to determining final confidence level of prompt injection evaluation outcomes Jan 27, 2026

Copilot AI review requested due to automatic review settings January 27, 2026 08:27

Copilot started reviewing on behalf of dorien-koelemeijer January 27, 2026 08:27 View session

Different approach to determining final confidence level

95465f8

dorien-koelemeijer force-pushed the feat/improve-prompt-injection-final-confidence-determination branch from c098b4e to 95465f8 Compare January 27, 2026 08:28

Copilot AI reviewed Jan 27, 2026

View reviewed changes

michaelneale approved these changes Jan 29, 2026

View reviewed changes

dorien-koelemeijer merged commit 66e4a1a into main Jan 29, 2026
17 checks passed

dorien-koelemeijer deleted the feat/improve-prompt-injection-final-confidence-determination branch January 29, 2026 00:41

This was referenced Jan 29, 2026

chore(release): release version 1.22.0 (minor) #6812

Closed

chore(release): release version 1.22.0 (minor) #6813

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Different approach to determining final confidence level of prompt injection evaluation outcomes #6729

Different approach to determining final confidence level of prompt injection evaluation outcomes #6729

Uh oh!

dorien-koelemeijer commented Jan 27, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

michaelneale left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Different approach to determining final confidence level of prompt injection evaluation outcomes #6729

Different approach to determining final confidence level of prompt injection evaluation outcomes #6729

Uh oh!

Conversation

dorien-koelemeijer commented Jan 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Type of Change

AI Assistance

Testing

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

michaelneale left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

dorien-koelemeijer commented Jan 27, 2026 •

edited

Loading