docs: ml-based prompt injection detection #6627

dianed-square · 2026-01-22T03:01:34Z

Summary

This PR documents ML-based prompt injection detection.

Documentation updates:

documentation/docs/guides/security/prompt-injection-detection.md:
- Update default threshold from 0.7 to 0.8
- Integrate ML-based detection as option
- Add "Enhanced Detection with Machine Learning" section
documentation/docs/guides/config-files.md:
- Update SECURITY_PROMPT_THRESHOLD default from 0.7 to 0.8
- Add SECURITY_PROMPT_CLASSIFIER_ENABLED, SECURITY_PROMPT_CLASSIFIER_ENDPOINT, and SECURITY_PROMPT_CLASSIFIER_TOKEN settings
documentation/docs/guides/environment-variables.md:
- Add all SECURITY_PROMPT variables
documentation/docs/guides/security/classification-api-spec.md:
- Un-unlist topic and add info box clarifying this is for self-hosting use cases
- Question Is there a better location for this reference doc?
documentation/docs/guides/security/index.mdx:
- Add card for Classification API Specification

Type of Change

AI Assistance

This PR was created or reviewed with AI assistance

Testing

None

To see the specific tasks where the Asana app for GitHub is being used, see below:
- https://app.asana.com/0/0/1212876649901503

github-actions · 2026-01-22T03:04:40Z

PR Preview Action v1.6.3
Preview removed because the pull request was closed.
2026-01-22 22:38 UTC

Copilot

Pull request overview

This PR documents the ML-based prompt injection detection feature for goose, updating the default threshold from 0.7 to 0.8 and adding configuration options for using machine learning classifiers.

Changes:

Updated default SECURITY_PROMPT_THRESHOLD from 0.7 to 0.8 across all documentation
Added ML-based detection section with configuration instructions for classifier endpoints
Added new environment variables and config options for ML classifier settings (SECURITY_PROMPT_CLASSIFIER_ENABLED, SECURITY_PROMPT_CLASSIFIER_ENDPOINT, SECURITY_PROMPT_CLASSIFIER_TOKEN)
Made Classification API Specification document publicly visible
Added Classification API Specification card to security index

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
`documentation/docs/guides/security/prompt-injection-detection.md`	Updated threshold default, added ML-based detection section with setup instructions
`documentation/docs/guides/security/index.mdx`	Added card for Classification API Specification
`documentation/docs/guides/security/classification-api-spec.md`	Changed from unlisted to publicly visible, updated references from "HuggingFace" to "Hugging Face"
`documentation/docs/guides/environment-variables.md`	Added all `SECURITY_PROMPT_*` variables with examples
`documentation/docs/guides/config-files.md`	Updated threshold default and added ML classifier config options

documentation/docs/guides/security/index.mdx

documentation/docs/guides/security/classification-api-spec.md

* main: docs: ml-based prompt injection detection (#6627) Strip the audience for compacting (#6646) chore(release): release version 1.21.0 (minor) (#6634) add collapsable chat nav (#6649) fix: capitalize Rust in CONTRIBUTING.md (#6640) chore(deps): bump lodash from 4.17.21 to 4.17.23 in /ui/desktop (#6623) Vibe mcp apps (#6569) Add session forking capability (#5882) chore(deps): bump lodash from 4.17.21 to 4.17.23 in /documentation (#6624) fix(docs): use named import for globby v13 (#6639) PR Code Review (#6043) fix(docs): use dynamic import for globby ESM module (#6636) chore: trigger CI Document tab completion (#6635) Install goose-mcp crate dependencies (#6632) feat(goose): standardize agent-session-id for session correlation (#6626)

Signed-off-by: fbalicchia <[email protected]>

* origin/main: Fix GCP Vertex AI global endpoint support for Gemini 3 models (#6187) fix: macOS keychain infinite prompt loop (#6620) chore: reduce duplicate or unused cargo deps (#6630) feat: codex subscription support (#6600) smoke test allow pass for flaky providers (#6638) feat: Add built-in skill for goose documentation reference (#6534) Native images (#6619) docs: ml-based prompt injection detection (#6627) Strip the audience for compacting (#6646) chore(release): release version 1.21.0 (minor) (#6634) add collapsable chat nav (#6649) fix: capitalize Rust in CONTRIBUTING.md (#6640) chore(deps): bump lodash from 4.17.21 to 4.17.23 in /ui/desktop (#6623) Vibe mcp apps (#6569) Add session forking capability (#5882) chore(deps): bump lodash from 4.17.21 to 4.17.23 in /documentation (#6624) fix(docs): use named import for globby v13 (#6639) PR Code Review (#6043) fix(docs): use dynamic import for globby ESM module (#6636) # Conflicts: # Cargo.lock # crates/goose-server/src/routes/session.rs

…o dkatz/canonical-context * 'dkatz/canonical-provider' of github.com:block/goose: (27 commits) docs: add Remotion video creation tutorial (#6675) docs: export recipe and copy yaml (#6680) Test against fastmcp (#6666) docs: mid-session changes (#6672) Fix MCP elicitation deadlock and improve UX (#6650) chore: upgrade to rmcp 0.14.0 (#6674) [docs] add MCP-UI to MCP Apps blog (#6664) ACP get working dir from args.cwd (#6653) Optimise load config in UI (#6662) Fix GCP Vertex AI global endpoint support for Gemini 3 models (#6187) fix: macOS keychain infinite prompt loop (#6620) chore: reduce duplicate or unused cargo deps (#6630) feat: codex subscription support (#6600) smoke test allow pass for flaky providers (#6638) feat: Add built-in skill for goose documentation reference (#6534) Native images (#6619) docs: ml-based prompt injection detection (#6627) Strip the audience for compacting (#6646) chore(release): release version 1.21.0 (minor) (#6634) add collapsable chat nav (#6649) ...

ml-based prompt injection detection

539552e

dianed-square requested a review from a team as a code owner January 22, 2026 03:01

dianed-square requested review from Copilot and dorien-koelemeijer January 22, 2026 03:01

Copilot started reviewing on behalf of dianed-square January 22, 2026 03:02 View session

blackgirlbytes approved these changes Jan 22, 2026

View reviewed changes

Copilot AI reviewed Jan 22, 2026

View reviewed changes

documentation/docs/guides/security/index.mdx Show resolved Hide resolved

documentation/docs/guides/security/classification-api-spec.md Show resolved Hide resolved

fix card layout

733f1db

dianed-square merged commit 4578c77 into main Jan 22, 2026
22 checks passed

dianed-square deleted the docs/ml-based-prompt-injection-detection branch January 22, 2026 22:34

fbalicchia pushed a commit to fbalicchia/goose that referenced this pull request Jan 23, 2026

docs: ml-based prompt injection detection (block#6627)

fb3ce8f

Signed-off-by: fbalicchia <[email protected]>

This was referenced Jan 27, 2026

chore(release): release version 1.22.0 (minor) #6725

Closed

chore(release): release version 1.22.0 (minor) salmanmkc/goose#1

Open

chore(release): release version 1.22.0 (minor) #6812

Closed

chore(release): release version 1.22.0 (minor) #6813

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: ml-based prompt injection detection #6627

docs: ml-based prompt injection detection #6627

Uh oh!

dianed-square commented Jan 22, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Jan 22, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

docs: ml-based prompt injection detection #6627

docs: ml-based prompt injection detection #6627

Uh oh!

Conversation

dianed-square commented Jan 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Type of Change

AI Assistance

Testing

Uh oh!

github-actions bot commented Jan 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

dianed-square commented Jan 22, 2026 •

edited

Loading

github-actions bot commented Jan 22, 2026 •

edited

Loading