-
Notifications
You must be signed in to change notification settings - Fork 2.7k
docs: ml-based prompt injection detection #6627
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR documents the ML-based prompt injection detection feature for goose, updating the default threshold from 0.7 to 0.8 and adding configuration options for using machine learning classifiers.
Changes:
- Updated default
SECURITY_PROMPT_THRESHOLDfrom 0.7 to 0.8 across all documentation - Added ML-based detection section with configuration instructions for classifier endpoints
- Added new environment variables and config options for ML classifier settings (
SECURITY_PROMPT_CLASSIFIER_ENABLED,SECURITY_PROMPT_CLASSIFIER_ENDPOINT,SECURITY_PROMPT_CLASSIFIER_TOKEN) - Made Classification API Specification document publicly visible
- Added Classification API Specification card to security index
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
documentation/docs/guides/security/prompt-injection-detection.md |
Updated threshold default, added ML-based detection section with setup instructions |
documentation/docs/guides/security/index.mdx |
Added card for Classification API Specification |
documentation/docs/guides/security/classification-api-spec.md |
Changed from unlisted to publicly visible, updated references from "HuggingFace" to "Hugging Face" |
documentation/docs/guides/environment-variables.md |
Added all SECURITY_PROMPT_* variables with examples |
documentation/docs/guides/config-files.md |
Updated threshold default and added ML classifier config options |
* main: docs: ml-based prompt injection detection (#6627) Strip the audience for compacting (#6646) chore(release): release version 1.21.0 (minor) (#6634) add collapsable chat nav (#6649) fix: capitalize Rust in CONTRIBUTING.md (#6640) chore(deps): bump lodash from 4.17.21 to 4.17.23 in /ui/desktop (#6623) Vibe mcp apps (#6569) Add session forking capability (#5882) chore(deps): bump lodash from 4.17.21 to 4.17.23 in /documentation (#6624) fix(docs): use named import for globby v13 (#6639) PR Code Review (#6043) fix(docs): use dynamic import for globby ESM module (#6636) chore: trigger CI Document tab completion (#6635) Install goose-mcp crate dependencies (#6632) feat(goose): standardize agent-session-id for session correlation (#6626)
Signed-off-by: fbalicchia <[email protected]>
* origin/main: Fix GCP Vertex AI global endpoint support for Gemini 3 models (#6187) fix: macOS keychain infinite prompt loop (#6620) chore: reduce duplicate or unused cargo deps (#6630) feat: codex subscription support (#6600) smoke test allow pass for flaky providers (#6638) feat: Add built-in skill for goose documentation reference (#6534) Native images (#6619) docs: ml-based prompt injection detection (#6627) Strip the audience for compacting (#6646) chore(release): release version 1.21.0 (minor) (#6634) add collapsable chat nav (#6649) fix: capitalize Rust in CONTRIBUTING.md (#6640) chore(deps): bump lodash from 4.17.21 to 4.17.23 in /ui/desktop (#6623) Vibe mcp apps (#6569) Add session forking capability (#5882) chore(deps): bump lodash from 4.17.21 to 4.17.23 in /documentation (#6624) fix(docs): use named import for globby v13 (#6639) PR Code Review (#6043) fix(docs): use dynamic import for globby ESM module (#6636) # Conflicts: # Cargo.lock # crates/goose-server/src/routes/session.rs
…o dkatz/canonical-context * 'dkatz/canonical-provider' of github.com:block/goose: (27 commits) docs: add Remotion video creation tutorial (#6675) docs: export recipe and copy yaml (#6680) Test against fastmcp (#6666) docs: mid-session changes (#6672) Fix MCP elicitation deadlock and improve UX (#6650) chore: upgrade to rmcp 0.14.0 (#6674) [docs] add MCP-UI to MCP Apps blog (#6664) ACP get working dir from args.cwd (#6653) Optimise load config in UI (#6662) Fix GCP Vertex AI global endpoint support for Gemini 3 models (#6187) fix: macOS keychain infinite prompt loop (#6620) chore: reduce duplicate or unused cargo deps (#6630) feat: codex subscription support (#6600) smoke test allow pass for flaky providers (#6638) feat: Add built-in skill for goose documentation reference (#6534) Native images (#6619) docs: ml-based prompt injection detection (#6627) Strip the audience for compacting (#6646) chore(release): release version 1.21.0 (minor) (#6634) add collapsable chat nav (#6649) ...
Summary
This PR documents ML-based prompt injection detection.
Documentation updates:
documentation/docs/guides/security/prompt-injection-detection.md:documentation/docs/guides/config-files.md:SECURITY_PROMPT_THRESHOLDdefault from 0.7 to 0.8SECURITY_PROMPT_CLASSIFIER_ENABLED,SECURITY_PROMPT_CLASSIFIER_ENDPOINT, andSECURITY_PROMPT_CLASSIFIER_TOKENsettingsdocumentation/docs/guides/environment-variables.md:SECURITY_PROMPTvariablesdocumentation/docs/guides/security/classification-api-spec.md:documentation/docs/guides/security/index.mdx:Type of Change
AI Assistance
Testing
None