Skip to content

Conversation

@dianed-square
Copy link
Contributor

@dianed-square dianed-square commented Jan 22, 2026

Summary

This PR documents ML-based prompt injection detection.

Documentation updates:

  • documentation/docs/guides/security/prompt-injection-detection.md:
    • Update default threshold from 0.7 to 0.8
    • Integrate ML-based detection as option
    • Add "Enhanced Detection with Machine Learning" section
  • documentation/docs/guides/config-files.md:
    • Update SECURITY_PROMPT_THRESHOLD default from 0.7 to 0.8
    • Add SECURITY_PROMPT_CLASSIFIER_ENABLED, SECURITY_PROMPT_CLASSIFIER_ENDPOINT, and SECURITY_PROMPT_CLASSIFIER_TOKEN settings
  • documentation/docs/guides/environment-variables.md:
    • Add all SECURITY_PROMPT variables
  • documentation/docs/guides/security/classification-api-spec.md:
    • Un-unlist topic and add info box clarifying this is for self-hosting use cases
    • Question Is there a better location for this reference doc?
  • documentation/docs/guides/security/index.mdx:
    • Add card for Classification API Specification

Type of Change

  • Feature
  • Bug fix
  • Refactor / Code quality
  • Performance improvement
  • Documentation
  • Tests
  • Security fix
  • Build / Release
  • Other (specify below)

AI Assistance

  • This PR was created or reviewed with AI assistance

Testing

None


@github-actions
Copy link
Contributor

github-actions bot commented Jan 22, 2026

PR Preview Action v1.6.3
Preview removed because the pull request was closed.
2026-01-22 22:38 UTC

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR documents the ML-based prompt injection detection feature for goose, updating the default threshold from 0.7 to 0.8 and adding configuration options for using machine learning classifiers.

Changes:

  • Updated default SECURITY_PROMPT_THRESHOLD from 0.7 to 0.8 across all documentation
  • Added ML-based detection section with configuration instructions for classifier endpoints
  • Added new environment variables and config options for ML classifier settings (SECURITY_PROMPT_CLASSIFIER_ENABLED, SECURITY_PROMPT_CLASSIFIER_ENDPOINT, SECURITY_PROMPT_CLASSIFIER_TOKEN)
  • Made Classification API Specification document publicly visible
  • Added Classification API Specification card to security index

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
documentation/docs/guides/security/prompt-injection-detection.md Updated threshold default, added ML-based detection section with setup instructions
documentation/docs/guides/security/index.mdx Added card for Classification API Specification
documentation/docs/guides/security/classification-api-spec.md Changed from unlisted to publicly visible, updated references from "HuggingFace" to "Hugging Face"
documentation/docs/guides/environment-variables.md Added all SECURITY_PROMPT_* variables with examples
documentation/docs/guides/config-files.md Updated threshold default and added ML classifier config options

@dianed-square dianed-square merged commit 4578c77 into main Jan 22, 2026
22 checks passed
@dianed-square dianed-square deleted the docs/ml-based-prompt-injection-detection branch January 22, 2026 22:34
lifeizhou-ap added a commit that referenced this pull request Jan 22, 2026
* main:
  docs: ml-based prompt injection detection (#6627)
  Strip the audience for compacting (#6646)
  chore(release): release version 1.21.0 (minor) (#6634)
  add collapsable chat nav (#6649)
  fix: capitalize Rust in CONTRIBUTING.md (#6640)
  chore(deps): bump lodash from 4.17.21 to 4.17.23 in /ui/desktop (#6623)
  Vibe mcp apps (#6569)
  Add session forking capability (#5882)
  chore(deps): bump lodash from 4.17.21 to 4.17.23 in /documentation (#6624)
  fix(docs): use named import for globby v13 (#6639)
  PR Code Review (#6043)
  fix(docs): use dynamic import for globby ESM module (#6636)
  chore: trigger CI
  Document tab completion (#6635)
  Install goose-mcp crate dependencies (#6632)
  feat(goose): standardize agent-session-id for session correlation (#6626)
fbalicchia pushed a commit to fbalicchia/goose that referenced this pull request Jan 23, 2026
tlongwell-block added a commit that referenced this pull request Jan 23, 2026
* origin/main:
  Fix GCP Vertex AI global endpoint support for Gemini 3 models (#6187)
  fix: macOS keychain infinite prompt loop    (#6620)
  chore: reduce duplicate or unused cargo deps (#6630)
  feat: codex subscription support (#6600)
  smoke test allow pass for flaky providers (#6638)
  feat: Add built-in skill for goose documentation reference (#6534)
  Native images (#6619)
  docs: ml-based prompt injection detection (#6627)
  Strip the audience for compacting (#6646)
  chore(release): release version 1.21.0 (minor) (#6634)
  add collapsable chat nav (#6649)
  fix: capitalize Rust in CONTRIBUTING.md (#6640)
  chore(deps): bump lodash from 4.17.21 to 4.17.23 in /ui/desktop (#6623)
  Vibe mcp apps (#6569)
  Add session forking capability (#5882)
  chore(deps): bump lodash from 4.17.21 to 4.17.23 in /documentation (#6624)
  fix(docs): use named import for globby v13 (#6639)
  PR Code Review (#6043)
  fix(docs): use dynamic import for globby ESM module (#6636)

# Conflicts:
#	Cargo.lock
#	crates/goose-server/src/routes/session.rs
katzdave added a commit that referenced this pull request Jan 26, 2026
…o dkatz/canonical-context

* 'dkatz/canonical-provider' of github.com:block/goose: (27 commits)
  docs: add Remotion video creation tutorial (#6675)
  docs: export recipe and copy yaml (#6680)
  Test against fastmcp (#6666)
  docs: mid-session changes (#6672)
  Fix MCP elicitation deadlock and improve UX (#6650)
  chore: upgrade to rmcp 0.14.0 (#6674)
  [docs] add MCP-UI to MCP Apps blog (#6664)
  ACP get working dir from args.cwd (#6653)
  Optimise load config in UI (#6662)
  Fix GCP Vertex AI global endpoint support for Gemini 3 models (#6187)
  fix: macOS keychain infinite prompt loop    (#6620)
  chore: reduce duplicate or unused cargo deps (#6630)
  feat: codex subscription support (#6600)
  smoke test allow pass for flaky providers (#6638)
  feat: Add built-in skill for goose documentation reference (#6534)
  Native images (#6619)
  docs: ml-based prompt injection detection (#6627)
  Strip the audience for compacting (#6646)
  chore(release): release version 1.21.0 (minor) (#6634)
  add collapsable chat nav (#6649)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants