-
Notifications
You must be signed in to change notification settings - Fork 2.3k
[Security] Implement Security Option for Handling Tool-Call Chains #4691
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Security] Implement Security Option for Handling Tool-Call Chains #4691
Conversation
DOsinga
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
overall looks great, there's a lot of debugging logs in there that presumably we don't need anymore, can you clear those up? same for the LLM style comments
| use serde_json::Value; | ||
|
|
||
| /// Result of a security scan. | ||
| #[derive(Debug, Clone)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove
crates/goose/src/security/scanner.rs
Outdated
| } | ||
|
|
||
| /// Get threshold from config | ||
| /// Get the maliciousness threshold from config, or use default. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you're not happy with the current comment, rename the function (drop from_config, I don't think the caller cares where we store this)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah ez
crates/goose/src/security/scanner.rs
Outdated
| } | ||
|
|
||
| /// Scan with prompt injection model (legacy method name for compatibility) | ||
| /// Scan text for prompt injection (legacy compatibility). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ugh, we should have caught this the first place. remove this comment - legacy
| disabled_secondary_tool_list: &[String], | ||
| ) -> bool { | ||
| let tool_name = tool_call.name.as_str(); | ||
| if !disabled_secondary_tool_list.iter().any(|t| t == tool_name) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd consider having the caller check this; that would leave this function with just one job, makes naming easier too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking of just making it fully self-contained (IE it "just works").
I may split this into into a function though to make it cleaner/improve readability
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually looking at it, we re-use the tool_name var quite a lot. So I think it's really a call on if the caller should do it vs the function.
I personally like the function having the check but happy to change if you feel strongly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you should keep the toolname of course. I can go either way on it, but the function is called is_secondary_tool_violation_single, not is_secondary_tool_violation_single_if_enabled
| } | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you're going through the message twice here, I think you can do it in one go like:
for msg in messages.iter().rev():
if isUsere(msg): break
if toolCallName(msg) != toolCallName: return true
return true
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also, we have effective_role which will return tool for tool messages
|
config.yaml makes sense as we already have it for rules for commands, but in future I wonder if people will want to build a distro with these more role-based (ie protect people from themselves) but I digress... looking good so far! |
|
LGTM - but will want to check if this will be a problem with config.yaml: change here #4651 - does this imply the config section in config.yaml is changing cc @dorien-koelemeijer ? (ie no security: section?) |
Yeah, perhaps a comment on this whether we could also configure all of this through the UI settings and keep the config as all other config (with underscores)? See #4651 @shellz-n-stuff |
|
@dorien-koelemeijer, I can make the updates on this branch after your PR gets merged? Not in a mega rush here! |
Just wondering whether you wanted me to update/rename any config items or not? |
| /// Get threshold from config | ||
| pub fn get_threshold_from_config(&self) -> f32 { | ||
| /// Get the confidence threshold from config, or use default. | ||
| pub fn get_confidence_threshold_from_config(&self) -> f32 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought we'd just call this get_confidence_threshold and drop the comment?
|
Just adding a comment for visibility instead of chatting over Slack - I've now updated the config for prompt injection to be as follows: |
|
do we still want this to land? |
|
I'm going to close this for now due to staleness. can always reopen when relevant again |
Pull Request Description
One of the key attacks against Goose is MCP poisoning. Using the response data from an MCP to trigger a non-expected tool-call that performs an unintended action.
Whilst we absolutely don't want to remove the utility of the agent autonomously deciding to call a chain of tools, users should be able to configure security profiles in their config to prevent certain tools being called as a "secondary tool" without user input.
This PR introduces such functionality
Implementation Detail
The general approach is to intercept tool-calls and perform a message look back to see what tools have been called since prior invocation. Specifically, the method is as follows:
Testing