Skip to content

Conversation

@shellz-n-stuff
Copy link
Contributor

@shellz-n-stuff shellz-n-stuff commented Sep 22, 2025

Pull Request Description

One of the key attacks against Goose is MCP poisoning. Using the response data from an MCP to trigger a non-expected tool-call that performs an unintended action.

Whilst we absolutely don't want to remove the utility of the agent autonomously deciding to call a chain of tools, users should be able to configure security profiles in their config to prevent certain tools being called as a "secondary tool" without user input.

This PR introduces such functionality

Implementation Detail

The general approach is to intercept tool-calls and perform a message look back to see what tools have been called since prior invocation. Specifically, the method is as follows:

  1. Check for config of tools to protect (if null skip the check)
  2. Check if the current tool-call exists in the list of protected tools (if null skip)
  3. Collect a list of tool calls between the tool we're about to invoke and the last user message (if 0 skip)
  4. If the last called tool is this tool skip
  5. If the last called tool is another tool raise a security alert.

Testing

Screenshot 2025-09-22 at 5 23 24 pm
  • Config File Setup
Screenshot 2025-09-22 at 5 25 32 pm
  • Base Case Running a Normal MCP + The developer__shell mcp - Part 1
Screenshot 2025-09-22 at 5 26 01 pm
  • Base Case Running a Normal MCP + The developer__shell mcp - Part 2
Screenshot 2025-09-22 at 5 24 50 pm
  • Secondary Case running two instances of developer shell

@shellz-n-stuff shellz-n-stuff changed the title [WIP][Security] Implement Security Option for Handling Tool-Call Chains [Security] Implement Security Option for Handling Tool-Call Chains Sep 23, 2025
@shellz-n-stuff shellz-n-stuff marked this pull request as ready for review September 23, 2025 00:29
@michaelneale michaelneale self-assigned this Sep 23, 2025
Copy link
Collaborator

@DOsinga DOsinga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall looks great, there's a lot of debugging logs in there that presumably we don't need anymore, can you clear those up? same for the LLM style comments

use serde_json::Value;

/// Result of a security scan.
#[derive(Debug, Clone)]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove

}

/// Get threshold from config
/// Get the maliciousness threshold from config, or use default.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you're not happy with the current comment, rename the function (drop from_config, I don't think the caller cares where we store this)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah ez

}

/// Scan with prompt injection model (legacy method name for compatibility)
/// Scan text for prompt injection (legacy compatibility).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ugh, we should have caught this the first place. remove this comment - legacy

disabled_secondary_tool_list: &[String],
) -> bool {
let tool_name = tool_call.name.as_str();
if !disabled_secondary_tool_list.iter().any(|t| t == tool_name) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd consider having the caller check this; that would leave this function with just one job, makes naming easier too

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking of just making it fully self-contained (IE it "just works").

I may split this into into a function though to make it cleaner/improve readability

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually looking at it, we re-use the tool_name var quite a lot. So I think it's really a call on if the caller should do it vs the function.

I personally like the function having the check but happy to change if you feel strongly

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you should keep the toolname of course. I can go either way on it, but the function is called is_secondary_tool_violation_single, not is_secondary_tool_violation_single_if_enabled

}
}
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you're going through the message twice here, I think you can do it in one go like:

for msg in messages.iter().rev():
if isUsere(msg): break
if toolCallName(msg) != toolCallName: return true
return true

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, we have effective_role which will return tool for tool messages

@michaelneale
Copy link
Collaborator

config.yaml makes sense as we already have it for rules for commands, but in future I wonder if people will want to build a distro with these more role-based (ie protect people from themselves) but I digress... looking good so far!

@michaelneale
Copy link
Collaborator

michaelneale commented Sep 24, 2025

LGTM - but will want to check if this will be a problem with config.yaml:

change here #4651 - does this imply the config section in config.yaml is changing cc @dorien-koelemeijer ? (ie no security: section?)

@dorien-koelemeijer
Copy link
Collaborator

LGTM - but will want to check if this will be a problem with config.yaml:

change here #4651 - does this imply the config section in config.yaml is changing cc @dorien-koelemeijer ? (ie no security: section?)

Yeah, perhaps a comment on this whether we could also configure all of this through the UI settings and keep the config as all other config (with underscores)? See #4651 @shellz-n-stuff

@shellz-n-stuff
Copy link
Contributor Author

#4651

@dorien-koelemeijer, I can make the updates on this branch after your PR gets merged? Not in a mega rush here!

@michaelneale michaelneale removed their assignment Sep 24, 2025
@dorien-koelemeijer
Copy link
Collaborator

#4651

@dorien-koelemeijer, I can make the updates on this branch after your PR gets merged? Not in a mega rush here!

Just wondering whether you wanted me to update/rename any config items or not?

/// Get threshold from config
pub fn get_threshold_from_config(&self) -> f32 {
/// Get the confidence threshold from config, or use default.
pub fn get_confidence_threshold_from_config(&self) -> f32 {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought we'd just call this get_confidence_threshold and drop the comment?

@dorien-koelemeijer
Copy link
Collaborator

Just adding a comment for visibility instead of chatting over Slack - I've now updated the config for prompt injection to be as follows: security_prompt_enabled and security_prompt_threshold, so perhaps you could update your config items to follow the same convention? Decided for the non-nested, underscore approach since that seems to be the more conventional thing to do looking at the other config. Happy to discuss more ofc if you wanted me to update something!

@DOsinga
Copy link
Collaborator

DOsinga commented Oct 6, 2025

do we still want this to land?

@DOsinga DOsinga added the waiting label Oct 6, 2025
@DOsinga
Copy link
Collaborator

DOsinga commented Oct 18, 2025

I'm going to close this for now due to staleness. can always reopen when relevant again

@DOsinga DOsinga closed this Oct 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants