Skip to content

Conversation

@codefromthecrypt
Copy link
Collaborator

Summary

This PR makes it possible to run ACP agents and sessions in parallel, with critical state isolated to each agent instance.

Before, we were using global singletons for most state, which made it impossible to change configuration without ENV or subprocess isolation. Moreover, unlike platform, built-in extensions launched subprocesses which implied you need a binary entrypoint to run them.

This PR extracts the most critical state into structs and ensures consistent threading to call sites. It also reconfigures built-in extensions to run in-process, via duplex streams.

Legacy entrypoints retain global singletons despite ACP entrypoints being isolated. This was done to allow progress, particularly to implement goose-server with ACP. This is also due to the pending PR to replace the CLI. We don't want to do too much work for code that is going to be replaced.

The most important impact is the new goose-acp crate which contains isolated code for critical paths even though the CLI entrypoint that uses it will act the same.

Later in the tests section you will see how the combination of these changes allow more ACP integration tests than before to complete in less than a second.

Type of Change

  • Feature
  • Refactor / Code quality
  • Tests

AI Assistance

  • This PR was created or reviewed with AI assistance

Testing

  • Now that built-in are in-process like platform extensions, we don't need to build binaries to run ACP tests.
  • Since we parameterized the GooseAcpAgent to accept variables via AgentConfig, tests don't rely on ENV variables.

Above and other changes allow us to safely run tests in parallel, removing env-lock, serial and reset patterns.
This means despite having more ACP integration tests, our performance is incredibly faster.

Crate MAIN THIS PR Speedup
goose-acp 42.22s 0.77s 55x

Works cited

What this PR does is a step towards a more modular codebase which avoids global singletons.
Some projects I studied for reference were:

  • hyperswitch - AppState takes storage config
  • lemmy - LemmyContext created once, passed to handlers
  • typst - Engine requires all dependencies at construction

Related Issues

Copy link
Collaborator Author

@codefromthecrypt codefromthecrypt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

notes on the important things

STDIN: {"jsonrpc":"2.0","method":"notifications/initialized"}
STDERR: time=2025-12-11T17:58:47.642-05:00 level=INFO msg="session initialized"
STDIN: {"jsonrpc":"2.0","id":1,"method":"tools/call","params":{"_meta":{"progressToken":0},"name":"get_file_contents","arguments":{"owner":"block","path":"README.md","repo":"goose","sha":"ab62b863c1666232a67048b6c4e10007a2a5b83c"}}}
STDIN: {"jsonrpc":"2.0","id":1,"method":"tools/call","params":{"_meta":{"goose-session-id":"test-session-id","progressToken":0},"name":"get_file_contents","arguments":{"owner":"block","path":"README.md","repo":"goose","sha":"ab62b863c1666232a67048b6c4e10007a2a5b83c"}}}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now that session is propagated via parameter in the case we always know it, the expectation file bumped for good reason.

}
}

fn build_and_get_binary_path() -> PathBuf {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now that builtin extensions no longer require being launched as subprocesses (mcp xxx args), ACP doesn't need a build to run tests. This leaves only one path where we launch processes in tests, the old one for MCP replay, only used when recording. This is put back to exactly the same code from before we needed to extract a common cargo build utility.

) -> Result<Option<(String, DateTime<Utc>)>, SchedulerError>;
}

const UNAVAILABLE_MESSAGE: &str = "Scheduler not available. This tool only works in server mode.";
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

before we had a single call site we noticed scheduler in use and disabled it for CLI sessions. This extracts so it isn't optional. Unless there's a good reason to have CLI sessions fail on scheduler, I recommend in this PR or the next just letting them work and removing this workaround and/or only using it in tests

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I completely follow; the reason the CLI didn't have the scheduler is that a CLI session just only runs that session so it is never going to run scheduled jobs. we could still allow the CLI to set-up scheduled jobs, but you'd need the desktop to actually trigger them if that makes sense


static SESSION_STORAGE: OnceCell<Arc<SessionStorage>> = OnceCell::const_new();
static SESSION_STORAGE: LazyLock<Arc<SessionStorage>> =
LazyLock::new(|| Arc::new(SessionStorage::new(Paths::data_dir())));
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in order to propagate and also still have a singleton I looked around at about 10 rust projects and this lazy lock pattern seemed the best, as old sites still get a global singleton until/unless updated, and new can propagate through in all cases the actual connect stuff is deferred to first.

AllowOnce,
Cancel,
DenyOnce,
AlwaysDeny,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we were missing this even though our permissions system allows persistence of deny. This is needed for ACP and the motivation was testing permissions which had a bug and they now are testable without clobbering or ENV

.unwrap();
}

#[test_case(Some(PermissionOptionKind::AllowAlways), ToolCallStatus::Completed, "user:\n always_allow:\n - lookup__get_code\n ask_before: []\n never_allow: []\n"; "allow_always")]
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the new test and notice no ACP test requires serial anymore, and none use ENV or clobber eachother's database or permissions files


const PERMISSION_FILE: &str = "permission.yaml";

static PERMISSION_MANAGER: LazyLock<Arc<PermissionManager>> =
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

permission manager needs to be a singleton where used, so I used exact same pattern as session storage. before depending on call site reads might come from memory which could have problems when not shared. now it is always shared and pushed down consistently

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR enables parallel ACP agent sessions with isolated state by refactoring global singletons into instance-based configuration. Key changes include:

  • Introduces AgentConfig struct to encapsulate session manager, permission manager, mode, and scheduler
  • Refactors SessionManager from static methods to instance-based with lazy initialization
  • Refactors PermissionManager from mutable singleton to thread-safe Arc-wrapped instance
  • Creates new goose-acp crate for isolated ACP server implementation
  • Moves built-in MCP extensions to in-process execution via duplex streams
  • Adds session_id parameter to MCP client trait methods for proper session tracking
  • Achieves 55x speedup in ACP integration tests through parallel execution

Reviewed changes

Copilot reviewed 79 out of 88 changed files in this pull request and generated no comments.

Show a summary per file
File Description
crates/goose/src/agents/agent.rs Introduces AgentConfig, refactors Agent to use instance-based dependencies
crates/goose/src/session/session_manager.rs Converts from static methods to instance-based with lazy pool initialization
crates/goose/src/config/permission.rs Makes thread-safe with RwLock, converts to instance-based
crates/goose/src/agents/extension_manager.rs Updates context to hold SessionManager, supports in-process builtins
crates/goose/src/agents/mcp_client.rs Adds session_id parameter to call_tool and makes trait methods have default impls
crates/goose-acp/src/server.rs New isolated ACP server implementation (not shown in diff)
crates/goose/tests/* Updates tests to use isolated instances, removes serial execution
crates/goose-mcp/src/lib.rs Adds builtin extension registry for in-process spawning
crates/goose-server/src/routes/* Updates all routes to use SessionManager from AppState
crates/goose-cli/src/* Updates CLI to use instance-based managers


[dependencies]
anyhow = "1.0"
anyhow = { workspace = true }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice - this does what I think it does which makes good sense.

@michaelneale
Copy link
Collaborator

so far working really well!

@michaelneale
Copy link
Collaborator

I also like that it is half as many processes for usual workloads. I want to stamp this but need to work out how to get enough consensus but LGTM - I will try next running it bundled

@codefromthecrypt
Copy link
Collaborator Author

ps confirmed in zed the missing "reject always" works and updates my permissions file properly

Screenshot 2026-01-08 at 12 10 45 PM
$ cat ~/.config/goose/permission.yaml
user:
  always_allow: []
  ask_before: []
  never_allow:
  - kiwi__search-flight

@codefromthecrypt
Copy link
Collaborator Author

codefromthecrypt commented Jan 8, 2026

ps verified there's no bug with us about the double check vs double x on deny. I'll raise a PR with zed to fix that, right now the only icon they have for double is approve

@codefromthecrypt
Copy link
Collaborator Author

opened zed-industries/zed#46329 on zed for the unrelated UI glitch I noticed testing this

Copilot AI review requested due to automatic review settings January 8, 2026 05:47
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 79 out of 88 changed files in this pull request and generated no new comments.

@codefromthecrypt
Copy link
Collaborator Author

@angiejones @alexhancock FYI I tested in webstorm (jetbrains) and it works. One note is that current version of Jetbrains doesn't respond to permissions requests like Zed, etc. So, don't change the GOOSE_MODE default from auto until that's not the case.

Screenshot 2026-01-08 at 2 21 57 PM

@codefromthecrypt
Copy link
Collaborator Author

also checked 3rd party MCP also works in jetbrains same permissions caveat which isn't new to this PR and probably already solved, just not released in latest jetbrains.

Screenshot 2026-01-08 at 2 11 40 PM Screenshot 2026-01-08 at 2 24 53 PM

@codefromthecrypt codefromthecrypt force-pushed the feat/scacp-10-library-mode branch from 40697bb to 67df032 Compare January 9, 2026 05:33
@codefromthecrypt
Copy link
Collaborator Author

rebased no code changes since last review otherwise

@codefromthecrypt codefromthecrypt force-pushed the feat/scacp-10-library-mode branch from 67df032 to 5cd195d Compare January 9, 2026 05:46
Copilot AI review requested due to automatic review settings January 9, 2026 05:46
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 79 out of 88 changed files in this pull request and generated no new comments.

@codefromthecrypt codefromthecrypt force-pushed the feat/scacp-10-library-mode branch from 5cd195d to 8060238 Compare January 9, 2026 06:23
Copilot AI review requested due to automatic review settings January 14, 2026 03:01
@codefromthecrypt codefromthecrypt force-pushed the feat/scacp-10-library-mode branch from 8060238 to 9fce44a Compare January 14, 2026 03:01
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 77 out of 86 changed files in this pull request and generated no new comments.

@codefromthecrypt codefromthecrypt force-pushed the feat/scacp-10-library-mode branch from 9fce44a to 8f937b7 Compare January 14, 2026 03:07
@codefromthecrypt
Copy link
Collaborator Author

this PR takes quite a lot of effort to rebase every couple days. let me know if there is anything I can do to get this in so my "goose time" isn't dominated by rebasing.

@codefromthecrypt codefromthecrypt force-pushed the feat/scacp-10-library-mode branch from 8f937b7 to 69383fe Compare January 14, 2026 03:09
@codefromthecrypt codefromthecrypt force-pushed the feat/scacp-10-library-mode branch from 69383fe to 835aab0 Compare January 14, 2026 03:13
Copilot AI review requested due to automatic review settings January 14, 2026 03:13
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 77 out of 86 changed files in this pull request and generated no new comments.

Copy link
Collaborator

@baxen baxen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, a lot of great changes in here for the architecture to enable this

A lot of my comments i think can be followups, along with eventually getting the rest of the global state out like you mentioned (should make the desktop app multi chat less confusing too!)

&self,
name: &str,
arguments: Option<JsonObject>,
session_id: &str,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get why we need session id now in so many downstream tools (no more consistent way to lookup the process's current session). I like how it becomes _meta in the MCP spec. Can be a followup, but i'd cnsider having a "meta" arg in call_tool instead of session_id and always add session id into it

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point. when we get around to otel we'll need to implicitly add trace IDs, but maybe we wouldn't if we were passing meta (since we could do it manually just like we are for session ID)

pub fn new(config_dir: PathBuf) -> Self {
let permission_path = config_dir.join(PERMISSION_FILE);
let permission_map = if permission_path.exists() {
let file_contents = fs::read_to_string(&permission_path).unwrap();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: unwrap -> expect (old code had it)

serde_yaml::from_str(&file_contents).unwrap_or_else(|_| HashMap::new())
} else {
HashMap::new() // No config file, create an empty map
fs::create_dir_all(&config_dir).unwrap();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: missing why we need to create dir here? not a serious concern though

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we have a choice to ensure the directory is there in the beginning or later when we persist. Not all code paths to this point ensure the config dir is there.. I wanted to put a TODO why but I have noticed not everyone likes to have TODOs.. if it were a TODO, I would say TODO: ensure all XDG paths are ensured at all entrypoints

}
fn create_pool(path: &Path) -> Pool<Sqlite> {
if let Some(parent) = path.parent() {
fs::create_dir_all(parent).unwrap();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: could use an expect here for a better error message (we had a lot of access issues with some peoples xdg dirs)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sgtm

pub spawn_server: SpawnServerFn,
}

fn spawn_and_serve<S>(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is great, really nice to clean up the processes we were leaving here

goose = { path = "../goose" }
goose-mcp = { path = "../goose-mcp" }
rmcp = { workspace = true }
sacp = "10.1.0"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

discussed in discord, will bring this to the official rust sdk once they finish their refactor

@codefromthecrypt
Copy link
Collaborator Author

thanks for the look @baxen I added a commit which I think got the bulk of things

@codefromthecrypt codefromthecrypt force-pushed the feat/scacp-10-library-mode branch from d534580 to 5226528 Compare January 14, 2026 11:35
Signed-off-by: Adrian Cole <[email protected]>
Copilot AI review requested due to automatic review settings January 14, 2026 11:35
@codefromthecrypt codefromthecrypt force-pushed the feat/scacp-10-library-mode branch from 5226528 to 05a43fd Compare January 14, 2026 11:35
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 77 out of 86 changed files in this pull request and generated 2 comments.

Comments suppressed due to low confidence (1)

crates/goose-acp/src/server.rs:1

  • Avoid cloning session_manager on every call. Store a reference in GooseAcpAgent struct or use Arc::clone(&self.agent.config.session_manager) for clarity that you're cloning the Arc, not the SessionManager.
use anyhow::Result;

Comment on lines +1468 to +1472
let provider_name = provider.get_name().to_string();
let model_config = provider.get_model_config();

let mut current_provider = self.provider.lock().await;
*current_provider = Some(provider.clone());
*current_provider = Some(provider);
Copy link

Copilot AI Jan 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extract provider metadata before acquiring lock to minimize lock hold time. Move lines 1468-1469 after line 1472 since provider is moved into the lock.

Copilot generated this review using guidance from repository custom instructions.
let storage = Self { pool };
storage.run_migrations().await?;
Ok(storage)
async fn pool(&self) -> Result<&Pool<Sqlite>> {
Copy link

Copilot AI Jan 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Method name 'pool' is ambiguous - it returns a reference to the pool after initialization. Rename to 'get_pool' or 'ensure_initialized_pool' to clarify its purpose.

Suggested change
async fn pool(&self) -> Result<&Pool<Sqlite>> {
async fn get_pool(&self) -> Result<&Pool<Sqlite>> {

Copilot uses AI. Check for mistakes.
@DOsinga
Copy link
Collaborator

DOsinga commented Jan 14, 2026

oops this totally slipped my mind. I'll have a look too since this is kinda big?

@DOsinga
Copy link
Collaborator

DOsinga commented Jan 14, 2026

I went through this once. it's clearly an improvement and we should possibly just merge as is, but after this I think the architecture would be left in a slightly confusing state.

This is solid progress. The dependency injection via AgentConfig is the right move, and making the core components instantiable (vs global singletons) unblocks library mode. Let's merge this forward.

What Works Well

  1. AgentConfig pattern - bundling SessionManager, PermissionManager, scheduler into config for DI
  2. Agent.with_config() - enables multiple independent instances
  3. Threading session_id through methods - makes Agent more session-agnostic
  4. Test improvements - no more #[serial] or unsafe reset hacks

What's Confusing

  1. AgentManager + Agent both have SessionManager

pub struct AgentManager {
sessions: LruCache<String, Arc>, // session_id -> Agent
session_manager: Arc, // Also has session manager
}

pub struct Agent {
config: AgentConfig {
session_manager: Arc, // Agent also has it
}
}

AgentManager finds the Agent for a session_id, but Agent also needs SessionManager. This double indirection is confusing - why does AgentManager need to know about sessions at all if it's just caching agents?

  1. What goes in AgentConfig vs on Agent?

pub struct AgentConfig {
session_manager, permission_manager, scheduler, goose_mode // In config
}

pub struct Agent {
provider: SharedProvider, // NOT in config
extension_manager: Arc, // NOT in config
config: AgentConfig,
}

Why are provider and extension_manager outside AgentConfig? What's the principle for what goes where?

  1. Two competing patterns
  • goose-server: Uses AgentManager, creates 1 Agent per session
  • goose-acp: Bypasses AgentManager, uses 1 Agent for all sessions

Both work, but it's unclear when to use which pattern or what Agent's relationship to sessions is meant to be.

  1. Provider in multiple places
  • Agent.provider (mutable via update_provider)
  • Session.provider (string stored in DB)
  • GooseAcpConfig.provider (for initialization)

What's the source of truth?

In general though this does feel like a good improvement. longer term though I think we might want to deprecate the whole notion of agents as a thing. we have sessions and an extension manager. then when you want to do something you can start an agent loop with a session and your preferences (goose mode, number of turns?). it then asks the (global?) extension manager to get it the extensions it needs. for stateless mcp, it can just reuse what is already there and that should be the general case. maybe sometimes it needs to start an extra one if one is busy. but other than that, the agent loop can just be a function without any agent state. thoughts?

@baxen baxen merged commit 7d4a6bd into main Jan 14, 2026
20 checks passed
@baxen baxen deleted the feat/scacp-10-library-mode branch January 14, 2026 22:19
@codefromthecrypt
Copy link
Collaborator Author

@DOsinga I agree things aren't yet coherent and there are still some mixed responsibilities.. there are a lot of call sites to old constructors and in some cases agent == agent+session

I was told that we are killing the old CLI which is a large part of some copy/paste stuff, and after that it would be goose-server+acp only, let's do another major arch pass when that's in!

lifeizhou-ap added a commit that referenced this pull request Jan 15, 2026
* main:
  fixed 0 token in openrouter steaming (#6493)
  feat(goose-acp): enable parallel sessions with isolated agent state (#6392)
  copilot instruction to flag prelease docs (#6504)
  docs: acp mcp support (#6491)
  feat: add flatpak support for linux (#6387)
  fix(code_execution): serialize record_result output as JSON (#6495)
  perf(google): avoid accumulating thoughtSignatures across conversation history (#6462)
  fix(openai): make tool_call arguments optional and fix silent stream termination (#6309)
  fix: Improve error messages for invalid tool calls (#6483)
zanesq added a commit that referenced this pull request Jan 15, 2026
…ased

* 'main' of github.com:block/goose:
  Fix popular topics not starting chat when clicked (#6508)
  fix[desktop]: deeplink ui repeat on refresh (#6469)
  fixed test compilation on main branch (#6512)
  fix: correctly parse extension name from tool call for MCP apps (#6482)
  fixed 0 token in openrouter steaming (#6493)
  feat(goose-acp): enable parallel sessions with isolated agent state (#6392)
  copilot instruction to flag prelease docs (#6504)
  docs: acp mcp support (#6491)
  feat: add flatpak support for linux (#6387)
zanesq added a commit that referenced this pull request Jan 16, 2026
* 'main' of github.com:block/goose: (28 commits)
  chore(deps): bump aiohttp from 3.13.0 to 3.13.3 in /scripts/provider-error-proxy (#6539)
  chore(deps): bump brotli from 1.1.0 to 1.2.0 in /scripts/provider-error-proxy (#6538)
  docs: temp correction for agent directory (#6544)
  chore: upgrade rmcp (#6516)
  docs: clarify directory in /documentation readme (#6541)
  Release 1.20.0
  Standalone mcp apps (#6458)
  don't add escaping to the command field (#6519)
  Fix popular topics not starting chat when clicked (#6508)
  fix[desktop]: deeplink ui repeat on refresh (#6469)
  fixed test compilation on main branch (#6512)
  fix: correctly parse extension name from tool call for MCP apps (#6482)
  fixed 0 token in openrouter steaming (#6493)
  feat(goose-acp): enable parallel sessions with isolated agent state (#6392)
  copilot instruction to flag prelease docs (#6504)
  docs: acp mcp support (#6491)
  feat: add flatpak support for linux (#6387)
  fix(code_execution): serialize record_result output as JSON (#6495)
  perf(google): avoid accumulating thoughtSignatures across conversation history (#6462)
  fix(openai): make tool_call arguments optional and fix silent stream termination (#6309)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants