-
Notifications
You must be signed in to change notification settings - Fork 2.4k
feat(goose-acp): enable parallel sessions with isolated agent state #6392
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
codefromthecrypt
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
notes on the important things
| STDIN: {"jsonrpc":"2.0","method":"notifications/initialized"} | ||
| STDERR: time=2025-12-11T17:58:47.642-05:00 level=INFO msg="session initialized" | ||
| STDIN: {"jsonrpc":"2.0","id":1,"method":"tools/call","params":{"_meta":{"progressToken":0},"name":"get_file_contents","arguments":{"owner":"block","path":"README.md","repo":"goose","sha":"ab62b863c1666232a67048b6c4e10007a2a5b83c"}}} | ||
| STDIN: {"jsonrpc":"2.0","id":1,"method":"tools/call","params":{"_meta":{"goose-session-id":"test-session-id","progressToken":0},"name":"get_file_contents","arguments":{"owner":"block","path":"README.md","repo":"goose","sha":"ab62b863c1666232a67048b6c4e10007a2a5b83c"}}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
now that session is propagated via parameter in the case we always know it, the expectation file bumped for good reason.
| } | ||
| } | ||
|
|
||
| fn build_and_get_binary_path() -> PathBuf { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
now that builtin extensions no longer require being launched as subprocesses (mcp xxx args), ACP doesn't need a build to run tests. This leaves only one path where we launch processes in tests, the old one for MCP replay, only used when recording. This is put back to exactly the same code from before we needed to extract a common cargo build utility.
crates/goose/src/scheduler_trait.rs
Outdated
| ) -> Result<Option<(String, DateTime<Utc>)>, SchedulerError>; | ||
| } | ||
|
|
||
| const UNAVAILABLE_MESSAGE: &str = "Scheduler not available. This tool only works in server mode."; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
before we had a single call site we noticed scheduler in use and disabled it for CLI sessions. This extracts so it isn't optional. Unless there's a good reason to have CLI sessions fail on scheduler, I recommend in this PR or the next just letting them work and removing this workaround and/or only using it in tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I completely follow; the reason the CLI didn't have the scheduler is that a CLI session just only runs that session so it is never going to run scheduled jobs. we could still allow the CLI to set-up scheduled jobs, but you'd need the desktop to actually trigger them if that makes sense
|
|
||
| static SESSION_STORAGE: OnceCell<Arc<SessionStorage>> = OnceCell::const_new(); | ||
| static SESSION_STORAGE: LazyLock<Arc<SessionStorage>> = | ||
| LazyLock::new(|| Arc::new(SessionStorage::new(Paths::data_dir()))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in order to propagate and also still have a singleton I looked around at about 10 rust projects and this lazy lock pattern seemed the best, as old sites still get a global singleton until/unless updated, and new can propagate through in all cases the actual connect stuff is deferred to first.
| AllowOnce, | ||
| Cancel, | ||
| DenyOnce, | ||
| AlwaysDeny, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we were missing this even though our permissions system allows persistence of deny. This is needed for ACP and the motivation was testing permissions which had a bug and they now are testable without clobbering or ENV
| .unwrap(); | ||
| } | ||
|
|
||
| #[test_case(Some(PermissionOptionKind::AllowAlways), ToolCallStatus::Completed, "user:\n always_allow:\n - lookup__get_code\n ask_before: []\n never_allow: []\n"; "allow_always")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is the new test and notice no ACP test requires serial anymore, and none use ENV or clobber eachother's database or permissions files
|
|
||
| const PERMISSION_FILE: &str = "permission.yaml"; | ||
|
|
||
| static PERMISSION_MANAGER: LazyLock<Arc<PermissionManager>> = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
permission manager needs to be a singleton where used, so I used exact same pattern as session storage. before depending on call site reads might come from memory which could have problems when not shared. now it is always shared and pushed down consistently
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR enables parallel ACP agent sessions with isolated state by refactoring global singletons into instance-based configuration. Key changes include:
- Introduces
AgentConfigstruct to encapsulate session manager, permission manager, mode, and scheduler - Refactors
SessionManagerfrom static methods to instance-based with lazy initialization - Refactors
PermissionManagerfrom mutable singleton to thread-safe Arc-wrapped instance - Creates new
goose-acpcrate for isolated ACP server implementation - Moves built-in MCP extensions to in-process execution via duplex streams
- Adds
session_idparameter to MCP client trait methods for proper session tracking - Achieves 55x speedup in ACP integration tests through parallel execution
Reviewed changes
Copilot reviewed 79 out of 88 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| crates/goose/src/agents/agent.rs | Introduces AgentConfig, refactors Agent to use instance-based dependencies |
| crates/goose/src/session/session_manager.rs | Converts from static methods to instance-based with lazy pool initialization |
| crates/goose/src/config/permission.rs | Makes thread-safe with RwLock, converts to instance-based |
| crates/goose/src/agents/extension_manager.rs | Updates context to hold SessionManager, supports in-process builtins |
| crates/goose/src/agents/mcp_client.rs | Adds session_id parameter to call_tool and makes trait methods have default impls |
| crates/goose-acp/src/server.rs | New isolated ACP server implementation (not shown in diff) |
| crates/goose/tests/* | Updates tests to use isolated instances, removes serial execution |
| crates/goose-mcp/src/lib.rs | Adds builtin extension registry for in-process spawning |
| crates/goose-server/src/routes/* | Updates all routes to use SessionManager from AppState |
| crates/goose-cli/src/* | Updates CLI to use instance-based managers |
|
|
||
| [dependencies] | ||
| anyhow = "1.0" | ||
| anyhow = { workspace = true } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice - this does what I think it does which makes good sense.
|
so far working really well! |
|
I also like that it is half as many processes for usual workloads. I want to stamp this but need to work out how to get enough consensus but LGTM - I will try next running it bundled |
|
ps verified there's no bug with us about the double check vs double x on deny. I'll raise a PR with zed to fix that, right now the only icon they have for double is approve |
|
opened zed-industries/zed#46329 on zed for the unrelated UI glitch I noticed testing this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 79 out of 88 changed files in this pull request and generated no new comments.
|
@angiejones @alexhancock FYI I tested in webstorm (jetbrains) and it works. One note is that current version of Jetbrains doesn't respond to permissions requests like Zed, etc. So, don't change the GOOSE_MODE default from auto until that's not the case.
|
40697bb to
67df032
Compare
|
rebased no code changes since last review otherwise |
67df032 to
5cd195d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 79 out of 88 changed files in this pull request and generated no new comments.
5cd195d to
8060238
Compare
8060238 to
9fce44a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 77 out of 86 changed files in this pull request and generated no new comments.
9fce44a to
8f937b7
Compare
|
this PR takes quite a lot of effort to rebase every couple days. let me know if there is anything I can do to get this in so my "goose time" isn't dominated by rebasing. |
8f937b7 to
69383fe
Compare
Signed-off-by: Adrian Cole <[email protected]>
69383fe to
835aab0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 77 out of 86 changed files in this pull request and generated no new comments.
baxen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, a lot of great changes in here for the architecture to enable this
A lot of my comments i think can be followups, along with eventually getting the rest of the global state out like you mentioned (should make the desktop app multi chat less confusing too!)
| &self, | ||
| name: &str, | ||
| arguments: Option<JsonObject>, | ||
| session_id: &str, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I get why we need session id now in so many downstream tools (no more consistent way to lookup the process's current session). I like how it becomes _meta in the MCP spec. Can be a followup, but i'd cnsider having a "meta" arg in call_tool instead of session_id and always add session id into it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point. when we get around to otel we'll need to implicitly add trace IDs, but maybe we wouldn't if we were passing meta (since we could do it manually just like we are for session ID)
| pub fn new(config_dir: PathBuf) -> Self { | ||
| let permission_path = config_dir.join(PERMISSION_FILE); | ||
| let permission_map = if permission_path.exists() { | ||
| let file_contents = fs::read_to_string(&permission_path).unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: unwrap -> expect (old code had it)
| serde_yaml::from_str(&file_contents).unwrap_or_else(|_| HashMap::new()) | ||
| } else { | ||
| HashMap::new() // No config file, create an empty map | ||
| fs::create_dir_all(&config_dir).unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: missing why we need to create dir here? not a serious concern though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we have a choice to ensure the directory is there in the beginning or later when we persist. Not all code paths to this point ensure the config dir is there.. I wanted to put a TODO why but I have noticed not everyone likes to have TODOs.. if it were a TODO, I would say TODO: ensure all XDG paths are ensured at all entrypoints
| } | ||
| fn create_pool(path: &Path) -> Pool<Sqlite> { | ||
| if let Some(parent) = path.parent() { | ||
| fs::create_dir_all(parent).unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: could use an expect here for a better error message (we had a lot of access issues with some peoples xdg dirs)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sgtm
| pub spawn_server: SpawnServerFn, | ||
| } | ||
|
|
||
| fn spawn_and_serve<S>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is great, really nice to clean up the processes we were leaving here
| goose = { path = "../goose" } | ||
| goose-mcp = { path = "../goose-mcp" } | ||
| rmcp = { workspace = true } | ||
| sacp = "10.1.0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
discussed in discord, will bring this to the official rust sdk once they finish their refactor
|
thanks for the look @baxen I added a commit which I think got the bulk of things |
d534580 to
5226528
Compare
Signed-off-by: Adrian Cole <[email protected]>
5226528 to
05a43fd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 77 out of 86 changed files in this pull request and generated 2 comments.
Comments suppressed due to low confidence (1)
crates/goose-acp/src/server.rs:1
- Avoid cloning session_manager on every call. Store a reference in GooseAcpAgent struct or use Arc::clone(&self.agent.config.session_manager) for clarity that you're cloning the Arc, not the SessionManager.
use anyhow::Result;
| let provider_name = provider.get_name().to_string(); | ||
| let model_config = provider.get_model_config(); | ||
|
|
||
| let mut current_provider = self.provider.lock().await; | ||
| *current_provider = Some(provider.clone()); | ||
| *current_provider = Some(provider); |
Copilot
AI
Jan 14, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Extract provider metadata before acquiring lock to minimize lock hold time. Move lines 1468-1469 after line 1472 since provider is moved into the lock.
| let storage = Self { pool }; | ||
| storage.run_migrations().await?; | ||
| Ok(storage) | ||
| async fn pool(&self) -> Result<&Pool<Sqlite>> { |
Copilot
AI
Jan 14, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Method name 'pool' is ambiguous - it returns a reference to the pool after initialization. Rename to 'get_pool' or 'ensure_initialized_pool' to clarify its purpose.
| async fn pool(&self) -> Result<&Pool<Sqlite>> { | |
| async fn get_pool(&self) -> Result<&Pool<Sqlite>> { |
|
oops this totally slipped my mind. I'll have a look too since this is kinda big? |
|
I went through this once. it's clearly an improvement and we should possibly just merge as is, but after this I think the architecture would be left in a slightly confusing state. This is solid progress. The dependency injection via AgentConfig is the right move, and making the core components instantiable (vs global singletons) unblocks library mode. Let's merge this forward. What Works Well
What's Confusing
pub struct AgentManager { pub struct Agent { AgentManager finds the Agent for a session_id, but Agent also needs SessionManager. This double indirection is confusing - why does AgentManager need to know about sessions at all if it's just caching agents?
pub struct AgentConfig { pub struct Agent { Why are provider and extension_manager outside AgentConfig? What's the principle for what goes where?
Both work, but it's unclear when to use which pattern or what Agent's relationship to sessions is meant to be.
What's the source of truth? In general though this does feel like a good improvement. longer term though I think we might want to deprecate the whole notion of agents as a thing. we have sessions and an extension manager. then when you want to do something you can start an agent loop with a session and your preferences (goose mode, number of turns?). it then asks the (global?) extension manager to get it the extensions it needs. for stateless mcp, it can just reuse what is already there and that should be the general case. maybe sometimes it needs to start an extra one if one is busy. but other than that, the agent loop can just be a function without any agent state. thoughts? |
|
@DOsinga I agree things aren't yet coherent and there are still some mixed responsibilities.. there are a lot of call sites to old constructors and in some cases agent == agent+session I was told that we are killing the old CLI which is a large part of some copy/paste stuff, and after that it would be goose-server+acp only, let's do another major arch pass when that's in! |
* main: fixed 0 token in openrouter steaming (#6493) feat(goose-acp): enable parallel sessions with isolated agent state (#6392) copilot instruction to flag prelease docs (#6504) docs: acp mcp support (#6491) feat: add flatpak support for linux (#6387) fix(code_execution): serialize record_result output as JSON (#6495) perf(google): avoid accumulating thoughtSignatures across conversation history (#6462) fix(openai): make tool_call arguments optional and fix silent stream termination (#6309) fix: Improve error messages for invalid tool calls (#6483)
…ased * 'main' of github.com:block/goose: Fix popular topics not starting chat when clicked (#6508) fix[desktop]: deeplink ui repeat on refresh (#6469) fixed test compilation on main branch (#6512) fix: correctly parse extension name from tool call for MCP apps (#6482) fixed 0 token in openrouter steaming (#6493) feat(goose-acp): enable parallel sessions with isolated agent state (#6392) copilot instruction to flag prelease docs (#6504) docs: acp mcp support (#6491) feat: add flatpak support for linux (#6387)
* 'main' of github.com:block/goose: (28 commits) chore(deps): bump aiohttp from 3.13.0 to 3.13.3 in /scripts/provider-error-proxy (#6539) chore(deps): bump brotli from 1.1.0 to 1.2.0 in /scripts/provider-error-proxy (#6538) docs: temp correction for agent directory (#6544) chore: upgrade rmcp (#6516) docs: clarify directory in /documentation readme (#6541) Release 1.20.0 Standalone mcp apps (#6458) don't add escaping to the command field (#6519) Fix popular topics not starting chat when clicked (#6508) fix[desktop]: deeplink ui repeat on refresh (#6469) fixed test compilation on main branch (#6512) fix: correctly parse extension name from tool call for MCP apps (#6482) fixed 0 token in openrouter steaming (#6493) feat(goose-acp): enable parallel sessions with isolated agent state (#6392) copilot instruction to flag prelease docs (#6504) docs: acp mcp support (#6491) feat: add flatpak support for linux (#6387) fix(code_execution): serialize record_result output as JSON (#6495) perf(google): avoid accumulating thoughtSignatures across conversation history (#6462) fix(openai): make tool_call arguments optional and fix silent stream termination (#6309) ...




Summary
This PR makes it possible to run ACP agents and sessions in parallel, with critical state isolated to each agent instance.
Before, we were using global singletons for most state, which made it impossible to change configuration without ENV or subprocess isolation. Moreover, unlike platform, built-in extensions launched subprocesses which implied you need a binary entrypoint to run them.
This PR extracts the most critical state into structs and ensures consistent threading to call sites. It also reconfigures built-in extensions to run in-process, via duplex streams.
Legacy entrypoints retain global singletons despite ACP entrypoints being isolated. This was done to allow progress, particularly to implement goose-server with ACP. This is also due to the pending PR to replace the CLI. We don't want to do too much work for code that is going to be replaced.
The most important impact is the new goose-acp crate which contains isolated code for critical paths even though the CLI entrypoint that uses it will act the same.
Later in the tests section you will see how the combination of these changes allow more ACP integration tests than before to complete in less than a second.
Type of Change
AI Assistance
Testing
GooseAcpAgentto accept variables viaAgentConfig, tests don't rely on ENV variables.Above and other changes allow us to safely run tests in parallel, removing env-lock, serial and reset patterns.
This means despite having more ACP integration tests, our performance is incredibly faster.
Works cited
What this PR does is a step towards a more modular codebase which avoids global singletons.
Some projects I studied for reference were:
Related Issues