Skip to content

fix(goose-server): cache TLS cert to disk to avoid slow startup on first launch#8174

Merged
lifeizhou-ap merged 1 commit into
aaif-goose:mainfrom
treebird7:fix/tls-cert-caching-on-startup
Apr 23, 2026
Merged

fix(goose-server): cache TLS cert to disk to avoid slow startup on first launch#8174
lifeizhou-ap merged 1 commit into
aaif-goose:mainfrom
treebird7:fix/tls-cert-caching-on-startup

Conversation

@treebird7
Copy link
Copy Markdown
Contributor

Summary

On fresh macOS installs, rcgen TLS key generation takes 2–8 seconds. Combined with Electron startup overhead, this regularly exceeded the 10-second health-check timeout in goosed.ts, causing the desktop app to fail on first launch while working fine on relaunches (fixes #8137).

Changes:

  • crates/goose-server/src/tls.rs: adds get_tls_cache_dir(), load_cached_tls(), and save_tls_to_cache() — caches cert + key + fingerprint to ~/.config/goose/tls/ on first generation; subsequent startups skip rcgen entirely. Cache miss silently falls back to fresh generation.
  • ui/desktop/src/goosed.ts: raises checkServerStatus timeout from 10 s to 30 s as a belt-and-suspenders guard for the unavoidable first-launch case.

No new dependencies introduced — uses only std::fs, std::path, and dependencies already present in Cargo.toml.

Test plan

  • cargo test --package goose-server tls — existing TLS tests pass
  • Fresh install on macOS (delete ~/.config/goose/tls/ to simulate): desktop app starts successfully within 30 s
  • Second launch: app starts in < 1 s (cert loaded from cache, no rcgen invocation)
  • Corrupt cache file (echo bad > ~/.config/goose/tls/server.pem): app still starts (falls back to fresh generation)

🤖 Generated with Goose — contributed by goose, using goose!

Copilot AI review requested due to automatic review settings March 28, 2026 17:48
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d8ba9ae2d3

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread crates/goose-server/src/tls.rs Outdated
Comment on lines +84 to +85
let fingerprint = std::fs::read_to_string(dir.join("server.fingerprint")).ok()?;
let fingerprint = fingerprint.trim().to_string();
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Recompute fingerprint from cached certificate

load_cached_tls trusts the persisted server.fingerprint string without checking it matches server.pem. If that file becomes stale/corrupted while server.pem and server.key are still valid, self_signed_config emits an incorrect fingerprint, the Electron side pins that value, and subsequent localhost TLS verification fails (startup appears broken instead of falling back). Please derive the fingerprint from the cached certificate (or validate and regenerate on mismatch) before returning TlsSetup.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Improves first-launch reliability for the desktop app by avoiding expensive self-signed TLS key generation on startup (especially on fresh macOS installs), and by extending the UI health-check timeout.

Changes:

  • Cache the generated TLS cert/key (and associated fingerprint) to disk and reuse it on subsequent goose-server startups.
  • Fall back to fresh TLS generation on cache miss/error without blocking startup.
  • Increase the desktop app’s backend health-check timeout from 10s to 30s.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
crates/goose-server/src/tls.rs Adds TLS cache load/save + fast-path reuse to skip rcgen on subsequent startups.
ui/desktop/src/goosed.ts Extends backend readiness polling window to reduce first-launch failures.

Comment thread crates/goose-server/src/tls.rs Outdated
Comment on lines +68 to +73
/// Returns the directory used for caching TLS files: `$HOME/.config/goose/tls/`.
/// Returns `None` if the `HOME` environment variable is not set.
fn get_tls_cache_dir() -> Option<std::path::PathBuf> {
let home = std::env::var("HOME").ok()?;
Some(std::path::PathBuf::from(home).join(".config").join("goose").join("tls"))
}
Copy link

Copilot AI Mar 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_tls_cache_dir() hard-codes $HOME/.config/..., which doesn’t follow the codebase’s existing Paths::config_dir() convention and will put the cache in an odd location (or disable caching) on platforms where HOME isn’t the config root (notably Windows). Consider using goose::config::paths::Paths::config_dir().join("tls") (or similar) so the cache location matches the rest of goose-server and stays cross-platform.

Copilot uses AI. Check for mistakes.
Comment thread crates/goose-server/src/tls.rs Outdated
Comment on lines +114 to +124
fn save_tls_to_cache(cert_pem: &str, key_pem: &str, fingerprint: &str) {
let dir = match get_tls_cache_dir() {
Some(d) => d,
None => return,
};
if std::fs::create_dir_all(&dir).is_err() {
return;
}
let _ = std::fs::write(dir.join("server.pem"), cert_pem);
let _ = std::fs::write(dir.join("server.key"), key_pem);
let _ = std::fs::write(dir.join("server.fingerprint"), fingerprint);
Copy link

Copilot AI Mar 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

save_tls_to_cache() writes the private key using std::fs::write, which can create server.key with default permissions (e.g. 0644 after umask), making the TLS private key potentially readable by other local users. Please write the key with restrictive permissions (e.g. 0600 on Unix, and consider tightening directory permissions as well) so another process can’t reuse the cached key+cert to impersonate goosed on the selected localhost port.

Copilot uses AI. Check for mistakes.
Comment thread crates/goose-server/src/tls.rs Outdated
Comment on lines +75 to +93
/// Attempt to load a previously cached TLS configuration from disk.
///
/// Reads `server.pem`, `server.key`, and `server.fingerprint` from the cache
/// directory. Returns `None` on any error (missing files, parse failures, …).
#[cfg(feature = "rustls-tls")]
async fn load_cached_tls() -> Option<TlsSetup> {
let dir = get_tls_cache_dir()?;
let cert_bytes = std::fs::read(dir.join("server.pem")).ok()?;
let key_bytes = std::fs::read(dir.join("server.key")).ok()?;
let fingerprint = std::fs::read_to_string(dir.join("server.fingerprint")).ok()?;
let fingerprint = fingerprint.trim().to_string();

let config =
axum_server::tls_rustls::RustlsConfig::from_pem(cert_bytes, key_bytes)
.await
.ok()?;

Some(TlsSetup { config, fingerprint })
}
Copy link

Copilot AI Mar 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

load_cached_tls() trusts server.fingerprint as the value printed to stdout for UI pinning; if that file is corrupted/mismatched while server.pem+server.key still parse, the server will start but the desktop app will pin the wrong fingerprint and fail to connect. Consider recomputing the fingerprint from the loaded certificate bytes (or at least validating the cached fingerprint matches the cert) and falling back to regeneration on mismatch.

Copilot uses AI. Check for mistakes.
treebird7 added a commit to treebird7/goose that referenced this pull request Mar 28, 2026
- Use goose::config::paths::Paths::config_dir() instead of hardcoded
  $HOME/.config/ so the cache path is cross-platform (Linux/macOS/Windows)
- Store raw DER bytes (server.der) instead of a pre-computed fingerprint
  string; fingerprint is now re-derived on load so a tampered cache file
  cannot cause a wrong fingerprint to be announced
- Write private key with mode 0600 on Unix via write_private_key() helper
  so the key is only readable by the current user
- Run cargo fmt

Addresses review comments from codex and Copilot on PR aaif-goose#8174.

Signed-off-by: freedbird <freedbird@treebird.dev>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0902dda5ca

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread crates/goose-server/src/tls.rs Outdated
Comment on lines +119 to +123
let cert_der = std::fs::read(dir.join("server.der")).ok()?;

// Derive fingerprint from the actual cert bytes rather than trusting a
// stored string.
let fingerprint = sha256_fingerprint(&cert_der);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Compute cached fingerprint from loaded PEM

load_cached_tls computes the pinned fingerprint from server.der without checking that those bytes match the certificate actually loaded from server.pem. If server.der becomes stale/corrupted while server.pem and server.key still parse, the server starts with one cert but emits a different fingerprint, and Electron’s pinning path in ui/desktop/src/main.ts (certificate checks around lines 151-175) will reject the backend cert and break local API connectivity. Recompute the fingerprint from the parsed certificate (or validate server.der against server.pem and fall back to regeneration on mismatch).

Useful? React with 👍 / 👎.

@DOsinga DOsinga added the needs_human label to set when a robot looks at a PR and can't handle it label Mar 30, 2026
Copy link
Copy Markdown
Collaborator

@DOsinga DOsinga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this — caching the TLS cert is a reasonable fix for #8137. A few things to address before this can move forward:

  1. Drop the separate server.der file. Storing DER alongside PEM creates a consistency risk (Codex flagged this too). Instead, parse the DER from the PEM bytes at load time — you already have the PEM bytes in hand from reading server.pem, so you can extract the DER from them and compute the fingerprint from that. No extra file, no sync issue.

  2. The 30s timeout bump in goosed.ts — if the caching works, the original 10s should be plenty on subsequent launches. If it doesn't work (first launch), 30s is still arbitrary. I'd drop this change from this PR and keep it focused on the caching fix. If the timeout still needs adjusting, that's a separate conversation.

  3. Trim the doc comments. Comments like "Derive fingerprint from the actual cert bytes rather than trusting a stored string" and "Write private key with restricted permissions" are restating what the code does. Keep the one on the silent-error-handling design choice, remove the rest.

  4. Minor: load_cached_tls duplication. The two cfg-gated versions of load_cached_tls are identical except for the config construction line. Consider using inline cfg attributes (like the existing self_signed_config does) instead of duplicating the entire function.

treebird7 added a commit to treebird7/goose that referenced this pull request Mar 30, 2026
- Drop server.der: extract DER from PEM bytes at load time using a
  small der_from_cert_pem() helper (base64-decode the PEM body) so
  there is no separate file that can fall out of sync with the cert
- Remove cert_der parameter from save_tls_to_cache(); only cert PEM
  and private key are stored on disk
- Consolidate the two cfg-gated load_cached_tls() functions into one
  using inline #[cfg] on the differing lines, matching the style of
  self_signed_config()
- Revert the goosed.ts timeout change — not needed if caching works,
  and a separate concern if it does not
- Trim doc comments that restated the code; keep only the silent-error
  design note on save_tls_to_cache()

Signed-off-by: freedbird <freedbird@treebird.dev>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ef1cdfc2b7

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread crates/goose-server/src/tls.rs Outdated
treebird7 added a commit to treebird7/goose that referenced this pull request Mar 30, 2026
- Drop server.der: extract DER from PEM bytes at load time using a
  small der_from_cert_pem() helper (base64-decode the PEM body) so
  there is no separate file that can fall out of sync with the cert
- Remove cert_der parameter from save_tls_to_cache(); only cert PEM
  and private key are stored on disk
- Consolidate the two cfg-gated load_cached_tls() functions into one
  using inline #[cfg] on the differing lines, matching the style of
  self_signed_config()
- Revert the goosed.ts timeout change — not needed if caching works,
  and a separate concern if it does not
- Trim doc comments that restated the code; keep only the silent-error
  design note on save_tls_to_cache()

Signed-off-by: treebird <treebird@treebird.dev>
@treebird7 treebird7 force-pushed the fix/tls-cert-caching-on-startup branch from ef1cdfc to 921d9ac Compare March 30, 2026 18:43
@treebird7
Copy link
Copy Markdown
Contributor Author

Also - Updated the health-check timeout from 10s → 30s in ui/desktop/src/goosed.ts.

@treebird7
Copy link
Copy Markdown
Contributor Author

Anything else you need from me? I think I addressed all issues raised, but do tell me if i missed anything. cheers

@lifeizhou-ap
Copy link
Copy Markdown
Collaborator

Hi @treebird7, thanks for the contribution.

The PR looks good overall. A few follow-ups before merge:

  • Please merge the conflicts with the main branch.
  • Please trim the doc comments where the code is already self-explanatory.
  • Please add a short note on what you tested for these changes.

After you've finished above tasks, please ping me and we’ll finalize it.

…n repeat launches

On first launch goosed generates a self-signed certificate which can take
several seconds.  Subsequent launches reuse the cached cert/key from
~/.config/goose/tls/, skipping generation entirely.

- Add `tls_cache_dir()`, `load_cached_tls()`, `save_tls_to_cache()`, and
  `write_private_key()` helpers to tls.rs
- `self_signed_config()` checks the cache before generating a new cert;
  saves to cache after generation
- Private key written with mode 0o600 on Unix
- Increase desktop startup-status poll timeout from 10 s to 30 s to cover
  first-launch cert generation on slower machines
- Integrates cleanly alongside the new `from_pem_files` / `setup_tls` API
  added in main (user-supplied certs are never cached)

Tested by:
- Running goosed for the first time and confirming the cert files appear
  in ~/.config/goose/tls/
- Restarting goosed and confirming startup completes without regenerating
  the cert (fast path logged, same fingerprint emitted)
- Running `cargo clippy --all-targets -- -D warnings` with no warnings

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Treebird <treebird@treebird.dev>
@treebird7 treebird7 force-pushed the fix/tls-cert-caching-on-startup branch from 921d9ac to 9dce4cb Compare April 22, 2026 17:26
@treebird7
Copy link
Copy Markdown
Contributor Author

Hi @lifeizhou-ap, glad to be a part of the goose project!
Hopefully all done according to your request

@lifeizhou-ap lifeizhou-ap added this pull request to the merge queue Apr 23, 2026
Merged via the queue into aaif-goose:main with commit 092a40c Apr 23, 2026
21 checks passed
lifeizhou-ap added a commit that referenced this pull request Apr 23, 2026
* main: (34 commits)
  fix(goose-server): cache TLS cert to disk to avoid slow startup on first launch (#8174)
  feat: add Exa AI-powered search tool (#8487)
  fix: preprompt would show after loading session (#8744)
  commands to acp+ migration: extensions management (#8733)
  feat: desktop notification when goose finishes a task (#8647)
  harden code review skill for async state and default-resolution bugs (#8740)
  Feature/at agent mention (#8571)
  fix: removed hardcoded dependency of goose-acp-macro (#8753)
  perf: split agent setup into staged phases to reduce startup blocking (#8746)
  Add /skills command (#8600)
  Replace deprecated Claude ACP package links (#8625)
  removed the specific code owner for documentation change (#8749)
  fix(providers): handle missing delta field in streaming chunks (#8700)
  refactor(providers): extract http_status module and rename handle_status_openai_compat (#8620)
  fix(providers/openai): accept streaming chunks with both reasoning fields (#8715)
  feat: associate threads with projects (#8745)
  upgrade goose sdk and tui to be compatible with the latest agentclientprotocol/sdk package (#8667)
  feat: extend goose2 context window ux with auto-compaction (#8721)
  improve goose2 agent management flows (#8737)
  alexhancock/tui-improvements (#8736)
  ...
@treebird7 treebird7 deleted the fix/tls-cert-caching-on-startup branch April 23, 2026 18:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

needs_human label to set when a robot looks at a PR and can't handle it

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] Backend server fails to start on fresh install (desktop app) - resolves on relaunch

4 participants