Skip to content

Non-fatal chown fallback for macOS bind mounts (complement to #1307)#1537

Open
ztech-gthb wants to merge 1 commit intocoleam00:devfrom
ztech-gthb:fix/macos-virtiofs-chown-fallback
Open

Non-fatal chown fallback for macOS bind mounts (complement to #1307)#1537
ztech-gthb wants to merge 1 commit intocoleam00:devfrom
ztech-gthb:fix/macos-virtiofs-chown-fallback

Conversation

@ztech-gthb
Copy link
Copy Markdown

@ztech-gthb ztech-gthb commented May 2, 2026

Summary

  • Problem: On macOS bind mounts (VirtioFS), the entrypoint's chown -Rh appuser:appuser /.archon fails because the host controls ownership and host UIDs (e.g. 501) don't map to container's appuser (1001). The script then exit 1s and the container crash-loops.
  • Why it matters: Affects every macOS user running Archon via Docker Desktop with the default bind-mount setup. Currently the only workaround is to run Archon outside Docker entirely.
  • What changed: Failed chown is now treated as a warning. Container falls back to running as root, with IS_SANDBOX=1 exported so ClaudeProvider skips its UID-0 safety check (we're still inside Docker — sandboxed in the meaningful sense).
  • What did not change (scope boundary): No changes to safe.directory-loop (already fixed by fix(docker): register safe.directory for all repos on bind-mount restart #1307), Linux behaviour (chown succeeds → unchanged path), Kubernetes/non-root behaviour (id != 0 → unchanged path), or the security model on Linux.

UX Journey

Before

User on macOS                     Container start
─────────────                     ───────────────
docker compose up ──────────────▶ entrypoint: chown -Rh appuser /.archon
                                    │
                                    └─ FAILS (host UID, VirtioFS)
                                       │
                                       └─ "ERROR: Failed to fix ownership"
                                          exit 1
                                  Container crash-loops
container never starts ◀────────  user blocked

After

User on macOS                     Container start
─────────────                     ───────────────
docker compose up ──────────────▶ entrypoint: chown -Rh appuser /.archon
                                    │
                                    └─ FAILS (host UID, VirtioFS)
                                       │
                                       └─ [warning logged]
                                          [export IS_SANDBOX=1]
                                          [RUNNER=""]   ←── run as root
                                  Server starts as root
                                  ClaudeProvider sees IS_SANDBOX=1,
                                  skips UID-0 check
container running ◀──────────────  user can use Archon

Architecture Diagram

Before

docker-entrypoint.sh
        │
        ├─▶ chown /.archon (fail-fatal)
        ├─▶ RUNNER="gosu appuser"
        └─▶ exec $RUNNER bun run setup-auth

After

docker-entrypoint.sh
        │
        ├─▶ chown /.archon (try)
        │      ├─ ok  ──▶ RUNNER="gosu appuser"
        │      └─ fail ──▶ warn + export IS_SANDBOX=1 + RUNNER=""   [+]
        └─▶ exec $RUNNER bun run setup-auth

Connection inventory:

From To Status Notes
docker-entrypoint.sh gosu appuser unchanged (happy path) only when chown succeeds
docker-entrypoint.sh run as root + IS_SANDBOX=1 new only when chown fails
ClaudeProvider IS_SANDBOX env unchanged already reads this var

Label Snapshot

  • Risk: risk: low
  • Size: size: XS
  • Scope: core
  • Module: core:docker-entrypoint

Change Metadata

  • Change type: bug
  • Primary scope: core

Linked Issue

Validation Evidence (required)

# Locally on macOS (Docker Desktop, VirtioFS bind mount):
docker compose down
docker compose build --no-cache
docker compose up
# Container starts, "WARNING: Could not fix ownership..." appears once,
# then server reaches gh_auth.status_ok and serves on :5174.
  • Evidence: container logs show single warning then normal startup. No crash-loop. Web UI reachable. Functional integration test (chat-message → soft-sync) confirmed working in the same session.
  • Skipped commands: bun run type-check, bun run lint, bun run test — touched file is a shell script, not TS; lint-staged was a no-op for .sh.

Security Impact (required)

  • New permissions/capabilities? No — when fallback fires, the process runs as root inside the container, but it would have run as root before too (the difference is exit 1 vs continuing).
  • New external network calls? No.
  • Secrets/tokens handling changed? No.
  • File system access scope changed? Nochown itself is unchanged; only the failure handling differs.
  • Security note: running as root is undesirable but not a regression — the alternative is "container does not start", which strictly speaking is more secure but also unusable. IS_SANDBOX=1 mirrors how Claude SDK itself handles known-sandboxed contexts; the container is still isolated by Docker.

Compatibility / Migration

  • Backward compatible? Yes — Linux setups where chown succeeds are unchanged. Only the previously-fatal failure path is altered.
  • Config/env changes? Yes (passive): IS_SANDBOX=1 is exported in the fallback path. Users who already set IS_SANDBOX explicitly are not overridden if chown succeeds.
  • Database migration needed? No.

Human Verification (required)

  • Verified scenarios:
  • Edge cases checked: chown-success path (Linux Docker host emulator) still goes through the appuser branch; verified by inspection of the conditional, no code path duplication.
  • What was not verified: Kubernetes / --user-flag invocation (the else branch is unchanged structurally; behaviour-equivalent to pre-PR).

Side Effects / Blast Radius (required)

  • Affected subsystems: container startup only.
  • Potential unintended effects: a Linux user who intentionally configured chown to fail (highly unusual) would now silently get root-execution instead of exit. Mitigated by the explicit WARNING: log line.
  • Guardrails: warning is logged on stderr, visible in docker compose logs. IS_SANDBOX=1 is observable in process env.

Rollback Plan (required)

  • Fast rollback: revert this single commit (git revert <sha>). One-file change, no DB/state to migrate.
  • Feature flags: none.
  • Observable failure symptoms: macOS container would resume crash-looping with the original ERROR: Failed to fix ownership message — same symptom as before this PR.

Risks and Mitigations

  • Risk: A misconfigured Linux host where chown fails for an unrelated reason (filesystem error, capability missing) would now log a warning and run as root instead of failing fast.
    • Mitigation: Warning is loud and explicit. Discovery via logs is easy. The previous behaviour (exit 1) was equally hostile to debugging in that case (no remediation hint either).
  • Risk: IS_SANDBOX=1 skips a ClaudeProvider UID safety check that exists for a reason.
    • Mitigation: The check is for "user is running Claude as root on a host system" — that's not what's happening here (we're inside a container). The flag is purpose-named for exactly this pattern.

Summary by CodeRabbit

  • Bug Fixes
    • Container initialization now handles permission configuration failures gracefully, ensuring the service can start in restricted deployment environments where privilege adjustments may not be possible.

Complements coleam00#1307. On macOS Docker Desktop with VirtioFS bind  mounts, host UIDs (e.g. 501) don't map to appuser (1001) and chown fails — the script then exit 1s and the container crash-loops. Now treats chown failure as a warning, falls  back to running as root, and exports IS_SANDBOX=1 so ClaudeProvider skips its UID-0 check (we're still sandboxed by Docker).
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 2, 2026

📝 Walkthrough

Walkthrough

The entrypoint script now handles chown failure gracefully by logging a warning, setting a sandbox flag, and continuing as root instead of exiting. Previously, failed chown operations triggered immediate script termination.

Changes

Root-Privilege Fallback Handling

Layer / File(s) Summary
Error Handling & Startup Flow
docker-entrypoint.sh
Root execution path now attempts chown /.archon and only drops privileges via gosu if successful. On failure, logs a warning, sets IS_SANDBOX=1, clears RUNNER, and allows container to start as root; subsequent git, credential helper, and binary initialization steps proceed unchanged.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related issues

Possibly related PRs

Poem

🐰 A rabbit hops through Docker's door,
No crash on chown anymore!
Sandboxed, graceful, running free,
Root when needed—permission spree! ✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately reflects the main change: a non-fatal fallback for chown failures on macOS bind mounts, complementing the related #1307 fix.
Description check ✅ Passed The description comprehensively covers the template requirements with detailed problem statement, UX journey, architecture diagrams, security analysis, validation evidence, and rollback plan. All critical sections are well-populated.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
Review rate limit: 7/8 reviews remaining, refill in 7 minutes and 30 seconds.

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docker-entrypoint.sh`:
- Line 17: The chown invocation currently silences stderr with `2>/dev/null`,
losing the real OS error; change the `if chown -Rh appuser:appuser /.archon
2>/dev/null; then` pattern to capture stderr into a variable (e.g.
`output=$(chown -Rh appuser:appuser /.archon 2>&1)`) and use the command's exit
status to decide the branch, then include that captured `output` in the warning
message so the exact chown error (like "Operation not permitted" or "Read-only
file system") is logged.
- Around line 17-23: The script currently unconditionally exports IS_SANDBOX=1
on any chown failure, silently bypassing the root-safety guard in
ClaudeProvider; change this so IS_SANDBOX is only set when the user explicitly
opted in (e.g., check if an opt-in env var is already set like IS_SANDBOX=1 or a
new explicit flag SANDBOX_FORCE=1) instead of auto-exporting it on failure.
Update the chown block around the chown -Rh appuser:appuser /.archon check and
the RUNNER assignment so that on failure you print the warning and leave
IS_SANDBOX alone unless the opt-in env var is present; if you need to support
macOS VirtioFS convenience, document/require users to set the opt-in variable
before starting the container rather than assigning IS_SANDBOX=1 automatically.
Ensure references to RUNNER and IS_SANDBOX in docker-entrypoint.sh remain
consistent with this opt-in behavior so ClaudeProvider's getProcessUid() guard
is not silently bypassed.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d93421b6-7d5c-48a9-9788-420afb96ef49

📥 Commits

Reviewing files that changed from the base of the PR and between 69b2c89 and 900270d.

📒 Files selected for processing (1)
  • docker-entrypoint.sh

Comment thread docker-entrypoint.sh
# as a warning and fall back to running as root so the container still starts
# rather than crash-looping. IS_SANDBOX=1 lets ClaudeProvider skip its UID-0
# safety check (we're still inside Docker — sandboxed in the meaningful sense).
if chown -Rh appuser:appuser /.archon 2>/dev/null; then
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

2>/dev/null discards the actual chown error, making the warning on Line 20 unactionable

When chown fails, the only information in the warning message is "bind mount with incompatible options?" — the actual OS error (Operation not permitted, Read-only file system, etc.) is thrown away. On unexpected Linux failures this makes diagnosis very difficult.

🔍 Proposed fix — include the chown error in the warning
-  if chown -Rh appuser:appuser /.archon 2>/dev/null; then
+  if chown_err=$(chown -Rh appuser:appuser /.archon 2>&1); then
     RUNNER="gosu appuser"
   else
-    echo "WARNING: Could not fix ownership of /.archon (bind mount with incompatible options?) — running as root" >&2
+    echo "WARNING: Could not fix ownership of /.archon — running as root (chown said: ${chown_err})" >&2
     export IS_SANDBOX=1
     RUNNER=""
   fi

if cmd=$(command 2>&1); then routes both stdout and stderr into the variable while still using the command's exit status to pick the branch — no change to set -e behaviour.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if chown -Rh appuser:appuser /.archon 2>/dev/null; then
if chown_err=$(chown -Rh appuser:appuser /.archon 2>&1); then
RUNNER="gosu appuser"
else
echo "WARNING: Could not fix ownership of /.archon — running as root (chown said: ${chown_err})" >&2
export IS_SANDBOX=1
RUNNER=""
fi
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docker-entrypoint.sh` at line 17, The chown invocation currently silences
stderr with `2>/dev/null`, losing the real OS error; change the `if chown -Rh
appuser:appuser /.archon 2>/dev/null; then` pattern to capture stderr into a
variable (e.g. `output=$(chown -Rh appuser:appuser /.archon 2>&1)`) and use the
command's exit status to decide the branch, then include that captured `output`
in the warning message so the exact chown error (like "Operation not permitted"
or "Read-only file system") is logged.

Comment thread docker-entrypoint.sh
Comment on lines +17 to 23
if chown -Rh appuser:appuser /.archon 2>/dev/null; then
RUNNER="gosu appuser"
else
echo "WARNING: Could not fix ownership of /.archon (bind mount with incompatible options?) — running as root" >&2
export IS_SANDBOX=1
RUNNER=""
fi
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

IS_SANDBOX=1 silently bypasses the UID-0 safety guard for any chown failure, not just macOS VirtioFS

The ClaudeProvider constructor (see packages/providers/src/claude/provider.ts Line 904) throws when getProcessUid() === 0 && IS_SANDBOX !== '1'. That guard exists to prevent running Claude with bypassPermissions as root. This PR unconditionally exports IS_SANDBOX=1 whenever chown fails — but inside a Linux container there is no way to distinguish a macOS VirtioFS bind mount from a Linux host with a misconfigured volume driver, an SELinux/AppArmor policy denial, or a wrong --mount type. In all those cases the safety guard is silently bypassed and the Claude binary runs as root.

A minimal mitigation is an explicit opt-in variable so the behaviour is intentional rather than automatic:

🛡️ Proposed fix — gate root fallback on explicit opt-in
-  if chown -Rh appuser:appuser /.archon 2>/dev/null; then
+  if chown -Rh appuser:appuser /.archon 2>/tmp/_archon_chown_err; then
     RUNNER="gosu appuser"
   else
-    echo "WARNING: Could not fix ownership of /.archon (bind mount with incompatible options?) — running as root" >&2
-    export IS_SANDBOX=1
-    RUNNER=""
+    _chown_msg=$(cat /tmp/_archon_chown_err 2>/dev/null); rm -f /tmp/_archon_chown_err
+    echo "WARNING: chown /.archon failed (${_chown_msg:-unknown reason})." >&2
+    if [ "${ARCHON_ALLOW_ROOT_FALLBACK:-0}" = "1" ]; then
+      echo "WARNING: ARCHON_ALLOW_ROOT_FALLBACK=1 — continuing as root with IS_SANDBOX=1." >&2
+      export IS_SANDBOX=1
+      RUNNER=""
+    else
+      echo "ERROR: Set ARCHON_ALLOW_ROOT_FALLBACK=1 (e.g. macOS VirtioFS) to allow root fallback, or fix volume ownership." >&2
+      exit 1
+    fi
   fi

This keeps the macOS fix functional (users add one env var to their docker-compose.yml or docker run command) while preventing the safety guard from being bypassed silently on Linux hosts with inadvertent chown failures.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if chown -Rh appuser:appuser /.archon 2>/dev/null; then
RUNNER="gosu appuser"
else
echo "WARNING: Could not fix ownership of /.archon (bind mount with incompatible options?) — running as root" >&2
export IS_SANDBOX=1
RUNNER=""
fi
if chown -Rh appuser:appuser /.archon 2>/tmp/_archon_chown_err; then
RUNNER="gosu appuser"
else
_chown_msg=$(cat /tmp/_archon_chown_err 2>/dev/null); rm -f /tmp/_archon_chown_err
echo "WARNING: chown /.archon failed (${_chown_msg:-unknown reason})." >&2
if [ "${ARCHON_ALLOW_ROOT_FALLBACK:-0}" = "1" ]; then
echo "WARNING: ARCHON_ALLOW_ROOT_FALLBACK=1 — continuing as root with IS_SANDBOX=1." >&2
export IS_SANDBOX=1
RUNNER=""
else
echo "ERROR: Set ARCHON_ALLOW_ROOT_FALLBACK=1 (e.g. macOS VirtioFS) to allow root fallback, or fix volume ownership." >&2
exit 1
fi
fi
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docker-entrypoint.sh` around lines 17 - 23, The script currently
unconditionally exports IS_SANDBOX=1 on any chown failure, silently bypassing
the root-safety guard in ClaudeProvider; change this so IS_SANDBOX is only set
when the user explicitly opted in (e.g., check if an opt-in env var is already
set like IS_SANDBOX=1 or a new explicit flag SANDBOX_FORCE=1) instead of
auto-exporting it on failure. Update the chown block around the chown -Rh
appuser:appuser /.archon check and the RUNNER assignment so that on failure you
print the warning and leave IS_SANDBOX alone unless the opt-in env var is
present; if you need to support macOS VirtioFS convenience, document/require
users to set the opt-in variable before starting the container rather than
assigning IS_SANDBOX=1 automatically. Ensure references to RUNNER and IS_SANDBOX
in docker-entrypoint.sh remain consistent with this opt-in behavior so
ClaudeProvider's getProcessUid() guard is not silently bypassed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant