Skip to content

fix(docker): require ARCHON_ALLOW_ROOT_FALLBACK opt-in for root-fallback when chown fails#1573

Open
ztech-gthb wants to merge 1 commit intocoleam00:devfrom
ztech-gthb:fix/chown-opt-in-root-fallback
Open

fix(docker): require ARCHON_ALLOW_ROOT_FALLBACK opt-in for root-fallback when chown fails#1573
ztech-gthb wants to merge 1 commit intocoleam00:devfrom
ztech-gthb:fix/chown-opt-in-root-fallback

Conversation

@ztech-gthb
Copy link
Copy Markdown
Contributor

@ztech-gthb ztech-gthb commented May 4, 2026

Summary

UX Journey

Before (current origin/dev, post-#1518)

macOS user: docker compose up -d
  entrypoint runs as root
  chown -Rh appuser:appuser /.archon → fails (VirtioFS, host UID 501)
  echo "ERROR: Failed to fix ownership..." >&2
  exit 1
  → container crash-loops, no usable stack on macOS

After (this PR)

macOS user (no env var set yet): docker compose up -d
  chown of /.archon fails, chown of /home/appuser fails
  echo "WARNING: chown failed:" + per-path stderr from chown
  echo "ERROR: refusing to run as root. On macOS VirtioFS this is expected
         — set ARCHON_ALLOW_ROOT_FALLBACK=1 to opt in. On Linux, fix
         volume ownership instead."
  exit 1
  → user reads error, adds ARCHON_ALLOW_ROOT_FALLBACK=1 to their .env or compose env, retries

macOS user (env var set): docker compose up -d
  chowns fail, captured stderr printed
  echo "WARNING: ARCHON_ALLOW_ROOT_FALLBACK=1 — continuing as root with IS_SANDBOX=1."
  export IS_SANDBOX=1
  RUNNER=""  (run as root)
  → container starts, ClaudeProvider's UID-0 guard recognizes IS_SANDBOX=1 and proceeds

Linux user with broken volume permissions (no env var):
  chown fails with concrete error (e.g. "Read-only file system" or
  "Operation not permitted (eacces)")
  → user sees the actual error message and the explicit instruction to
    fix volume ownership rather than opt-in to root-fallback. No silent
    bypass.

Architecture Diagram

Before

docker-entrypoint.sh (post-#1518):

  if [ "$(id -u)" = "0" ]; then
    chown -Rh appuser:appuser /.archon 2>/dev/null    [-]
    || (echo ERROR; exit 1)                            [-] silent stderr,
    chown -Rh appuser:appuser /home/appuser 2>/dev/null  [-] no opt-in
    || (echo ERROR; exit 1)                            [-]
    RUNNER="gosu appuser"
  fi

After

docker-entrypoint.sh (this PR):

  if [ "$(id -u)" = "0" ]; then
    chown_failed=0; chown_errors=""                    [+]
    if ! chown_err=$(chown -Rh ... /.archon 2>&1)      [~] capture stderr
      chown_failed=1                                   [+]
      chown_errors+="  /.archon: ${chown_err}\n"       [+]
    if ! chown_err=$(chown -Rh ... /home/appuser 2>&1) [~] capture stderr
      chown_failed=1                                   [+]
      chown_errors+="  /home/appuser: ${chown_err}\n"  [+]

    if chown_failed == 0:                              [+]
      RUNNER="gosu appuser"
    else:                                              [+]
      echo "WARNING: chown failed:" + chown_errors     [+] actionable
      if ARCHON_ALLOW_ROOT_FALLBACK == "1":            [+] explicit opt-in
        IS_SANDBOX=1; RUNNER=""                        [~] same as old #1537
      else:                                            [+]
        echo "ERROR: refusing to run as root..." +     [+] fail-loud,
        instructions to set the opt-in or fix perms    [+] explicit guidance
        exit 1                                         [+]
  fi

Connection inventory:

From To Status Notes
chown invocations stderr capture into shell var modified was 2>/dev/null, now 2>&1 into chown_err
failure path IS_SANDBOX=1 export modified gated by env-var, was unconditional in pre-#1518 #1537
failure path ARCHON_ALLOW_ROOT_FALLBACK env var new explicit opt-in
failure path exit 1 with instructions new replaces silent IS_SANDBOX bypass
ClaudeProvider's UID-0 guard IS_SANDBOX env unchanged still requires IS_SANDBOX=1 to skip the guard
/home/appuser chown shared fallback block new previously had its own fail-loud path; now uses the same opt-in branch as /.archon

Label Snapshot

  • Risk: risk: low
  • Size: size: S
  • Scope: docker
  • Module: docker:entrypoint

Change Metadata

  • Change type: bug
  • Primary scope: multi (config + macOS UX)

Linked Issue

Validation Evidence (required)

bash -n docker-entrypoint.sh                # syntax OK

End-to-end manual verification:

  1. Linux happy path (chown succeeds): container starts as appuser via gosu, no warnings. Behavior identical to current origin/dev.
  2. macOS without opt-in: chown fails for both paths, both errors printed in the warning, then ERROR: refusing to run as root... followed by exit 1. Container crash-loops as in current state — but with a clear instruction message for the user.
  3. macOS with ARCHON_ALLOW_ROOT_FALLBACK=1 (set via compose env): chown fails, both errors printed, then WARNING: ARCHON_ALLOW_ROOT_FALLBACK=1 — continuing as root with IS_SANDBOX=1., container starts, ClaudeProvider passes its UID-0 guard. End-to-end functional.
  4. Linux with simulated chown failure (e.g. RO mount, --read-only flag): same as macOS-without-opt-in — clear error, actionable instruction. No silent root-execution.

Security Impact (required)

Compatibility / Migration

  • Backward compatible? Yes for any user whose chown succeeds (Linux-default + named-volume setups). For users currently blocked on macOS by post-feat(docker): persist /home/appuser by default + clarify ARCHON_HOME/ARCHON_DATA semantics #1518 crash-loop, this PR introduces an opt-in fix — they need to set ARCHON_ALLOW_ROOT_FALLBACK=1 once, e.g. in their docker-compose.override.yml environment: block or in .env.
  • Config/env changes? One new opt-in env var (ARCHON_ALLOW_ROOT_FALLBACK). Default unset → behavior identical to current origin/dev.
  • Database migration needed? No

Human Verification (required)

Verified scenarios:

  • syntax-clean (bash -n)
  • macOS host with opt-in set: container starts, IS_SANDBOX=1 in env, gosu skipped, claude binary loads (verified with docker exec ... env | grep IS_SANDBOX)
  • macOS host without opt-in: container exits 1 with the expected error message including chown stderr from both paths
  • Both chowns (/.archon and /home/appuser) tested independently — message accumulates per-path errors

Edge cases checked:

  • chown_err capture works under set -e because the if ! cmd; then guard suppresses the failing exit-status from triggering errexit
  • printf "%s" "$chown_errors" preserves embedded newlines correctly
  • ARCHON_ALLOW_ROOT_FALLBACK is checked with :-0 default so an unset var resolves to "0" (exit-loud) rather than triggering set-u

What was not verified:

  • SELinux/AppArmor-denied chown on a real host — only the macOS VirtioFS case end-to-end. The shape of the error path is identical so no behavioral surprise expected, but I haven't reproduced an SELinux denial under enforcement.

Side Effects / Blast Radius (required)

Rollback Plan (required)

  • Fast rollback: revert this commit. Behavior returns to current origin/dev (fail-loud unconditionally, no opt-in). No data migration.
  • Feature flags: the env var itself is the toggle — leaving it unset on existing installs has zero behavior change.
  • Observable failure symptom: a misconfigured opt-in (e.g. typo in env-var name) would simply have no effect — user sees the unchanged exit-1 path, just like without the env var. Safe failure mode.

Risks and Mitigations

  • Risk: a Linux operator who sees the macOS-style warning and "set ARCHON_ALLOW_ROOT_FALLBACK=1" hint might be tempted to set it on a Linux host with an actual misconfiguration, masking the real issue.
    • Mitigation: the error message explicitly says "On macOS VirtioFS this is expected — set ARCHON_ALLOW_ROOT_FALLBACK=1 to opt in. On Linux, fix volume ownership instead." The Linux-specific guidance is in-line.
  • Risk: the env var name (ARCHON_ALLOW_ROOT_FALLBACK) might clash with a future CI tool expecting it to mean something else.
    • Mitigation: name is ARCHON_-prefixed, so unlikely collision with generic CI vars. Could be tightened further (ARCHON_DOCKER_ALLOW_ROOT, etc.) if reviewers prefer — happy to rename.
  • Risk: future entrypoint additions (e.g. another chown of a future persisted directory) might forget to participate in the same fallback block.
    • Mitigation: the comment block in the entrypoint explicitly explains the pattern; future contributors should extend the chown_failed/chown_errors accumulator rather than introduce a new isolated exit 1.

Summary by CodeRabbit

  • Bug Fixes

    • Improved error handling during container startup with detailed messages for permission issues.
    • Added helpful instructions for resolving ownership-related startup failures.
  • New Features

    • Added optional root fallback mechanism via ARCHON_ALLOW_ROOT_FALLBACK environment variable for cases where standard permission configurations fail.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 4, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 1c47a10f-18cd-472e-8471-6ef64a9917b1

📥 Commits

Reviewing files that changed from the base of the PR and between 0c5d7b1 and 38f6669.

📒 Files selected for processing (1)
  • docker-entrypoint.sh

📝 Walkthrough

Walkthrough

The docker-entrypoint.sh entrypoint script now captures detailed chown stderr, conditionally sets privilege-drop behavior, and provides an opt-in root fallback when ownership changes fail instead of failing fast.

Changes

Privilege-Drop Initialization & Error Handling

Layer / File(s) Summary
Error Capture & Variable Tracking
docker-entrypoint.sh (lines 11–30)
Introduces chown_failed and chown_errors variables to track recursive chown attempts on /.archon and /home/appuser, capturing detailed stderr for diagnostic output.
Conditional Privilege-Drop Assignment
docker-entrypoint.sh (lines 31–42)
Sets RUNNER="gosu appuser" on successful chown; otherwise, checks ARCHON_ALLOW_ROOT_FALLBACK environment variable to decide between root fallback (exporting IS_SANDBOX=1 and clearing RUNNER) or exiting with a helpful error message.
Error Messaging & Documentation
docker-entrypoint.sh (lines 43–51)
On chown failure without fallback opt-in, outputs detailed error message explaining ownership issues and how to enable ARCHON_ALLOW_ROOT_FALLBACK=1 to continue as root, then exits.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

  • coleam00/Archon#1272: Adds mkdir -p /.archon/workspaces and /.archon/worktrees setup before privilege drop in the same entrypoint script, complementing this PR's improved chown handling and error recovery for /.archon ownership.

Poem

🐰 A docker hare learns to chown with care,
Catching errors where they float in air,
When root's not welcome, a fallback stands tall,
IS_SANDBOX=1 hops through it all! 🎩✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically describes the main change: requiring an explicit opt-in environment variable for root fallback behavior when chown fails.
Description check ✅ Passed The pull request description is comprehensive and well-structured, covering all major template sections including problem statement, validation evidence, security impact, compatibility, verification, side effects, rollback plan, and risk mitigations.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
Review rate limit: 7/8 reviews remaining, refill in 7 minutes and 30 seconds.

Comment @coderabbitai help to get the list of available commands and usage tips.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant