Skip to content

test(cli): add command-coverage smoke gate#1086

Merged
buremba merged 1 commit into
mainfrom
feat/cli-smoke-coverage
May 26, 2026
Merged

test(cli): add command-coverage smoke gate#1086
buremba merged 1 commit into
mainfrom
feat/cli-smoke-coverage

Conversation

@buremba
Copy link
Copy Markdown
Member

@buremba buremba commented May 26, 2026

What

Adds scripts/cli-smoke.sh, a command-coverage smoke gate for the lobu CLI, plus the cli-smoke CI job and a make test-e2e-cli target.

sdk-e2e.sh already proves the SDK lifecycle (init → run → apply → prune → worker turn → connector → watcher → client) down one deep path. It does not prove that the rest of the command surface even runs. This gate fills that: it boots one local lobu run (embedded Postgres + the deterministic mock provider from scripts/sdk-e2e/, no API key) under an isolated $HOME, then invokes every lobu command/subcommand once and asserts each either returns its documented success marker or fails gracefully (clean message + controlled exit, no stack trace). It collects all failures and prints a summary instead of failing on the first miss, so one run tells you exactly what's broken.

Coverage

66 checks, 4 documented skips. Commands hit: version/help, init (+--list-providers, --from-org), validate (+broken-config negative), doctor (+--memory-only), telemetry on/off/status, context (list/current/add/use/rm), whoami, status, org list/current/set, token (print/--raw/create/revoke), agent (list/get/create/scaffold/update/delete/config get/config patch + negatives), call (--list + a tool call), link/unlink, apply (--only validation + --dry-run + real), chat (--json/--new/--dry-run/-C), memory (health/run/exec/seed/configure/init/org), login (--token + non-interactive device-code bail)/logout, connector (runtime-self-check + run --check).

Skipped with a logged reason: org create (opens a browser), memory browser-auth capture (Chrome + Keychain), interactive device-code login (needs a TTY), chat --user (needs a platform connection).

Notes

  • Runs under an isolated $HOME so it never touches the dev's real ~/.config/lobu contexts/credentials or ~/.lobu data.
  • local-context commands use the loopback /api/local-init bootstrap (same path sdk-e2e relies on), so no real login is needed.
  • The cli-smoke CI job mirrors sdk-e2e (node 22, ICU shim, bubblewrap, make build-packages).
  • Biome now ignores the .cli-smoke-run/.sdk-e2e-run/.managed-e2e-run scratch dirs.

Validated locally on macOS/node@22: 66 passed, 0 failed, 4 skipped (ran it three times). make review ran it independently with the same result.

Summary by CodeRabbit

  • Tests

    • Added an end-to-end CLI “smoke” test that exercises the full CLI surface (commands/subcommands, init/apply/chat/memory/identity/org/agent flows) under an isolated environment with a mock provider, verifying documented outputs and graceful failures.
  • Chores

    • Added a CI gate to run the CLI smoke job, introduced a Make target to invoke it, updated ignore/config rules to exclude smoke run artifacts.

Review Change Stack

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 26, 2026

Caution

Review failed

Pull request was closed or merged during review

📝 Walkthrough

Walkthrough

Adds a CI job and Make target plus a bash smoke script that boots an isolated Lobu stack with a mock provider, runs broad lobu CLI command coverage across identity, agents, memory, chat, and org flows, and reports pass/fail/skip.

Changes

CLI Smoke Test Infrastructure and Execution

Layer / File(s) Summary
Test infrastructure scaffolding and CI wiring
.github/workflows/ci.yml, Makefile, .gitignore, config/biome.config.json
New cli-smoke CI job, test-e2e-cli Makefile target, .cli-smoke-run/ ignore rule, and Biome excludes establish the smoke test as a CI gate.
Script environment initialization and helper functions
scripts/cli-smoke.sh (lines 1–143)
Bash strict-mode, isolated HOME, cleanup trap, counters, and helper/assertion functions (runlobu, expect_grep, expect_ok, expect_fail_grep, expect_exit); starts deterministic mock provider and rewrites provider registry to mock port.
CLI command test execution across functional areas
scripts/cli-smoke.sh (lines 144–375)
Scaffolds a project, boots lobu run, waits for auto-apply, and executes static validation, identity/token/org flows, agent CRUD/config, admin dispatch, link/unlink, apply dry-run/apply, chat turns, memory MCP (seed/exec/init/doctor), login/logout, and marks interactive/browser paths as skipped; prints PASS/FAIL/SKIP summary and exits nonzero on failures.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • lobu-ai/lobu#1033: Related to embedded Postgres handling used by lobu doctor checks exercised in the smoke script.

Poem

🐰 I hopped through scripts and CI this day,

Mock provider hummed, the servers out to play.
Commands I whispered, outputs checked with care,
Passes tallied up, failures few and rare.
A little rabbit cheers the CLI’s fresh sway.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 45.45% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely describes the main change: adding a command-coverage smoke gate for the CLI, which is the primary objective of this pull request.
Description check ✅ Passed The PR description is comprehensive and well-structured, covering the what, coverage details, and notes sections, though it omits the template's test plan checklist.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/cli-smoke-coverage

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov-commenter
Copy link
Copy Markdown

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@buremba
Copy link
Copy Markdown
Member Author

buremba commented May 26, 2026

bug_free 91, simplicity 82, slop 0, bugs 0, 0 blockers

Typecheck, unit, and integration logs all passed. Reviewed diff, ran bash -n scripts/cli-smoke.sh, then ran full scripts/cli-smoke.sh: 66 passed, 0 failed, 4 intentional skips.

Full verdict JSON
{
  "bug_free_confidence": 91,
  "bugs": 0,
  "slop": 0,
  "simplicity": 82,
  "blockers": [],
  "change_type": "test",
  "behavior_change_risk": "low",
  "tests_adequate": true,
  "suggested_fixes": [],
  "notes": "Typecheck, unit, and integration logs all passed. Reviewed diff, ran bash -n scripts/cli-smoke.sh, then ran full scripts/cli-smoke.sh: 66 passed, 0 failed, 4 intentional skips.",
  "categories": {
    "src": 0,
    "tests": 375,
    "docs": 0,
    "config": 14,
    "deps": 0,
    "migrations": 0,
    "ci": 46,
    "generated": 0
  }
}

Local review gate — branch protection can require the pi-review commit status. See docs/REVIEW_SCHEMA.md.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/workflows/ci.yml:
- Around line 167-181: In the cli-smoke job, replace mutable action tags with
fixed commit SHAs for actions/checkout, oven-sh/setup-bun and actions/setup-node
and set actions/checkout to persist-credentials: false; specifically update the
step that references actions/checkout@v4 to include persist-credentials: false
and swap the three action references (actions/checkout, oven-sh/setup-bun,
actions/setup-node) to their corresponding commit SHAs so the workflow is pinned
and tokens aren't persisted. Ensure you make the same changes for other jobs in
ci.yml that use actions/checkout@v4 without persist-credentials and any other
mutable action tags.

In `@scripts/cli-smoke.sh`:
- Around line 102-109: The expect_fail_grep function currently only verifies
non-zero exit and presence of marker but doesn't reject crashes/stack traces;
update expect_fail_grep to also scan the captured output variable OUT for common
stack-trace signatures (e.g. case-insensitive patterns like "traceback", "stack
trace", "panic:", "Exception", "at " with file:line, etc.) after runlobu and
before passing; if any such pattern matches, call softfail with a clear message
(including the matched pattern and original "$RC") instead of pass, otherwise
proceed with the existing marker check. Ensure you update the logic around the
RC/OUT checks in expect_fail_grep and reference the variables/function names
expect_fail_grep, runlobu, RC, and OUT.
- Around line 148-150: The test uses VERSION=$(node "$LOBU_BIN" --version ...)
then immediately uses that value as a required marker in expect_grep; add a
non-empty guard for VERSION before calling expect_grep to prevent a
false-positive when the node call fails. Specifically, after the assignment to
VERSION (and before the expect_grep lines) check that VERSION is not empty
(e.g., test -n "$VERSION" or [[ -z "$VERSION" ]] and exit with a clear error
message), and only call expect_grep "lobu --version" "$VERSION" "$WT" --version
when the VERSION variable is present; keep the existing LOBU_BIN and expect_grep
usages unchanged except gated by this precheck.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 0df7d752-2539-429d-a81d-46bf05cf80c4

📥 Commits

Reviewing files that changed from the base of the PR and between 261d755 and c94e351.

📒 Files selected for processing (5)
  • .github/workflows/ci.yml
  • .gitignore
  • Makefile
  • config/biome.config.json
  • scripts/cli-smoke.sh

Comment thread .github/workflows/ci.yml
Comment on lines +167 to +181
- uses: actions/checkout@v4

- uses: ./.github/actions/setup-submodule
with:
deploy-key: ${{ secrets.OWLETTO_WEB_DEPLOY_KEY }}

- uses: oven-sh/setup-bun@v2
with:
bun-version: 1.3.5

# Node 22 — the worker uses isolated-vm (abi127 prebuild); the runner's
# default Node major would segfault the sandbox runtime.
- uses: actions/setup-node@v4
with:
node-version: "22"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify mutable action refs and missing persist-credentials in this workflow.
set -euo pipefail
file=".github/workflows/ci.yml"

echo "== uses refs not pinned to SHA =="
rg -n '^\s*-\s*uses:\s*[^@]+@v[0-9]+' "$file" || true

echo
echo "== checkout steps lacking persist-credentials: false =="
python - <<'PY'
import re, pathlib
p = pathlib.Path(".github/workflows/ci.yml")
lines = p.read_text().splitlines()
for i,l in enumerate(lines,1):
    if re.search(r'^\s*-\s*uses:\s*actions/checkout@', l):
        window = "\n".join(lines[i:i+8])
        if "persist-credentials: false" not in window:
            print(f"checkout without persist-credentials:false near line {i}")
PY

Repository: lobu-ai/lobu

Length of output: 1756


🏁 Script executed:

#!/bin/bash
set -euo pipefail
file=".github/workflows/ci.yml"

echo "== Context around lines 160-190 =="
nl -ba "$file" | sed -n '150,200p'

echo
echo "== Context around all actions/checkout steps =="
python - <<'PY'
import pathlib, re
p = pathlib.Path(".github/workflows/ci.yml")
lines = p.read_text().splitlines()
for i,l in enumerate(lines,1):
    if re.search(r'^\s*-\s*uses:\s*actions/checkout@', l):
        start=max(1,i-3); end=min(len(lines), i+12)
        print(f"\n--- checkout near line {i} (show {start}-{end}) ---")
        for ln in range(start,end+1):
            print(f"{ln:5d}: {lines[ln-1]}")
PY

Repository: lobu-ai/lobu

Length of output: 152


🏁 Script executed:

#!/bin/bash
set -euo pipefail
file=".github/workflows/ci.yml"

echo "== Context around lines 160-190 (cat -n) =="
cat -n "$file" | sed -n '150,200p'

echo
echo "== Exact checkout blocks (uses actions/checkout@) with nearby lines =="
python - <<'PY'
import pathlib, re
p = pathlib.Path(".github/workflows/ci.yml")
lines = p.read_text().splitlines()
for i,l in enumerate(lines,1):
    if re.search(r'^\s*-\s*uses:\s*actions/checkout@', l):
        start=max(1,i-2); end=min(len(lines), i+14)
        print(f"\n--- checkout near line {i} (show {start}-{end}) ---")
        for ln in range(start,end+1):
            print(f"{ln:5d}: {lines[ln-1]}")
PY

Repository: lobu-ai/lobu

Length of output: 11098


Pin CI actions by commit SHA and disable checkout credential persistence

In cli-smoke (steps around lines 167-181), actions/checkout@v4, oven-sh/setup-bun@v2, and actions/setup-node@v4 use mutable tags, and the actions/checkout step does not set persist-credentials: false, increasing token exposure risk. The same actions/checkout@v4 + missing persist-credentials: false pattern appears in multiple other jobs in ci.yml.

Suggested patch
-      - uses: actions/checkout@v4
+      - uses: actions/checkout@<pinned-sha>
+        with:
+          persist-credentials: false

-      - uses: oven-sh/setup-bun@v2
+      - uses: oven-sh/setup-bun@<pinned-sha>
         with:
           bun-version: 1.3.5

-      - uses: actions/setup-node@v4
+      - uses: actions/setup-node@<pinned-sha>
         with:
           node-version: "22"
🧰 Tools
🪛 zizmor (1.25.2)

[warning] 167-167: credential persistence through GitHub Actions artifacts (artipacked): does not set persist-credentials: false

(artipacked)


[error] 167-167: unpinned action reference (unpinned-uses): action is not pinned to a hash (required by blanket policy)

(unpinned-uses)


[error] 173-173: unpinned action reference (unpinned-uses): action is not pinned to a hash (required by blanket policy)

(unpinned-uses)


[error] 179-179: unpinned action reference (unpinned-uses): action is not pinned to a hash (required by blanket policy)

(unpinned-uses)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/ci.yml around lines 167 - 181, In the cli-smoke job,
replace mutable action tags with fixed commit SHAs for actions/checkout,
oven-sh/setup-bun and actions/setup-node and set actions/checkout to
persist-credentials: false; specifically update the step that references
actions/checkout@v4 to include persist-credentials: false and swap the three
action references (actions/checkout, oven-sh/setup-bun, actions/setup-node) to
their corresponding commit SHAs so the workflow is pinned and tokens aren't
persisted. Ensure you make the same changes for other jobs in ci.yml that use
actions/checkout@v4 without persist-credentials and any other mutable action
tags.

Comment thread scripts/cli-smoke.sh
Comment on lines +102 to +109
# expect_fail_grep <desc> <marker> <cwd> <args...> -> graceful failure:
# non-zero exit AND the documented error message present (no crash/stack trace).
expect_fail_grep() {
local desc="$1" marker="$2" cwd="$3"; shift 3
runlobu "$cwd" "$@"
if [ "$RC" -ne 0 ] && grep -qiF -- "$marker" "$OUT"; then pass "$desc (graceful: $marker)"
else softfail "$desc (expected non-zero exit + '$marker', got exit=$RC) [lobu $*]"; fi
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Graceful-failure assertions don’t currently enforce “no stack trace.”

Line 103 says crashes/stack traces should fail, but expect_fail_grep only checks exit code + message marker. A command can still emit a stack trace and pass this gate.

Suggested patch
 expect_fail_grep() {
   local desc="$1" marker="$2" cwd="$3"; shift 3
   runlobu "$cwd" "$@"
-  if [ "$RC" -ne 0 ] && grep -qiF -- "$marker" "$OUT"; then pass "$desc (graceful: $marker)"
+  if [ "$RC" -ne 0 ] \
+    && grep -qiF -- "$marker" "$OUT" \
+    && ! grep -qiE '(^\s*at\s+.*\(.+\)$|Traceback \(most recent call last\)|UnhandledPromiseRejection|panic:|SyntaxError:)' "$OUT"; then
+    pass "$desc (graceful: $marker)"
   else softfail "$desc (expected non-zero exit + '$marker', got exit=$RC) [lobu $*]"; fi
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# expect_fail_grep <desc> <marker> <cwd> <args...> -> graceful failure:
# non-zero exit AND the documented error message present (no crash/stack trace).
expect_fail_grep() {
local desc="$1" marker="$2" cwd="$3"; shift 3
runlobu "$cwd" "$@"
if [ "$RC" -ne 0 ] && grep -qiF -- "$marker" "$OUT"; then pass "$desc (graceful: $marker)"
else softfail "$desc (expected non-zero exit + '$marker', got exit=$RC) [lobu $*]"; fi
}
# expect_fail_grep <desc> <marker> <cwd> <args...> -> graceful failure:
# non-zero exit AND the documented error message present (no crash/stack trace).
expect_fail_grep() {
local desc="$1" marker="$2" cwd="$3"; shift 3
runlobu "$cwd" "$@"
if [ "$RC" -ne 0 ] \
&& grep -qiF -- "$marker" "$OUT" \
&& ! grep -qiE '(^\s*at\s+.*\(.+\)$|Traceback \(most recent call last\)|UnhandledPromiseRejection|panic:|SyntaxError:)' "$OUT"; then
pass "$desc (graceful: $marker)"
else softfail "$desc (expected non-zero exit + '$marker', got exit=$RC) [lobu $*]"; fi
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@scripts/cli-smoke.sh` around lines 102 - 109, The expect_fail_grep function
currently only verifies non-zero exit and presence of marker but doesn't reject
crashes/stack traces; update expect_fail_grep to also scan the captured output
variable OUT for common stack-trace signatures (e.g. case-insensitive patterns
like "traceback", "stack trace", "panic:", "Exception", "at " with file:line,
etc.) after runlobu and before passing; if any such pattern matches, call
softfail with a clear message (including the matched pattern and original "$RC")
instead of pass, otherwise proceed with the existing marker check. Ensure you
update the logic around the RC/OUT checks in expect_fail_grep and reference the
variables/function names expect_fail_grep, runlobu, RC, and OUT.

Comment thread scripts/cli-smoke.sh
Comment on lines +148 to +150
VERSION="$(node "$LOBU_BIN" --version 2>/dev/null | tr -d '[:space:]')"
expect_grep "lobu --version" "$VERSION" "$WT" --version
expect_grep "lobu --help" "CLI for deploying and managing AI agents on Lobu" "$WT" --help
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Guard against empty VERSION before using it as a required marker.

If Line 148 fails and VERSION is empty, Line 149 can pass trivially on any successful output. Add a non-empty precheck.

Suggested patch
 VERSION="$(node "$LOBU_BIN" --version 2>/dev/null | tr -d '[:space:]')"
+[ -n "$VERSION" ] || die "could not resolve lobu --version output"
 expect_grep "lobu --version" "$VERSION" "$WT" --version
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
VERSION="$(node "$LOBU_BIN" --version 2>/dev/null | tr -d '[:space:]')"
expect_grep "lobu --version" "$VERSION" "$WT" --version
expect_grep "lobu --help" "CLI for deploying and managing AI agents on Lobu" "$WT" --help
VERSION="$(node "$LOBU_BIN" --version 2>/dev/null | tr -d '[:space:]')"
[ -n "$VERSION" ] || { echo "could not resolve lobu --version output" >&2; exit 1; }
expect_grep "lobu --version" "$VERSION" "$WT" --version
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@scripts/cli-smoke.sh` around lines 148 - 150, The test uses VERSION=$(node
"$LOBU_BIN" --version ...) then immediately uses that value as a required marker
in expect_grep; add a non-empty guard for VERSION before calling expect_grep to
prevent a false-positive when the node call fails. Specifically, after the
assignment to VERSION (and before the expect_grep lines) check that VERSION is
not empty (e.g., test -n "$VERSION" or [[ -z "$VERSION" ]] and exit with a clear
error message), and only call expect_grep "lobu --version" "$VERSION" "$WT"
--version when the VERSION variable is present; keep the existing LOBU_BIN and
expect_grep usages unchanged except gated by this precheck.

scripts/cli-smoke.sh boots one local lobu run (embedded Postgres + the
deterministic mock provider from scripts/sdk-e2e/) under an isolated HOME and
walks every lobu command/subcommand once, asserting each runs or fails
gracefully. Complements sdk-e2e.sh (SDK lifecycle) with breadth across the whole
command surface: init/validate/doctor/telemetry, context/org/token, agent CRUD,
call, chat (json/new/dry-run/continue), apply, link, memory (health/run/exec/
seed/configure/init/org), login/logout, connector. Browser/TTY/external-PG paths
are exercised at the runs-gracefully level or skipped with a reason.

Wired as the cli-smoke CI job (mirrors sdk-e2e) and make test-e2e-cli. Biome
ignores the .cli-smoke-run/.sdk-e2e-run/.managed-e2e-run scratch dirs.
@buremba buremba force-pushed the feat/cli-smoke-coverage branch from c94e351 to 487444b Compare May 26, 2026 18:01
@buremba buremba merged commit 7f2ae5e into main May 26, 2026
21 of 22 checks passed
@buremba buremba deleted the feat/cli-smoke-coverage branch May 26, 2026 18:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants