test(cli): add command-coverage smoke gate#1086
Conversation
|
Caution Review failedPull request was closed or merged during review 📝 WalkthroughWalkthroughAdds a CI job and Make target plus a bash smoke script that boots an isolated Lobu stack with a mock provider, runs broad lobu CLI command coverage across identity, agents, memory, chat, and org flows, and reports pass/fail/skip. ChangesCLI Smoke Test Infrastructure and Execution
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
|
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
|
bug_free 91, simplicity 82, slop 0, bugs 0, 0 blockers Typecheck, unit, and integration logs all passed. Reviewed diff, ran bash -n scripts/cli-smoke.sh, then ran full scripts/cli-smoke.sh: 66 passed, 0 failed, 4 intentional skips. Full verdict JSON{
"bug_free_confidence": 91,
"bugs": 0,
"slop": 0,
"simplicity": 82,
"blockers": [],
"change_type": "test",
"behavior_change_risk": "low",
"tests_adequate": true,
"suggested_fixes": [],
"notes": "Typecheck, unit, and integration logs all passed. Reviewed diff, ran bash -n scripts/cli-smoke.sh, then ran full scripts/cli-smoke.sh: 66 passed, 0 failed, 4 intentional skips.",
"categories": {
"src": 0,
"tests": 375,
"docs": 0,
"config": 14,
"deps": 0,
"migrations": 0,
"ci": 46,
"generated": 0
}
}Local review gate — branch protection can require the |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.github/workflows/ci.yml:
- Around line 167-181: In the cli-smoke job, replace mutable action tags with
fixed commit SHAs for actions/checkout, oven-sh/setup-bun and actions/setup-node
and set actions/checkout to persist-credentials: false; specifically update the
step that references actions/checkout@v4 to include persist-credentials: false
and swap the three action references (actions/checkout, oven-sh/setup-bun,
actions/setup-node) to their corresponding commit SHAs so the workflow is pinned
and tokens aren't persisted. Ensure you make the same changes for other jobs in
ci.yml that use actions/checkout@v4 without persist-credentials and any other
mutable action tags.
In `@scripts/cli-smoke.sh`:
- Around line 102-109: The expect_fail_grep function currently only verifies
non-zero exit and presence of marker but doesn't reject crashes/stack traces;
update expect_fail_grep to also scan the captured output variable OUT for common
stack-trace signatures (e.g. case-insensitive patterns like "traceback", "stack
trace", "panic:", "Exception", "at " with file:line, etc.) after runlobu and
before passing; if any such pattern matches, call softfail with a clear message
(including the matched pattern and original "$RC") instead of pass, otherwise
proceed with the existing marker check. Ensure you update the logic around the
RC/OUT checks in expect_fail_grep and reference the variables/function names
expect_fail_grep, runlobu, RC, and OUT.
- Around line 148-150: The test uses VERSION=$(node "$LOBU_BIN" --version ...)
then immediately uses that value as a required marker in expect_grep; add a
non-empty guard for VERSION before calling expect_grep to prevent a
false-positive when the node call fails. Specifically, after the assignment to
VERSION (and before the expect_grep lines) check that VERSION is not empty
(e.g., test -n "$VERSION" or [[ -z "$VERSION" ]] and exit with a clear error
message), and only call expect_grep "lobu --version" "$VERSION" "$WT" --version
when the VERSION variable is present; keep the existing LOBU_BIN and expect_grep
usages unchanged except gated by this precheck.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: 0df7d752-2539-429d-a81d-46bf05cf80c4
📒 Files selected for processing (5)
.github/workflows/ci.yml.gitignoreMakefileconfig/biome.config.jsonscripts/cli-smoke.sh
| - uses: actions/checkout@v4 | ||
|
|
||
| - uses: ./.github/actions/setup-submodule | ||
| with: | ||
| deploy-key: ${{ secrets.OWLETTO_WEB_DEPLOY_KEY }} | ||
|
|
||
| - uses: oven-sh/setup-bun@v2 | ||
| with: | ||
| bun-version: 1.3.5 | ||
|
|
||
| # Node 22 — the worker uses isolated-vm (abi127 prebuild); the runner's | ||
| # default Node major would segfault the sandbox runtime. | ||
| - uses: actions/setup-node@v4 | ||
| with: | ||
| node-version: "22" |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify mutable action refs and missing persist-credentials in this workflow.
set -euo pipefail
file=".github/workflows/ci.yml"
echo "== uses refs not pinned to SHA =="
rg -n '^\s*-\s*uses:\s*[^@]+@v[0-9]+' "$file" || true
echo
echo "== checkout steps lacking persist-credentials: false =="
python - <<'PY'
import re, pathlib
p = pathlib.Path(".github/workflows/ci.yml")
lines = p.read_text().splitlines()
for i,l in enumerate(lines,1):
if re.search(r'^\s*-\s*uses:\s*actions/checkout@', l):
window = "\n".join(lines[i:i+8])
if "persist-credentials: false" not in window:
print(f"checkout without persist-credentials:false near line {i}")
PYRepository: lobu-ai/lobu
Length of output: 1756
🏁 Script executed:
#!/bin/bash
set -euo pipefail
file=".github/workflows/ci.yml"
echo "== Context around lines 160-190 =="
nl -ba "$file" | sed -n '150,200p'
echo
echo "== Context around all actions/checkout steps =="
python - <<'PY'
import pathlib, re
p = pathlib.Path(".github/workflows/ci.yml")
lines = p.read_text().splitlines()
for i,l in enumerate(lines,1):
if re.search(r'^\s*-\s*uses:\s*actions/checkout@', l):
start=max(1,i-3); end=min(len(lines), i+12)
print(f"\n--- checkout near line {i} (show {start}-{end}) ---")
for ln in range(start,end+1):
print(f"{ln:5d}: {lines[ln-1]}")
PYRepository: lobu-ai/lobu
Length of output: 152
🏁 Script executed:
#!/bin/bash
set -euo pipefail
file=".github/workflows/ci.yml"
echo "== Context around lines 160-190 (cat -n) =="
cat -n "$file" | sed -n '150,200p'
echo
echo "== Exact checkout blocks (uses actions/checkout@) with nearby lines =="
python - <<'PY'
import pathlib, re
p = pathlib.Path(".github/workflows/ci.yml")
lines = p.read_text().splitlines()
for i,l in enumerate(lines,1):
if re.search(r'^\s*-\s*uses:\s*actions/checkout@', l):
start=max(1,i-2); end=min(len(lines), i+14)
print(f"\n--- checkout near line {i} (show {start}-{end}) ---")
for ln in range(start,end+1):
print(f"{ln:5d}: {lines[ln-1]}")
PYRepository: lobu-ai/lobu
Length of output: 11098
Pin CI actions by commit SHA and disable checkout credential persistence
In cli-smoke (steps around lines 167-181), actions/checkout@v4, oven-sh/setup-bun@v2, and actions/setup-node@v4 use mutable tags, and the actions/checkout step does not set persist-credentials: false, increasing token exposure risk. The same actions/checkout@v4 + missing persist-credentials: false pattern appears in multiple other jobs in ci.yml.
Suggested patch
- - uses: actions/checkout@v4
+ - uses: actions/checkout@<pinned-sha>
+ with:
+ persist-credentials: false
- - uses: oven-sh/setup-bun@v2
+ - uses: oven-sh/setup-bun@<pinned-sha>
with:
bun-version: 1.3.5
- - uses: actions/setup-node@v4
+ - uses: actions/setup-node@<pinned-sha>
with:
node-version: "22"🧰 Tools
🪛 zizmor (1.25.2)
[warning] 167-167: credential persistence through GitHub Actions artifacts (artipacked): does not set persist-credentials: false
(artipacked)
[error] 167-167: unpinned action reference (unpinned-uses): action is not pinned to a hash (required by blanket policy)
(unpinned-uses)
[error] 173-173: unpinned action reference (unpinned-uses): action is not pinned to a hash (required by blanket policy)
(unpinned-uses)
[error] 179-179: unpinned action reference (unpinned-uses): action is not pinned to a hash (required by blanket policy)
(unpinned-uses)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In @.github/workflows/ci.yml around lines 167 - 181, In the cli-smoke job,
replace mutable action tags with fixed commit SHAs for actions/checkout,
oven-sh/setup-bun and actions/setup-node and set actions/checkout to
persist-credentials: false; specifically update the step that references
actions/checkout@v4 to include persist-credentials: false and swap the three
action references (actions/checkout, oven-sh/setup-bun, actions/setup-node) to
their corresponding commit SHAs so the workflow is pinned and tokens aren't
persisted. Ensure you make the same changes for other jobs in ci.yml that use
actions/checkout@v4 without persist-credentials and any other mutable action
tags.
| # expect_fail_grep <desc> <marker> <cwd> <args...> -> graceful failure: | ||
| # non-zero exit AND the documented error message present (no crash/stack trace). | ||
| expect_fail_grep() { | ||
| local desc="$1" marker="$2" cwd="$3"; shift 3 | ||
| runlobu "$cwd" "$@" | ||
| if [ "$RC" -ne 0 ] && grep -qiF -- "$marker" "$OUT"; then pass "$desc (graceful: $marker)" | ||
| else softfail "$desc (expected non-zero exit + '$marker', got exit=$RC) [lobu $*]"; fi | ||
| } |
There was a problem hiding this comment.
Graceful-failure assertions don’t currently enforce “no stack trace.”
Line 103 says crashes/stack traces should fail, but expect_fail_grep only checks exit code + message marker. A command can still emit a stack trace and pass this gate.
Suggested patch
expect_fail_grep() {
local desc="$1" marker="$2" cwd="$3"; shift 3
runlobu "$cwd" "$@"
- if [ "$RC" -ne 0 ] && grep -qiF -- "$marker" "$OUT"; then pass "$desc (graceful: $marker)"
+ if [ "$RC" -ne 0 ] \
+ && grep -qiF -- "$marker" "$OUT" \
+ && ! grep -qiE '(^\s*at\s+.*\(.+\)$|Traceback \(most recent call last\)|UnhandledPromiseRejection|panic:|SyntaxError:)' "$OUT"; then
+ pass "$desc (graceful: $marker)"
else softfail "$desc (expected non-zero exit + '$marker', got exit=$RC) [lobu $*]"; fi
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # expect_fail_grep <desc> <marker> <cwd> <args...> -> graceful failure: | |
| # non-zero exit AND the documented error message present (no crash/stack trace). | |
| expect_fail_grep() { | |
| local desc="$1" marker="$2" cwd="$3"; shift 3 | |
| runlobu "$cwd" "$@" | |
| if [ "$RC" -ne 0 ] && grep -qiF -- "$marker" "$OUT"; then pass "$desc (graceful: $marker)" | |
| else softfail "$desc (expected non-zero exit + '$marker', got exit=$RC) [lobu $*]"; fi | |
| } | |
| # expect_fail_grep <desc> <marker> <cwd> <args...> -> graceful failure: | |
| # non-zero exit AND the documented error message present (no crash/stack trace). | |
| expect_fail_grep() { | |
| local desc="$1" marker="$2" cwd="$3"; shift 3 | |
| runlobu "$cwd" "$@" | |
| if [ "$RC" -ne 0 ] \ | |
| && grep -qiF -- "$marker" "$OUT" \ | |
| && ! grep -qiE '(^\s*at\s+.*\(.+\)$|Traceback \(most recent call last\)|UnhandledPromiseRejection|panic:|SyntaxError:)' "$OUT"; then | |
| pass "$desc (graceful: $marker)" | |
| else softfail "$desc (expected non-zero exit + '$marker', got exit=$RC) [lobu $*]"; fi | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@scripts/cli-smoke.sh` around lines 102 - 109, The expect_fail_grep function
currently only verifies non-zero exit and presence of marker but doesn't reject
crashes/stack traces; update expect_fail_grep to also scan the captured output
variable OUT for common stack-trace signatures (e.g. case-insensitive patterns
like "traceback", "stack trace", "panic:", "Exception", "at " with file:line,
etc.) after runlobu and before passing; if any such pattern matches, call
softfail with a clear message (including the matched pattern and original "$RC")
instead of pass, otherwise proceed with the existing marker check. Ensure you
update the logic around the RC/OUT checks in expect_fail_grep and reference the
variables/function names expect_fail_grep, runlobu, RC, and OUT.
| VERSION="$(node "$LOBU_BIN" --version 2>/dev/null | tr -d '[:space:]')" | ||
| expect_grep "lobu --version" "$VERSION" "$WT" --version | ||
| expect_grep "lobu --help" "CLI for deploying and managing AI agents on Lobu" "$WT" --help |
There was a problem hiding this comment.
Guard against empty VERSION before using it as a required marker.
If Line 148 fails and VERSION is empty, Line 149 can pass trivially on any successful output. Add a non-empty precheck.
Suggested patch
VERSION="$(node "$LOBU_BIN" --version 2>/dev/null | tr -d '[:space:]')"
+[ -n "$VERSION" ] || die "could not resolve lobu --version output"
expect_grep "lobu --version" "$VERSION" "$WT" --version📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| VERSION="$(node "$LOBU_BIN" --version 2>/dev/null | tr -d '[:space:]')" | |
| expect_grep "lobu --version" "$VERSION" "$WT" --version | |
| expect_grep "lobu --help" "CLI for deploying and managing AI agents on Lobu" "$WT" --help | |
| VERSION="$(node "$LOBU_BIN" --version 2>/dev/null | tr -d '[:space:]')" | |
| [ -n "$VERSION" ] || { echo "could not resolve lobu --version output" >&2; exit 1; } | |
| expect_grep "lobu --version" "$VERSION" "$WT" --version |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@scripts/cli-smoke.sh` around lines 148 - 150, The test uses VERSION=$(node
"$LOBU_BIN" --version ...) then immediately uses that value as a required marker
in expect_grep; add a non-empty guard for VERSION before calling expect_grep to
prevent a false-positive when the node call fails. Specifically, after the
assignment to VERSION (and before the expect_grep lines) check that VERSION is
not empty (e.g., test -n "$VERSION" or [[ -z "$VERSION" ]] and exit with a clear
error message), and only call expect_grep "lobu --version" "$VERSION" "$WT"
--version when the VERSION variable is present; keep the existing LOBU_BIN and
expect_grep usages unchanged except gated by this precheck.
scripts/cli-smoke.sh boots one local lobu run (embedded Postgres + the deterministic mock provider from scripts/sdk-e2e/) under an isolated HOME and walks every lobu command/subcommand once, asserting each runs or fails gracefully. Complements sdk-e2e.sh (SDK lifecycle) with breadth across the whole command surface: init/validate/doctor/telemetry, context/org/token, agent CRUD, call, chat (json/new/dry-run/continue), apply, link, memory (health/run/exec/ seed/configure/init/org), login/logout, connector. Browser/TTY/external-PG paths are exercised at the runs-gracefully level or skipped with a reason. Wired as the cli-smoke CI job (mirrors sdk-e2e) and make test-e2e-cli. Biome ignores the .cli-smoke-run/.sdk-e2e-run/.managed-e2e-run scratch dirs.
c94e351 to
487444b
Compare
What
Adds
scripts/cli-smoke.sh, a command-coverage smoke gate for thelobuCLI, plus thecli-smokeCI job and amake test-e2e-clitarget.sdk-e2e.shalready proves the SDK lifecycle (init → run → apply → prune → worker turn → connector → watcher → client) down one deep path. It does not prove that the rest of the command surface even runs. This gate fills that: it boots one locallobu run(embedded Postgres + the deterministic mock provider fromscripts/sdk-e2e/, no API key) under an isolated$HOME, then invokes everylobucommand/subcommand once and asserts each either returns its documented success marker or fails gracefully (clean message + controlled exit, no stack trace). It collects all failures and prints a summary instead of failing on the first miss, so one run tells you exactly what's broken.Coverage
66 checks, 4 documented skips. Commands hit: version/help, init (+
--list-providers,--from-org), validate (+broken-config negative), doctor (+--memory-only), telemetry on/off/status, context (list/current/add/use/rm), whoami, status, org list/current/set, token (print/--raw/create/revoke), agent (list/get/create/scaffold/update/delete/config get/config patch + negatives), call (--list+ a tool call), link/unlink, apply (--onlyvalidation +--dry-run+ real), chat (--json/--new/--dry-run/-C), memory (health/run/exec/seed/configure/init/org), login (--token+ non-interactive device-code bail)/logout, connector (runtime-self-check +run --check).Skipped with a logged reason:
org create(opens a browser),memory browser-authcapture (Chrome + Keychain), interactive device-code login (needs a TTY),chat --user(needs a platform connection).Notes
$HOMEso it never touches the dev's real~/.config/lobucontexts/credentials or~/.lobudata.local-context commands use the loopback/api/local-initbootstrap (same pathsdk-e2erelies on), so no real login is needed.cli-smokeCI job mirrorssdk-e2e(node 22, ICU shim, bubblewrap,make build-packages)..cli-smoke-run/.sdk-e2e-run/.managed-e2e-runscratch dirs.Validated locally on macOS/node@22: 66 passed, 0 failed, 4 skipped (ran it three times).
make reviewran it independently with the same result.Summary by CodeRabbit
Tests
Chores