fix(apply): surface config path, auto-load .env, schema-prep for per-org agent IDs#734
Conversation
…p for per-org agent IDs Three related fixes that surfaced while debugging why `lobu apply` silently failed against the office-bot project: 1. **`Invalid lobu.toml` → `Invalid <absolute path>`**: schema and TOML-syntax errors now print the file path so users can jump straight to the offending config (`packages/cli/src/config/loader.ts`). 2. **`lobu apply` auto-loads `.env`** from the project dir before resolving `$VAR` refs in lobu.toml — same shape as `lobu dev`. Shell env still wins over .env values. Removes the "Missing required secret" cliff users hit when the toml references a key that lives only in a project-local .env (`packages/cli/src/commands/_lib/apply/apply-cmd.ts`). 3. **Phase A of moving `agents` to a per-org PK** (non-breaking): adds `organization_id text NULL` to the 5 child tables that FK on agents.id (`agent_grants`, `agent_connections`, `agent_users`, `agent_channel_bindings`, `grants`), backfills from agents, adds a parallel `UNIQUE (organization_id, id)` on agents, plus composite indexes on each child for the upcoming org-scoped query patterns. Single-column PK and FKs stay; app code is unchanged. Phases B (plumb organization_id through ~67 storage call sites) and C (drop old PK + composite FK swap) ship in separate PRs. Also drops the now-rejected `provider = "lobu-public"` key from examples/office-bot/lobu.toml and points its `[memory] org` at `lobu-team` where the project actually lives.
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughAdds nullable ChangesOrganization-scoped agent schema migration
CLI environment configuration and error reporting
Example configuration updates
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
Output of `dbmate --migrations-dir db/migrations --schema-file db/schema.sql up` against a fresh pgvector/pgvector:pg16 followed by `scripts/normalize-schema.sh`. Required by the `migrations` CI job's drift check.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: c0cfe52662
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| ALTER TABLE agent_grants ADD COLUMN organization_id text; | ||
| ALTER TABLE agent_connections ADD COLUMN organization_id text; | ||
| ALTER TABLE agent_users ADD COLUMN organization_id text; | ||
| ALTER TABLE agent_channel_bindings ADD COLUMN organization_id text; | ||
| ALTER TABLE grants ADD COLUMN organization_id text; |
There was a problem hiding this comment.
Include scheduled_jobs in the agent org backfill
For deployments that have scheduled jobs created by an agent, this Phase A prep omits the existing scheduled_jobs.created_by_agent FK to agents(id) (db/schema.sql:4783). That leaves schedules without a backfilled agent-organization column path, so the later composite FK/PK swap cannot drop the single-column agents(id) dependency without either failing on that FK or preserving global agent-id uniqueness for scheduled agents; include scheduled_jobs in this migration and the embedded patch mirror while it already carries organization_id.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
packages/cli/src/config/loader.ts (1)
39-49:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winUse an absolute
configPathso the new error text is always unambiguous.These messages depend on
configPath, butjoin(cwd, CONFIG_FILENAME)can stay relative whencwdis relative. That weakens the “full resolved path” goal.💡 Suggested patch
-import { join } from "node:path"; +import { resolve } from "node:path"; @@ - const configPath = join(cwd, CONFIG_FILENAME); + const configPath = resolve(cwd, CONFIG_FILENAME);🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/cli/src/config/loader.ts` around lines 39 - 49, Error messages use the possibly-relative configPath which can be ambiguous; before parsing/validation and before returning errors in the block around lobuConfigSchema.safeParse, resolve the config path to an absolute path (e.g., use path.resolve or similar) and use that absolute path variable in the error strings and details so messages always show the full resolved file path; update references to configPath in the TOML syntax error and the validation error return to use the new absolute path variable.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@packages/cli/src/commands/_lib/apply/apply-cmd.ts`:
- Around line 93-111: The loadProjectEnvFile function currently calls
parseEnvContent(raw) which can throw and abort the apply flow; wrap the
parseEnvContent call in a try/catch so malformed .env content is swallowed (or
logged at debug) and the function returns early, preserving the existing
behavior of not overriding process.env and allowing checkRequiredSecrets to run;
update the code paths around parseEnvContent/raw so any parse exception does not
propagate out of loadProjectEnvFile.
---
Outside diff comments:
In `@packages/cli/src/config/loader.ts`:
- Around line 39-49: Error messages use the possibly-relative configPath which
can be ambiguous; before parsing/validation and before returning errors in the
block around lobuConfigSchema.safeParse, resolve the config path to an absolute
path (e.g., use path.resolve or similar) and use that absolute path variable in
the error strings and details so messages always show the full resolved file
path; update references to configPath in the TOML syntax error and the
validation error return to use the new absolute path variable.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: 86f3df96-926b-4bef-869d-875a294f6de5
📒 Files selected for processing (5)
db/migrations/20260515120000_agents_per_org_pk.sqlexamples/office-bot/lobu.tomlpackages/cli/src/commands/_lib/apply/apply-cmd.tspackages/cli/src/config/loader.tspackages/server/src/db/embedded-schema-patches.ts
| /** | ||
| * Merge `.env` values from the project dir into `process.env` (without | ||
| * overriding values already set in the shell). Quietly noop if the file | ||
| * doesn't exist or can't be parsed — `checkRequiredSecrets` will surface a | ||
| * clear "Missing required secret" error downstream. | ||
| */ | ||
| async function loadProjectEnvFile(cwd: string): Promise<void> { | ||
| const envPath = join(cwd, ".env"); | ||
| let raw: string; | ||
| try { | ||
| raw = await readFile(envPath, "utf-8"); | ||
| } catch { | ||
| return; | ||
| } | ||
| const vars = parseEnvContent(raw); | ||
| for (const [key, value] of Object.entries(vars)) { | ||
| if (process.env[key] === undefined) process.env[key] = value; | ||
| } | ||
| } |
There was a problem hiding this comment.
Handle .env parse errors so apply keeps the intended fallback behavior.
parseEnvContent(raw) can throw, and that currently escapes before checkRequiredSecrets. A malformed .env will hard-fail lobu apply instead of gracefully falling through to the existing missing-secret reporting path.
💡 Suggested patch
async function loadProjectEnvFile(cwd: string): Promise<void> {
const envPath = join(cwd, ".env");
let raw: string;
try {
raw = await readFile(envPath, "utf-8");
} catch {
return;
}
- const vars = parseEnvContent(raw);
+ let vars: Record<string, string>;
+ try {
+ vars = parseEnvContent(raw);
+ } catch {
+ return;
+ }
for (const [key, value] of Object.entries(vars)) {
if (process.env[key] === undefined) process.env[key] = value;
}
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| /** | |
| * Merge `.env` values from the project dir into `process.env` (without | |
| * overriding values already set in the shell). Quietly noop if the file | |
| * doesn't exist or can't be parsed — `checkRequiredSecrets` will surface a | |
| * clear "Missing required secret" error downstream. | |
| */ | |
| async function loadProjectEnvFile(cwd: string): Promise<void> { | |
| const envPath = join(cwd, ".env"); | |
| let raw: string; | |
| try { | |
| raw = await readFile(envPath, "utf-8"); | |
| } catch { | |
| return; | |
| } | |
| const vars = parseEnvContent(raw); | |
| for (const [key, value] of Object.entries(vars)) { | |
| if (process.env[key] === undefined) process.env[key] = value; | |
| } | |
| } | |
| /** | |
| * Merge `.env` values from the project dir into `process.env` (without | |
| * overriding values already set in the shell). Quietly noop if the file | |
| * doesn't exist or can't be parsed — `checkRequiredSecrets` will surface a | |
| * clear "Missing required secret" error downstream. | |
| */ | |
| async function loadProjectEnvFile(cwd: string): Promise<void> { | |
| const envPath = join(cwd, ".env"); | |
| let raw: string; | |
| try { | |
| raw = await readFile(envPath, "utf-8"); | |
| } catch { | |
| return; | |
| } | |
| let vars: Record<string, string>; | |
| try { | |
| vars = parseEnvContent(raw); | |
| } catch { | |
| return; | |
| } | |
| for (const [key, value] of Object.entries(vars)) { | |
| if (process.env[key] === undefined) process.env[key] = value; | |
| } | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@packages/cli/src/commands/_lib/apply/apply-cmd.ts` around lines 93 - 111, The
loadProjectEnvFile function currently calls parseEnvContent(raw) which can throw
and abort the apply flow; wrap the parseEnvContent call in a try/catch so
malformed .env content is swallowed (or logged at debug) and the function
returns early, preserving the existing behavior of not overriding process.env
and allowing checkRequiredSecrets to run; update the code paths around
parseEnvContent/raw so any parse exception does not propagate out of
loadProjectEnvFile.
|
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
…_dump elided length
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@db/schema.sql`:
- Around line 116-117: Backfill any NULL organization_id values, then add NOT
NULL and a cross-row consistency CHECK to each table/column pair that introduced
organization_id/agent_id (e.g., the columns named organization_id and agent_id
near created_at) so new writes cannot store NULL or mismatched pairs;
specifically, run an UPDATE to set organization_id for existing rows, then ALTER
TABLE to set organization_id NOT NULL and add a CHECK constraint like: agent_id
IS NULL OR organization_id = (SELECT organization_id FROM agents WHERE agents.id
= agent_id) (or equivalent using your agents table PK), and include the same
pattern for the other affected columns/locations (the other occurrences noted in
the comment) so tenant isolation is enforced going forward.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
| created_at timestamp with time zone DEFAULT now() NOT NULL, | ||
| organization_id text |
There was a problem hiding this comment.
Enforce org/agent consistency for the new org-scoped columns.
These columns are backfilled once, but new writes can still store NULL or mismatched (organization_id, agent_id) pairs. That creates tenant-isolation drift once reads rely on org-scoped filters.
Suggested migration pattern
+-- 1) Keep new writes consistent (example approach: trigger to derive org from agents)
+-- 2) Add composite FKs (initially NOT VALID to reduce rollout risk)
+ALTER TABLE public.agent_channel_bindings
+ ADD CONSTRAINT agent_channel_bindings_org_agent_fkey
+ FOREIGN KEY (organization_id, agent_id)
+ REFERENCES public.agents (organization_id, id)
+ ON DELETE CASCADE
+ NOT VALID;
+
+ALTER TABLE public.agent_connections
+ ADD CONSTRAINT agent_connections_org_agent_fkey
+ FOREIGN KEY (organization_id, agent_id)
+ REFERENCES public.agents (organization_id, id)
+ ON DELETE CASCADE
+ NOT VALID;
+
+ALTER TABLE public.agent_grants
+ ADD CONSTRAINT agent_grants_org_agent_fkey
+ FOREIGN KEY (organization_id, agent_id)
+ REFERENCES public.agents (organization_id, id)
+ ON DELETE CASCADE
+ NOT VALID;
+
+ALTER TABLE public.agent_users
+ ADD CONSTRAINT agent_users_org_agent_fkey
+ FOREIGN KEY (organization_id, agent_id)
+ REFERENCES public.agents (organization_id, id)
+ ON DELETE CASCADE
+ NOT VALID;
+
+ALTER TABLE public.grants
+ ADD CONSTRAINT grants_org_agent_fkey
+ FOREIGN KEY (organization_id, agent_id)
+ REFERENCES public.agents (organization_id, id)
+ ON DELETE CASCADE
+ NOT VALID;Also applies to: 135-135, 149-150, 193-194, 1111-1112, 2248-2249
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@db/schema.sql` around lines 116 - 117, Backfill any NULL organization_id
values, then add NOT NULL and a cross-row consistency CHECK to each table/column
pair that introduced organization_id/agent_id (e.g., the columns named
organization_id and agent_id near created_at) so new writes cannot store NULL or
mismatched pairs; specifically, run an UPDATE to set organization_id for
existing rows, then ALTER TABLE to set organization_id NOT NULL and add a CHECK
constraint like: agent_id IS NULL OR organization_id = (SELECT organization_id
FROM agents WHERE agents.id = agent_id) (or equivalent using your agents table
PK), and include the same pattern for the other affected columns/locations (the
other occurrences noted in the comment) so tenant isolation is enforced going
forward.
Today `lobu apply` validates that `[[providers]] key = "$VAR"` resolves to
a real env var, then silently throws the value away. The agent row records
`installedProviders: [{providerId}]` but the actual API key never reaches
the gateway, so `lobu chat` against a freshly-applied agent dies with:
To use <model>, an admin needs to connect <provider> on the base agent.
The architectural reason was real — `auth_profiles` requires a `userId`
because user-owned credentials (BYOK, personal OAuth) are scoped to humans.
But that left no path for the most common case: an org-wide provider key
that should be wired declaratively.
This PR closes the gap with org-shared `agent_secrets` rows under a
documented naming convention `provider:<id>:apiKey`. Three pieces:
1. **Naming convention** — `packages/server/src/lobu/stores/provider-secrets.ts`
exports `providerOrgSecretName(providerId)` so the worker and apply agree
on the row name. No schema change; `agent_secrets` is already keyed by
`(organization_id, name)` and AES-256-GCM encrypted.
2. **Worker resolution order** — `base-provider-module.ts` now checks the
org-shared row as the second tier, between per-user `auth_profiles` and
deployment-wide `process.env`. Per-user BYOK still wins; `process.env`
stays as the final fallback for local dev.
3. **`lobu apply` writes the secret** — adds `PUT /api/<org>/agents/<id>/
providers/<providerId>/api-key` (admin-PAT gated, idempotent), and the
apply step pushes one row per provider that declares a `key` after the
settings PATCH. The resolved value lives only in process memory at apply
time; it's never serialized into the desired-state snapshot.
Tested:
- `bun test packages/cli/src/commands/_lib/apply` — 114 pass (1 new test
covers `providerKeys` resolution).
- `bun run typecheck` clean.
- `make build-packages` clean.
Manual end-to-end on prod: this PR's fix unblocks the `food-ordering` agent
that was stuck on the "needs admin to connect zai" error after PR #734
deployed it via apply. Re-running `lobu apply` after merge will land the
key automatically.
…shared-DB guard (#745) Cleans up four loose ends surfaced while debugging the food-ordering agent: 1. **format-lint green again** — `packages/landing/src/components/ FeatureGraphics.tsx:357` had a trailing blank line that biome rejected. Stripped via `bunx biome format --write`. The CI `format-lint` job has been red on `main` since this slipped in; this unblocks every PR's CI. 2. **integration green again** — `packages/server/src/utils/__tests__/ mcp-install-targets.test.ts:28` expected the legacy 7-element list of first-class MCP install targets, but the source list now has 9 (`skills` and `lobu-cli` were added without updating the test). Updated the expected array. CI `integration` job goes from red to green on `main`. 3. **AGENTS.md auth-scope clarification** — adds a paragraph to the "Browser-driven verification" section calling out that the forged session cookie authenticates only the web admin REST mounted at `/` (what `lobu apply` and the SPA hit), NOT the public Agent API at `/lobu` (which expects a JWT bearer from `lobu login` or a PAT). The missing distinction caused an hour of debugging in the previous session — anyone hitting `/lobu/api/v1/agents` with a cookie now gets pointed at the right auth path. 4. **`lobu run` refuses non-loopback DATABASE_URL inherited from shell** — `packages/cli/src/commands/dev.ts` checks at startup: if `DATABASE_URL` is set in `process.env` but NOT in the project's `.env`, AND its host isn't `localhost` / `127.0.0.1` / `::1`, refuse to start. The error spells out three remediations (pin in .env, unset, or pass the new `--unsafe-shared-db` escape hatch). Closes the silent footgun where a developer's parent shell carries a tailnet/prod URL into a freshly-cloned project's `lobu run`, leading to local-dev runs racing prod workers and writing into shared data. Tested: - `bun test packages/cli/src/__tests__/dev.test.ts` — 6 pass (1 new for isSharedDatabaseUrl loopback/tailnet/prod/IPv6/garbage cases) - `bunx vitest run mcp-install-targets.test.ts` — 3 pass - `bun run typecheck` clean - `bun run format:check` clean - `make build-packages` clean Out of scope: PR #734's Phase B/C of the agents per-org PK refactor (~67 storage call sites + composite FK swap). That's a multi-day project that deserves its own branch and review pass.
…id) callers (#747) * fix(schema): drop agents_organization_id_id_key — broke ON CONFLICT (id) callers The parallel `UNIQUE (organization_id, id)` added in 20260515120000 (PR #734 schema-prep) actively breaks `ON CONFLICT (id) DO NOTHING/UPDATE` clauses on the agents table. Why: Postgres' `ON CONFLICT (X)` only suppresses violations of the unique constraint matching exactly column set X. Adding a second unique constraint that overlaps with the PK means inserts can fail on the new constraint *before* reaching the PK conflict — and `ON CONFLICT (id)` doesn't catch it. Surfaced in `__tests__/integration/.../race-mcp` where parallel inserts of `(org-a, race-mcp-0)` started throwing `agents_organization_id_id_key` duplicates instead of being silently de-duped by the existing `ON CONFLICT (id) DO NOTHING` clause. The integration job has been red on main since #734 deployed for this exact reason. The PK on `(id)` already enforces global uniqueness, which subsumes `(organization_id, id)` uniqueness — the new constraint was logically redundant. Phase C of the per-org PK migration will swap the PK directly without needing a parallel constraint as a stepping stone. Mirrored in embedded-schema-patches.ts (idempotent DROP CONSTRAINT IF EXISTS). * chore(schema): keep schema_migrations.version varchar(128) — local pg_dump elided length
Why
While debugging why
lobu applysilently failed againstexamples/office-bot, three distinct issues compounded into the same observable: nothing landed in the target org. This PR fixes the two CLI ergonomics problems that hide root cause, plus phase A of the actual schema fix.The original failure chain was:
examples/office-bot/lobu.tomlhadprovider = "lobu-public"under[agents.<id>.preview.slack]. The schema is.strict()so validation aborted before any row was written. The error message wasInvalid lobu.toml: …— no file path.Missing required secret: $Z_AI_API_KEY. The key was inexamples/office-bot/.envbutlobu applydoesn't auto-load.env(onlylobu devdoes).Agent ID already exists in another organization.agents.idis the PK alone — globally unique — so a stalefood-orderingagent in themarketorg silently blockedlobu-teamfrom creating one with the same name. The application has always treated agents as org-scoped (getOrgId()filters on every read/delete/list); the global PK is a latent footgun.Summary
Invalid lobu.toml→Invalid <absolute path>for both schema and TOML-syntax errors. Users can now click the path in their terminal.lobu applyauto-loads.envfrom cwd before resolving\$VARrefs (shell env wins). Mirrorslobu dev.organization_id text NULLto the 5 child tables that FK onagents.id(agent_grants,agent_connections,agent_users,agent_channel_bindings,grants), backfills fromagents, adds parallelUNIQUE (organization_id, id)onagents, plus composite indexes for the upcoming org-scoped query patterns. Existing PK and FKs stay. App code unchanged.Phases B + C of the per-org PK migration ship in separate PRs:
organization_idthrough ~67 storage call sites inpackages/server/srcthat currently key only onagent_id. Many lack org context in their immediate scope (agent_users.addUserAgent(platform, userId, agentId)is the worst). Bigger refactor; deserves its own review pass.Test plan
bun run typecheckcleanmake build-packagescleanbun test packages/cli/src/commands/_lib/apply— 114 pass, 0 faillobu apply --dry-run --org lobu-teamfromexamples/office-bot(no manualset -a; source .env) — picks upZ_AI_API_KEYfrom.env, plan renders correctlyprovider = "lobu-public"key, gotError: Invalid /private/tmp/lobu-test-bad/lobu.tomlSummary by CodeRabbit
New Features
Improvements
Database