Skip to content

fix(cli,server): apply takes effect without lobu run restart#993

Merged
buremba merged 1 commit into
mainfrom
feat/model-hotreload
May 21, 2026
Merged

fix(cli,server): apply takes effect without lobu run restart#993
buremba merged 1 commit into
mainfrom
feat/model-hotreload

Conversation

@buremba
Copy link
Copy Markdown
Member

@buremba buremba commented May 20, 2026

Problem

After lobu apply against a running embedded lobu run server, a new lobu chat session kept using the previously-selected provider/model until lobu run was restarted. In fact, an lobu apply-provisioned provider often never resolved at all — the worker errored with No provider specified for model "glm-4.6".

Root cause

BaseProviderModule.hasCredentials only checked per-user auth_profiles. It ignored the org-shared API key that lobu apply writes to agent_secrets (provider:<id>:apiKey). buildEnvVars already falls back to that org-shared key when injecting credentials, but hasCredentials did not.

As a result, the session-context primary-provider detection in resolveProviderConfig (gateway/index.ts) found no credentialed provider and returned defaultProvider: none. The CLI and the web UI both store model preferences as a bare model ref (e.g. glm-4.6, no provider/ prefix), so the worker's resolveModelRef("glm-4.6", { defaultProvider: "" }) threw No provider specified for model "glm-4.6". The model was effectively pinned to whatever the first successful resolution produced until the gateway restarted.

Agent settings are read fresh from Postgres per message (resolveAgentOptionsgetSettings, no caching), so once the provider resolves correctly, both the first session and any re-applied model take effect on the next session with no restart.

Fix

  • hasCredentials now also recognizes an org-shared provider key (the same source buildEnvVars already uses), so lobu apply-provisioned providers report credentials.
  • Thread the worker token's organizationId into resolveProviderConfig and its hasCredentials calls. The worker session-context endpoint runs outside the org-scoped HTTP middleware's AsyncLocalStorage, so passing the org explicitly makes the agent_secrets lookup reliable.

Reproducer (red → green)

Embedded lobu run on a clean DB, agent hotreload-test with one z-ai provider, model = "glm-4.6". Single server process throughout (worker lobu-worker-api-4c5447d606e), no restart.

RED (before fix) — first chat after apply:

[platform-helpers] Applying agent settings {"agentId":"hotreload-test","effectiveModel":"glm-4.6"}
[worker-gateway] Session context for hotreload-test: ... provider: none
[api-response-renderer] Broadcast error: No provider specified for model "glm-4.6". Use "provider/model" format or set AGENT_DEFAULT_PROVIDER.

GREEN (after fix) — chat 1 (glm-4.6), then edit lobu.tomlglm-4.5, lobu apply against the running server (incremental ~ settings hotreload-test (providerModelPreferences)), then a new chat — no lobu run restart:

# chat 1
[platform-helpers] Applying agent settings {"agentId":"hotreload-test","effectiveModel":"glm-4.6"}
[orchestrator] Selected primary provider {"agentId":"hotreload-test","primaryProviderId":"z-ai","slug":"z-ai"}
[worker] Starting OpenClaw session: provider=zai, model=glm-4.6, tools=7, customTools=28

# lobu apply  →  ~ settings hotreload-test (providerModelPreferences)   (running server, no restart)

# chat 2 (same server pid, same worker)
[platform-helpers] Applying agent settings {"agentId":"hotreload-test","effectiveModel":"glm-4.5"}
[worker] Starting OpenClaw session: provider=zai, model=glm-4.5, tools=7, customTools=28

Model switched glm-4.6glm-4.5 for the next session with no restart.

Validation

  • bunx tsc --noEmit (packages/server) — clean.
  • make build-packages + packages/cli build — green.
  • New unit test base-provider-module-credentials.test.ts (4 pass) covers: profile present, org-shared key only (the apply path, via explicit org + ALS fallback), neither present, and no-org short-circuit.
  • E2E above run end-to-end against an embedded lobu run (Node 22, z-ai / glm-4.6 → glm-4.5).

Note: the 2 pre-existing failures in api-auth-middleware.test.ts (ENCRYPTION_KEY rotation / agent-scoped token isolation) fail identically at HEAD and don't reference any code in this PR.

Summary by CodeRabbit

  • New Features
    • Organization-shared provider API keys now serve as a fallback credential option when individual user credentials are unavailable, enabling more flexible credential management.
    • Provider configuration selection and credential eligibility are now organization-aware, allowing better context-specific handling across shared environments.

Review Change Stack

A freshly-applied provider/model now takes effect for the next chat
session against a running embedded server, with no restart required.

Root cause: BaseProviderModule.hasCredentials only checked per-user
auth_profiles, ignoring the org-shared API key that `lobu apply` writes
to agent_secrets (provider:<id>:apiKey). buildEnvVars already falls back
to that org-shared key, but hasCredentials did not — so the session-
context primary-provider detection (resolveProviderConfig) found no
credentialed provider and returned defaultProvider: none. With a bare
model ref like "glm-4.6" (no provider/ prefix, the format the CLI and UI
both store) the worker's resolveModelRef then threw "No provider
specified for model", and the model never updated until lobu run was
restarted.

Fix:
- hasCredentials now also recognizes an org-shared provider key (same
  source buildEnvVars already uses), so apply-provisioned providers
  report credentials.
- Thread the worker token's organizationId into resolveProviderConfig
  and its hasCredentials calls so the org-shared lookup resolves
  reliably on the worker session-context path (which runs outside the
  org-scoped HTTP middleware's AsyncLocalStorage).

Settings are read fresh from Postgres per message (no caching), so once
the provider resolves, both the first session and any re-applied model
take effect on the next session without a restart.
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 20, 2026

📝 Walkthrough

Walkthrough

This PR extends provider credential resolution with organization-scoped fallback: when a user has no per-profile API key configured, hasCredentials() now checks for shared org-level provider keys. The gateway passes organization context through credential availability checks during provider selection and proxy credential placeholder construction.

Changes

Organization-Aware Provider Credentials

Layer / File(s) Summary
Credential fallback logic in BaseProviderModule
packages/server/src/gateway/auth/base-provider-module.ts
hasCredentials() now checks per-user profiles first, then falls back to org-scoped shared provider keys via readOrgSharedProviderKey when no profile exists.
Test suite for org-shared credential fallback
packages/server/src/gateway/auth/__tests__/base-provider-module-credentials.test.ts
Comprehensive test suite with mocked org-context, encrypted provider keys, and four test cases: profile exists (true), org-shared key exists with explicit/implicit org ID (true), neither exists (false), no org context (false).
Gateway integration with org-aware credential checks
packages/server/src/gateway/gateway/index.ts
handleSessionContextRequest() passes worker token and organization ID to resolveProviderConfig(), which now uses organization-aware credential availability checks during default provider selection and credential placeholder construction.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • lobu-ai/lobu#740: Implements end-to-end org-shared provider key functionality in BaseProviderModule, which directly overlaps with the core credential fallback logic in this PR.
  • lobu-ai/lobu#944: Ensures organization ID is propagated through gateway session/queue flows, providing the org-context plumbing that enables organization-aware credential checks in this PR.
  • lobu-ai/lobu#746: Updates BaseProviderModule to make org-shared key lookup agent-ID-based, which this PR leverages for org-aware credential fallback logic.

Poem

🐰 A rabbit hops through gateway halls,
Sharing keys across org walls,
When profiles fail, the org key shines,
Credentials align with org design!
─ Bun tests bloom, the fallback sprawls! 🌙

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Description check ⚠️ Warning The description is comprehensive and well-structured, covering Problem, Root cause, Fix, Reproducer, and Validation. However, the required Test plan section with checkboxes is entirely missing. Add the Test plan section with checkboxes indicating which validation steps were completed (bun run check:fix, bun run typecheck, make build-packages, bot behavior tests).
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title 'fix(cli,server): apply takes effect without lobu run restart' directly addresses the main change: enabling lobu apply to take effect without restarting the server. It accurately reflects the primary fix.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/model-hotreload

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 ESLint

If the error stems from missing dependencies, add them to the package.json file. For unrecoverable errors (e.g., due to private dependencies), disable the tool in the CodeRabbit configuration.

ESLint skipped: no ESLint configuration detected in root package.json. To enable, add eslint to devDependencies.


Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov-commenter
Copy link
Copy Markdown

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
packages/server/src/gateway/auth/__tests__/base-provider-module-credentials.test.ts (2)

25-27: ⚡ Quick win

Make the DB mock validate the org-scoped lookup inputs.

This stub returns orgSharedSecretRows for every tagged-template call, so the suite still goes green if readOrgSharedProviderKey() uses the wrong organization_id/secret name or accidentally ignores context.organizationId and falls back to ALS. Making the mock inspect the interpolated values would turn these into real contract tests for the new fallback path.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@packages/server/src/gateway/auth/__tests__/base-provider-module-credentials.test.ts`
around lines 25 - 27, The DB mock currently always returns orgSharedSecretRows
regardless of the tagged-template inputs; update the mock.module replacement for
"../../../db/client.js" so getDb returns a function that inspects the
tagged-template call arguments (the template strings and interpolated values)
used by readOrgSharedProviderKey and validates the organization_id and secret
name (and that context.organizationId is being used rather than ALS) against
expected values, and throw or return an empty result when they don't match so
tests fail when the wrong org/secret or fallback is used.

29-30: ⚡ Quick win

Move the dynamic import into an allow-listed test hook.

Top-level await import(...) is outside the repo's permitted test pattern. Loading this module in beforeAll/beforeEach after mock.module(...) keeps the mock ordering and matches the guideline.

As per coding guidelines, Use static import by default; restrict await import(...) (dynamic imports) to the allow-listed locations only and In test files, await import(...) inside beforeAll, beforeEach, or test() blocks is allowed (load after vi.mock(...)) following the vitest pattern.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@packages/server/src/gateway/auth/__tests__/base-provider-module-credentials.test.ts`
around lines 29 - 30, The dynamic top-level await import of ApiKeyProviderModule
should be moved into an allow-listed test hook (e.g., beforeAll or beforeEach)
so mocks are registered first; update the test to call await
import("../api-key-provider-module.js") inside beforeAll/ beforeEach after any
vi.mock(...) or mock.module(...) calls and assign the imported
ApiKeyProviderModule to the same variable used by the tests so the module graph
picks up mocks correctly.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In
`@packages/server/src/gateway/auth/__tests__/base-provider-module-credentials.test.ts`:
- Around line 25-27: The DB mock currently always returns orgSharedSecretRows
regardless of the tagged-template inputs; update the mock.module replacement for
"../../../db/client.js" so getDb returns a function that inspects the
tagged-template call arguments (the template strings and interpolated values)
used by readOrgSharedProviderKey and validates the organization_id and secret
name (and that context.organizationId is being used rather than ALS) against
expected values, and throw or return an empty result when they don't match so
tests fail when the wrong org/secret or fallback is used.
- Around line 29-30: The dynamic top-level await import of ApiKeyProviderModule
should be moved into an allow-listed test hook (e.g., beforeAll or beforeEach)
so mocks are registered first; update the test to call await
import("../api-key-provider-module.js") inside beforeAll/ beforeEach after any
vi.mock(...) or mock.module(...) calls and assign the imported
ApiKeyProviderModule to the same variable used by the tests so the module graph
picks up mocks correctly.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: ed1350b8-008d-4f29-a5b9-cede6671c0a3

📥 Commits

Reviewing files that changed from the base of the PR and between 86b3f17 and ce94f0f.

📒 Files selected for processing (3)
  • packages/server/src/gateway/auth/__tests__/base-provider-module-credentials.test.ts
  • packages/server/src/gateway/auth/base-provider-module.ts
  • packages/server/src/gateway/gateway/index.ts

@buremba
Copy link
Copy Markdown
Member Author

buremba commented May 20, 2026

bug_free 84, simplicity 88, slop 0, bugs 0, 0 blockers

Typecheck and unit suites passed. [env] Integration suite failed before exercising diff because DATABASE_URL pointed at database "postgres" and the harness refused destructive setup. Ran new targeted test packages/server/src/gateway/auth/tests/base-provider-module-credentials.test.ts: 4 pass. Exploratory server boot failed because process env lacked DATABASE_URL.

Full verdict JSON
{
  "bug_free_confidence": 84,
  "bugs": 0,
  "slop": 0,
  "simplicity": 88,
  "blockers": [],
  "change_type": "fix",
  "behavior_change_risk": "medium",
  "tests_adequate": true,
  "suggested_fixes": [],
  "notes": "Typecheck and unit suites passed. [env] Integration suite failed before exercising diff because DATABASE_URL pointed at database \"postgres\" and the harness refused destructive setup. Ran new targeted test packages/server/src/gateway/auth/__tests__/base-provider-module-credentials.test.ts: 4 pass. Exploratory server boot failed because process env lacked DATABASE_URL.",
  "categories": {
    "src": 23,
    "tests": 101,
    "docs": 0,
    "config": 0,
    "deps": 0,
    "migrations": 0,
    "ci": 0,
    "generated": 0
  }
}

Local review gate — branch protection can require the pi-review commit status. See docs/REVIEW_SCHEMA.md.

@buremba buremba merged commit 3012104 into main May 21, 2026
19 of 20 checks passed
@buremba buremba deleted the feat/model-hotreload branch May 21, 2026 00:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants