fix(gateway): Owletto MCP sync, Slack markdown/threading, tool-approval lifecycle, deployment coalescing by buremba · Pull Request #210 · lobu-ai/lobu

buremba · 2026-04-18T02:59:16Z

Bundle of follow-up hardening on top of #207 (declared vs user auth split). Each theme is independent and fixes a concrete user-facing bug; kept as one PR because they share the same test surface and landed together during hardening.

Theme A — Owletto MCP global server sync

Files: services/core-services.ts, auth/mcp/config-service.ts

Mirror resolved MEMORY_URL env var into McpConfigService as a global owletto streamable-http server at startup AND on lobu.toml reload. Without this, /mcp/owletto proxy lookups via getHttpServer(\"owletto\") return undefined and the worker-side plugin fails with "MCP server 'owletto' not found".
New upsertGlobalServer() — replace-on-write (vs existing registerGlobalServers() which is register-once).
Per-agent MCP lookup: getSettings → getEffectiveSettings so MCP servers declared in lobu.toml (via the declared-agent registry from refactor(gateway): split declared agent config from user-scoped auth state #207) actually resolve.
Intentionally omit oauth: {} on the owletto entry: worker-side owletto_login owns the device-code flow; adding oauth would trigger a parallel gateway OAuth/PKCE discovery and surface two competing login links.

Theme B — Tool-approval lifecycle fixes

Files: auth/mcp/proxy.ts, cli/gateway.ts, connections/interaction-bridge.ts

Bump PENDING_TOOL_TTL 5 min → 24 h (proxy + in-memory card timer in interaction-bridge). Tool-approval cards sit in Slack for hours; 5 min silently expired real clicks.
cli/gateway.ts approveToolCall: switch GET-then-DEL to atomic GETDEL so a double-POST to /api/v1/agents/approve can't double-execute the tool. Matches the interaction-bridge path.
Capture channelId / conversationId / teamId / connectionId on the pending-tool Redis record so approval results route to the conversation the tool call originated from, not the thread of the clicked card (which can differ on Slack).
Expired-card UX: when GETDEL returns null but the in-memory approval card is still tracked, edit the card and post a "this approval expired" notice instead of silently no-op'ing. Real retries still no-op silently (no tracked card).

Theme C — Slack markdown rendering

Files: connections/chat-response-bridge.ts (+229 LOC)

Skip the SDK streaming path for Slack; buffer deltas, then at completion post a single chunked chat.postMessage with markdown_text. Fixes (a) mid-line splits from the Slack streaming API's fixed-size chunking and (b) < / > / & HTML-escaping that the text field inflicts on code snippets.
Split on paragraph boundaries (\n\n), then line boundaries, then hard-slice, with an 11,000-char per-post margin under Slack's 12k limit.
Pre-decode HTML entities (worker may have produced them upstream from MCP tool results) and strip empty markdown links [text]() (some tools emit these as unlinked citation footnotes).
Silent-drop fix: resolve the SDK fallback target during delta setup so a missing Slack adapter at completion still delivers via target.post() instead of silently dropping.

Theme D — Slack session behavior

Files: connections/{slack-instruction-provider.ts (new), chat-instance-manager.ts, message-handler-bridge.ts, conversation-state-store.ts}, gateway-main.ts

New SlackInstructionProvider contributes Slack bot identity (@username / U0123ABC) through the InstructionProvider interface so the model knows its own Slack handle and doesn't treat its own ID as a stranger. Auto-registered for Slack platforms; gateway-main.ts now eagerly registers providers for platforms added after start().
chat-instance-manager: mirror adapter.botUserId onto connection.metadata alongside botUsername so mention-strip and the instruction provider can find it.
message-handler-bridge: strip <@Uxxx> and bare @Uxxx bot-mention tokens Slack delivers raw. Collapse to canonical thread.id for DM thread replies (previously fell back to channelId, so replies to bot messages inside a DM thread landed in the main DM pane).
Thread-history backfill: on first mention in a group thread, fetch up to 50 prior messages via adapter.fetchMessages (Slack → conversations.replies) and append to conversation history, so mid-thread mentions have context. Atomic one-shot-per-thread guard via new ConversationStateStore.claimThreadBackfill / releaseThreadBackfill (Redis SETNX + 24h TTL, release on failure so transient errors can retry).

Theme E — Deployment ensure coalescing

Files: orchestration/base-deployment-manager.ts, orchestration/impl/{docker,embedded,k8s}-deployment.ts

BaseDeploymentManager: rename abstract createDeployment → spawnDeployment (protected) and add public ensureDeployment wrapper with an in-flight Promise map. Concurrent calls for the same deploymentName within one gateway process share a single spawn; cross-process concurrency is handled by the orchestrator's atomic name uniqueness (Docker 409, K8s 409, embedded workers.has check).
docker-deployment, k8s/deployment: treat 409 AlreadyExists as benign success — inspect/start existing container for Docker, no-op for K8s.
embedded-deployment: short-circuit on workers.has(deploymentName).

Smaller items (not a theme, bundled for proximity)

config/index.ts loadEnvFile: pass override: true to dotenv so .env wins over stale shell exports. Matches the single-source-of-truth contract introduced by chore(dev): simplify compose DEPLOYMENT_MODE and opt-in OpenAI for careops #209 (compose no longer re-exports DEPLOYMENT_MODE). Production (NODE_ENV=production) skips loadEnvFile entirely, so prod deploys are unaffected.
config/index.ts isPluginInstalled: walk up parent directories for the node_modules/<pkg> folder, falling back from require.resolve(). Fixes detection of ESM-only packages whose exports map omits a require condition (e.g. @lobu/owletto-openclaw).
routes/public/connections.ts: bump webhook-processing log debug → info so Slack/Telegram inbound traffic is visible in prod without flipping LOG_LEVEL.
worker/src/openclaw/worker.ts (-62 LOC): drop the proactive owletto_login call at worker startup. Now that the gateway registers the owletto MCP server correctly (Theme A), the model can call owletto_login lazily when a memory tool first returns an auth error — more natural than pushing an unsolicited login link at every session start.
worker/src/gateway/sse-client.ts: signature updates matching the gateway-side changes above.

Tests (+~1,000 LOC)

Expanded coverage:

base-deployment-grants: ensureDeployment coalescing, spawnDeployment rename.
chat-response-bridge: Slack markdown buffer-and-post path.
docker-deployment: 409 handling.
embedded-deployment: workers.has short-circuit.
interaction-bridge-action-handlers: expired-card UX.
mcp-config-service: upsertGlobalServer semantics.
message-handler-bridge: backfill, mention strip, DM-thread conversationId.

Test plan

make build-packages passes (typecheck clean locally).
bun test packages/gateway/src green.
make dev with a careops agent configured via lobu.toml — agent loads, Owletto MCP calls reach memory (no "MCP server 'owletto' not found").
Slack end-to-end: <@Uxxx> mention → bot responds in-thread with markdown rendering (no HTML entities visible for code blocks), replies to bot DM messages stay in the DM thread, tool-approval card expired after 24h+ shows the expiry notice.
Confirm single Owletto login link surfaces on first memory call (not two competing device-code + PKCE flows).
Concurrent-request stress: fire N parallel messages for the same thread — only one worker deployment is spawned.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b0c72efe80

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-18T03:07:40Z

+    // No per-message user equality check: deployment names are derived from
+    // the canonical conversation key (`platform:channelId:conversationId`)
+    // *without* userId, so a thread is intentionally shared by every user
+    // who posts in it. The SSE queue (`thread_message_<deployment>`) is the
+    // routing boundary — only messages for this deployment reach this
+    // worker. The per-message `data.userId` is the actual sender of *this*
+    // turn and is propagated downstream via `payloadToWorkerConfig`
+    // (response addressing, MCP namespacing, audit). `this.userId` is the
+    // user that originally spawned the deployment, kept for audit only.


Reinstate user isolation before queuing thread messages

Dropping the per-message userId check here allows any participant in a shared thread to execute turns inside a worker that is still authenticated with the deployment-scoped WORKER_TOKEN created for the original user. In that scenario, tool approvals/execution and MCP auth context are attributed to the wrong user, which can expose one user’s grants or integrations to another user in the same thread. The new comment says data.userId is enough for namespacing, but gateway auth for MCP routes is still derived from the worker token, not this payload field.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-04-18T03:07:40Z

+        stream = {
+          iterator: new AsyncPushIterator<string>(),
+          streamPromise: Promise.resolve(),
+          buffer: payload.delta,
+          streamFailed: true, // Force completion to use the post-buffer path
+          wasFullyReplaced: !!payload.isFullReplacement,
+          target: null,
+        };


Preserve a Slack post fallback target for completion errors

In the Slack-specific delta path, stream.target is always initialized to null, but completion tries to fall back to stream.target.post(...) when postSlackMarkdown is unavailable or throws. That makes the fallback path unreachable and can silently drop Slack responses whenever the adapter client is missing/uninitialized or markdown posting fails. Populate target (or resolve one in completion) so the documented fallback actually works.

Useful? React with 👍 / 👎.

@uxxx

…al lifecycle, deployment coalescing Bundled follow-up work on top of 283c66a (declared vs user auth split). Each theme is independent and fixes a concrete user-facing bug; kept as one PR because they share the same test surface and landed together during hardening. ## Owletto MCP global server sync - core-services: mirror the resolved `MEMORY_URL` env var into `McpConfigService` as a global `owletto` streamable-http server at startup AND on lobu.toml reload. Without this, `/mcp/owletto` proxy lookups via `getHttpServer("owletto")` return undefined and the worker-side plugin fails with "MCP server 'owletto' not found". - config-service: new `upsertGlobalServer()` (replace-on-write, vs the existing `registerGlobalServers()` which is register-once). Also switch the per-agent MCP lookup from `getSettings` → `getEffectiveSettings` so MCP servers declared in lobu.toml (via the declared-agent registry introduced in 283c66a) actually resolve. - Intentionally omit `oauth: {}` on the owletto entry: worker-side `owletto_login` owns the device-code flow, adding it would trigger a parallel gateway OAuth/PKCE discovery and surface two competing login links. ## Tool-approval lifecycle fixes - Bump `PENDING_TOOL_TTL` 5 min → 24 h (proxy.ts + in-memory card timer in interaction-bridge.ts). Tool-approval cards sit in Slack for hours; 5 min silently expired real clicks. - cli/gateway.ts `approveToolCall`: switch GET-then-DEL to atomic GETDEL so a double-POST to /api/v1/agents/approve can't double-execute the tool. Matches the interaction-bridge path. - Capture `channelId / conversationId / teamId / connectionId` on the pending-tool Redis record so approval results route to the conversation the tool call originated from, not the thread of the clicked card (which can differ on Slack). - Expired-card UX: when GETDEL returns null but the in-memory approval card is still tracked, edit the card and post a "this approval expired" notice instead of silently no-op'ing. Real retries still no-op silently (no tracked card). ## Slack markdown rendering (chat-response-bridge.ts) - Skip the SDK streaming path for Slack; buffer deltas, then at completion post a single chunked `chat.postMessage` with `markdown_text`. Fixes (a) mid-line splits from the Slack streaming API's fixed-size chunking and (b) `<`/`>`/`&` HTML-escaping that `text` field inflicts on code snippets. - Split on paragraph boundaries (`\n\n`), then line boundaries, then hard-slice, with an 11,000-char per-post margin under Slack's 12k limit. - Pre-decode HTML entities (worker may have produced them upstream from MCP tool results) and strip empty markdown links `[text]()` (some tools emit these as unlinked citation footnotes). - Resolve the SDK fallback target during delta setup so a missing Slack adapter at completion still delivers via `target.post()` instead of silently dropping. ## Slack session behavior - New `SlackInstructionProvider` contributes Slack bot identity (`@username` / `U0123ABC`) through the `InstructionProvider` interface so the model knows its own Slack handle and doesn't treat its own ID as a stranger. Auto-registered for Slack platforms; `gateway-main.ts` now eagerly registers providers for platforms added after `start()`. - chat-instance-manager: mirror `adapter.botUserId` onto `connection.metadata` alongside `botUsername` so the mention strip and the instruction provider can find it. - message-handler-bridge: strip `<@uxxx>` and bare `@Uxxx` bot-mention tokens Slack delivers raw. Collapse to canonical `thread.id` for DM thread replies (previously fell back to channelId, so replies to bot messages inside a DM thread landed in the main DM pane). - Thread-history backfill: on first mention in a group thread, fetch up to 50 prior messages via `adapter.fetchMessages` (Slack → conversations.replies) and append to conversation history, so mid-thread mentions have context. Atomic one-shot-per-thread guard via new `ConversationStateStore.claimThreadBackfill` / `releaseThreadBackfill` (Redis SETNX + 24h TTL, release on failure so transient errors can retry). ## Deployment ensure coalescing - `BaseDeploymentManager`: rename abstract `createDeployment` → `spawnDeployment` (protected) and add a public `ensureDeployment` wrapper with an in-flight Promise map. Concurrent calls for the same `deploymentName` within one gateway process share a single spawn; cross-process concurrency is handled by the orchestrator's atomic name uniqueness (Docker 409, K8s 409, embedded workers.has check). - docker-deployment, k8s/deployment: treat 409 AlreadyExists as benign success — inspect/start an existing container, no-op for k8s. - embedded-deployment: short-circuit on `workers.has(deploymentName)`. ## Smaller items - config/index.ts `loadEnvFile`: pass `override: true` to dotenv so `.env` wins over stale shell exports. Matches the single-source-of- truth contract introduced by PR #209 (compose no longer re-exports `DEPLOYMENT_MODE`). Production skips loadEnvFile entirely, so production deploys are unaffected. - config/index.ts `isPluginInstalled`: walk up parent directories for the `node_modules/<pkg>` folder, falling back from `require.resolve()`. Fixes detection of ESM-only packages whose `exports` map omits a `require` condition (e.g. @lobu/owletto-openclaw). - routes/public/connections.ts: bump webhook-processing log from debug → info so Slack/Telegram inbound traffic is visible in prod without flipping LOG_LEVEL. - worker/src/openclaw/worker.ts: drop the proactive `owletto_login` call at worker startup. Now that the gateway registers the owletto MCP server correctly (theme 1), the model can call `owletto_login` lazily when a memory tool first returns an auth error — more natural than pushing an unsolicited login link at every session start. - worker/src/gateway/sse-client.ts: signature updates matching the gateway-side changes above. ## Tests (+~1,000 LOC) Expanded coverage: base-deployment-grants (ensureDeployment coalescing, spawnDeployment rename), chat-response-bridge (Slack markdown buffer-and-post path), docker-deployment (409 handling), embedded- deployment (workers.has short-circuit), interaction-bridge action handlers (expired-card UX), mcp-config-service (upsertGlobalServer), message-handler-bridge (backfill, mention strip, DM-thread conversationId).

buremba · 2026-04-18T03:14:05Z

Addressed both Codex comments in the latest push (75cb4c8):

P1 (sse-client.ts:511 — per-user isolation): Fixed. You're right — generateDeploymentName in the identity overload hashes only platform:channelId:conversationId (see base-deployment-manager.ts:94-101), so every user in a shared thread is routed to the same SSE queue, and the worker's baked-in WORKER_TOKEN encodes only the original spawning user's id. Without the equality guard, a second user's message would execute with the first user's token and their OAuth-scoped integrations.

Restored the data.userId !== this.userId drop with an updated comment that explains the isolation contract (WORKER_TOKEN is the enforcement point, deployment-name hashing is not). Long-term fix is per-message JWT minting so shared-thread workers can act on behalf of the actual sender — leaving that for a follow-up.

P2 (chat-response-bridge.ts:363 — Slack fallback target): Fixed. Same fix I flagged during my self-review in the previous round: now resolving the SDK target via this.resolveTarget(...) during delta setup on the Slack path, so when postSlackMarkdown returns false (missing adapter client) or throws, the else if (stream.target) fallback has a real non-null target and delivers via target.post(cleaned) instead of silently dropping.

…r-channel by design

buremba · 2026-04-18T03:19:55Z

Correction on the P1 (sse-client.ts): I've reverted the userId equality check in 834d665.

Channels/threads are intentionally one-worker-per-platform:channelId:conversationId (see generateDeploymentName), shared across all posting users — that's the group-bot design. The check I added would have silently dropped every non-spawning user's message in a channel, breaking mention/subscribed flows. DMs are single-participant so the check would have been dead there anyway.

The underlying concern — WORKER_TOKEN is scoped to the original spawning user's id in a worker shared across many users — is real but a design tradeoff, not something to fix with a filter at the worker boundary. Proper fix is per-message JWT minting; that's out of scope for this PR and is now called out in the comment above the (removed) check.

Picks up: deploy: add Postgres PITR recovery runbook + seal summaries-db-app (#210) Before: 83c9a09 After: bf904a7

chatgpt-codex-connector Bot reviewed Apr 18, 2026

View reviewed changes

buremba force-pushed the refactor/auth-split-continuation branch from b0c72ef to 83a780a Compare April 18, 2026 03:10

buremba changed the title ~~refactor(gateway): follow-ups to declared vs user auth split~~ fix(gateway): Owletto MCP sync, Slack markdown/threading, tool-approval lifecycle, deployment coalescing Apr 18, 2026

buremba force-pushed the refactor/auth-split-continuation branch from 83a780a to 75cb4c8 Compare April 18, 2026 03:13

fix(worker): remove userId filter — channel bots are shared-worker-pe…

834d665

…r-channel by design

buremba merged commit 92ce0eb into main Apr 18, 2026
13 of 14 checks passed

buremba deleted the refactor/auth-split-continuation branch April 18, 2026 03:30

buremba mentioned this pull request Apr 18, 2026

chore(main): release lobu 3.6.0 #204

Merged

buremba mentioned this pull request May 21, 2026

chore: bump packages/owletto pointer to bf904a7 #997

Merged

buremba added a commit that referenced this pull request May 21, 2026

chore: bump packages/owletto pointer to bf904a7 (#997)

3ec1937

Picks up: deploy: add Postgres PITR recovery runbook + seal summaries-db-app (#210) Before: 83c9a09 After: bf904a7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(gateway): Owletto MCP sync, Slack markdown/threading, tool-approval lifecycle, deployment coalescing#210

fix(gateway): Owletto MCP sync, Slack markdown/threading, tool-approval lifecycle, deployment coalescing#210
buremba merged 2 commits into
mainfrom
refactor/auth-split-continuation

buremba commented Apr 18, 2026 •

edited

Loading

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Apr 18, 2026

Uh oh!

chatgpt-codex-connector Bot Apr 18, 2026

Uh oh!

buremba commented Apr 18, 2026

Uh oh!

buremba commented Apr 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

buremba commented Apr 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Theme A — Owletto MCP global server sync

Theme B — Tool-approval lifecycle fixes

Theme C — Slack markdown rendering

Theme D — Slack session behavior

Theme E — Deployment ensure coalescing

Smaller items (not a theme, bundled for proximity)

Tests (+~1,000 LOC)

Test plan

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 18, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Apr 18, 2026

Choose a reason for hiding this comment

Uh oh!

buremba commented Apr 18, 2026

Uh oh!

buremba commented Apr 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

buremba commented Apr 18, 2026 •

edited

Loading