Skip to content

fix(api): pass connected set as array literal in sync-presence#4483

Merged
saddlepaddle merged 1 commit into
mainfrom
fix-sync-presence-sql
May 13, 2026
Merged

fix(api): pass connected set as array literal in sync-presence#4483
saddlepaddle merged 1 commit into
mainfrom
fix-sync-presence-sql

Conversation

@saddlepaddle
Copy link
Copy Markdown
Collaborator

@saddlepaddle saddlepaddle commented May 13, 2026

Summary

The reconciler from #4476 has been 502-ing every QStash tick since merge. Verified via Vercel logs — the actual error:

```
Error: Failed query: ... ANY(($1, $2, $3, ..., $57)::text[])
```

Drizzle's `sql` template was expanding the JS `connected: string[]` as N positional placeholders, producing a Postgres row-constructor cast to text[] rather than an array. That's not valid SQL syntax for membership testing.

Fix

Build a single Postgres array literal (`{a,b,c}`) from the connected set and pass it as one text parameter. The routing keys are `${uuid}:${32-char-hex}` — they contain only `[0-9a-f-]` plus a single `:`, no characters that would conflict with the unquoted array literal format, so no per-element quoting needed.

Test plan

  • After merge + Vercel deploy, watch the next QStash tick in vercel logs — should return 200 with `{connected, flippedOn, flippedOff}` counts instead of 502.
  • Confirm a known stuck-online host (the sandbox on satya-sprite, killed earlier) flips to `is_online = false` after one reconciler run.

Notes

  • Caught this while running the original E2E test: killed the staging relay machine, waited for the directory TTL to expire, expected the reconciler to flip my row to offline — instead it 502-ed every minute. The `try/catch` from the hardening commit (a8c83b6 in the merged PR) is what made it diagnosable rather than a generic 500.
  • The local repro confirmed shape of the error too (`db.execute` throws `NeonDbError` with the same expanded SQL).

Summary by cubic

Fix the sync-presence job by passing the connected set as a single Postgres array literal ({a,b,c}) instead of letting drizzle expand the JS array into N placeholders that became a row-cast and caused 502s. Membership checks now work, so hosts flip online/offline as expected.

Written for commit 4f515d2. Summary will update on new commits.

Summary by CodeRabbit

  • Refactor
    • Optimized internal database query construction for presence synchronization endpoints. No changes to user-facing functionality, behavior, or response formats.

Review Change Stack

The reconciler was passing the JS string array straight into the
drizzle sql template, which expanded it as N positional params:
`ANY(($1, $2, ..., $57)::text[])`. That's a row-cast, not an array
construction, so every QStash tick was 502ing with "Reconcile write
failed" and no rows ever flipped. Verified in prod via vercel logs
showing the exact expanded SQL.

Fix: build a Postgres array literal (`{a,b,c}`) and pass it as a
single text parameter. Routing keys are `${uuid}:${32-char-hex}`
which never contain literal `,`, `{`, `}`, `"`, or `\`, so the
unquoted form is safe.
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 13, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: e900b3e7-5391-4aef-96eb-1ad32d89427c

📥 Commits

Reviewing files that changed from the base of the PR and between 9853d80 and 4f515d2.

📒 Files selected for processing (1)
  • apps/api/src/app/api/hosts/jobs/sync-presence/route.ts

📝 Walkthrough

Walkthrough

The sync-presence endpoint now constructs connected relay keys from Redis into a Postgres array-literal string ({a,b,c}) before passing them to the SQL reconciler. The ANY(...) clause in the presence update query is updated to reference this pre-built literal with an explicit ::text[] cast instead of passing the raw JavaScript array.

Changes

Presence Sync Array Literal

Layer / File(s) Summary
Postgres array literal construction and SQL parameter update
apps/api/src/app/api/hosts/jobs/sync-presence/route.ts
Connected relay keys from Redis are formatted into a single Postgres array-literal string and stored in connectedArrayLiteral. The reconcile SQL ANY(...) clause is then updated to use ${connectedArrayLiteral}::text[] instead of ${connected}::text[] when determining expected online state.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

Possibly related PRs

  • superset-sh/superset#4476: Both PRs modify the same sync-presence/route.ts file to change how the reconciler builds the Postgres ANY(...) input from the connected relay key set when computing and updating expected/is_online state.

Poem

🐰 Array literals bound in braces bright,
Postgres strings made pure and right,
Redis keys dance through the relay's song,
Presence syncs now steady and strong!
~CodeRabbit

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and accurately describes the main fix: passing the connected set as an array literal instead of expanding it into placeholders.
Description check ✅ Passed The description provides comprehensive context, including the specific error, root cause analysis, and the fix approach, though it lacks formal PR template sections.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix-sync-presence-sql

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@capy-ai
Copy link
Copy Markdown

capy-ai Bot commented May 13, 2026

Capy auto-review is paused for this organization because the monthly auto-review limit has been reached. Increase the limit or turn it off in billing settings to resume automatic reviews.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 13, 2026

Greptile Summary

This PR fixes the reconciler's 502 errors by replacing a broken Drizzle array interpolation with a single Postgres array-literal parameter. The root cause was that sql\${connected}`(whereconnectedis a JSstring[]) expanded into ($1,$2,...,$N)::text[]` — a row constructor cast, not an array — which Postgres rejects.

  • connectedArrayLiteral = \{${connected.join(",")}}`is built after the empty-set guard and passed as one$1binding, so Postgres correctly parses it via$1::text[]`.
  • The fix is not SQL-injectable because Drizzle still parameterizes the literal string; the implicit risk is that any Redis key containing {, }, or , would cause a Postgres cast error and a 502, with no early warning before the query runs.

Confidence Score: 4/5

The fix correctly resolves the production 502 and is safe to merge; the only open question is whether Redis key format is enforced upstream.

The change is small and targeted, directly addressing a confirmed production breakage. The array-literal approach is a standard Postgres workaround for parameterized array inputs and behaves correctly under Drizzle's binding model. The one gap is that there is no in-code assertion that Redis values match the uuid:hex format before they are embedded in the literal — a format drift in directory.ts would produce a quiet 502 rather than a clear validation error. That is a hardening concern, not a blocker.

The single changed file, apps/api/src/app/api/hosts/jobs/sync-presence/route.ts, deserves a second look around line 79 where the array literal is constructed without validating element format.

Important Files Changed

Filename Overview
apps/api/src/app/api/hosts/jobs/sync-presence/route.ts Fixes the Drizzle array-expansion bug by building a Postgres array literal and passing it as a single parameter; the fix is correct and not SQL-injectable, though it adds an implicit dependency on Redis key format that is not validated in code.

Sequence Diagram

sequenceDiagram
    participant QStash
    participant Route as sync-presence route
    participant Redis
    participant Postgres as Postgres (Neon)

    QStash->>Route: POST (signed)
    Route->>Route: verify QStash signature
    Route->>Redis: ZRANGEBYSCORE relay:tunnel-ttl [now, +inf]
    Redis-->>Route: connected string[]

    alt "connected.length === 0"
        Route-->>QStash: "200 { skipped: true }"
    else "connected.length > 0"
        Route->>Route: "build connectedArrayLiteral = "{a,b,c}""
        Note over Route,Postgres: Before fix: drizzle expanded array into ($1,$2,...)::text[] (row-cast, invalid). After fix: passes literal as single $1
        Route->>Postgres: UPDATE v2_hosts ... ANY($1::text[])
        Postgres-->>Route: RETURNING rows with new is_online
        Route-->>QStash: "200 { connected, flippedOn, flippedOff }"
    end
Loading
Prompt To Fix All With AI
Fix the following 1 code review issue. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 1
apps/api/src/app/api/hosts/jobs/sync-presence/route.ts:79
**Unvalidated Redis values flow into Postgres array literal**

The fix is not a SQL-injection risk — Drizzle parameterizes `${connectedArrayLiteral}` as a single `$1` binding — but it creates a silent failure path: if any value returned by `redis.zrange` ever contains `{`, `}`, `,`, or whitespace (e.g., after a key-format change in `directory.ts` or due to Redis data corruption), Postgres will reject the `$1::text[]` cast and the reconciler will 502. Because the guard at line 63 only checks length, not shape, a single malformed key silences the whole run. Adding a format check before building the literal would make that failure mode explicit and diagnosable.

Reviews (1): Last reviewed commit: "fix(api): pass connected set as array li..." | Re-trigger Greptile

// rather than letting drizzle expand the JS array into N placeholders
// (`($1, $2, ...)::text[]` is a row-cast, not an array). Routing keys are
// `${uuid}:${32-char-hex}` so the unquoted `{a,b,c}` literal is safe.
const connectedArrayLiteral = `{${connected.join(",")}}`;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Unvalidated Redis values flow into Postgres array literal

The fix is not a SQL-injection risk — Drizzle parameterizes ${connectedArrayLiteral} as a single $1 binding — but it creates a silent failure path: if any value returned by redis.zrange ever contains {, }, ,, or whitespace (e.g., after a key-format change in directory.ts or due to Redis data corruption), Postgres will reject the $1::text[] cast and the reconciler will 502. Because the guard at line 63 only checks length, not shape, a single malformed key silences the whole run. Adding a format check before building the literal would make that failure mode explicit and diagnosable.

Prompt To Fix With AI
This is a comment left during a code review.
Path: apps/api/src/app/api/hosts/jobs/sync-presence/route.ts
Line: 79

Comment:
**Unvalidated Redis values flow into Postgres array literal**

The fix is not a SQL-injection risk — Drizzle parameterizes `${connectedArrayLiteral}` as a single `$1` binding — but it creates a silent failure path: if any value returned by `redis.zrange` ever contains `{`, `}`, `,`, or whitespace (e.g., after a key-format change in `directory.ts` or due to Redis data corruption), Postgres will reject the `$1::text[]` cast and the reconciler will 502. Because the guard at line 63 only checks length, not shape, a single malformed key silences the whole run. Adding a format check before building the literal would make that failure mode explicit and diagnosable.

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 1 file

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 13, 2026

🧹 Preview Cleanup Complete

The following preview resources have been cleaned up:

  • ✅ Neon database branch

Thank you for your contribution! 🎉

@saddlepaddle saddlepaddle merged commit a2e7f6f into main May 13, 2026
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant