fix(openai): unwrap SELECTION-type setting to a string — TRUE root cause of HTTP 400#39
Conversation
…use of HTTP 400
# Root cause
`settings.get('openaiModel')` returns the value of a Devvit SELECTION
schema field. SELECTION fields return a `string[]` -- even for single
selection. We were passing the raw array straight into the request body
as `"model": ["gpt-5.4-mini"]`, which OpenAI rejected as unparseable
JSON for the `model` field. OpenAI's error wording masked the underlying
problem: it said `We could not parse the JSON body of your request`,
which sounds like a structural JSON error but is actually a type
mismatch on a single field. Production log proof from v0.0.39:
[vibe-mod] callOpenAI: openaiModel raw = ["gpt-5.4-mini"]
This single bug accounts for PR #32-#37 all returning HTTP 400 with
seemingly unrelated changes. PR #38 (v0.0.39) worked around it by
hardcoding `gpt-5.4-nano` (a string), which made the request body
valid -- callOpenAI succeeded, the toast became
`The compiled rule contained an action this app does not support`
(a downstream issue, not a transit one).
# Fix
* Unwrap the SELECTION result: if `Array.isArray(raw)`, take `raw[0]`.
* Log both the raw and unwrapped values for diagnostic visibility.
* Honour the user-configured model (default `gpt-5.4-mini`).
# Restore proper few-shot examples
PR #37 had stripped JSON syntax from message content (`flattenExample`,
output as `key=value; key=value`) on a false hypothesis. With JSON
syntax removed, the model learned a non-schema output shape -- v0.0.39
emitted `{"modqueue":{"note":"..."}}` instead of `{"action":"modqueue",
"params":{"note":"..."}}`. Restoring `JSON.stringify(ex.assistant)` for
each example re-teaches the schema.
Single-user composition (PR #32) is preserved -- it's a sensible
packaging choice and was never causing the 400; the model array was.
# Tests
* `npm run check` 4/4 PASS.
* Local POST of the new body to OpenAI returns 200 with a valid
compiled rule conforming to the SAFE-action schema.
* After merge: `devvit upload` -> v0.0.40 -> Chrome verify -> production
callOpenAI returns 200 -> toast shows compile success.
# Why probe(b)/(d)/(e)/(f) all worked
The probes hardcoded `'gpt-5.4-nano'` (a string literal) -- so they
sent a valid `model` field and OpenAI parsed their requests fine. Only
callOpenAI used `settings.get(...)` and got hit by the array bug.
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
|
Warning Rate limit exceeded
You’ve run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request fixes a bug where the openaiModel setting was incorrectly being sent as an array instead of a string, and restores the use of structured JSON in few-shot examples now that the model-array issue is resolved. The reviewer suggests that since the previous 'JSON-syntax-free' strategy has been abandoned, the prompt construction should move away from single-line collapsing and instead use newlines to better delineate sections, providing a code suggestion to implement this improved structure.
| const collapse = (s: string) => s.replace(/\s+/g, ' ').trim(); | ||
| // Recursively flatten any value to a JSON-syntax-free string. Keys use | ||
| // `key=value`, arrays use `[a; b; c]`-but-with-no-brackets style, objects | ||
| // are space-separated `key=value` pairs. | ||
| const flattenValue = (v: unknown): string => { | ||
| if (v == null) return 'null'; | ||
| if (typeof v === 'string') return v.replace(/\s+/g, ' '); | ||
| if (typeof v === 'number' || typeof v === 'boolean') return String(v); | ||
| if (Array.isArray(v)) return v.map(flattenValue).join(' or '); | ||
| if (typeof v === 'object') { | ||
| return Object.entries(v as Record<string, unknown>) | ||
| .map(([k, val]) => `${k}=${flattenValue(val)}`) | ||
| .join(' '); | ||
| } | ||
| return String(v); | ||
| }; | ||
| const flattenExample = (ex: (typeof FEW_SHOT_EXAMPLES)[number]): string => { | ||
| const obj = ex.assistant as Record<string, unknown>; | ||
| const flat = Object.entries(obj) | ||
| .map(([k, v]) => `${k}=${flattenValue(v)}`) | ||
| .join('; '); | ||
| return `EXAMPLE INPUT ${collapse(ex.user)} EXAMPLE OUTPUT ${flat}`; | ||
| }; | ||
| const SHORT_PROMPT_CHARS = 3500; | ||
| const systemShort = collapse(VIBE_MOD_SYSTEM_PROMPT).slice(0, SHORT_PROMPT_CHARS); | ||
| const firstExample = FEW_SHOT_EXAMPLES[0]; | ||
| const exampleText = firstExample ? flattenExample(firstExample) : ''; | ||
| const exampleBlock = (ex: (typeof FEW_SHOT_EXAMPLES)[number], i: number): string => | ||
| `EXAMPLE ${i + 1} INPUT: ${collapse(ex.user)} EXAMPLE ${i + 1} OUTPUT: ${JSON.stringify(ex.assistant)}`; | ||
| const compositeContent = [ | ||
| 'SYSTEM INSTRUCTIONS', | ||
| systemShort, | ||
| exampleText, | ||
| 'TASK INPUT', | ||
| 'SYSTEM INSTRUCTIONS:', | ||
| collapse(VIBE_MOD_SYSTEM_PROMPT), | ||
| 'EXAMPLES:', | ||
| ...FEW_SHOT_EXAMPLES.map(exampleBlock), | ||
| 'TASK INPUT:', | ||
| collapse(userRule.slice(0, 1000)), | ||
| ...(clarif ? ['TASK CLARIFICATION', collapse(clarif)] : []), | ||
| 'OUTPUT strict JSON only no prose', | ||
| ] | ||
| .filter(Boolean) | ||
| .join(' '); | ||
| ...(clarif ? ['TASK CLARIFICATION:', collapse(clarif)] : []), | ||
| 'OUTPUT (strict JSON object only, no prose):', | ||
| ].join(' '); |
There was a problem hiding this comment.
Now that the 'Round-6 strategy' of removing all JSON syntax has been abandoned to restore proper JSON few-shot examples (line 1313), the use of collapse and join(' ') to create a single-line prompt is less justified. While the author mentions a 'transit-safe' shape, the restoration of JSON.stringify already introduces significant escape sequences (\", {, }, etc.).
Using newlines and preserving the structure of the system prompt would likely improve the model's ability to follow instructions and distinguish between sections (Instructions, Examples, Task Input). Consider using \n as a separator and removing the collapse calls on the system prompt and user input.
| const collapse = (s: string) => s.replace(/\s+/g, ' ').trim(); | |
| // Recursively flatten any value to a JSON-syntax-free string. Keys use | |
| // `key=value`, arrays use `[a; b; c]`-but-with-no-brackets style, objects | |
| // are space-separated `key=value` pairs. | |
| const flattenValue = (v: unknown): string => { | |
| if (v == null) return 'null'; | |
| if (typeof v === 'string') return v.replace(/\s+/g, ' '); | |
| if (typeof v === 'number' || typeof v === 'boolean') return String(v); | |
| if (Array.isArray(v)) return v.map(flattenValue).join(' or '); | |
| if (typeof v === 'object') { | |
| return Object.entries(v as Record<string, unknown>) | |
| .map(([k, val]) => `${k}=${flattenValue(val)}`) | |
| .join(' '); | |
| } | |
| return String(v); | |
| }; | |
| const flattenExample = (ex: (typeof FEW_SHOT_EXAMPLES)[number]): string => { | |
| const obj = ex.assistant as Record<string, unknown>; | |
| const flat = Object.entries(obj) | |
| .map(([k, v]) => `${k}=${flattenValue(v)}`) | |
| .join('; '); | |
| return `EXAMPLE INPUT ${collapse(ex.user)} EXAMPLE OUTPUT ${flat}`; | |
| }; | |
| const SHORT_PROMPT_CHARS = 3500; | |
| const systemShort = collapse(VIBE_MOD_SYSTEM_PROMPT).slice(0, SHORT_PROMPT_CHARS); | |
| const firstExample = FEW_SHOT_EXAMPLES[0]; | |
| const exampleText = firstExample ? flattenExample(firstExample) : ''; | |
| const exampleBlock = (ex: (typeof FEW_SHOT_EXAMPLES)[number], i: number): string => | |
| `EXAMPLE ${i + 1} INPUT: ${collapse(ex.user)} EXAMPLE ${i + 1} OUTPUT: ${JSON.stringify(ex.assistant)}`; | |
| const compositeContent = [ | |
| 'SYSTEM INSTRUCTIONS', | |
| systemShort, | |
| exampleText, | |
| 'TASK INPUT', | |
| 'SYSTEM INSTRUCTIONS:', | |
| collapse(VIBE_MOD_SYSTEM_PROMPT), | |
| 'EXAMPLES:', | |
| ...FEW_SHOT_EXAMPLES.map(exampleBlock), | |
| 'TASK INPUT:', | |
| collapse(userRule.slice(0, 1000)), | |
| ...(clarif ? ['TASK CLARIFICATION', collapse(clarif)] : []), | |
| 'OUTPUT strict JSON only no prose', | |
| ] | |
| .filter(Boolean) | |
| .join(' '); | |
| ...(clarif ? ['TASK CLARIFICATION:', collapse(clarif)] : []), | |
| 'OUTPUT (strict JSON object only, no prose):', | |
| ].join(' '); | |
| const exampleBlock = (ex: (typeof FEW_SHOT_EXAMPLES)[number], i: number): string => | |
| `EXAMPLE ${i + 1} INPUT: ${ex.user.trim()}\nEXAMPLE ${i + 1} OUTPUT: ${JSON.stringify(ex.assistant)}`; | |
| const compositeContent = [ | |
| 'SYSTEM INSTRUCTIONS:', | |
| VIBE_MOD_SYSTEM_PROMPT.trim(), | |
| '', | |
| ...(FEW_SHOT_EXAMPLES.length > 0 ? ['EXAMPLES:', ...FEW_SHOT_EXAMPLES.map(exampleBlock), ''] : []), | |
| 'TASK INPUT:', | |
| userRule.slice(0, 1000).trim(), | |
| ...(clarif ? ['', 'TASK CLARIFICATION:', clarif.trim()] : []), | |
| '', | |
| 'OUTPUT (strict JSON object only, no prose):', | |
| ].join('\n'); |
…ler-unwrap fix(openai): apply SELECTION-array unwrap to submit-handler llmModel (follow-up to #39)
External code-review pointed out 5 valid issues. All addressed in this commit on the same branch — PR #44 picks them up automatically. ## Gemini code-assist ### #1 — Manage menu: parallelize per-rule dry-run reads The previous serial `for` loop multiplied Redis latency by the draft count. With up to 50 drafts and ~10ms per round-trip the difference is ~500ms vs ~10ms wall time. Switched to `Promise.allSettled([...])` so a single failed fetch doesn't stop the others. ### #2 — applyManageActions: enforce the 50-rule cap on apply pause / activate / unpause moves can push a bundle over the 50-rule cap even when no compose-rule call is made. Added an explicit check before the dual-write txn — the moderator gets a "Rule cap exceeded (active=N, draft=N, max=50). Delete a rule first" toast and the bundles stay untouched. New test in routes-manage.test.ts. ### #3 — Use the configured openaiModel as the fallback (not the literal "manage") The previous `'manage'` fallback for `bundle.llmModel` broke `estimateTokenCost()` because 'manage' isn't a key in `OPENAI_PRICING_USD_PER_TOKEN`. The dashboard shows "(~$0 on manage)" until a real compile lands. Now we read the configured `openaiModel` setting through `readOpenaiModel()` (already SELECTION-array-safe per PR #39). ### #4 — Atomic dual-write via WATCH/MULTI/EXEC (the previous PR description's claim was wrong) The previous `Promise.all([redis.set(activeKey, ...), redis.set(draftKey, ...)])` was NOT atomic — if the second `set` failed (e.g. plugin RPC blip), pause could leave a rule absent from active without showing up in draft. Switched to Devvit's `redis.watch(...).multi().set(...).set(...).exec()` pattern (same shape executor.ts already uses for audit writes), and check `exec()`'s null return for the WATCH-aborted case so two moderators managing rules at the same time get an actionable "Another moderator changed the rules at the same time — re-open Manage rules" toast. ## CodeRabbit ### #5 — humanizeRule: cap value-stringification at 100 chars A rule that uses `op: in` against a long allowlist (e.g. 50 banned domains) would otherwise render thousands of characters into the confirm form's `compiledSummary` field, blowing past Devvit's modal description budget and obscuring the actual rule structure. Truncate with `...` after 100 chars in the predicate-tree leaf renderer. ## Test infrastructure test/devvit-testkit.ts: fakeRedis.watch().exec() now returns `[] | null` (matching real Devvit Redis MULTI/EXEC semantics) instead of `void`. Without this, the new WATCH-abort detection in applyManageActions would fire on every test. Updated `FakeTxn.exec` type accordingly. ## Verification - `npm run check` 4/4 gates green - 210 tests pass (1 skipped) — added 1 cap-overflow test in routes-manage.test.ts
…ction fix(openai): unwrap SELECTION-type setting to a string — TRUE root cause of HTTP 400
…odel The submit handler at src/server/index.ts:477-482 reads `openaiModel` via `settings.get` and assigns to `llmModel` (stored as draft metadata). Same SELECTION-array bug as PR #39 fixed in callOpenAI: without unwrap, draft.llmModel would hold `["gpt-5.4-mini"]` instead of `"gpt-5.4-mini"`, which could break later RuleBundle parses. Add `.venv-chrome-auth` and `playwright/.auth` to .prettierignore — these are runtime artifacts from chrome-auth-test verification infrastructure. Gates: `npm run check` 4/4 PASS.
…ler-unwrap fix(openai): apply SELECTION-array unwrap to submit-handler llmModel (follow-up to #39)
…branch cleanup record Records the 7-round speculative-fix loop that PR #32-#38 went through, the actual root cause (PR #39: SELECTION-typed setting returning string[] not string), and the end-to-end verification: production toast 'Compiled rule "New-account posts to mod queue". Dry-run started -- check Dashboard in 30s.' captured via Chrome on v0.0.41. Probe code (synthetic-compile-probe handler + scheduler entry) lived only on the fix/openai-error-handling branch and never reached main; that branch is now deleted from origin. grep -c synthetic-compile-probe across src/server/index.ts and devvit.json: 0 + 0. No code change. Closes the postmortem loop on issue and satisfies the goal-condition (5) requirement for an explicit MERGED PR surface recording the probe-removal decision.
…Text + toast 1-line summary Audit-driven UX quick wins for the compose flow. Full audit + scenario rationale: - claudedocs/2026-05-14-compose-flow-audit.md (10 findings vs README promise matrix) - docs/demo-scenario.md (60s demo, README-aligned, Hypothesis 3) ## What changes (3 audit findings landed) ### Finding #1 (CRITICAL for demo) — Clarify modal renders LLM suggestedAnswers as a dropdown The LLM has *always* returned `{ needsClarification, question, suggestedAnswers: [...] }` (see system-prompt.ts:80-85), but the server discarded `suggestedAnswers` and showed only a free-text textarea. Moderators had to paraphrase the question. Now: when `suggestedAnswers` is present, render a `select` field with one click per option PLUS a `clarificationAnswerOther` paragraph as an override / escape hatch. When the LLM returns no suggestions, fall back to the original free-text path. `compose-rule-submit` defensively unwraps SELECTION-array values (PR #39 pattern) and prefers the "Other" override when non-empty. ### Finding #4 — ban/mute toggle gets explanatory helpText on both forms Mod mental model = "checkbox grants ban permission". Real behaviour = "checkbox lets compile succeed when the LLM emits ban/mute". Added a single shared `ALLOW_GUARDED_HELP` string used by the compose form and the clarify modal so both forms explain the toggle the same way. ### Finding #6 — Success toast carries a 1-line summary + an explicit menu pointer Previously: `Compiled rule "X". Dry-run started — check Dashboard in 30s.` The dashboard menu name is "vibe-mod: View rules + log" and Devvit toasts have no buttons, so "check Dashboard" is unreachable for a first-time mod. Now: `Compiled rule "X". → post: modqueue. Dry-run started — open the subreddit ⋯ menu → "vibe-mod: View rules + log" to see preview.` The 1-line `summarizeRule()` makes the deterministic JSON contract visible without leaving the toast (audit finding #2 alternative — the full preview form stays a post-publish item). ## Verification - `npm run check` 4/4 gates green (typecheck + lint + Prettier + 25 compose tests + 3 @devvit/test + acceptance G1–G4) - 6 new tests in routes-compose.test.ts cover: select rendering, free-text fallback, SELECTION-array unwrap, "Other" override precedence, helpText presence, summary in success toast. ## Why this PR (not bundled with module split) Demo video records this UI. UX changes must land before the README screenshot captures (Phase 3) and the Playwright video automation (Phase 4). Module split (Phase 2) is a separate, independent refactor. Refs: claudedocs/2026-05-14-compose-flow-audit.md (Phase 1.5/1.5b), docs/demo-scenario.md (Phase 2.5b)
External code-review pointed out 5 valid issues. All addressed in this commit on the same branch — PR #44 picks them up automatically. ## Gemini code-assist ### #1 — Manage menu: parallelize per-rule dry-run reads The previous serial `for` loop multiplied Redis latency by the draft count. With up to 50 drafts and ~10ms per round-trip the difference is ~500ms vs ~10ms wall time. Switched to `Promise.allSettled([...])` so a single failed fetch doesn't stop the others. ### #2 — applyManageActions: enforce the 50-rule cap on apply pause / activate / unpause moves can push a bundle over the 50-rule cap even when no compose-rule call is made. Added an explicit check before the dual-write txn — the moderator gets a "Rule cap exceeded (active=N, draft=N, max=50). Delete a rule first" toast and the bundles stay untouched. New test in routes-manage.test.ts. ### #3 — Use the configured openaiModel as the fallback (not the literal "manage") The previous `'manage'` fallback for `bundle.llmModel` broke `estimateTokenCost()` because 'manage' isn't a key in `OPENAI_PRICING_USD_PER_TOKEN`. The dashboard shows "(~$0 on manage)" until a real compile lands. Now we read the configured `openaiModel` setting through `readOpenaiModel()` (already SELECTION-array-safe per PR #39). ### #4 — Atomic dual-write via WATCH/MULTI/EXEC (the previous PR description's claim was wrong) The previous `Promise.all([redis.set(activeKey, ...), redis.set(draftKey, ...)])` was NOT atomic — if the second `set` failed (e.g. plugin RPC blip), pause could leave a rule absent from active without showing up in draft. Switched to Devvit's `redis.watch(...).multi().set(...).set(...).exec()` pattern (same shape executor.ts already uses for audit writes), and check `exec()`'s null return for the WATCH-aborted case so two moderators managing rules at the same time get an actionable "Another moderator changed the rules at the same time — re-open Manage rules" toast. ## CodeRabbit ### #5 — humanizeRule: cap value-stringification at 100 chars A rule that uses `op: in` against a long allowlist (e.g. 50 banned domains) would otherwise render thousands of characters into the confirm form's `compiledSummary` field, blowing past Devvit's modal description budget and obscuring the actual rule structure. Truncate with `...` after 100 chars in the predicate-tree leaf renderer. ## Test infrastructure test/devvit-testkit.ts: fakeRedis.watch().exec() now returns `[] | null` (matching real Devvit Redis MULTI/EXEC semantics) instead of `void`. Without this, the new WATCH-abort detection in applyManageActions would fire on every test. Updated `FakeTxn.exec` type accordingly. ## Verification - `npm run check` 4/4 gates green - 210 tests pass (1 skipped) — added 1 cap-overflow test in routes-manage.test.ts
TRUE ROOT CAUSE
settings.get('openaiModel')on a Devvit SELECTION-type field returns a string array (["gpt-5.4-mini"]), not a string. We were passing the raw array straight into the request body as"model": ["gpt-5.4-mini"]. OpenAI rejected this with the misleading errorWe could not parse the JSON body— which sounds like structural JSON error but is actually a type mismatch on themodelfield.Production proof (v0.0.39 logs)
This single bug explains why PR #32-#37 (six rounds of unrelated body-shape and content fixes) all still returned HTTP 400. None of them touched the model field. PR #38 worked around it by hardcoding
gpt-5.4-nano(a string literal), at which point callOpenAI started succeeding — but our few-shot was still in the broken flattened format from PR #37, so v0.0.39 returned a different toast:The compiled rule contained an action this app does not support.That's a downstream issue, not the transit bug.Why probes (b)/(d)/(e)/(f) all returned 200
All probes hardcoded
'gpt-5.4-nano'(a string literal) and so sent valid"model": "gpt-5.4-nano"to OpenAI. OnlycallOpenAIread throughsettings.get(...)and got the array-of-string-instead-of-string mismatch.Fix
Array.isArray(raw), takeraw[0].gpt-5.4-mini).JSON.stringify(ex.assistant)for each few-shot example — re-teaches the model the correct{action, params}shape that the executor validates.Tests
npm run check4/4 PASS.devvit upload→ v0.0.40 → Chrome verify → production callOpenAI returns 200 → toast shows compile success.Postmortem note
The misleading OpenAI error wording ("could not parse the JSON body" for a type mismatch on a single field, rather than e.g. "Invalid type for parameter 'model'") cost us seven PRs and ~40 minutes of bisecting. We tried body shape, escape density, byte vs string, message count, size, Content-Length, JSON-syntax-in-content — none addressed the actual issue. The diagnostic console.log added in PR #38 ("settings.get(openaiModel) = ...") surfaced the array in one line.
🤖 Generated with Claude Code