feat(openai): tune compile call (reasoning_effort: none, verbosity: low), default gpt-5.4-mini by ComBba · Pull Request #11 · Two-Weeks-Team/vibe-mod

ComBba · 2026-05-12T10:55:10Z

What

vibe-mod's one LLM call (callOpenAI: NL rule → strict JSON, or a clarification) is mechanical translation, not reasoning — so configure it that way:

param	value	why
`reasoning_effort`	`'none'`	no hidden reasoning → fast (~1.2–1.4s) and the token budget isn't eaten by reasoning. ⚠️ `'none'` is the gpt-5.4-family value; gpt-5.0/5.1 used `'minimal'` (5.4 rejects it: "Supported values are: 'none','low','medium','high','xhigh'") and gpt-5-mini wants `'minimal'`. Fine because the model options are restricted to the 5.4 family.
`verbosity`	`'low'`	terse JSON, no commentary
`max_completion_tokens`	`600` (was 700)	a compiled rule + a clarification fit comfortably; observed worst case ~150 out tokens
~~`max_tokens`~~ / ~~`temperature`~~	—	not supported on gpt-5.x; use `max_completion_tokens`, default temperature only

Model choice — measured (real key, production config)

model	pass	median	max	avg out tok
gpt-5.4-mini	7/7	~1.2s	~1.8s	~98	← recommended / new default, fastest
gpt-5.4-nano	7/7	~1.5s	~1.7s	~113	← cheapest
gpt-5.4 (full)	7/7	~2.1s	~4.2s	~112	← slower, more cautious on ambiguous rules, no quality gain for this task
gpt-5-mini	7/7 (only with `reasoning_effort: minimal`)	~1.9s	~4.2s	—	slower + bumpy; not in the options
gpt-5-nano / gpt-4.1-mini / gpt-4.1-nano	n/a	—	—	—	403 (not available to this project)

devvit.json openaiModel trimmed to those three with the numbers in the labels; default switched gpt-5.4-nano → gpt-5.4-mini; index.ts fallback updated too.

smoketest

Default request config now mirrors callOpenAI (reasoning_effort/verbosity/max_completion_tokens), with REASONING_EFFORT/VERBOSITY/MAX_COMPLETION_TOKENS env overrides for experiments (earlier commits added per-call latency + the OPENAI_MODELS=a,b,c comparison table). Tightened one test case to an explicit threshold ("more than 90% of the letters are uppercase") so it doesn't penalise the cautious model for asking — now 7/7 on all three.

On data sharing: the free-daily-usage tier is the OpenAI "share API inputs/outputs" program (the help article you linked). For vibe-mod that's benign — it sends only the moderator's natural-language rule + the system prompt, never Reddit content (hard-lock #6). The shared data is the rule text + the compiled JSON.

Verify

tsc --noEmit / eslint --max-warnings 0 / prettier --check clean · vitest 151 passed (1 skipped) · acceptance 4/4 · doctor 0 hard · CI green
OPENAI_MODELS=gpt-5.4-mini,gpt-5.4-nano,gpt-5.4 npm run openai:smoketest → 7/7 × 3

🤖 Generated with Claude Code

…none, verbosity: low; default gpt-5.4-mini The single LLM call vibe-mod makes (callOpenAI: NL rule → strict JSON, or a clarification) is mechanical translation, not reasoning. Configure it as such: - reasoning_effort: 'none' — no hidden reasoning; fast (~1.2–1.4s) and keeps the token budget from being eaten by reasoning. NB: 'none' is the gpt-5.4 family's value; the gpt-5.0/5.1-era 'minimal' is rejected by 5.4 ("Supported values are: 'none','low','medium','high','xhigh'"), and gpt-5-mini wants 'minimal' — so this is 5.4-family-specific, which is fine since the model options are restricted to that family. - verbosity: 'low' — terse JSON, no commentary. - max_completion_tokens: 600 (down from 700) — a compiled rule + a clarification fit comfortably; observed worst case ~150 out tokens. - still no `temperature` (gpt-5.x only accepts the default). devvit.json openaiModel options trimmed/relabelled to the three viable picks with measured numbers, default switched gpt-5.4-nano → gpt-5.4-mini: gpt-5.4-mini 7/7 median ~1.2s max ~1.8s ← recommended, fastest gpt-5.4-nano 7/7 median ~1.5s max ~1.7s ← cheapest gpt-5.4 7/7 median ~2.1s max ~4.2s ← full; slower, more cautious on ambiguous rules, no quality gain (index.ts fallback model also updated nano → mini.) smoketest: default request config now mirrors callOpenAI (reasoning_effort/ verbosity/max_completion_tokens), with REASONING_EFFORT/VERBOSITY/ MAX_COMPLETION_TOKENS env overrides for experiments; added per-call latency and the OPENAI_MODELS=a,b,c comparison table in earlier commits. Tightened one test case to an explicit threshold ("more than 90% of the letters are uppercase") so it doesn't penalise the more-cautious model for asking — now 7/7 on all three. tsc/lint/format/tests(152)/acceptance(4/4) all green; smoke test 7/7 × 3 models. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

chatgpt-codex-connector · 2026-05-12T10:55:15Z

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

coderabbitai · 2026-05-12T10:55:17Z

Warning

Rate limit exceeded

@ComBba has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 51 minutes and 47 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 06b60383-8567-4f48-a565-b806a4c8fa2c

📥 Commits

Reviewing files that changed from the base of the PR and between 3f5fa4d and 301bae3.

📒 Files selected for processing (3)

devvit.json
scripts/openai-smoketest.ts
src/server/index.ts

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/openai-reasoning-config

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gemini-code-assist

Code Review

This pull request updates the default OpenAI model to gpt-5.4-mini and introduces specific tuning parameters—reasoning_effort, verbosity, and a reduced max_completion_tokens—to optimize for speed and cost. The smoketest script was also updated to allow experimentation via environment variables. Feedback was provided regarding the hardcoding of these new parameters in the production API call, as it may cause errors for older models or non-standard API implementations; a conditional approach is recommended to ensure backward compatibility and API stability.

gemini-code-assist · 2026-05-12T10:56:38Z

    body: JSON.stringify({
-      model,
+      model, // gpt-5.4-mini (default) / gpt-5.4-nano / gpt-5.4 — see devvit.json openaiModel
      response_format: { type: 'json_object' },
      messages,
-      // Newer OpenAI models (gpt-5.x family) require max_completion_tokens (not max_tokens)
-      // and only accept the default temperature, so we don't send `temperature`. Determinism
-      // is carried by response_format: json_object + the strict prompt + few-shot examples.
-      max_completion_tokens: 700,
+      // Tuned for what this call is: a mechanical NL → strict-JSON translation.
+      //   reasoning_effort: 'none'  — no hidden reasoning needed; keeps it fast and stops the
+      //                               token budget being eaten by reasoning (gpt-5.4 family value;
+      //                               older models call this 'minimal'). Measured ~1.1–1.4s.
+      //   verbosity: 'low'          — terse JSON, no commentary.
+      //   max_completion_tokens     — a compiled rule + a clarification fit well under 600.
+      //   (no `temperature` — the gpt-5.x family only accepts the default; max_tokens isn't
+      //    supported on these models, use max_completion_tokens.)
+      reasoning_effort: 'none',
+      verbosity: 'low',
+      max_completion_tokens: 600,
    }),


Hardcoding reasoning_effort and verbosity in the request body can lead to API errors (HTTP 400) if the model stored in the user's settings does not support these parameters. This is a significant risk for existing installations where an older model (like gpt-4o-mini or gpt-5-mini) might still be configured in the subreddit settings.

Additionally, verbosity is not a standard parameter in the public OpenAI Chat Completions API. While reasoning_effort is a valid parameter for reasoning models (like the o1/o3 family), verbosity appears to be non-standard and may cause errors if the API does not recognize it.

It is safer to construct the request body conditionally, ensuring these parameters are only sent to models known to support them (the gpt-5.4 family in this context), similar to the defensive implementation in the smoketest script.

body: JSON.stringify({ model, // gpt-5.4-mini (default) / gpt-5.4-nano / gpt-5.4 — see devvit.json openaiModel response_format: { type: 'json_object' }, messages, // Tuned for what this call is: a mechanical NL → strict-JSON translation. // reasoning_effort: 'none' — no hidden reasoning needed; keeps it fast and stops the // token budget being eaten by reasoning (gpt-5.4 family value; // older models call this 'minimal'). Measured ~1.1–1.4s. // verbosity: 'low' — terse JSON, no commentary. // max_completion_tokens — a compiled rule + a clarification fit well under 600. // (no `temperature` — the gpt-5.x family only accepts the default; max_tokens isn't // supported on these models, use max_completion_tokens.) max_completion_tokens: 600, ...(model.startsWith('gpt-5.4') ? { reasoning_effort: 'none', verbosity: 'low' } : {}), }),

feat(openai): tune compile call (reasoning_effort: none, verbosity: low), default gpt-5.4-mini

ComBba merged commit 23c8b1b into main May 12, 2026
2 checks passed

ComBba deleted the feat/openai-reasoning-config branch May 12, 2026 10:56

gemini-code-assist Bot reviewed May 12, 2026

View reviewed changes

ComBba mentioned this pull request May 12, 2026

docs(HANDOFF): refresh OpenAI model/cost/state (gpt-5.4-mini default, reasoning_effort: none) #12

Merged

ComBba added a commit that referenced this pull request May 15, 2026

Merge pull request #11 from Two-Weeks-Team/feat/openai-reasoning-config

9b34988

feat(openai): tune compile call (reasoning_effort: none, verbosity: low), default gpt-5.4-mini

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(openai): tune compile call (reasoning_effort: none, verbosity: low), default gpt-5.4-mini#11

feat(openai): tune compile call (reasoning_effort: none, verbosity: low), default gpt-5.4-mini#11
ComBba merged 1 commit into
mainfrom
feat/openai-reasoning-config

ComBba commented May 12, 2026

Uh oh!

chatgpt-codex-connector Bot commented May 12, 2026

Uh oh!

coderabbitai Bot commented May 12, 2026

Rate limit exceeded

Uh oh!

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot May 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ComBba commented May 12, 2026

What

Model choice — measured (real key, production config)

smoketest

Verify

Uh oh!

chatgpt-codex-connector Bot commented May 12, 2026

Uh oh!

coderabbitai Bot commented May 12, 2026

Rate limit exceeded

Uh oh!

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 12, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant