fix: address external review feedback (title-caps fact, testkit ZSet fidelity, mock-leak, doctor semver, replay seeding)#6
Conversation
…t / codex) - starter rule r_shouting_title evaluated the *body* uppercase ratio, not the title — add a `content.title.upperCaseRatio` fact (rule-schema FactPaths + fact-bag computation; system-prompt picks it up automatically) and point the rule at it. New test covers "shouty body, calm title => no match". (codex P2) - devvit-testkit: zAdd no longer allows duplicate ZSet members (real ZADD updates the existing member's score); zRange honours `reverse` with `by: 'score'` and the branch structure is simplified. (gemini) - devvit-doctor: node engine comparison now covers major.minor.patch. (gemini) - system-prompt.test: clarification examples must have a non-empty `suggestedAnswers`, not just an array. (coderabbit) - test/setup beforeEach: clearAllMocks() doesn't reset mock *implementations* — mockReset() getPostById/getCommentById and re-establish every reddit double's default impl so a per-test mockResolvedValue/mockImplementation can't leak. (coderabbit) - replay: fixtures gain `redisHashes` / `redisZsets` (was string keys only) and the runner installs a generic Reddit "thing" stub so rollback/remove/lock/flair flows complete; add fixtures/undo-action.json (rollback round-trip replay). 152 tests pass (1 skipped); tsc/lint/format clean; acceptance 4/4. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
|
Warning Rate limit exceeded
You’ve run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (14)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request implements several improvements and bug fixes, notably introducing a content.title.upperCaseRatio fact to correctly identify shouty titles independently of the post body. Other changes include enhancing the Redis testkit's zAdd and zRange behavior, refining Node.js version checks in devvit-doctor, and improving test mock isolation by resetting implementations in test/setup.ts. Feedback focuses on the ASCII-only limitation of the uppercase ratio calculation which may impact internationalization, a bug in handling negative indices in the rank-based zRange emulation, and potential state leakage in tests when using shared mock objects for Reddit entities.
| function upperCaseRatioOf(s: string): number { | ||
| const letters = s.replace(/[^A-Za-z]/g, ''); | ||
| return letters.length === 0 ? 0 : (letters.match(/[A-Z]/g)?.length ?? 0) / letters.length; | ||
| } |
There was a problem hiding this comment.
The upperCaseRatioOf function is strictly limited to ASCII letters (A-Z, a-z). This means it will return a ratio of 0 for content written in non-Latin scripts (e.g., Cyrillic, Greek, or CJK), even if the text is in an uppercase equivalent. While this suffices for the current starter rules, it limits the tool's effectiveness for international subreddits. Consider using a Unicode-aware approach (e.g., checking Unicode categories) if global support is intended.
| const end = stop < 0 ? arr.length : stop + 1; | ||
| return arr.slice(start, end); |
There was a problem hiding this comment.
The current logic for handling negative stop indices in the rank-based zRange does not correctly emulate Redis behavior for values other than -1. In Redis, stop is an inclusive index where -1 is the last element and -2 is the second to last. The current implementation treats all negative values as the end of the array, which would make ZRANGE key 0 -2 return the same result as ZRANGE key 0 -1.
| const end = stop < 0 ? arr.length : stop + 1; | |
| return arr.slice(start, end); | |
| const end = stop < 0 ? arr.length + stop + 1 : stop + 1; | |
| return arr.slice(start, end); |
| fakeReddit.getPostById.mockResolvedValue(stubThing); | ||
| fakeReddit.getCommentById.mockResolvedValue(stubThing); |
There was a problem hiding this comment.
Using mockResolvedValue(stubThing) causes all calls to getPostById and getCommentById to return the exact same object instance. If a single replay request interacts with multiple posts or comments, state changes (like removed or flair) on one will leak to all others. It is safer to use mockImplementation to return a fresh clone for each call.
| fakeReddit.getPostById.mockResolvedValue(stubThing); | |
| fakeReddit.getCommentById.mockResolvedValue(stubThing); | |
| fakeReddit.getPostById.mockImplementation(async () => ({ ...stubThing })); | |
| fakeReddit.getCommentById.mockImplementation(async () => ({ ...stubThing })); |
fix: address external review feedback (title-caps fact, testkit ZSet fidelity, mock-leak, doctor semver, replay seeding)
…Text + toast 1-line summary Audit-driven UX quick wins for the compose flow. Full audit + scenario rationale: - claudedocs/2026-05-14-compose-flow-audit.md (10 findings vs README promise matrix) - docs/demo-scenario.md (60s demo, README-aligned, Hypothesis 3) ## What changes (3 audit findings landed) ### Finding #1 (CRITICAL for demo) — Clarify modal renders LLM suggestedAnswers as a dropdown The LLM has *always* returned `{ needsClarification, question, suggestedAnswers: [...] }` (see system-prompt.ts:80-85), but the server discarded `suggestedAnswers` and showed only a free-text textarea. Moderators had to paraphrase the question. Now: when `suggestedAnswers` is present, render a `select` field with one click per option PLUS a `clarificationAnswerOther` paragraph as an override / escape hatch. When the LLM returns no suggestions, fall back to the original free-text path. `compose-rule-submit` defensively unwraps SELECTION-array values (PR #39 pattern) and prefers the "Other" override when non-empty. ### Finding #4 — ban/mute toggle gets explanatory helpText on both forms Mod mental model = "checkbox grants ban permission". Real behaviour = "checkbox lets compile succeed when the LLM emits ban/mute". Added a single shared `ALLOW_GUARDED_HELP` string used by the compose form and the clarify modal so both forms explain the toggle the same way. ### Finding #6 — Success toast carries a 1-line summary + an explicit menu pointer Previously: `Compiled rule "X". Dry-run started — check Dashboard in 30s.` The dashboard menu name is "vibe-mod: View rules + log" and Devvit toasts have no buttons, so "check Dashboard" is unreachable for a first-time mod. Now: `Compiled rule "X". → post: modqueue. Dry-run started — open the subreddit ⋯ menu → "vibe-mod: View rules + log" to see preview.` The 1-line `summarizeRule()` makes the deterministic JSON contract visible without leaving the toast (audit finding #2 alternative — the full preview form stays a post-publish item). ## Verification - `npm run check` 4/4 gates green (typecheck + lint + Prettier + 25 compose tests + 3 @devvit/test + acceptance G1–G4) - 6 new tests in routes-compose.test.ts cover: select rendering, free-text fallback, SELECTION-array unwrap, "Other" override precedence, helpText presence, summary in success toast. ## Why this PR (not bundled with module split) Demo video records this UI. UX changes must land before the README screenshot captures (Phase 3) and the Playwright video automation (Phase 4). Module split (Phase 2) is a separate, independent refactor. Refs: claudedocs/2026-05-14-compose-flow-audit.md (Phase 1.5/1.5b), docs/demo-scenario.md (Phase 2.5b)
Summary
Addresses the still-valid items from the external reviewers on PR #1 / PR #2 (gemini-code-assist, CodeRabbit, Codex).
Fixes
r_shouting_titlestarter rule was checking the wrong thing (Codex P2). It usedcontent.upperCaseRatio, whichbuildPostFactBagderives from the post body — so an ALL-CAPS title on a link post (empty body) never matched, despite the rule's name/sourceNL. Added a new factcontent.title.upperCaseRatio(rule-schemaFactPaths;fact-bagcomputes it from the title via a sharedupperCaseRatioOf; the system prompt picks it up automatically) and pointed the rule at it. New test: "shouty body, calm title ⇒ no match".devvit-testkitZSet fidelity (gemini):zAddno longer keeps duplicate members — real RedisZADDupdates the existing member's score;zRangenow honoursreversewithby: 'score'and the branch structure is simplified.devvit-doctorversion check (gemini): node-engine comparison now covers major.minor.patch, not just major.minor.system-prompt.test(CodeRabbit): clarification examples must have a non-emptysuggestedAnswers, not merely an array.test/setup.tsbeforeEach(CodeRabbit):vi.clearAllMocks()only wipes call history, not implementations — a per-testmockResolvedValue/mockImplementationongetPostById/getCommentByIdwould leak into the next test. NowmockReset()those two and explicitly re-establish every Reddit double's default implementation each run.replayharness: fixtures gainredisHashes/redisZsets(previously only string keys could be seeded), and the runner installs a generic Reddit "thing" stub so rollback/remove/lock/flair flows complete in a replay. Newfixtures/undo-action.jsondemonstrates a rollback round-trip replay.Not changed (with reason)
Verify
npx tsc --noEmit— cleannpx vitest run— 151 passed, 1 skipped (replay-runner)npx eslint . --max-warnings 0/npx prettier --check .— cleannpm run acceptance— 4/4npm run replay fixtures/undo-action.json— shows the rollback succeeding🤖 Generated with Claude Code