Skip to content

test+fix: vitest/route tests, acceptance gates, starter rules, @devvit SDK alignment, CI#1

Merged
ComBba merged 3 commits into
mainfrom
feat/tests-acceptance-starter-rules
May 12, 2026
Merged

test+fix: vitest/route tests, acceptance gates, starter rules, @devvit SDK alignment, CI#1
ComBba merged 3 commits into
mainfrom
feat/tests-acceptance-starter-rules

Conversation

@ComBba
Copy link
Copy Markdown
Contributor

@ComBba ComBba commented May 12, 2026

Summary

Brings vibe-mod from "no tests, doesn't typecheck" to "148 tests passing, tsc --noEmit clean, CI green" — everything that can be done before the manual Devvit-wizard step.

What's in here

Testing layer (was 0 tests → 148)

  • test/setup.ts — vitest global setup: in-memory Redis double; Devvit SDK doubles whose shapes track @devvit/web@0.12.x (getCurrentSubreddit() → {id,name}, getModerators() → Listing, karma lookup → {fromComments,fromPosts}, reddit.report(thing,opts), modMail.createModNotification); an OpenAI fetch double.
  • Unit (6 files): rule-schema, evaluator, executor, fact-bag, system-prompt, starter-rules.
  • Feature-segmented route call-tests (6 files) — drive the real Hono app via app.fetch() with Devvit + OpenAI mocked: routes-compose, routes-dashboard, routes-undo, routes-triggers, routes-scheduler, routes-settings.

@devvit/web SDK alignment (npx tsc --noEmit: ~45 errors → 0)

  • src/server/devvit-helpers.tsgetCurrentSubreddit{Name,Ref}, asT1/asT3 branded-id casts.
  • reddit.getCurrentSubredditName()reddit.getCurrentSubreddit().name; getPostById/getCommentById take t3_/t1_ ids; thing.report()reddit.report(thing,{reason}); getModerators()/Listing.all(); getUserKarmaFromCurrentSubreddit(name){fromComments,fromPosts}; modMail.createmodMail.createModNotification(...); no redis.zCount → count a last-hour score window via zRange; no TaskRequest/TaskResponse in @devvit/web/shared → local TaskBody/TaskAck.

Bug fixes (found while writing tests)

  • Audit/rollback/breaker key scopingexecutor.ts wrote audit, rollback:…, circuit:open un-scoped while the Dashboard, Undo and rate-limit-breaker handlers read ${sub}:audit, ${sub}:audit:${id}, ${sub}:circuit:open. Dashboard/Undo therefore never found real audit entries and the per-sub breaker checked the wrong key. All audit/rollback/breaker keys are now sub-scoped consistently (circuit:beta_freeze stays global by design).
  • Orphaned form — removed devvit.json's activateRuleForm → /internal/form/activate-rule (no such route; activation goes through dashboardForm). Caught by acceptance G1.

Starter rules

  • src/shared/starter-rules.ts — 5 conservative seed rules (SAFE actions only, shadow:true). onAppInstall seeds them as a draft bundle; the mod activates from the Dashboard.

Tooling

  • scripts/acceptance.tsnpm run acceptance: G1 app wiring · G2 compiler↔schema · G3 rollback/dry-run/log · G4 hardening (starter rules + tsc --noEmit + full test suite). 4/4.
  • .github/workflows/ci.yml — push/PR: install → tsc --noEmitnpm testnpm run acceptance.

Not in scope (needs the Devvit wizard / npm run dev)

  • Live playtest gates (Compose menu renders, real OpenAI compile round-trips, undo round-trip) — see the MANUAL section in npm run acceptance output.
  • .devvit-app-id (wizard) + devvit build.
  • ToS/Privacy HTML export + gallery push.

Verify

  • npx tsc --noEmit — clean · npx vitest run — 148 passed (12 files) · npm run acceptance — 4/4 · CI green.

🤖 Generated with Claude Code

- test/setup.ts: in-memory Redis + Devvit SDK mocks for vitest
- 6 test files (95 tests): rule-schema, evaluator, executor, fact-bag,
  system-prompt, starter-rules — all passing
- src/shared/starter-rules.ts: 5 conservative seed rules (SAFE actions only,
  shadow:true); wired into onAppInstall trigger as a draft bundle
- scripts/acceptance.ts: `npm run acceptance` — Day1-Day4 exit gates encoded
  as config<->code consistency + schema + unit-suite checks (4/4 passing)
- devvit.json: remove orphaned `activateRuleForm` form entry (no route in
  index.ts; activation flows through dashboardForm) — caught by acceptance G1
- HANDOFF.md: update artifact inventory

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 12, 2026

Walkthrough

PR은 vibe-mod Devvit 모드 도구에 Day 1 수락 게이트 검증 인프라와 5개 기본 규칙 시드 번들을 추가합니다. 새로운 acceptance.ts 스크립트가 4개 게이트 검증(설정↔코드 일관성, OpenAI 정렬, 운영 배선, 강화)을 자동으로 실행하고, 테스트 이중화와 종합 테스트 스위트(evaluator, executor, fact-bag, rule-schema, starter-rules, system-prompt)를 포함하며, 앱 설치 시 기본 규칙을 초안으로 분배합니다.

Changes

Day 1 Acceptance Gates and Starter Rules

Layer / File(s) Summary
Test infrastructure with fake Redis and Devvit SDK doubles
test/setup.ts
Web Crypto 검증, 인메모리 FakeRedis/FakeTxn 인터페이스 및 구현, fakeRedis/fakeReddit/fakeSettings/fakeScheduler 내보내기, Vitest 모의 함수 등록, 각 테스트 전 상태 초기화.
Rule schema and system prompt validation tests
src/shared/rule-schema.test.ts, src/shared/system-prompt.test.ts
동작 동사(SAFE_ACTIONS, GUARDED_ACTIONS) 분리성 및 합집합, FactPaths 중복 검사, Rule.parse/RuleBundle.parse 스키마 검증, checkTreeDepth 깊이 제한, 시스템 프롬프트의 모든 사실/동작 포함, FEW_SHOT_EXAMPLES 일관성 및 품질.
Starter rules feature: seeding logic and validation
src/shared/starter-rules.ts, src/shared/starter-rules.test.ts
STARTER_RULE_SEEDS 5개 규칙 정의(보수적 조건, SAFE_ACTIONS만), seedStarterRules(now) 함수 및 RuleBundleType 생성, 결정론적 출력, STARTER_RULE_IDS 내보내기, 번들 무결성/타임스탬프/플래그 검증, 신규 계정/낮은 카르마/모두 대문자/URL 단축기 시나리오 매칭 테스트.
Core server logic test suites: evaluator, executor, fact-bag
src/server/evaluator.test.ts, src/server/executor.test.ts, src/server/fact-bag.test.ts
evaluatePredicate 연산자(==, !=, <, >, in, contains, matches) 및 composite(all, any, not) 의미론, selectMatchingRules 필터링/순서 보존, executeActions 단락(그림자/건조/회로차단/속도 제한)/적용/오류 결과, rollbackAction 토큰 소비/감사, fact-bag 폐쇄 키 집합/캐싱 동작.
Server bootstrap: seed starter rules on install
src/server/index.ts
seedStarterRules import 추가, onAppInstall에서 active 번들을 비운 후, rules:draft 키 없을 때만 기본 규칙 시드로 초안 분배, 기존 초안 덮어쓰기 방지.
4-gate acceptance check framework
scripts/acceptance.ts
Day 1–Day 4 종료 조건 자동 검증: Gate 1 설정↔코드 일관성(권한/메뉴/폼/경로), Gate 2 OpenAI 정렬(프롬프트 사실/동작, json_object), Gate 3 운영 배선(롤백/스케줄러/대시보드), Gate 4 강화(기본 규칙, 시드 호출, 문서, vitest). tsc --noEmit 정보성. 실패 시 수동 플레이테스트 요구사항 출력 및 exit(1).
Config cleanup and project documentation updates
devvit.json, HANDOFF.md
devvit.json에서 orphaned activateRuleForm 제거. HANDOFF.md에 starter-rules.ts, acceptance.ts, test/setup.ts, 테스트 커버리지, Day 1 완료 및 남은 플레이테스트/tsc 검증 항목 추가.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 다섯 규칙이 새로 심어져,
그림자 속에서 조용히 자라,
테스트는 모두 완벽하게 깔맞춤,
수락 게이트가 울리니 *ding*,
Day 1 임무 완료! 🎉

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 16.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed 제목은 PR의 주요 변경사항(vitest 테스트, acceptance 게이트, starter rules, SDK 정렬, CI)을 명확하게 요약하며 구체적이고 간결합니다.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/tests-acceptance-starter-rules

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a comprehensive testing suite and an automated acceptance script to verify the application's state against defined 'exit gates'. Key additions include unit tests for the evaluator, executor, and fact-bag components, along with a global Vitest setup that mocks the Devvit SDK and Redis. Additionally, the PR implements a starter rule seeding mechanism triggered on app installation. Feedback was provided regarding the Redis mock's zAdd implementation, which currently allows duplicate members, potentially leading to inconsistent behavior in future tests.

Comment thread test/setup.ts
Comment on lines +78 to +82
zAdd: async (k: string, entry: { member: string; score: number }) => {
const arr = zsets.get(k) ?? [];
arr.push(entry);
zsets.set(k, arr);
},
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The current zAdd implementation in the Redis mock allows duplicate members in the set, which deviates from standard Redis behavior where ZADD updates the score of an existing member. While this doesn't affect the current tests (which use unique IDs), it could lead to incorrect results in future tests that attempt to update a member's score or rely on member uniqueness.

Suggested change
zAdd: async (k: string, entry: { member: string; score: number }) => {
const arr = zsets.get(k) ?? [];
arr.push(entry);
zsets.set(k, arr);
},
zAdd: async (k: string, entry: { member: string; score: number }) => {
const arr = (zsets.get(k) ?? []).filter((e) => e.member !== entry.member);
arr.push(entry);
zsets.set(k, arr);
},

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 1deaa1fba7

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

on: ['onPostSubmit'],
when: {
all: [
{ fact: 'content.upperCaseRatio', op: 'gt', value: 0.7 },
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Match all-caps titles using title content, not body ratio

Update this predicate to evaluate the title itself instead of content.upperCaseRatio. In buildPostFactBag() (src/server/fact-bag.ts), content.upperCaseRatio is derived from post.selftext (body), so this starter rule misses the common case of an ALL-CAPS title on a link post or a post with empty/normal body text. That makes the seeded “shouting title” rule materially less effective than its name and sourceNL promise.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/server/index.ts (1)

527-538: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

재설치 시 ACTIVE 번들을 무조건 초기화해 운영 규칙이 사라질 수 있습니다.

Line 537에서 항상 :rules:active를 빈 번들로 덮어써서, 설치 트리거가 다시 실행되면 기존 활성 규칙이 즉시 비활성화됩니다. draft는 보호하면서 active는 보호하지 않아 데이터 손실 위험이 큽니다.

🔧 제안 수정
   const emptyActive: RuleBundleType = {
     schemaVersion: '1.0.0',
     bundleVersion: 1,
     compiledAt: Date.now(),
     llmModel: 'seed',
     llmTokensIn: 0,
     llmTokensOut: 0,
     rules: [],
   };
-  await redis.set(`${subredditName}:rules:active`, JSON.stringify(emptyActive));
+  const activeKey = `${subredditName}:rules:active`;
+  if (!(await redis.get(activeKey))) {
+    await redis.set(activeKey, JSON.stringify(emptyActive));
+  }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/server/index.ts` around lines 527 - 538, The current code always
overwrites the active bundle by calling
redis.set(`${subredditName}:rules:active`, JSON.stringify(emptyActive)), which
erases existing runtime rules on reinstall; change this to preserve an existing
active bundle by checking for the key first (e.g., use redis.exists/GET or a
SETNX/SET with NX option) and only write the emptyActive when no active bundle
exists, referring to the symbols emptyActive, redis.set (replace with SETNX or
conditional set), and the `${subredditName}:rules:active` key so that draft
protection remains but active bundles are not unintentionally cleared.
🧹 Nitpick comments (3)
scripts/acceptance.ts (3)

101-101: ⚡ Quick win

Action verb 부분 문자열 매칭이 false positive를 유발할 수 있습니다.

VIBE_MOD_SYSTEM_PROMPT.includes(a)는 action verb가 다른 단어의 일부로 나타나도 통과합니다. 예를 들어, "lock""unlock"에 포함되어도 통과합니다.

♻️ 단어 경계를 사용한 개선 제안
-        for (const a of [...SAFE_ACTIONS, ...GUARDED_ACTIONS]) assert(VIBE_MOD_SYSTEM_PROMPT.includes(a), `system prompt is missing action verb ${a}`);
+        for (const a of [...SAFE_ACTIONS, ...GUARDED_ACTIONS]) {
+          const pattern = new RegExp(`\\b${a.replace(/[.*+?^${}()|[\]\\]/g, '\\$&')}\\b`);
+          assert(pattern.test(VIBE_MOD_SYSTEM_PROMPT), `system prompt is missing action verb ${a}`);
+        }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@scripts/acceptance.ts` at line 101, The current check uses
VIBE_MOD_SYSTEM_PROMPT.includes(a) which can false-positive match substrings
(e.g., "lock" in "unlock"); change the assertion to verify whole-word matches by
either testing a word-boundary regex for each action (e.g., new
RegExp(`\\b${escapeRegex(a)}\\b`) against VIBE_MOD_SYSTEM_PROMPT) or by
tokenizing VIBE_MOD_SYSTEM_PROMPT with a non-word split (e.g., split on /\W+/)
and checking the resulting array contains the action; update the loop that
iterates SAFE_ACTIONS and GUARDED_ACTIONS to use that exact-word matching and
ensure you escape special regex characters when building the RegExp (symbols:
SAFE_ACTIONS, GUARDED_ACTIONS, VIBE_MOD_SYSTEM_PROMPT).

95-95: ⚡ Quick win

Fact path 부분 문자열 매칭이 false positive를 유발할 수 있습니다.

VIBE_MOD_SYSTEM_PROMPT.includes(f)는 fact path가 다른 단어의 일부로 나타나도 통과합니다. 예를 들어, fact path가 "age"이고 프롬프트에 "message"만 있어도 통과합니다.

♻️ 단어 경계를 사용한 개선 제안
-        for (const f of FactPaths) assert(VIBE_MOD_SYSTEM_PROMPT.includes(f), `system prompt is missing fact path ${f}`);
+        for (const f of FactPaths) {
+          const pattern = new RegExp(`\\b${f.replace(/[.*+?^${}()|[\]\\]/g, '\\$&')}\\b`);
+          assert(pattern.test(VIBE_MOD_SYSTEM_PROMPT), `system prompt is missing fact path ${f}`);
+        }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@scripts/acceptance.ts` at line 95, 현재 for-loop check using
VIBE_MOD_SYSTEM_PROMPT.includes(f) causes false positives because substring
matches (e.g., "age" in "message"); update the check to assert whole-word
matches instead: for each entry in FactPaths validate presence in
VIBE_MOD_SYSTEM_PROMPT using a word-boundary match (e.g., construct a regex with
\b around the fact path) or split the prompt into tokens and test exact
equality; update the assertion message to remain `system prompt is missing fact
path ${f}` and keep the loop over FactPaths and the VIBE_MOD_SYSTEM_PROMPT
identifier.

176-176: ⚡ Quick win

onAppInstall 체크가 주석 내 참조도 통과시킬 수 있습니다.

indexTs.includes('seedStarterRules(')는 주석이나 문자열 리터럴에 포함된 경우도 통과합니다. 실제로 함수가 호출되는지 확인하려면 더 정확한 검증이 필요합니다.

♻️ AST 기반 검증 제안

실제 함수 호출을 확인하려면:

-      run: () => assert(indexTs.includes('seedStarterRules('), 'on-app-install does not call seedStarterRules()'),
+      run: () => {
+        const callPattern = /(?:^|[^/\*])\s*seedStarterRules\s*\(/m;
+        assert(callPattern.test(indexTs), 'on-app-install does not call seedStarterRules()');
+      },

또는 주석을 제외한 검증 스크립트 추가를 고려하세요.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@scripts/acceptance.ts` at line 176, The current check uses
indexTs.includes('seedStarterRules(') which can be fooled by comments or string
literals; instead parse indexTs into an AST and verify there is an actual
CallExpression invoking seedStarterRules (look for a CallExpression with callee
Identifier named "seedStarterRules" or equivalent MemberExpression) before
asserting in the run function; update the validation in the run closure to parse
indexTs (e.g., with TypeScript or Babel parser) and search AST nodes for a real
function call to seedStarterRules rather than a plain substring match.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/shared/system-prompt.test.ts`:
- Around line 64-69: The test for clarification examples currently only asserts
Array.isArray(ex.assistant.suggestedAnswers) which allows an empty array; update
the assertion in the test that iterates FEW_SHOT_EXAMPLES (the block using
FEW_SHOT_EXAMPLES and ex.assistant) to also assert that suggestedAnswers has at
least one entry (e.g., check ex.assistant.suggestedAnswers.length > 0) while
keeping the existing non-empty question check; ensure you reference
ex.assistant.suggestedAnswers (and handle undefined defensively if needed) so
the test enforces both a question and one or more suggested answers.

In `@test/setup.ts`:
- Around line 159-170: The beforeEach currently calls vi.clearAllMocks(), which
only clears call history but preserves mock implementations and can leak
behavior between tests; replace (or add) a call to vi.resetAllMocks() so
implementations are reset, and then re-establish the default mocks
(fakeReddit.getCurrentSubredditName, fakeReddit.getCurrentUser,
fakeReddit.getUserByUsername, fakeReddit.getUserKarmaFromCurrentSubreddit,
fakeReddit.getModerators, fakeSettings.get) after the reset; ensure you still
clear fakeRedis.store/hashes/zsets and reapply those fakeReddit/fakeSettings
mockResolvedValue assignments inside beforeEach so each test starts with a
clean, consistent mock state.

---

Outside diff comments:
In `@src/server/index.ts`:
- Around line 527-538: The current code always overwrites the active bundle by
calling redis.set(`${subredditName}:rules:active`, JSON.stringify(emptyActive)),
which erases existing runtime rules on reinstall; change this to preserve an
existing active bundle by checking for the key first (e.g., use redis.exists/GET
or a SETNX/SET with NX option) and only write the emptyActive when no active
bundle exists, referring to the symbols emptyActive, redis.set (replace with
SETNX or conditional set), and the `${subredditName}:rules:active` key so that
draft protection remains but active bundles are not unintentionally cleared.

---

Nitpick comments:
In `@scripts/acceptance.ts`:
- Line 101: The current check uses VIBE_MOD_SYSTEM_PROMPT.includes(a) which can
false-positive match substrings (e.g., "lock" in "unlock"); change the assertion
to verify whole-word matches by either testing a word-boundary regex for each
action (e.g., new RegExp(`\\b${escapeRegex(a)}\\b`) against
VIBE_MOD_SYSTEM_PROMPT) or by tokenizing VIBE_MOD_SYSTEM_PROMPT with a non-word
split (e.g., split on /\W+/) and checking the resulting array contains the
action; update the loop that iterates SAFE_ACTIONS and GUARDED_ACTIONS to use
that exact-word matching and ensure you escape special regex characters when
building the RegExp (symbols: SAFE_ACTIONS, GUARDED_ACTIONS,
VIBE_MOD_SYSTEM_PROMPT).
- Line 95: 현재 for-loop check using VIBE_MOD_SYSTEM_PROMPT.includes(f) causes
false positives because substring matches (e.g., "age" in "message"); update the
check to assert whole-word matches instead: for each entry in FactPaths validate
presence in VIBE_MOD_SYSTEM_PROMPT using a word-boundary match (e.g., construct
a regex with \b around the fact path) or split the prompt into tokens and test
exact equality; update the assertion message to remain `system prompt is missing
fact path ${f}` and keep the loop over FactPaths and the VIBE_MOD_SYSTEM_PROMPT
identifier.
- Line 176: The current check uses indexTs.includes('seedStarterRules(') which
can be fooled by comments or string literals; instead parse indexTs into an AST
and verify there is an actual CallExpression invoking seedStarterRules (look for
a CallExpression with callee Identifier named "seedStarterRules" or equivalent
MemberExpression) before asserting in the run function; update the validation in
the run closure to parse indexTs (e.g., with TypeScript or Babel parser) and
search AST nodes for a real function call to seedStarterRules rather than a
plain substring match.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 608bf06e-46c2-45bb-b45d-eb956af19946

📥 Commits

Reviewing files that changed from the base of the PR and between f9ce6fc and 1deaa1f.

⛔ Files ignored due to path filters (1)
  • package-lock.json is excluded by !**/package-lock.json
📒 Files selected for processing (12)
  • HANDOFF.md
  • devvit.json
  • scripts/acceptance.ts
  • src/server/evaluator.test.ts
  • src/server/executor.test.ts
  • src/server/fact-bag.test.ts
  • src/server/index.ts
  • src/shared/rule-schema.test.ts
  • src/shared/starter-rules.test.ts
  • src/shared/starter-rules.ts
  • src/shared/system-prompt.test.ts
  • test/setup.ts
💤 Files with no reviewable changes (1)
  • devvit.json

Comment on lines +64 to +69
it('clarification examples carry a question and suggested answers', () => {
for (const ex of FEW_SHOT_EXAMPLES) {
if (!('needsClarification' in ex.assistant)) continue;
expect((ex.assistant.question ?? '').length).toBeGreaterThan(0);
expect(Array.isArray(ex.assistant.suggestedAnswers)).toBe(true);
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

clarification 예시에서 빈 suggestedAnswers가 통과됩니다.

Line 68은 배열 여부만 검증해서 []도 통과합니다. 테스트 의도(질문 + 제안 답변 제공)를 보장하려면 최소 1개 이상도 함께 확인해야 합니다.

🔧 제안 수정
   it('clarification examples carry a question and suggested answers', () => {
     for (const ex of FEW_SHOT_EXAMPLES) {
       if (!('needsClarification' in ex.assistant)) continue;
       expect((ex.assistant.question ?? '').length).toBeGreaterThan(0);
       expect(Array.isArray(ex.assistant.suggestedAnswers)).toBe(true);
+      expect((ex.assistant.suggestedAnswers ?? []).length).toBeGreaterThan(0);
     }
   });
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
it('clarification examples carry a question and suggested answers', () => {
for (const ex of FEW_SHOT_EXAMPLES) {
if (!('needsClarification' in ex.assistant)) continue;
expect((ex.assistant.question ?? '').length).toBeGreaterThan(0);
expect(Array.isArray(ex.assistant.suggestedAnswers)).toBe(true);
}
it('clarification examples carry a question and suggested answers', () => {
for (const ex of FEW_SHOT_EXAMPLES) {
if (!('needsClarification' in ex.assistant)) continue;
expect((ex.assistant.question ?? '').length).toBeGreaterThan(0);
expect(Array.isArray(ex.assistant.suggestedAnswers)).toBe(true);
expect((ex.assistant.suggestedAnswers ?? []).length).toBeGreaterThan(0);
}
});
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/shared/system-prompt.test.ts` around lines 64 - 69, The test for
clarification examples currently only asserts
Array.isArray(ex.assistant.suggestedAnswers) which allows an empty array; update
the assertion in the test that iterates FEW_SHOT_EXAMPLES (the block using
FEW_SHOT_EXAMPLES and ex.assistant) to also assert that suggestedAnswers has at
least one entry (e.g., check ex.assistant.suggestedAnswers.length > 0) while
keeping the existing non-empty question check; ensure you reference
ex.assistant.suggestedAnswers (and handle undefined defensively if needed) so
the test enforces both a question and one or more suggested answers.

Comment thread test/setup.ts
Comment on lines +159 to +170
beforeEach(() => {
fakeRedis.store.clear();
fakeRedis.hashes.clear();
fakeRedis.zsets.clear();
vi.clearAllMocks();
fakeReddit.getCurrentSubredditName.mockResolvedValue('testsub');
fakeReddit.getCurrentUser.mockResolvedValue({ id: 't2_caller', username: 'caller' });
fakeReddit.getUserByUsername.mockResolvedValue(null);
fakeReddit.getUserKarmaFromCurrentSubreddit.mockResolvedValue(0);
fakeReddit.getModerators.mockResolvedValue([]);
fakeSettings.get.mockResolvedValue(undefined);
});
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

beforeEach 격리 의도와 다르게 mock 구현이 누수될 수 있습니다.

Line 159에서 vi.clearAllMocks()만 사용하면 호출 이력만 초기화되고 구현은 남습니다. 테스트에서 특정 mock 구현을 바꾼 경우 다음 케이스로 전파될 수 있어 flaky 원인이 됩니다.

🔧 제안 수정
 beforeEach(() => {
   fakeRedis.store.clear();
   fakeRedis.hashes.clear();
   fakeRedis.zsets.clear();
-  vi.clearAllMocks();
+  vi.resetAllMocks();
   fakeReddit.getCurrentSubredditName.mockResolvedValue('testsub');
   fakeReddit.getCurrentUser.mockResolvedValue({ id: 't2_caller', username: 'caller' });
   fakeReddit.getUserByUsername.mockResolvedValue(null);
   fakeReddit.getUserKarmaFromCurrentSubreddit.mockResolvedValue(0);
   fakeReddit.getModerators.mockResolvedValue([]);
   fakeSettings.get.mockResolvedValue(undefined);
 });
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
beforeEach(() => {
fakeRedis.store.clear();
fakeRedis.hashes.clear();
fakeRedis.zsets.clear();
vi.clearAllMocks();
fakeReddit.getCurrentSubredditName.mockResolvedValue('testsub');
fakeReddit.getCurrentUser.mockResolvedValue({ id: 't2_caller', username: 'caller' });
fakeReddit.getUserByUsername.mockResolvedValue(null);
fakeReddit.getUserKarmaFromCurrentSubreddit.mockResolvedValue(0);
fakeReddit.getModerators.mockResolvedValue([]);
fakeSettings.get.mockResolvedValue(undefined);
});
beforeEach(() => {
fakeRedis.store.clear();
fakeRedis.hashes.clear();
fakeRedis.zsets.clear();
vi.resetAllMocks();
fakeReddit.getCurrentSubredditName.mockResolvedValue('testsub');
fakeReddit.getCurrentUser.mockResolvedValue({ id: 't2_caller', username: 'caller' });
fakeReddit.getUserByUsername.mockResolvedValue(null);
fakeReddit.getUserKarmaFromCurrentSubreddit.mockResolvedValue(0);
fakeReddit.getModerators.mockResolvedValue([]);
fakeSettings.get.mockResolvedValue(undefined);
});
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/setup.ts` around lines 159 - 170, The beforeEach currently calls
vi.clearAllMocks(), which only clears call history but preserves mock
implementations and can leak behavior between tests; replace (or add) a call to
vi.resetAllMocks() so implementations are reset, and then re-establish the
default mocks (fakeReddit.getCurrentSubredditName, fakeReddit.getCurrentUser,
fakeReddit.getUserByUsername, fakeReddit.getUserKarmaFromCurrentSubreddit,
fakeReddit.getModerators, fakeSettings.get) after the reset; ensure you still
clear fakeRedis.store/hashes/zsets and reapply those fakeReddit/fakeSettings
mockResolvedValue assignments inside beforeEach so each test starts with a
clean, consistent mock state.

SDK adaptation (npx tsc --noEmit now passes — was ~45 errors):
- src/server/devvit-helpers.ts: getCurrentSubreddit{Name,Ref}, asT1/asT3 casts
- reddit.getCurrentSubredditName() -> reddit.getCurrentSubreddit().name
- getPostById/getCommentById take t3_/t1_ branded ids
- thing.report() -> reddit.report(thing, { reason })
- getModerators()/Listing -> .all()
- getUserKarmaFromCurrentSubreddit(name) -> { fromComments, fromPosts }
- modMail.create -> modMail.createModNotification({ subject, bodyMarkdown, subredditId })
- redis has no zCount -> count a last-hour score window via zRange
- @devvit/web/shared has no TaskRequest/TaskResponse -> local TaskBody/TaskAck

Bug fix: executor wrote audit/rollback/breaker keys un-scoped while the
Dashboard, Undo and circuit-breaker handlers read `${sub}:audit` etc — so
Dashboard/Undo never found real audit entries, and the per-sub breaker was
checked against the wrong key. All audit/rollback/breaker keys are now
sub-scoped consistently.

Tests: 148 passing (was 95). Added 6 feature-segmented route test files that
drive the real Hono app via app.fetch() with Devvit + OpenAI mocked:
routes-compose, routes-dashboard, routes-undo, routes-triggers,
routes-scheduler, routes-settings. test/setup.ts gains an in-memory Redis,
Listing/Reddit doubles matching the real SDK, and an OpenAI fetch double.

CI: .github/workflows/ci.yml runs npm ci -> tsc --noEmit -> npm test ->
npm run acceptance on push/PR. Acceptance G4 now includes the typecheck gate.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@chatgpt-codex-connector
Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

`npm ci` errors with EBADPLATFORM on esbuild's per-platform optional deps
(@esbuild/aix-ppc64 etc.) against the committed lockfile on linux runners.
`npm install` honours an in-sync lockfile and skips non-matching optional
packages.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@chatgpt-codex-connector
Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

@ComBba ComBba changed the title test: vitest suite + acceptance gates + 5 starter rules test+fix: vitest/route tests, acceptance gates, starter rules, @devvit SDK alignment, CI May 12, 2026
@ComBba ComBba merged commit 385475f into main May 12, 2026
1 check passed
@ComBba ComBba deleted the feat/tests-acceptance-starter-rules branch May 12, 2026 07:34
ComBba added a commit that referenced this pull request May 14, 2026
feat(compose): clarify select + ban/mute helpText + toast summary (audit findings #1, #4, #6)
ComBba pushed a commit that referenced this pull request May 14, 2026
External code-review pointed out 5 valid issues. All addressed in this commit
on the same branch — PR #44 picks them up automatically.

## Gemini code-assist

### #1 — Manage menu: parallelize per-rule dry-run reads
The previous serial `for` loop multiplied Redis latency by the draft count.
With up to 50 drafts and ~10ms per round-trip the difference is ~500ms vs
~10ms wall time. Switched to `Promise.allSettled([...])` so a single failed
fetch doesn't stop the others.

### #2 — applyManageActions: enforce the 50-rule cap on apply
pause / activate / unpause moves can push a bundle over the 50-rule cap
even when no compose-rule call is made. Added an explicit check before
the dual-write txn — the moderator gets a "Rule cap exceeded (active=N,
draft=N, max=50). Delete a rule first" toast and the bundles stay
untouched. New test in routes-manage.test.ts.

### #3 — Use the configured openaiModel as the fallback (not the literal "manage")
The previous `'manage'` fallback for `bundle.llmModel` broke
`estimateTokenCost()` because 'manage' isn't a key in
`OPENAI_PRICING_USD_PER_TOKEN`. The dashboard shows "(~$0 on manage)" until
a real compile lands. Now we read the configured `openaiModel` setting
through `readOpenaiModel()` (already SELECTION-array-safe per PR #39).

### #4 — Atomic dual-write via WATCH/MULTI/EXEC (the previous PR description's claim was wrong)
The previous `Promise.all([redis.set(activeKey, ...), redis.set(draftKey, ...)])`
was NOT atomic — if the second `set` failed (e.g. plugin RPC blip), pause
could leave a rule absent from active without showing up in draft.
Switched to Devvit's `redis.watch(...).multi().set(...).set(...).exec()`
pattern (same shape executor.ts already uses for audit writes), and check
`exec()`'s null return for the WATCH-aborted case so two moderators
managing rules at the same time get an actionable "Another moderator
changed the rules at the same time — re-open Manage rules" toast.

## CodeRabbit

### #5 — humanizeRule: cap value-stringification at 100 chars
A rule that uses `op: in` against a long allowlist (e.g. 50 banned domains)
would otherwise render thousands of characters into the confirm form's
`compiledSummary` field, blowing past Devvit's modal description budget
and obscuring the actual rule structure. Truncate with `...` after 100
chars in the predicate-tree leaf renderer.

## Test infrastructure
test/devvit-testkit.ts: fakeRedis.watch().exec() now returns `[] | null`
(matching real Devvit Redis MULTI/EXEC semantics) instead of `void`.
Without this, the new WATCH-abort detection in applyManageActions would
fire on every test. Updated `FakeTxn.exec` type accordingly.

## Verification
- `npm run check` 4/4 gates green
- 210 tests pass (1 skipped) — added 1 cap-overflow test in routes-manage.test.ts
ComBba pushed a commit that referenced this pull request May 14, 2026
5 review issues from Gemini code-assist on PR #45 (refactor/split-server-routes).

## HIGH severity

### #1 Non-ASCII regex made unambiguous
helpers/openai.ts: the `replace(/[<U+0080>-<U+FFFF>]/g, ...)` literal contained
the U+0080 control character invisibly, which made review tools render it
as `/[-<box>]/` and flag it as "matches only `-`". Behaviour was always
correct, but the explicit `/[�-ï¿¿]/g` form removes any doubt.

### #2 WATCH ordering in applyManageActions
routes/manage.ts: the previous shape did GET → modify → WATCH → MULTI →
EXEC, so the optimistic lock didn't actually cover the data we were
about to mutate. Restructured into a CAS retry loop:
  1. WATCH first
  2. GET (under WATCH — uses redis.get because Devvit's `txn.get` queues
     into the transaction and returns a chainable, not the value)
  3. modify in memory
  4. MULTI + SET + EXEC (null exec → another mod's call landed first
     between WATCH and now → loop and retry up to MAX_CAS_ATTEMPTS = 3).
After 3 contended attempts the moderator gets a clear "another mod
changed the rules — re-open and try again" toast.

### #3 Sequential audit-entry hGetAll in undo handler
routes/undo.ts: 100 sequential hGetAll calls could blow the menu
handler's deadline. Switched to `Promise.allSettled([...])` and scan
the resolved array in order so the "most recent first" break-on-found
behaviour is preserved.

## MEDIUM severity

### #4 Sequential audit-entry hGetAll in dashboard
routes/dashboard.ts: same parallelization for the 20-entry recent-actions
fetch and the per-draft-rule dry-run fetch.

### #5 Outdated comment on /internal/trigger/on-app-install
routes/triggers.ts: the comment claimed the submit triggers seed
in-band, but they actually fail-safe by returning ok when no bundle
exists. Updated to describe the real flow: starter rules are seeded
by the deferred /internal/scheduler/seed-on-install task, registered
in devvit.json's scheduler.tasks block.

## Verification
- `npm run check` 4/4 gates green: typecheck + lint + Prettier + 211 tests
  (1 skipped) + 3 @devvit/test cases + acceptance G1-G4
- All PR #45 tests still pass with no test-side modifications
ComBba pushed a commit that referenced this pull request May 14, 2026
5 review issues from Gemini code-assist + CodeRabbit on PR #49.

## Major

### CodeRabbit #4 — pending consumption now atomic (WATCH/MULTI/DEL/EXEC)
Two concurrent submits with the same `pendingId` (back-button + re-submit,
double-click on Save, multi-tab moderator) could both read the entry
before either deleted it, doubling the bundle write + dry-run schedule.
Replaced the plain GET-then-DEL with a WATCH/MULTI/DEL/EXEC round-trip.
Whichever caller commits first wins; the loser sees `exec() == null` and
gets an actionable "Another submission consumed this confirmation" toast.

Updated FakeTxn to match (added `del()` method on the txn object — the
production Devvit Redis client supports it, the fake didn't).

### CodeRabbit #5 — daily compile counter now bumps at compile time
Quota was previously incremented inside `persistRuleAndStartDryRun()`,
which only runs on Save. A moderator could repeatedly hit Compile +
Cancel and burn through OpenAI tokens without the per-day counter ever
moving. Moved the increment to immediately after a successful OpenAI
return in compose-rule-submit. The token cost is real either way, so
the quota now reflects it either way.

## Medium

### Gemini #1 — dashboard cancelLabel no longer says "Don't show intro again"
Devvit's Cancel button does NOT trigger the form submit handler, so
clicking the misnamed button could not actually persist
`dismissOnboarding` — it just closed the modal. Reverted the label to
"Cancel"; the dismissOnboarding boolean toggle + Close (acceptLabel) is
the real opt-out path that submits the form values.

## Minor

### CodeRabbit #2 — verify script tolerates non-int SLOW_MO
`int(os.environ.get("SLOW_MO", "0"))` crashed if a user passed e.g.
`SLOW_MO=fast`. Added `_safe_int` helper that warns + falls back to the
default rather than ending the recording session before it starts.

### CodeRabbit #3 — atomic SET + TTL via the `expiration` option
Previous shape was `redis.set(key, value)` then `redis.expire(key, ttl)`,
which could leak a TTL-less pending key if expire failed after set
succeeded. Switched to `redis.set(key, value, { expiration: new Date(...) })`
which is the same single round-trip the rollback writer in executor.ts
already uses.

## Verification
- `npm run check` 4/4 gates green (211 unit/integration tests)
- All compose tests still pass with the new pending-id flow
- Race condition smoke: a second compose-confirm-submit call against the
  same pendingId now returns the "another submission consumed" toast
  instead of double-persisting (previously: double bundle write).
ComBba added a commit that referenced this pull request May 15, 2026
…er-rules

test+fix: vitest/route tests, acceptance gates, starter rules, @devvit SDK alignment, CI
ComBba pushed a commit that referenced this pull request May 15, 2026
…Text + toast 1-line summary

Audit-driven UX quick wins for the compose flow. Full audit + scenario rationale:
- claudedocs/2026-05-14-compose-flow-audit.md (10 findings vs README promise matrix)
- docs/demo-scenario.md (60s demo, README-aligned, Hypothesis 3)

## What changes (3 audit findings landed)

### Finding #1 (CRITICAL for demo) — Clarify modal renders LLM suggestedAnswers as a dropdown
The LLM has *always* returned `{ needsClarification, question, suggestedAnswers: [...] }`
(see system-prompt.ts:80-85), but the server discarded `suggestedAnswers` and
showed only a free-text textarea. Moderators had to paraphrase the question.

Now: when `suggestedAnswers` is present, render a `select` field with one click
per option PLUS a `clarificationAnswerOther` paragraph as an override / escape
hatch. When the LLM returns no suggestions, fall back to the original
free-text path. `compose-rule-submit` defensively unwraps SELECTION-array
values (PR #39 pattern) and prefers the "Other" override when non-empty.

### Finding #4 — ban/mute toggle gets explanatory helpText on both forms
Mod mental model = "checkbox grants ban permission". Real behaviour =
"checkbox lets compile succeed when the LLM emits ban/mute". Added a single
shared `ALLOW_GUARDED_HELP` string used by the compose form and the clarify
modal so both forms explain the toggle the same way.

### Finding #6 — Success toast carries a 1-line summary + an explicit menu pointer
Previously: `Compiled rule "X". Dry-run started — check Dashboard in 30s.`
The dashboard menu name is "vibe-mod: View rules + log" and Devvit toasts have
no buttons, so "check Dashboard" is unreachable for a first-time mod.

Now: `Compiled rule "X". → post: modqueue. Dry-run started — open the
subreddit ⋯ menu → "vibe-mod: View rules + log" to see preview.` The
1-line `summarizeRule()` makes the deterministic JSON contract visible
without leaving the toast (audit finding #2 alternative — the full preview
form stays a post-publish item).

## Verification
- `npm run check` 4/4 gates green (typecheck + lint + Prettier + 25 compose tests + 3 @devvit/test + acceptance G1–G4)
- 6 new tests in routes-compose.test.ts cover: select rendering, free-text fallback, SELECTION-array unwrap, "Other" override precedence, helpText presence, summary in success toast.

## Why this PR (not bundled with module split)
Demo video records this UI. UX changes must land before the README screenshot
captures (Phase 3) and the Playwright video automation (Phase 4). Module
split (Phase 2) is a separate, independent refactor.

Refs: claudedocs/2026-05-14-compose-flow-audit.md (Phase 1.5/1.5b),
      docs/demo-scenario.md (Phase 2.5b)
ComBba added a commit that referenced this pull request May 15, 2026
feat(compose): clarify select + ban/mute helpText + toast summary (audit findings #1, #4, #6)
ComBba pushed a commit that referenced this pull request May 15, 2026
External code-review pointed out 5 valid issues. All addressed in this commit
on the same branch — PR #44 picks them up automatically.

## Gemini code-assist

### #1 — Manage menu: parallelize per-rule dry-run reads
The previous serial `for` loop multiplied Redis latency by the draft count.
With up to 50 drafts and ~10ms per round-trip the difference is ~500ms vs
~10ms wall time. Switched to `Promise.allSettled([...])` so a single failed
fetch doesn't stop the others.

### #2 — applyManageActions: enforce the 50-rule cap on apply
pause / activate / unpause moves can push a bundle over the 50-rule cap
even when no compose-rule call is made. Added an explicit check before
the dual-write txn — the moderator gets a "Rule cap exceeded (active=N,
draft=N, max=50). Delete a rule first" toast and the bundles stay
untouched. New test in routes-manage.test.ts.

### #3 — Use the configured openaiModel as the fallback (not the literal "manage")
The previous `'manage'` fallback for `bundle.llmModel` broke
`estimateTokenCost()` because 'manage' isn't a key in
`OPENAI_PRICING_USD_PER_TOKEN`. The dashboard shows "(~$0 on manage)" until
a real compile lands. Now we read the configured `openaiModel` setting
through `readOpenaiModel()` (already SELECTION-array-safe per PR #39).

### #4 — Atomic dual-write via WATCH/MULTI/EXEC (the previous PR description's claim was wrong)
The previous `Promise.all([redis.set(activeKey, ...), redis.set(draftKey, ...)])`
was NOT atomic — if the second `set` failed (e.g. plugin RPC blip), pause
could leave a rule absent from active without showing up in draft.
Switched to Devvit's `redis.watch(...).multi().set(...).set(...).exec()`
pattern (same shape executor.ts already uses for audit writes), and check
`exec()`'s null return for the WATCH-aborted case so two moderators
managing rules at the same time get an actionable "Another moderator
changed the rules at the same time — re-open Manage rules" toast.

## CodeRabbit

### #5 — humanizeRule: cap value-stringification at 100 chars
A rule that uses `op: in` against a long allowlist (e.g. 50 banned domains)
would otherwise render thousands of characters into the confirm form's
`compiledSummary` field, blowing past Devvit's modal description budget
and obscuring the actual rule structure. Truncate with `...` after 100
chars in the predicate-tree leaf renderer.

## Test infrastructure
test/devvit-testkit.ts: fakeRedis.watch().exec() now returns `[] | null`
(matching real Devvit Redis MULTI/EXEC semantics) instead of `void`.
Without this, the new WATCH-abort detection in applyManageActions would
fire on every test. Updated `FakeTxn.exec` type accordingly.

## Verification
- `npm run check` 4/4 gates green
- 210 tests pass (1 skipped) — added 1 cap-overflow test in routes-manage.test.ts
ComBba pushed a commit that referenced this pull request May 15, 2026
5 review issues from Gemini code-assist on PR #45 (refactor/split-server-routes).

## HIGH severity

### #1 Non-ASCII regex made unambiguous
helpers/openai.ts: the `replace(/[<U+0080>-<U+FFFF>]/g, ...)` literal contained
the U+0080 control character invisibly, which made review tools render it
as `/[-<box>]/` and flag it as "matches only `-`". Behaviour was always
correct, but the explicit `/[�-ï¿¿]/g` form removes any doubt.

### #2 WATCH ordering in applyManageActions
routes/manage.ts: the previous shape did GET → modify → WATCH → MULTI →
EXEC, so the optimistic lock didn't actually cover the data we were
about to mutate. Restructured into a CAS retry loop:
  1. WATCH first
  2. GET (under WATCH — uses redis.get because Devvit's `txn.get` queues
     into the transaction and returns a chainable, not the value)
  3. modify in memory
  4. MULTI + SET + EXEC (null exec → another mod's call landed first
     between WATCH and now → loop and retry up to MAX_CAS_ATTEMPTS = 3).
After 3 contended attempts the moderator gets a clear "another mod
changed the rules — re-open and try again" toast.

### #3 Sequential audit-entry hGetAll in undo handler
routes/undo.ts: 100 sequential hGetAll calls could blow the menu
handler's deadline. Switched to `Promise.allSettled([...])` and scan
the resolved array in order so the "most recent first" break-on-found
behaviour is preserved.

## MEDIUM severity

### #4 Sequential audit-entry hGetAll in dashboard
routes/dashboard.ts: same parallelization for the 20-entry recent-actions
fetch and the per-draft-rule dry-run fetch.

### #5 Outdated comment on /internal/trigger/on-app-install
routes/triggers.ts: the comment claimed the submit triggers seed
in-band, but they actually fail-safe by returning ok when no bundle
exists. Updated to describe the real flow: starter rules are seeded
by the deferred /internal/scheduler/seed-on-install task, registered
in devvit.json's scheduler.tasks block.

## Verification
- `npm run check` 4/4 gates green: typecheck + lint + Prettier + 211 tests
  (1 skipped) + 3 @devvit/test cases + acceptance G1-G4
- All PR #45 tests still pass with no test-side modifications
ComBba pushed a commit that referenced this pull request May 15, 2026
5 review issues from Gemini code-assist + CodeRabbit on PR #49.

## Major

### CodeRabbit #4 — pending consumption now atomic (WATCH/MULTI/DEL/EXEC)
Two concurrent submits with the same `pendingId` (back-button + re-submit,
double-click on Save, multi-tab moderator) could both read the entry
before either deleted it, doubling the bundle write + dry-run schedule.
Replaced the plain GET-then-DEL with a WATCH/MULTI/DEL/EXEC round-trip.
Whichever caller commits first wins; the loser sees `exec() == null` and
gets an actionable "Another submission consumed this confirmation" toast.

Updated FakeTxn to match (added `del()` method on the txn object — the
production Devvit Redis client supports it, the fake didn't).

### CodeRabbit #5 — daily compile counter now bumps at compile time
Quota was previously incremented inside `persistRuleAndStartDryRun()`,
which only runs on Save. A moderator could repeatedly hit Compile +
Cancel and burn through OpenAI tokens without the per-day counter ever
moving. Moved the increment to immediately after a successful OpenAI
return in compose-rule-submit. The token cost is real either way, so
the quota now reflects it either way.

## Medium

### Gemini #1 — dashboard cancelLabel no longer says "Don't show intro again"
Devvit's Cancel button does NOT trigger the form submit handler, so
clicking the misnamed button could not actually persist
`dismissOnboarding` — it just closed the modal. Reverted the label to
"Cancel"; the dismissOnboarding boolean toggle + Close (acceptLabel) is
the real opt-out path that submits the form values.

## Minor

### CodeRabbit #2 — verify script tolerates non-int SLOW_MO
`int(os.environ.get("SLOW_MO", "0"))` crashed if a user passed e.g.
`SLOW_MO=fast`. Added `_safe_int` helper that warns + falls back to the
default rather than ending the recording session before it starts.

### CodeRabbit #3 — atomic SET + TTL via the `expiration` option
Previous shape was `redis.set(key, value)` then `redis.expire(key, ttl)`,
which could leak a TTL-less pending key if expire failed after set
succeeded. Switched to `redis.set(key, value, { expiration: new Date(...) })`
which is the same single round-trip the rollback writer in executor.ts
already uses.

## Verification
- `npm run check` 4/4 gates green (211 unit/integration tests)
- All compose tests still pass with the new pending-id flow
- Race condition smoke: a second compose-confirm-submit call against the
  same pendingId now returns the "another submission consumed" toast
  instead of double-persisting (previously: double bundle write).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant