Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@

- Keep things in one function unless composable or reusable
- Avoid `try`/`catch` where possible
- Avoid using the `any` type
- Never use `any` or `unknown` types. Always prefer specific, strong types — define interfaces, use generics, use branded types, use discriminated unions, or use Zod-inferred types. When you encounter `any` or `unknown` in existing code, replace it with the actual type the value holds. For catch blocks use `catch (e)` and narrow with `e instanceof Error`. For Zod schemas use specific Zod types (`z.string()`, `z.number()`, `z.record(z.string(), z.string())`, etc.) — never `z.any()` or `z.unknown()`. The only exceptions are: (1) type-erased storage in event emitter patterns where heterogeneous callbacks coexist, and (2) SDK/library boundary casts where the external API has an incompatible type — document both with a comment explaining why.
- Prefer single word variable names where possible
- Use Bun APIs when possible, like `Bun.file()`
- Rely on type inference when possible; avoid explicit type annotations or interfaces unless necessary for exports or clarity
Expand Down Expand Up @@ -156,7 +156,7 @@ Use `context_history` to navigate the edit DAG:

## Pre-existing Failures and Bugs

**IMPORTANT:** Pre-existing failures, bugs, and issues MUST be fixed too — always. Do not ignore typecheck errors, lint warnings, unused variables, broken imports, or failing tests just because they existed before your changes. If you encounter a pre-existing issue during your work, fix it as part of your changes. This applies to all types of issues: type errors, dead code, incorrect logic, missing exports, stale references, etc.
**IMPORTANT:** Pre-existing failures, bugs, and issues MUST be fixed too — always, no exceptions. Do not ignore typecheck errors, lint warnings, unused variables, broken imports, or failing tests just because they existed before your changes. If you encounter a pre-existing issue during your work, fix it as part of your changes. This applies to all types of issues: type errors, dead code, incorrect logic, missing exports, stale references, runtime errors, circular imports, etc. If a pre-existing issue is too complex to fix in a single session, document it in `BUGS.md` with full details so it can be tracked and fixed later — but never silently skip it.

## Git Workflow

Expand Down
143 changes: 73 additions & 70 deletions BUGS.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,90 +4,93 @@ All bugs tracked here. Do not create per-package bug files.

---

## Open Design Issues
## Open (0)

| Issue | Location | Impact |
| --- | --- | --- |
| No CAS garbage collection | `cas/index.ts` | Unbounded `cas_object` growth |
| Objective extraction never updates | `session/objective.ts` | Stale auto-extracted objectives |
| `Session.remove()` leaks EditGraph rows | `session/index.ts:665-689` | Orphan `edit_graph_node`/`edit_graph_head` rows |
| `CAS.store()` overwrites session_id on hash collision | `cas/index.ts:53-61` | Cross-session content deletion |
_No open bugs._

---

## Manual TUI Testing (2026-03-20)
## Deferred (1)

Tested on branch `effect/complete-effectification` (27 commits ahead of `dev`) using tmux-based test harness and manual interaction. Model: GLM-5 Z.AI Coding.

| Flow | Result | Notes |
| --- | --- | --- |
| Home screen | ✅ Pass | Logo, prompt, tips, status bar render correctly |
| Command palette (Ctrl+P) | ✅ Pass | Opens, shows Session/Skills/Open editor/Switch session |
| Agent cycling (Tab) | ✅ Pass | Cycles through Build/Plan/Docs agents |
| Message submission | ✅ Pass | Enter submits, streaming dots visible, response renders with token/cost metadata |
| Cost dialog (/cost) | ✅ Pass | Shows Sess/☼-ly/☽-ly rows with cache hit %, esc dismisses |
| Status bar | ✅ Pass | Shows branch, agent, model, hints |

No new bugs found during manual testing.
| # | Issue | Sev | Location | Notes |
| --- | ------------------------------ | --- | ----------------- | --------------------------------------------------------------------------- |
| B51 | ID generator counter not atomic | Low | `id/id.ts:25-27` | Fine single-threaded; documented with comment. Fix if worker threads added. |

---

## False Positives / Intentional

| Issue | Resolution |
| --- | --- |
| Fork-based ephemeral: message IDs point to deleted session | **Intentional** — ephemeral results serialized immediately, never dereferenced later |
| #32 Skill template returns Promise not string | **By design** — type is `Promise<string> | string`, all consumers `await`, `hints: []` for skill commands |
## Fixed (51)

| # | Issue | Sev | Fix |
| --- | ------------------------------------------------------ | ---- | --------------------------------------------------- |
| B1 | `reset()`/`checkout()` JSON.parse plain text CAS | Crit | `JSON.stringify(part)` in `CAS.store()` |
| B2 | `withTimeout` timer leak on rejection | Crit | `.finally()` cleanup |
| B3 | `unhide`/`annotate` missing transaction | Crit | `Database.transaction()` wrapper |
| B4 | `SideThread.list` total ignores status filter | High | `countWhere` applies filter |
| B5 | Unsafe cast to `MessageV2.User` in classifier | High | `.find()` + optional chaining |
| B6 | `sweep()` no transaction wrapping | High | `Database.transaction()` wrapper |
| B7 | `filterEdited` breaks message alternation | High | Synthetic placeholder preserves structure |
| B8 | N+1 queries in `getLog`/`buildPathToRoot` | High | `loadAllNodes()` single query |
| B9 | Missing plugin hooks in `unhide`/`annotate` | High | Added `pluginGuard` + `pluginNotify` |
| B10 | CAS `onConflictDoNothing` loses metadata | Med | `onConflictDoUpdate` |
| B11 | `AsyncQueue` no termination | Med | `close()` with CLOSED sentinel |
| B12 | Template variable bash-style syntax | Med | `$ARGUMENTS` substitution |
| B13 | `work()` drops `undefined` items | Med | Check `pending.length` not `item` |
| B14 | Modular bias in random ID generation | Med | Rejection sampling (`MAX_UNBIASED = 248`) |
| B15 | Ephemeral commands crash (schema validation) | Crit | Replaced sweep with `filterEphemeral()` |
| B16 | Ephemeral sweep leaks content into LLM turn | High | Filter upstream via `filterEphemeral()` |
| B17 | `context_edit` accepts invalid `afterTurns` | Med | `.int().min(1)` validation |
| B18 | Unused imports in message-v2.ts | Low | Removed |
| B19 | Unused constant `MAX_EDITS_PER_TURN` | Low | Removed |
| B20 | `focus-rewrite-history` missing tool perms | High | Added `classifier_threads`/`distill_threads` |
| B21 | Fork-based ephemeral: leaked sessions on error | High | try/finally around `Session.remove()` |
| B22 | Fork-based ephemeral: `Command.Event.Executed` skipped | Med | Added `Bus.publish()` in ephemeral path |
| B23 | `Session.remove()` doesn't clean up CAS | High | Added `CAS.deleteBySession()` |
| B24 | Duplicate skill name warns but doesn't skip | Low | Added warning log |
| B25 | Circuit breaker `lastFailure` after throw | High | Move `lastFailure = now` before threshold check |
| B26 | Circuit breaker never resets on success | Med | Added `recordSuccess()`, called on pass |
| B27 | Circuit breaker inverted `open` semantics | Low | Renamed `open` → `healthy` |
| B28 | Default cooldown (1s) effectively zero | Med | Increased to 30000ms |
| B29 | `scope`/`files`/`criteria` params unused | Med | Removed from schema |
| B30 | `command.split(" ")` breaks quoted args | Low | Changed to `["bash", "-c", command]` |
| B31 | Verify config shallow merge loses nested | High | `mergeDeep()` from remeda |
| B32 | Evaluator has no code context | Crit | Build change summary from parent session messages |
| B33 | `tools: {}` blocks agent tools | Crit | Removed `tools: {}` from prompt calls |
| B34 | `parseEvaluation` brittle parsing | Med | Extract `<evaluation>` block first, NaN guard |
| B35 | Child sessions never cleaned up | Low | try/finally with `Session.remove()` |
| B36 | Skill template returns Promise | High | By design — added comment, no code change |
| B37 | `Skill.get()` re-parses every call | Low | Module-level content cache, cleared on state reload |
| B38 | Scripts: argument injection | Med | Insert `--` separator before user args |
| B39 | Scripts: tool ID collision | Low | Changed separator to `::` (`script::skill/name`) |
| B40 | Evaluator agent has bash access | Med | Removed `bash: "allow"` from evaluator permissions |
| B41 | No CAS garbage collection | High | `runGC()` + `deleteOrphans()` implemented |
| B42 | `Session.remove()` leaks EditGraph rows | High | `EditGraph.deleteBySession()` added |
| B43 | `CAS.store()` overwrites session_id on collision | High | Changed to `onConflictDoNothing()` |
| B44 | Lock starvation in read/write lock | High | Writer prioritization with reader batch wakeup |
| B45 | `EditGraph.commit()` race condition | Med | `getHead()` moved inside `Database.use()` tx |
| B46 | FileWatcher subscription timeout cleanup | Low | `.then(s => s.unsubscribe()).catch()` on timeout |
| B47 | Objective extraction never updates | Med | Removed cache check in `extract()`, always re-evaluates |
| B48 | `Session.remove()` swallows errors | Med | Removed outer try-catch, errors now propagate |
| B49 | `context-edit mark()` not in transaction | Med | Wrapped `updatePart()` in `Database.transaction()` |
| B50 | `AsyncQueue.push()` after close silent | Low | Throws `Error("Cannot push to a closed queue")` |
| B52 | `Bus.publish` doesn't catch subscriber errors | Low | try-catch per subscriber + `Promise.allSettled()` |

---

## Previously Fixed (40 bugs)

| # | Issue | Sev | Fix |
| --- | --- | --- | --- |
| 1 | `reset()`/`checkout()` JSON.parse plain text CAS | Crit | `JSON.stringify(part)` in `CAS.store()` |
| 2 | `withTimeout` timer leak on rejection | Crit | `.finally()` cleanup |
| 3 | `unhide`/`annotate` missing transaction | Crit | `Database.transaction()` wrapper |
| 4 | `SideThread.list` total ignores status filter | High | `countWhere` applies filter |
| 5 | Unsafe cast to `MessageV2.User` in classifier | High | `.find()` + optional chaining |
| 6 | `sweep()` no transaction wrapping | High | `Database.transaction()` wrapper |
| 7 | `filterEdited` breaks message alternation | High | Synthetic placeholder preserves structure |
| 8 | N+1 queries in `getLog`/`buildPathToRoot` | High | `loadAllNodes()` single query |
| 9 | Missing plugin hooks in `unhide`/`annotate` | High | Added `pluginGuard` + `pluginNotify` |
| 10 | CAS `onConflictDoNothing` loses metadata | Med | `onConflictDoUpdate` |
| 11 | `AsyncQueue` no termination | Med | `close()` with CLOSED sentinel |
| 12 | Template variable bash-style syntax | Med | `$ARGUMENTS` substitution |
| 13 | `work()` drops `undefined` items | Med | Check `pending.length` not `item` |
| 14 | Modular bias in random ID generation | Med | Rejection sampling (`MAX_UNBIASED = 248`) |
| 15 | Ephemeral commands crash (schema validation) | Crit | Replaced sweep with `filterEphemeral()` |
| 16 | Ephemeral sweep leaks content into LLM turn | High | Filter upstream via `filterEphemeral()` |
| 17 | `context_edit` accepts invalid `afterTurns` | Med | `.int().min(1)` validation |
| 18 | Unused imports in message-v2.ts | Low | Removed |
| 19 | Unused constant `MAX_EDITS_PER_TURN` | Low | Removed |
| 20 | `focus-rewrite-history` missing tool perms | High | Added `classifier_threads`/`distill_threads` |
| — | Fork-based ephemeral: leaked sessions on error | High | try/finally around `Session.remove()` |
| — | Fork-based ephemeral: `Command.Event.Executed` skipped | Med | Added `Bus.publish()` in ephemeral path |
| — | `Session.remove()` doesn't clean up CAS | High | Added `CAS.deleteBySession()` |
| — | Duplicate skill name warns but doesn't skip | Low | Added warning log |
| 21 | Circuit breaker `lastFailure` after throw | High | Move `lastFailure = now` before threshold check |
| 22 | Circuit breaker never resets on success | Med | Added `recordSuccess()`, called on pass |
| 23 | Circuit breaker inverted `open` semantics | Low | Renamed `open` → `healthy` |
| 24 | Default cooldown (1s) effectively zero | Med | Increased to 30000ms |
| 25 | `scope`/`files`/`criteria` params unused | Med | Removed from schema |
| 26 | `command.split(" ")` breaks quoted args | Low | Changed to `["bash", "-c", command]` |
| 27 | Verify config shallow merge loses nested | High | `mergeDeep()` from remeda |
| 28 | Evaluator has no code context | Crit | Build change summary from parent session messages |
| 29 | `tools: {}` blocks agent tools | Crit | Removed `tools: {}` from prompt calls |
| 30 | `parseEvaluation` brittle parsing | Med | Extract `<evaluation>` block first, NaN guard |
| 31 | Child sessions never cleaned up | Low | try/finally with `Session.remove()` |
| 32 | Skill template returns Promise | High | By design — added comment, no code change |
| 33 | `Skill.get()` re-parses every call | Low | Module-level content cache, cleared on state reload |
| 34 | Scripts: argument injection | Med | Insert `--` separator before user args |
| 35 | Scripts: tool ID collision | Low | Changed separator to `::` (`script::skill/name`) |
| 36 | Evaluator agent has bash access | Med | Removed `bash: "allow"` from evaluator permissions |
## False Positives / Intentional (6)

| Issue | Resolution |
| ---------------------------------------------------------- | --------------------------------------------------------------------------------- |
| Fork-based ephemeral: message IDs point to deleted session | **Intentional** — ephemeral results serialized immediately, never dereferenced |
| Skill template returns Promise not string | **By design** — type is `Promise<string> \| string`, all consumers `await` |
| Provider state map key inconsistency | **False positive** — all usages consistently key by `InstanceALS.directory` |
| Config state map same issue | **False positive** — same consistent keying as provider |
| Bus subscription cleanup gap | **False positive** — unsubscribe + BusService finalizer both clean up properly |
| `CAS.deleteBySession()` race with store | **False positive** — deletion is idempotent, no transaction needed |

---

## Notes

**TUI Testing:** Playwright not feasible (OpenTUI+SolidJS). Use `testRender()` from `@opentui/solid` for unit tests. tmux-based integration harness at `test/cli/tui/tmux-tui-test.ts` for E2E flows.

**TUI Manual Test (2026-03-20):** All 6 flows passed (home screen, command palette, agent cycling, message submission, cost dialog, status bar). No bugs found.
1 change: 1 addition & 0 deletions CLAUDE.md
Loading
Loading