feat: mission control + flight recorder operator cockpit#2044
Conversation
Persistence: append-only FlightRecorderFrame repo (sqlite+postgres, new yoyo revision), wired into both backends + fake. Engine: pluggable recorder sink records per-turn frames after each agent run (off the hot path, fail-soft). Services: FlightRecorderService (frame-authoritative get/seek), CockpitService (live activity + stuck/runaway heuristics), SafeDefaultSteeringDirective (hint/redirect via INFO_REQUEST interrupt). Adds InterventionKind enum, cockpit settings namespace + events module.
Adds /cockpit controller (live snapshot, flight-recorder frames + seek, pause/kill/hint/redirect interventions) registered in BASE_CONTROLLERS and 503-gated via AppState. Wires CockpitService/FlightRecorderService/SteeringDirective at boot via _try_wire_cockpit. Appends CHANNEL_COCKPIT, AppState cockpit seams, ENFORCED ghost-manifest lines, and regenerated TS DTOs. CockpitService takes thresholds at call time (controller resolves settings) to stay free of a wire-time resolver dependency. Controller + service unit tests.
Adds /mission-control route + sidebar nav. Live tab: KPI row, agent activity rows with pause/kill/hint/redirect interventions, stuck/runaway flags, REST snapshot polling + WS liveness on the cockpit/tasks/agents/budget channels. Flight Recorder tab: new Timeline scrubber primitive (+ stories), transport controls + speed, per-turn frame detail. Adds cockpit endpoints, Zustand store (mutation pattern), useMissionControlData hook, MSW handlers, WS cockpit channel, and AgentActivity.execution_id (regenerated DTOs). Timeline + page + store tests.
Adds meta/mcp/domains/cockpit.py + handlers/cockpit.py: read tools (get_live_activity, get_flight_recorder_frames, seek_flight_recorder) and admin intervention tools (pause/kill/hint/redirect) that call require_admin_guardrails() and route through the same cockpit/flight-recorder/steering/task services as the REST controller. Registered in the domain + handler aggregators; tool-count plan bumped 216 -> 223.
allow_inf_nan=False on cockpit controller models; build_frames split under the 50-line limit; drop unused FlightRecorder playingRef; add AgentEngine auto-recording integration test proving run() records replayable frames through the real hook (closes the sim-harness validation gap).
test_channels expects CHANNEL_COCKPIT; protocol-compliance fake backend gains flight_recorder_frames; MCP hint/redirect handlers call require_admin_guardrails as their lexically-first call (gate compliance).
RUNTIME_PREFIXES excludes persistence/, so the flight-recorder repos (reached through the persistence backend like every other repo) cannot and should not be manifest-ENFORCED. The four genuine runtime services/builders (CockpitService, FlightRecorderService, build_steering_directive, build_flight_recorder_sink) remain enforced.
Adds cockpit to test_events expected discovery + VALID_SETTINGS_NAMESPACES; introduces _sum_costs helper carrying the currency-aggregation lint-allow marker in both cockpit and flight_recording services.
Persistence: avoid mutating row dict before validation in sqlite/postgres flight_recorder_repo. Service: type _MAX_SEEK_FRAMES as Final[int]. Web: extract shared statusBgClass into @/utils/status-color; add aria-label on AgentRow status dot; tie Hint InputField to the agent via aria-describedby; import FlightRecorderFrame for FrameDetail prop; memoize ordered + timelineFrames; tighten TAB_OPTIONS typing; seed richer MSW defaults; reset mission-control store in afterEach in both web tests.
Dependency Review✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.Scanned FilesNone |
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a comprehensive 'Mission Control' cockpit for the SynthOrg runtime. It provides operators with the ability to monitor live agent activity, detect stuck or runaway agents, perform interventions (pause, kill, hint, redirect), and replay completed agent runs turn-by-turn using a new flight recorder system. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize the Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counterproductive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository UI (base), Organization UI (inherited) Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (6)
📜 Recent review details⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (13)
🧰 Additional context used📓 Path-based instructions (9)**/*.{py,ts,tsx,jsx,md}📄 CodeRabbit inference engine (CLAUDE.md)
Files:
tests/**/*.py📄 CodeRabbit inference engine (CLAUDE.md)
Files:
⚙️ CodeRabbit configuration file
Files:
src/synthorg/**/*.py📄 CodeRabbit inference engine (CLAUDE.md)
Files:
src/synthorg/persistence/**/*.py📄 CodeRabbit inference engine (CLAUDE.md)
Files:
src/**/*.py⚙️ CodeRabbit configuration file
Files:
web/src/**/*.{js,jsx,ts,tsx,mts}📄 CodeRabbit inference engine (web/CLAUDE.md)
Files:
web/src/stores/**/*.ts📄 CodeRabbit inference engine (web/CLAUDE.md)
Files:
web/src/{api/endpoints,stores}/**/*.ts📄 CodeRabbit inference engine (web/CLAUDE.md)
Files:
web/src/**/*.{ts,tsx,mts}📄 CodeRabbit inference engine (web/CLAUDE.md)
Files:
🧠 Learnings (3)📚 Learning: 2026-05-05T09:04:46.195ZApplied to files:
📚 Learning: 2026-05-21T22:55:20.496ZApplied to files:
📚 Learning: 2026-05-21T22:55:09.289ZApplied to files:
🔇 Additional comments (8)
WalkthroughAdds cockpit live snapshot and interventions, integrates a flight recorder capturing agent turns, persists frames in Postgres/SQLite, and provides a replay/seek service. App state and startup wire cockpit/recorder/steering; REST controller and MCP tools expose snapshot, frames, seek, and interventions. Adds observability events and cockpit settings. AgentEngine records frames via a pluggable sink. Web app introduces a Mission Control page (Live and Flight Recorder tabs), a Timeline component, Zustand store and hook, routes/nav, and MSW mocks. Extensive unit/e2e tests cover repositories, services, engine recording, controller, steering, MCP surface, and UI. Suggested labels
|
There was a problem hiding this comment.
Code Review
This pull request introduces the Mission Control Cockpit, providing live agent activity monitoring, flight-recorder replay, and operator intervention capabilities. The implementation includes backend services, persistence for Postgres and SQLite, REST and MCP interfaces, and a React-based frontend. Review feedback highlights performance concerns regarding N+1 queries and inefficient individual inserts, as well as logic errors in turn-to-message correlation for resumed tasks and incomplete prefix reconstruction in the flight-recorder seek functionality.
| for status in _ACTIVE_STATUSES: | ||
| tasks, _ = await self._task_engine.list_tasks(status=status) | ||
| activities.extend( | ||
| [ | ||
| await self._build_activity(task, stuck_cutoff, runaway_pct) | ||
| for task in tasks | ||
| ] | ||
| ) |
There was a problem hiding this comment.
This loop introduces an N+1 query pattern because _build_activity performs a repository query for every active task. With many active tasks, this will significantly degrade the performance of the mission control dashboard. Additionally, the cost calculation in _build_activity is incorrect for tasks with more than 1000 turns: since the frame query is limited to 1000 records and the repository returns them newest-first, the resulting sum will only reflect the cost of the most recent 1000 turns. Consider adding an optimized aggregation method to the repository (e.g., get_stats_by_task) that returns the total cost, turn count, and latest timestamp in a single query using SQL SUM and MAX functions.
| response=( | ||
| assistant_messages[index].content | ||
| if index < len(assistant_messages) | ||
| else None | ||
| ), |
There was a problem hiding this comment.
Correlating turns with assistant messages using the loop index is incorrect for resumed tasks. execution_result.turns only contains turns from the current run, while execution_result.context.conversation contains the full history. If a task is resumed at turn 5, index 0 will incorrectly pick the first assistant message in the history instead of the 5th. Using turn.turn_number - 1 as the index into assistant_messages would correctly target the message corresponding to the turn, assuming a 1:1 mapping.
| response=( | |
| assistant_messages[index].content | |
| if index < len(assistant_messages) | |
| else None | |
| ), | |
| response=( | |
| assistant_messages[turn.turn_number - 1].content | |
| if (turn.turn_number - 1) < len(assistant_messages) | |
| else None | |
| ), |
| frames = await self._repository.query( | ||
| FlightRecorderFrameFilterSpec( | ||
| execution_id=NotBlankStr(execution_id), | ||
| turn_index_min=1, | ||
| turn_index_max=turn_index, | ||
| ), | ||
| limit=_MAX_SEEK_FRAMES, |
There was a problem hiding this comment.
There is a logic error in the seek method when an execution's turn count exceeds _MAX_SEEK_FRAMES. Because the repository query method returns frames newest-first (descending turn index), the limit will cause the query to return the latest 1000 frames within the requested range (turns 1 to turn_index). For a run at turn 1500, this returns turns 501 to 1500. The resulting ReplaySeekView will be missing the beginning of the run (turns 1 to 500), which contradicts the docstring and the expected behavior of a scrubber reconstruction. To fix this, the repository protocol should be extended to support sort order, or the service should handle prefix reconstruction more robustly.
| recorded = 0 | ||
| for frame in frames: | ||
| try: | ||
| await self._repository.append(frame) | ||
| recorded += 1 | ||
| except Exception as exc: | ||
| logger.warning( | ||
| FLIGHT_RECORDER_RECORD_FAILED, | ||
| execution_id=frame.execution_id, | ||
| turn_index=frame.turn_index, | ||
| error_type=type(exc).__name__, | ||
| error=safe_error_description(exc), | ||
| ) |
There was a problem hiding this comment.
The record_frames method is inefficient as it performs individual inserts in a loop. In both the Postgres and SQLite repository implementations, each call to append opens a new connection and transaction. For agent runs with many turns, this adds significant overhead to the finalization phase. It is recommended to add a batch insertion method (e.g., append_many) to the FlightRecorderFrameRepository protocol and use it here to persist all frames in a single transaction.
There was a problem hiding this comment.
Actionable comments posted: 19
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/synthorg/api/controllers/cockpit.py`:
- Around line 110-112: Add input validation at the controller boundary in
cockpit.py for the parameters limit, offset, and turn_index: ensure limit is a
positive integer and capped (use DEFAULT_PAGE_SIZE as the max), offset is >= 0,
and turn_index (when present) is >= 0 (and within any domain-specific max if
applicable). If a value is out of bounds, return a 4xx immediately (e.g., raise
fastapi.HTTPException(status_code=400) or return an ApiResponse error) instead
of passing invalid values into repository queries; add these guards in the
controller function that returns ApiResponse[FlightRecorderFramesResponse]
(validate the parameters named limit, offset, turn_index) and apply the same
checks to the similar controller block around lines 127-133.
- Around line 222-227: The steering text is operator-provided and must be
wrapped as untrusted content before being passed to SteeringDirective.steer;
import and call wrap_untrusted from engine.prompt_safety on data.text and pass
the wrapped value in details (e.g., details={"text": wrap_untrusted(data.text)})
when invoking app_state.steering_directive.steer to satisfy SEC-1 requirements.
In `@src/synthorg/api/state.py`:
- Around line 1592-1600: The log always emits transition="attached" in
set_cockpit_services even when no-op or replacement occurs; update
set_cockpit_services to compute the correct transition by comparing existing
self._cockpit_service (and related self._flight_recorder_service /
self._steering_directive) to the incoming
cockpit_service/flight_recorder_service/steering_directive: use "noop" when
nothing changed, "attached" when there was no prior service, and "replaced" when
an existing service is being swapped, then pass that transition value into the
logger.info(API_APP_STARTUP, service="cockpit_services", transition=transition)
call.
In `@src/synthorg/engine/agent_engine.py`:
- Around line 681-689: Wrap the frame construction and recording block in a
defensive try/except inside the _execute method so that build_frames(...) and
await self._flight_recorder_sink.record_frames(frames) cannot turn a successful
run into a failure; specifically, call build_frames and record_frames inside a
try, re-raise MemoryError and RecursionError unchanged, and for any other
Exception catch it, log the exception (using the existing logger in this class)
and return/exit the method early to preserve the existing fail-soft post-run
behavior.
In `@src/synthorg/engine/cockpit/service.py`:
- Around line 129-136: The loop is awaiting _build_activity sequentially for
each task, causing latency; refactor to fan-out/fan-in using asyncio.TaskGroup:
for each status from _ACTIVE_STATUSES call
self._task_engine.list_tasks(status=status) as before but then create an
asyncio.TaskGroup, spawn a task for each task that awaits
self._build_activity(task, stuck_cutoff, runaway_pct) (referencing
_build_activity and _task_engine) and collect results (gather successful
activities), then extend activities with the completed results; ensure you
handle task exceptions per the project's policy (cancel/propagate or log) inside
the TaskGroup and preserve original ordering if required.
In `@src/synthorg/engine/flight_recording/service.py`:
- Around line 88-114: The seek() implementation in service.py quietly truncates
results when turn_index > _MAX_SEEK_FRAMES by limiting the repository query to
_MAX_SEEK_FRAMES and still returning a normal ReplaySeekView, so change seek()
to detect when the requested turn_index exceeds the cap (compare turn_index to
_MAX_SEEK_FRAMES before/after the query) and fail fast or mark truncation:
either raise a clear exception (e.g., ValueError/CustomException) indicating the
seek limit was exceeded, or add a truncation flag on the ReplaySeekView (e.g.,
truncated=True) and populate it so callers can detect partial reconstruction;
update all usages of ReplaySeekView accordingly and ensure logging
(FLIGHT_RECORDER_SEEK) records the truncation or the error.
In `@src/synthorg/engine/flight_recording/sink.py`:
- Around line 75-82: In PersistenceFlightRecorderSink.record_frames,
special-case and re-raise system errors before the broad except Exception block:
add an early except (MemoryError, RecursionError) as sys_exc: raise sys_exc to
ensure these are not swallowed, then proceed with the existing except Exception
as exc: logger.warning(...) handling; reference the method name record_frames
and the existing logger.warning/FLIGHT_RECORDER_RECORD_FAILED call to locate
where to insert the re-raise.
In `@src/synthorg/engine/intervention/steering.py`:
- Around line 106-121: The operator-provided text is being stored directly into
the Interrupt.question (and similar fields) without the SEC-1 wrapper; import
and call wrap_untrusted from engine.prompt_safety on the raw text before
creating the Interrupt in steer/steering.py (the block that constructs Interrupt
with id, type, session_id, agent_id, question, context_snippet). Replace usages
of the raw text (e.g., the value passed to question and any other untrusted
fields like context_snippet) with the wrapped result (then feed that wrapped
string into NotBlankStr) so all untrusted content is SEC-1 wrapped prior to
queueing.
In `@src/synthorg/persistence/flight_recorder_protocol.py`:
- Around line 139-140: Update purge_before to require a timezone-aware datetime
and normalize it to UTC before using: import normalize_utc (and/or parse_iso_utc
if you decide to accept strings) from persistence._shared, change the
contract/docstring of purge_before to state the threshold must be timezone-aware
(ISO UTC), and call normalize_utc(threshold) at the start of the method (or
parse_iso_utc for string inputs) so naive datetimes are rejected/normalized
consistently before any backend purge logic; keep the return type int and raise
a clear ValueError if a naive datetime is passed.
In
`@src/synthorg/persistence/sqlite/revisions/20260522000002_flight_recorder.sql`:
- Around line 20-21: The index idx_frf_execution_turn on flight_recorder_frames
is non-unique, allowing duplicate (execution_id, turn_index) rows; change this
index to a UNIQUE index on columns execution_id and turn_index in the migration
that creates idx_frf_execution_turn (so seek/replay is deterministic) and mirror
the same UNIQUE constraint/update in schema.sql so drift checks remain green.
In `@tests/unit/engine/intervention/test_steering.py`:
- Around line 36-48: The test test_redirect_also_queues_interrupt currently only
asserts outcome.applied and outcome.artifact_id but doesn't verify persistence;
update the test to read from the InterruptStore (the same InterruptStore
instance created at the top) after calling SafeDefaultSteeringDirective.steer
and assert that a redirect interrupt exists for execution_id "exec-1" (e.g.,
verify store contains an interrupt with kind InterventionKind.REDIRECT and
matching execution_id/agent_id/details), ensuring the directive actually wrote
the interrupt to the store.
In `@tests/unit/persistence/test_protocol.py`:
- Around line 438-453: TestProtocolCompliance is missing an assertion that the
new flight_recorder_frames accessor conforms to the flight-recorder repository
protocol: add a protocol-conformance check similar to the other repository
accessors by creating or using _FakeFlightRecorderRepository and asserting that
the subject.flight_recorder_frames implements the same protocol used elsewhere
in TestProtocolCompliance (mirror assertions for other accessors so drift is
detected); reference _FakeFlightRecorderRepository, flight_recorder_frames, and
TestProtocolCompliance to locate and add this check.
In `@web/src/api/endpoints/cockpit.ts`:
- Around line 20-27: getFlightRecorderFrames currently accepts offset-based
params ({ limit, offset }) which must be replaced with opaque cursor pagination
and return/consume PaginationMeta; update the function signature
getFlightRecorderFrames(executionId: string, params?: { cursor?: string; limit?:
number }) and ensure the API call sends { params } with cursor and limit instead
of offset, and that the returned FlightRecorderFramesResponse (and any
ApiResponse wrapper handling) uses PaginationMeta for paging metadata so the
client/store flow gets the opaque cursor and next/previous info consistently.
In `@web/src/components/ui/timeline.tsx`:
- Around line 38-52: The keyboard handler onKeyDown can call onSeek with indexes
that may be out-of-range if currentIndex is stale after frames shrink; update
every key branch in onKeyDown (ArrowRight, ArrowLeft, Home, End) to clamp the
computed target into the valid range [0, lastIndex] before calling onSeek (e.g.,
compute target then set target = Math.max(0, Math.min(lastIndex, target))). This
change should be applied inside the onKeyDown function to ensure all key paths
use the same clamping logic and avoid invalid seeks.
In `@web/src/hooks/useMissionControlData.ts`:
- Around line 58-66: The WS handlers in the bindings created inside useMemo call
fetchSnapshot() on every event which can cause overlapping requests; change the
handler (the object built for COCKPIT_CHANNELS) to coalesce bursts by tracking a
shared timer/ref (e.g., pendingFetchTimerRef or lastWsUpdateAtRef) and
scheduling a single delayed call to
useMissionControlStore.getState().fetchSnapshot() within a short debounce window
(e.g., 50–200ms), clearing any previous timer when new events arrive so only one
fetch is triggered per burst; keep the lastWsUpdateAtRef update but ensure
fetchSnapshot is invoked from the debounced/queued function rather than directly
in the handler.
In `@web/src/router/index.tsx`:
- Line 170: Replace the hardcoded route string in the routes registration with
the route constant to prevent drift: in the routes array entry that currently
uses path: 'mission-control' (component: MissionControlPage), change it to use
path: ROUTES.MISSION_CONTROL so the route registration derives from the shared
ROUTES constant used elsewhere.
In `@web/src/stores/mission-control.ts`:
- Around line 78-85: The fetchFrames handler leaves previous frames in state if
loading a new execution fails; update fetchFrames so when initiating a load you
also clear stale frames (e.g., set frames to null or an empty array) along with
framesLoading, framesError, and framesExecutionId, and ensure on error you do
not retain old frames (keep frames cleared) while on success you set frames =
response.frames; refer to the fetchFrames function and the state keys frames,
framesLoading, framesError, and framesExecutionId.
In `@web/src/utils/constants.ts`:
- Line 188: The new display label "cockpit" was added but the NAMESPACE_ORDER
constant still omits 'cockpit', preventing it from being surfaced; update the
NAMESPACE_ORDER array to include 'cockpit' in the desired ordering so it appears
in Settings (locate the NAMESPACE_ORDER symbol in web/src/utils/constants.ts and
insert 'cockpit' at the appropriate position consistent with other namespace
names and the file's ordering contract).
In `@web/src/utils/status-color.ts`:
- Around line 8-23: The STATUS_BG map and statusBgClass currently use plain
string types allowing typos; replace them to use the generated TaskStatus union:
change STATUS_BG's type to Partial<Record<TaskStatus, string>> (or
Record<TaskStatus, string> if you can authoritatively cover every status) and
change statusBgClass to accept status: TaskStatus (i.e., export function
statusBgClass(status: TaskStatus): string). Keep the fallback lookup
(STATUS_BG[status] ?? 'bg-text-secondary') unchanged; import or reference the
generated TaskStatus type where STATUS_BG and statusBgClass are declared.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI (base), Organization UI (inherited)
Review profile: ASSERTIVE
Plan: Pro
Run ID: 82a8a440-145f-4c20-a08b-aecb3d30f9cc
📒 Files selected for processing (77)
scripts/_ghost_wiring_manifest.txtsrc/synthorg/api/app.pysrc/synthorg/api/channels.pysrc/synthorg/api/controllers/__init__.pysrc/synthorg/api/controllers/cockpit.pysrc/synthorg/api/state.pysrc/synthorg/core/enums.pysrc/synthorg/engine/agent_engine.pysrc/synthorg/engine/cockpit/__init__.pysrc/synthorg/engine/cockpit/service.pysrc/synthorg/engine/flight_recording/__init__.pysrc/synthorg/engine/flight_recording/service.pysrc/synthorg/engine/flight_recording/sink.pysrc/synthorg/engine/intervention/__init__.pysrc/synthorg/engine/intervention/steering.pysrc/synthorg/meta/mcp/domains/__init__.pysrc/synthorg/meta/mcp/domains/cockpit.pysrc/synthorg/meta/mcp/handlers/__init__.pysrc/synthorg/meta/mcp/handlers/cockpit.pysrc/synthorg/observability/events/cockpit.pysrc/synthorg/observability/events/persistence.pysrc/synthorg/observability/prometheus_labels.pysrc/synthorg/persistence/flight_recorder_protocol.pysrc/synthorg/persistence/postgres/backend.pysrc/synthorg/persistence/postgres/flight_recorder_repo.pysrc/synthorg/persistence/postgres/revisions/20260522000002_flight_recorder.sqlsrc/synthorg/persistence/postgres/schema.sqlsrc/synthorg/persistence/protocol.pysrc/synthorg/persistence/sqlite/__init__.pysrc/synthorg/persistence/sqlite/_backend_accessors.pysrc/synthorg/persistence/sqlite/backend.pysrc/synthorg/persistence/sqlite/flight_recorder_repo.pysrc/synthorg/persistence/sqlite/revisions/20260522000002_flight_recorder.sqlsrc/synthorg/persistence/sqlite/schema.sqlsrc/synthorg/settings/definitions/__init__.pysrc/synthorg/settings/definitions/cockpit.pysrc/synthorg/settings/enums.pysrc/synthorg/workers/runtime_builder.pytests/conformance/persistence/test_flight_recorder_repository.pytests/e2e/test_cockpit_mission_control_e2e.pytests/unit/api/controllers/test_cockpit.pytests/unit/api/fakes.pytests/unit/api/fakes_backend.pytests/unit/api/test_channels.pytests/unit/engine/cockpit/test_cockpit_service.pytests/unit/engine/flight_recording/test_engine_recording.pytests/unit/engine/flight_recording/test_flight_recorder_service.pytests/unit/engine/flight_recording/test_sink.pytests/unit/engine/intervention/test_steering.pytests/unit/meta/mcp/test_all_handlers_wired.pytests/unit/observability/audit_chain/test_audit_chain.pytests/unit/observability/test_events.pytests/unit/persistence/test_protocol.pyweb/src/__tests__/components/ui/timeline.test.tsxweb/src/__tests__/pages/MissionControlPage.test.tsxweb/src/__tests__/stores/mission-control.test.tsweb/src/api/endpoints/cockpit.tsweb/src/api/types/dtos.gen.tsweb/src/api/types/enum-values.gen.tsweb/src/api/types/openapi.gen.tsweb/src/api/types/websocket.tsweb/src/components/layout/Sidebar.tsxweb/src/components/ui/timeline.stories.tsxweb/src/components/ui/timeline.tsxweb/src/hooks/useMissionControlData.tsweb/src/mocks/handlers/cockpit.tsweb/src/mocks/handlers/index.tsweb/src/pages/MissionControlPage.tsxweb/src/pages/mission-control/FlightRecorder.tsxweb/src/pages/mission-control/LiveCockpit.tsxweb/src/pages/settings/utils.tsweb/src/router/index.tsxweb/src/router/route-titles.tsweb/src/router/routes.tsweb/src/stores/mission-control.tsweb/src/utils/constants.tsweb/src/utils/status-color.ts
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (13)
- GitHub Check: Deploy Preview
- GitHub Check: Build Backend
- GitHub Check: Build Web Assets (melange)
- GitHub Check: CodSpeed Web benchmarks
- GitHub Check: CodSpeed Python benchmarks
- GitHub Check: Test Unit
- GitHub Check: Dashboard Test
- GitHub Check: Test Integration
- GitHub Check: Test Conformance (SQLite)
- GitHub Check: Test E2E
- GitHub Check: Lighthouse Dashboard
- GitHub Check: Analyze (python)
- GitHub Check: Analyze (javascript-typescript)
🧰 Additional context used
📓 Path-based instructions (21)
web/src/**/*.{js,jsx,ts,tsx,mts}
📄 CodeRabbit inference engine (web/CLAUDE.md)
web/src/**/*.{js,jsx,ts,tsx,mts}: Always usecreateLoggerfrom@/lib/logger; never bareconsole.warn/console.error/console.debugin application code. Variable name must always belog. Onlylogger.tsitself may use bare console methods. Uselog.debug()(DEV-only, stripped in production),log.warn(),log.error().
Pass dynamic/untrusted values as separate args to logger calls (not interpolated into the message string) so they go throughsanitizeArg
Attacker-controlled fields inside structured objects must be wrapped insanitizeForLog()before embedding in log calls
Error-code constants (MANDATORY): importErrorCodeandErrorCategoryfrom@/api/types/errors(re-exported from the generatedweb/src/api/types/error-codes.gen.ts). Discriminate onErrorCode.<NAME>, never on raw integer literals.
Use@eslint-react/web-api-no-leaked-fetchto detectfetch()in effects withoutAbortControllercleanup
Files:
web/src/utils/status-color.tsweb/src/utils/constants.tsweb/src/router/index.tsxweb/src/__tests__/stores/mission-control.test.tsweb/src/api/endpoints/cockpit.tsweb/src/__tests__/components/ui/timeline.test.tsxweb/src/pages/MissionControlPage.tsxweb/src/components/ui/timeline.tsxweb/src/router/route-titles.tsweb/src/api/types/enum-values.gen.tsweb/src/__tests__/pages/MissionControlPage.test.tsxweb/src/components/ui/timeline.stories.tsxweb/src/api/types/websocket.tsweb/src/pages/settings/utils.tsweb/src/components/layout/Sidebar.tsxweb/src/router/routes.tsweb/src/mocks/handlers/index.tsweb/src/mocks/handlers/cockpit.tsweb/src/hooks/useMissionControlData.tsweb/src/pages/mission-control/LiveCockpit.tsxweb/src/pages/mission-control/FlightRecorder.tsxweb/src/stores/mission-control.tsweb/src/api/types/dtos.gen.tsweb/src/api/types/openapi.gen.ts
web/src/{components,utils}/**/*.{ts,tsx}
📄 CodeRabbit inference engine (web/CLAUDE.md)
NEVER write
getXIcon(value): LucideIconfactories called inside JSX bodies. Export a<XIcon value={...} />wrapper that does the lookup viacreateElementinside the wrapper body. Wrapper components live in their own file, not alongside utility exports.
Files:
web/src/utils/status-color.tsweb/src/utils/constants.tsweb/src/components/ui/timeline.tsxweb/src/components/ui/timeline.stories.tsxweb/src/components/layout/Sidebar.tsx
web/src/**/*.{ts,tsx,mts}
📄 CodeRabbit inference engine (web/CLAUDE.md)
web/src/**/*.{ts,tsx,mts}: Use@typescript-eslint/no-floating-promisesto forbid unawaited promises so async work cannot survive the test that scheduled it and trip the active-handle gate
Use@typescript-eslint/no-misused-promises(withchecksVoidReturn: { attributes: false }) to forbid passing async functions where the callsite ignores the returned promise. React 19asyncevent handlers stay allowed via theattributes: falseexemption.
Files:
web/src/utils/status-color.tsweb/src/utils/constants.tsweb/src/router/index.tsxweb/src/__tests__/stores/mission-control.test.tsweb/src/api/endpoints/cockpit.tsweb/src/__tests__/components/ui/timeline.test.tsxweb/src/pages/MissionControlPage.tsxweb/src/components/ui/timeline.tsxweb/src/router/route-titles.tsweb/src/api/types/enum-values.gen.tsweb/src/__tests__/pages/MissionControlPage.test.tsxweb/src/components/ui/timeline.stories.tsxweb/src/api/types/websocket.tsweb/src/pages/settings/utils.tsweb/src/components/layout/Sidebar.tsxweb/src/router/routes.tsweb/src/mocks/handlers/index.tsweb/src/mocks/handlers/cockpit.tsweb/src/hooks/useMissionControlData.tsweb/src/pages/mission-control/LiveCockpit.tsxweb/src/pages/mission-control/FlightRecorder.tsxweb/src/stores/mission-control.tsweb/src/api/types/dtos.gen.tsweb/src/api/types/openapi.gen.ts
**/*.{py,ts,tsx,jsx,md}
📄 CodeRabbit inference engine (CLAUDE.md)
No region/currency/locale privileged; use metric units; British English per docs/reference/regional-defaults.md
Files:
web/src/utils/status-color.tssrc/synthorg/engine/intervention/__init__.pytests/unit/engine/flight_recording/test_engine_recording.pyweb/src/utils/constants.tsweb/src/router/index.tsxweb/src/__tests__/stores/mission-control.test.tstests/unit/observability/test_events.pyweb/src/api/endpoints/cockpit.tsweb/src/__tests__/components/ui/timeline.test.tsxsrc/synthorg/engine/flight_recording/__init__.pyweb/src/pages/MissionControlPage.tsxsrc/synthorg/engine/intervention/steering.pyweb/src/components/ui/timeline.tsxsrc/synthorg/persistence/sqlite/__init__.pyweb/src/router/route-titles.tssrc/synthorg/persistence/sqlite/backend.pyweb/src/api/types/enum-values.gen.tsweb/src/__tests__/pages/MissionControlPage.test.tsxweb/src/components/ui/timeline.stories.tsxweb/src/api/types/websocket.tssrc/synthorg/observability/prometheus_labels.pysrc/synthorg/engine/cockpit/__init__.pysrc/synthorg/api/controllers/__init__.pyweb/src/pages/settings/utils.tssrc/synthorg/api/channels.pyweb/src/components/layout/Sidebar.tsxweb/src/router/routes.tstests/unit/observability/audit_chain/test_audit_chain.pysrc/synthorg/settings/enums.pyweb/src/mocks/handlers/index.tssrc/synthorg/meta/mcp/domains/__init__.pysrc/synthorg/api/app.pytests/unit/api/test_channels.pyweb/src/mocks/handlers/cockpit.tsweb/src/hooks/useMissionControlData.tssrc/synthorg/observability/events/persistence.pysrc/synthorg/core/enums.pysrc/synthorg/meta/mcp/handlers/__init__.pysrc/synthorg/meta/mcp/domains/cockpit.pytests/unit/persistence/test_protocol.pytests/unit/meta/mcp/test_all_handlers_wired.pytests/unit/engine/intervention/test_steering.pyweb/src/pages/mission-control/LiveCockpit.tsxsrc/synthorg/settings/definitions/__init__.pytests/e2e/test_cockpit_mission_control_e2e.pysrc/synthorg/observability/events/cockpit.pytests/conformance/persistence/test_flight_recorder_repository.pysrc/synthorg/engine/flight_recording/service.pysrc/synthorg/api/state.pysrc/synthorg/persistence/postgres/backend.pysrc/synthorg/persistence/sqlite/flight_recorder_repo.pysrc/synthorg/engine/flight_recording/sink.pytests/unit/api/controllers/test_cockpit.pysrc/synthorg/persistence/protocol.pysrc/synthorg/engine/cockpit/service.pysrc/synthorg/engine/agent_engine.pytests/unit/engine/cockpit/test_cockpit_service.pytests/unit/api/fakes_backend.pysrc/synthorg/persistence/sqlite/_backend_accessors.pysrc/synthorg/workers/runtime_builder.pysrc/synthorg/persistence/flight_recorder_protocol.pysrc/synthorg/settings/definitions/cockpit.pytests/unit/api/fakes.pytests/unit/engine/flight_recording/test_sink.pysrc/synthorg/persistence/postgres/flight_recorder_repo.pyweb/src/pages/mission-control/FlightRecorder.tsxtests/unit/engine/flight_recording/test_flight_recorder_service.pyweb/src/stores/mission-control.tssrc/synthorg/api/controllers/cockpit.pysrc/synthorg/meta/mcp/handlers/cockpit.pyweb/src/api/types/dtos.gen.tsweb/src/api/types/openapi.gen.ts
src/synthorg/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/synthorg/**/*.py: Onlysrc/synthorg/persistence/may import sqlite/psycopg or emit raw SQL; new repository protocols inherit from generic categories inpersistence/_generics.py; bespoke methods permitted only under ADR-0001 D7
Configuration Precedence: DB > env > code default viaSettingsService/ConfigResolver(Cat-1) or env > code default (Cat-2,read_only_post_init); Cat-3 bootstrap secrets pure env; YAML is ingestion format only, not precedence tier; noos.environ.getoutside startup
No hardcoded numeric values; numerics live insettings/definitions/; allowlist only 0/1/-1, HTTP codes, hex masks, powers-of-2, and module-level annotated named constants (NAME: int|float|Final); enforced byscripts/check_no_magic_numbers.py
Comments document WHY only; no reviewer citations, issue back-refs, or migration framing; enforced bycheck_no_review_origin_in_code.py+check_no_migration_framing.py
Nofrom __future__ import annotations(Python 3.14 has PEP 649); use PEP 758 except:except A, B:no parens unless binding
Type hints on public functions; mypy strict; Google-style docstrings; line length 88; functions <50 lines; files <800 lines
Errors follow<Domain><Condition>Errorpattern fromDomainError; never inheritException/RuntimeErrordirectly; enforced bycheck_domain_error_hierarchy.py
Pydantic v2 frozen +extra="forbid"on every frozen model project-wide; gatecheck_frozen_model_extra_forbid.py;@computed_fieldauto-exempt; per-line# lint-allow: frozen-extra-forbid -- <reason>forextra="allow"/"ignore"boundaries; use@computed_fieldfor derived; useNotBlankStrfor identifiers
Args models at every system boundary;parse_typed()for every external dict ingestion; enforced bycheck_boundary_typed.py
Immutability: usemodel_copy(update=...)orcopy.deepcopy(); deepcopy at system boundaries
Async: useasyncio.TaskGroupfor fan-out/fan-in; helpers catchException(re-raiseMemoryError/`RecursionError...
Files:
src/synthorg/engine/intervention/__init__.pysrc/synthorg/engine/flight_recording/__init__.pysrc/synthorg/engine/intervention/steering.pysrc/synthorg/persistence/sqlite/__init__.pysrc/synthorg/persistence/sqlite/backend.pysrc/synthorg/observability/prometheus_labels.pysrc/synthorg/engine/cockpit/__init__.pysrc/synthorg/api/controllers/__init__.pysrc/synthorg/api/channels.pysrc/synthorg/settings/enums.pysrc/synthorg/meta/mcp/domains/__init__.pysrc/synthorg/api/app.pysrc/synthorg/observability/events/persistence.pysrc/synthorg/core/enums.pysrc/synthorg/meta/mcp/handlers/__init__.pysrc/synthorg/meta/mcp/domains/cockpit.pysrc/synthorg/settings/definitions/__init__.pysrc/synthorg/observability/events/cockpit.pysrc/synthorg/engine/flight_recording/service.pysrc/synthorg/api/state.pysrc/synthorg/persistence/postgres/backend.pysrc/synthorg/persistence/sqlite/flight_recorder_repo.pysrc/synthorg/engine/flight_recording/sink.pysrc/synthorg/persistence/protocol.pysrc/synthorg/engine/cockpit/service.pysrc/synthorg/engine/agent_engine.pysrc/synthorg/persistence/sqlite/_backend_accessors.pysrc/synthorg/workers/runtime_builder.pysrc/synthorg/persistence/flight_recorder_protocol.pysrc/synthorg/settings/definitions/cockpit.pysrc/synthorg/persistence/postgres/flight_recorder_repo.pysrc/synthorg/api/controllers/cockpit.pysrc/synthorg/meta/mcp/handlers/cockpit.py
src/**/*.py
⚙️ CodeRabbit configuration file
This project uses Python 3.14+ with PEP 758 except syntax: "except A, B:" (comma-separated, no parentheses) is correct and mandatory -- do NOT flag it as a typo or suggest parenthesized form. The "except builtins.MemoryError, RecursionError: raise" pattern is intentional project convention for system-error propagation. When evaluating the 50-line function limit, count only the function body excluding the signature lines, decorators, and docstring. Functions 1-5 lines over due to docstrings or multi-line signatures should not be flagged. Do not suggest extracting single-use helper functions called exactly once -- this reduces readability without improving maintainability.
Files:
src/synthorg/engine/intervention/__init__.pysrc/synthorg/engine/flight_recording/__init__.pysrc/synthorg/engine/intervention/steering.pysrc/synthorg/persistence/sqlite/__init__.pysrc/synthorg/persistence/sqlite/backend.pysrc/synthorg/observability/prometheus_labels.pysrc/synthorg/engine/cockpit/__init__.pysrc/synthorg/api/controllers/__init__.pysrc/synthorg/api/channels.pysrc/synthorg/settings/enums.pysrc/synthorg/meta/mcp/domains/__init__.pysrc/synthorg/api/app.pysrc/synthorg/observability/events/persistence.pysrc/synthorg/core/enums.pysrc/synthorg/meta/mcp/handlers/__init__.pysrc/synthorg/meta/mcp/domains/cockpit.pysrc/synthorg/settings/definitions/__init__.pysrc/synthorg/observability/events/cockpit.pysrc/synthorg/engine/flight_recording/service.pysrc/synthorg/api/state.pysrc/synthorg/persistence/postgres/backend.pysrc/synthorg/persistence/sqlite/flight_recorder_repo.pysrc/synthorg/engine/flight_recording/sink.pysrc/synthorg/persistence/protocol.pysrc/synthorg/engine/cockpit/service.pysrc/synthorg/engine/agent_engine.pysrc/synthorg/persistence/sqlite/_backend_accessors.pysrc/synthorg/workers/runtime_builder.pysrc/synthorg/persistence/flight_recorder_protocol.pysrc/synthorg/settings/definitions/cockpit.pysrc/synthorg/persistence/postgres/flight_recorder_repo.pysrc/synthorg/api/controllers/cockpit.pysrc/synthorg/meta/mcp/handlers/cockpit.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/**/*.py: Timeout/slow failures = source-code regression; never edittests/baselines/unit_timing.jsonor anyscripts/*_baseline.{txt,json}/scripts/_*_baseline.py; both families PreToolUse-blocked; per-invocation bypass requires explicit approval (ALLOW_BASELINE_GROWTH=1 git commit)
Test markers:@pytest.mark.{unit,integration,e2e,slow}; async auto; timeout 30s global; coverage 80% min
xdist-n 8 --dist=loadfileauto-applied via pyprojectaddopts; Windows unit tests useWindowsSelectorEventLoopPolicy; subprocess tests override back
Test doubles:FakeClockfor Clock seam,mock_of[T](**overrides)for typed-boundary substitutions,SimpleNamespacefor attribute-bags; bareMagicMockat typed boundary blocked byscripts/check_mock_spec.py
Hypothesis: 10 deterministic CI examples; failures are real bugs (fix + add@example(...)); never skip/xfail flaky tests; fix fundamentally
Files:
tests/unit/engine/flight_recording/test_engine_recording.pytests/unit/observability/test_events.pytests/unit/observability/audit_chain/test_audit_chain.pytests/unit/api/test_channels.pytests/unit/persistence/test_protocol.pytests/unit/meta/mcp/test_all_handlers_wired.pytests/unit/engine/intervention/test_steering.pytests/e2e/test_cockpit_mission_control_e2e.pytests/conformance/persistence/test_flight_recorder_repository.pytests/unit/api/controllers/test_cockpit.pytests/unit/engine/cockpit/test_cockpit_service.pytests/unit/api/fakes_backend.pytests/unit/api/fakes.pytests/unit/engine/flight_recording/test_sink.pytests/unit/engine/flight_recording/test_flight_recorder_service.py
⚙️ CodeRabbit configuration file
Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare
@settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which@given() honors automatically.
Files:
tests/unit/engine/flight_recording/test_engine_recording.pytests/unit/observability/test_events.pytests/unit/observability/audit_chain/test_audit_chain.pytests/unit/api/test_channels.pytests/unit/persistence/test_protocol.pytests/unit/meta/mcp/test_all_handlers_wired.pytests/unit/engine/intervention/test_steering.pytests/e2e/test_cockpit_mission_control_e2e.pytests/conformance/persistence/test_flight_recorder_repository.pytests/unit/api/controllers/test_cockpit.pytests/unit/engine/cockpit/test_cockpit_service.pytests/unit/api/fakes_backend.pytests/unit/api/fakes.pytests/unit/engine/flight_recording/test_sink.pytests/unit/engine/flight_recording/test_flight_recorder_service.py
web/src/utils/constants.ts
📄 CodeRabbit inference engine (web/CLAUDE.md)
WS wire protocol (MANDATORY): the client-server contract lives in
web/src/utils/constants.ts(WS_PROTOCOL_VERSION,WS_MAX_MESSAGE_SIZE,WS_HEARTBEAT_INTERVAL_MS,WS_PONG_TIMEOUT_MS,LOG_SANITIZE_MAX_LENGTH) and MUST stay in lockstep withsrc/synthorg/api/ws_models.py/src/synthorg/api/controllers/ws.py. Bump the protocol version on both sides together for breaking payload changes.
Files:
web/src/utils/constants.ts
web/src/**/*.{jsx,tsx}
📄 CodeRabbit inference engine (web/CLAUDE.md)
web/src/**/*.{jsx,tsx}: Use@eslint-react/no-leaked-conditional-renderingto catch the{count && <Foo />}bug where0renders verbatim. ForReactNode | undefinedprops use{value != null && value !== false && <jsx>}; for compound truthiness useBoolean(...).
Use@eslint-react/globalsto restrictwindow/document/localStorage/ etc. inside render. Hoist offenders into auseCallbackevent handler, auseEffect, or auseSyncExternalStore-backed hook.Reuse
web/src/components/ui/design tokens in Web Dashboard Design System; detail inweb/CLAUDE.md
Files:
web/src/router/index.tsxweb/src/__tests__/components/ui/timeline.test.tsxweb/src/pages/MissionControlPage.tsxweb/src/components/ui/timeline.tsxweb/src/__tests__/pages/MissionControlPage.test.tsxweb/src/components/ui/timeline.stories.tsxweb/src/components/layout/Sidebar.tsxweb/src/pages/mission-control/LiveCockpit.tsxweb/src/pages/mission-control/FlightRecorder.tsx
web/src/{stores,**/*.test.{ts,tsx}}
📄 CodeRabbit inference engine (web/CLAUDE.md)
Active-handle gate (MANDATORY): every unit test runs under
web/test-infra/active-handle-tracker.ts, which fails any test that leaks an event-loop-holding resource. A new store that schedules timers / attaches listeners MUST expose a teardown hook and register it in the globalafterEach; otherwise the gate fails the first test that triggers the schedule.
Files:
web/src/__tests__/stores/mission-control.test.tsweb/src/__tests__/components/ui/timeline.test.tsxweb/src/__tests__/pages/MissionControlPage.test.tsx
web/src/{api/endpoints,stores}/**/*.ts
📄 CodeRabbit inference engine (web/CLAUDE.md)
Cursor pagination (MANDATORY): list endpoints must use opaque cursor-based paging via
PaginationMeta. Stores must keepnextCursor+hasMorein state (not offset arithmetic) and early-return when!hasMore || !nextCursor. Display counts must come fromdata.length.
Files:
web/src/api/endpoints/cockpit.tsweb/src/stores/mission-control.ts
web/src/api/endpoints/**/*.ts
📄 CodeRabbit inference engine (web/CLAUDE.md)
Health / readiness endpoints (MANDATORY):
getLiveness()is always 200 while the process is alive;getReadiness()is 200 healthy / 503 unavailable (binary'ok' | 'unavailable'outcome, no tri-state). Any new caller must handle the 503 path explicitly.
Files:
web/src/api/endpoints/cockpit.ts
web/src/components/ui/**/*.{ts,tsx}
📄 CodeRabbit inference engine (web/CLAUDE.md)
web/src/components/ui/**/*.{ts,tsx}: ALWAYS reuse existing components fromweb/src/components/ui/before creating new ones. NEVER hardcode hex colors, font-family declarations, pixel spacing, Motion transition durations, BCP 47 locale literals ('en-US'), or currency symbols / codes; use design tokens,@/lib/motionpresets, the helpers in@/utils/format, andDEFAULT_CURRENCYfrom@/utils/currencies.
Every new shared component lives inweb/src/components/ui/with a sibling.stories.tsxcovering all states (default, hover, loading, error, empty, disabled where applicable)
Component Props interface name must be<ComponentName>Propsand must be exported from the same file (e.g.AgentCardPropsinagent-card.tsx). This makes the contract greppable and lets callers extend the props without re-typing the shape.
Base UI primitives must compose Portal + Backdrop + Popup explicitly, use therenderprop for polymorphism, and rely on animation state attributes (data-[open],data-[closed]) rather than the olderdata-[state=open]form.
Base UI primitives are imported directly from@base-ui/react/<subpath>and use the nativerenderprop for polymorphism; the local<Slot>helper is reserved for<Button asChild>only
Files:
web/src/components/ui/timeline.tsxweb/src/components/ui/timeline.stories.tsx
web/src/{components,hooks}/**/*.{ts,tsx}
📄 CodeRabbit inference engine (web/CLAUDE.md)
NEVER read
window.innerWidth/window.innerHeightdirectly in a render body oruseMemo; useuseViewportSize()from@/hooks/useViewportSizeinstead
Files:
web/src/components/ui/timeline.tsxweb/src/components/ui/timeline.stories.tsxweb/src/components/layout/Sidebar.tsxweb/src/hooks/useMissionControlData.ts
src/synthorg/persistence/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/synthorg/persistence/**/*.py: Repository CRUD:save(entity),get(id),delete(id) -> bool,list_items(...),query(...)returning tuples
Datetime in persistence: useparse_iso_utc/format_iso_utcfrompersistence._shared(reject naive); usenormalize_utcfor already-typed
Files:
src/synthorg/persistence/sqlite/__init__.pysrc/synthorg/persistence/sqlite/backend.pysrc/synthorg/persistence/postgres/backend.pysrc/synthorg/persistence/sqlite/flight_recorder_repo.pysrc/synthorg/persistence/protocol.pysrc/synthorg/persistence/sqlite/_backend_accessors.pysrc/synthorg/persistence/flight_recorder_protocol.pysrc/synthorg/persistence/postgres/flight_recorder_repo.py
web/src/api/types/**/*.gen.ts
📄 CodeRabbit inference engine (web/CLAUDE.md)
Generated DTO types (MANDATORY): NEVER hand-edit
web/src/api/types/*.gen.ts. Regenerate withuv run python scripts/generate_dto_types_ts.py. Import DTOs via the barrel (import type { AgentConfig } from '@/api/types').
Files:
web/src/api/types/enum-values.gen.tsweb/src/api/types/dtos.gen.tsweb/src/api/types/openapi.gen.ts
web/src/**/*.stories.{ts,tsx}
📄 CodeRabbit inference engine (web/CLAUDE.md)
Storybook 10 is ESM-only; essentials are built into core, but
@storybook/addon-docsis now separate; imports moved tostorybook/testandstorybook/actions
Files:
web/src/components/ui/timeline.stories.tsx
web/src/mocks/handlers/**/*.ts
📄 CodeRabbit inference engine (web/CLAUDE.md)
web/src/mocks/handlers/**/*.ts: MSW handlers (MANDATORY):web/src/mocks/handlers/must mirrorweb/src/api/endpoints/*.ts1:1 with a default happy-path handler for every exported endpoint. UseonUnhandledRequest: 'error'in test setup; tests override per-case viaserver.use(...), nevervi.mock('@/api/endpoints/*').
Use typed envelope helpers (successFor,paginatedFor,voidSuccess) to keep MSW handlers in lockstep with endpoint return types
Files:
web/src/mocks/handlers/index.tsweb/src/mocks/handlers/cockpit.ts
src/synthorg/meta/mcp/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
MCP: Define
ToolHandler+args_model; callrequire_admin_guardrails()on admin tools; route through service layers per mcp-handler-contract.md
Files:
src/synthorg/meta/mcp/domains/__init__.pysrc/synthorg/meta/mcp/handlers/__init__.pysrc/synthorg/meta/mcp/domains/cockpit.pysrc/synthorg/meta/mcp/handlers/cockpit.py
tests/conformance/persistence/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Dual-backend conformance:
tests/conformance/persistence/consumesbackendfixture (SQLite + Postgres); enforced bycheck_dual_backend_test_parity.py
Files:
tests/conformance/persistence/test_flight_recorder_repository.py
web/src/stores/**/*.ts
📄 CodeRabbit inference engine (web/CLAUDE.md)
web/src/stores/**/*.ts: List reads (fetch*) must seterror: string | nullon the store instead of toasting
Test teardown (MANDATORY): any new store that schedules timers or attaches event listeners must expose an equivalent cleanup hook and register it in the globalafterEach. The globalafterEachinweb/src/test-setup.tsxalready callsuseToastStore.getState().dismissAll(),cancelPendingPersist(), anduseThemeStore.getState().teardown().
Files:
web/src/stores/mission-control.ts
🧠 Learnings (5)
📚 Learning: 2026-05-05T09:04:46.195Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1760
File: scripts/_dual_backend_parity_lib.py:215-216
Timestamp: 2026-05-05T09:04:46.195Z
Learning: This repository targets Python 3.14+ and follows PEP 758. Therefore, reviewer tooling should NOT treat unparenthesized multi-exception `except` clauses written without an `as` clause (e.g., `except MemoryError, RecursionError:`) as syntax errors. Only flag `except`-clause problems when they are genuinely invalid for Python 3.14+.
Applied to files:
src/synthorg/engine/intervention/__init__.pytests/unit/engine/flight_recording/test_engine_recording.pytests/unit/observability/test_events.pysrc/synthorg/engine/flight_recording/__init__.pysrc/synthorg/engine/intervention/steering.pysrc/synthorg/persistence/sqlite/__init__.pysrc/synthorg/persistence/sqlite/backend.pysrc/synthorg/observability/prometheus_labels.pysrc/synthorg/engine/cockpit/__init__.pysrc/synthorg/api/controllers/__init__.pysrc/synthorg/api/channels.pytests/unit/observability/audit_chain/test_audit_chain.pysrc/synthorg/settings/enums.pysrc/synthorg/meta/mcp/domains/__init__.pysrc/synthorg/api/app.pytests/unit/api/test_channels.pysrc/synthorg/observability/events/persistence.pysrc/synthorg/core/enums.pysrc/synthorg/meta/mcp/handlers/__init__.pysrc/synthorg/meta/mcp/domains/cockpit.pytests/unit/persistence/test_protocol.pytests/unit/meta/mcp/test_all_handlers_wired.pytests/unit/engine/intervention/test_steering.pysrc/synthorg/settings/definitions/__init__.pytests/e2e/test_cockpit_mission_control_e2e.pysrc/synthorg/observability/events/cockpit.pytests/conformance/persistence/test_flight_recorder_repository.pysrc/synthorg/engine/flight_recording/service.pysrc/synthorg/api/state.pysrc/synthorg/persistence/postgres/backend.pysrc/synthorg/persistence/sqlite/flight_recorder_repo.pysrc/synthorg/engine/flight_recording/sink.pytests/unit/api/controllers/test_cockpit.pysrc/synthorg/persistence/protocol.pysrc/synthorg/engine/cockpit/service.pysrc/synthorg/engine/agent_engine.pytests/unit/engine/cockpit/test_cockpit_service.pytests/unit/api/fakes_backend.pysrc/synthorg/persistence/sqlite/_backend_accessors.pysrc/synthorg/workers/runtime_builder.pysrc/synthorg/persistence/flight_recorder_protocol.pysrc/synthorg/settings/definitions/cockpit.pytests/unit/api/fakes.pytests/unit/engine/flight_recording/test_sink.pysrc/synthorg/persistence/postgres/flight_recorder_repo.pytests/unit/engine/flight_recording/test_flight_recorder_service.pysrc/synthorg/api/controllers/cockpit.pysrc/synthorg/meta/mcp/handlers/cockpit.py
📚 Learning: 2026-05-21T22:55:20.496Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 2035
File: src/synthorg/meta/toolsmith/models.py:114-114
Timestamp: 2026-05-21T22:55:20.496Z
Learning: In this repo’s “magic number” review standard, the existing gate in `scripts/check_no_magic_numbers.py` intentionally does NOT flag numeric literals used as raw call-site arguments. So, do not flag numeric literals passed as keyword arguments to Pydantic `Field()` (e.g., `Field(ge=0, le=100)` / `Field(ge=1, le=50)`)—this is an established idiom. Only treat numeric literals as “magic numbers” when they occur in the locations the gate checks (module-level assignments and function/method parameter defaults).
Applied to files:
src/synthorg/engine/intervention/__init__.pytests/unit/engine/flight_recording/test_engine_recording.pytests/unit/observability/test_events.pysrc/synthorg/engine/flight_recording/__init__.pysrc/synthorg/engine/intervention/steering.pysrc/synthorg/persistence/sqlite/__init__.pysrc/synthorg/persistence/sqlite/backend.pysrc/synthorg/observability/prometheus_labels.pysrc/synthorg/engine/cockpit/__init__.pysrc/synthorg/api/controllers/__init__.pysrc/synthorg/api/channels.pytests/unit/observability/audit_chain/test_audit_chain.pysrc/synthorg/settings/enums.pysrc/synthorg/meta/mcp/domains/__init__.pysrc/synthorg/api/app.pytests/unit/api/test_channels.pysrc/synthorg/observability/events/persistence.pysrc/synthorg/core/enums.pysrc/synthorg/meta/mcp/handlers/__init__.pysrc/synthorg/meta/mcp/domains/cockpit.pytests/unit/persistence/test_protocol.pytests/unit/meta/mcp/test_all_handlers_wired.pytests/unit/engine/intervention/test_steering.pysrc/synthorg/settings/definitions/__init__.pytests/e2e/test_cockpit_mission_control_e2e.pysrc/synthorg/observability/events/cockpit.pytests/conformance/persistence/test_flight_recorder_repository.pysrc/synthorg/engine/flight_recording/service.pysrc/synthorg/api/state.pysrc/synthorg/persistence/postgres/backend.pysrc/synthorg/persistence/sqlite/flight_recorder_repo.pysrc/synthorg/engine/flight_recording/sink.pytests/unit/api/controllers/test_cockpit.pysrc/synthorg/persistence/protocol.pysrc/synthorg/engine/cockpit/service.pysrc/synthorg/engine/agent_engine.pytests/unit/engine/cockpit/test_cockpit_service.pytests/unit/api/fakes_backend.pysrc/synthorg/persistence/sqlite/_backend_accessors.pysrc/synthorg/workers/runtime_builder.pysrc/synthorg/persistence/flight_recorder_protocol.pysrc/synthorg/settings/definitions/cockpit.pytests/unit/api/fakes.pytests/unit/engine/flight_recording/test_sink.pysrc/synthorg/persistence/postgres/flight_recorder_repo.pytests/unit/engine/flight_recording/test_flight_recorder_service.pysrc/synthorg/api/controllers/cockpit.pysrc/synthorg/meta/mcp/handlers/cockpit.py
📚 Learning: 2026-05-21T22:55:09.289Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 2035
File: src/synthorg/meta/toolsmith/config.py:29-30
Timestamp: 2026-05-21T22:55:09.289Z
Learning: For this repo’s Pydantic configuration idiom, do not treat numeric literals passed directly as arguments to `pydantic.Field(...)` as “magic numbers” during review. This includes call-site usages like `Field(default=0.2, ge=0.0, le=1.0)` (e.g., in config models such as `ToolAuthoringConfig`, `ToolValidationConfig`, `ToolsmithConfig`). Do not request extracting those `Field(...)` numeric arguments into named constants, since the repo’s `scripts/check_no_magic_numbers.py` intentionally excludes call-site `Field(...)` numerics and relies on `Field(...)` as the canonical way to express these constraints/defaults.
Applied to files:
src/synthorg/engine/intervention/__init__.pysrc/synthorg/engine/flight_recording/__init__.pysrc/synthorg/engine/intervention/steering.pysrc/synthorg/persistence/sqlite/__init__.pysrc/synthorg/persistence/sqlite/backend.pysrc/synthorg/observability/prometheus_labels.pysrc/synthorg/engine/cockpit/__init__.pysrc/synthorg/api/controllers/__init__.pysrc/synthorg/api/channels.pysrc/synthorg/settings/enums.pysrc/synthorg/meta/mcp/domains/__init__.pysrc/synthorg/api/app.pysrc/synthorg/observability/events/persistence.pysrc/synthorg/core/enums.pysrc/synthorg/meta/mcp/handlers/__init__.pysrc/synthorg/meta/mcp/domains/cockpit.pysrc/synthorg/settings/definitions/__init__.pysrc/synthorg/observability/events/cockpit.pysrc/synthorg/engine/flight_recording/service.pysrc/synthorg/api/state.pysrc/synthorg/persistence/postgres/backend.pysrc/synthorg/persistence/sqlite/flight_recorder_repo.pysrc/synthorg/engine/flight_recording/sink.pysrc/synthorg/persistence/protocol.pysrc/synthorg/engine/cockpit/service.pysrc/synthorg/engine/agent_engine.pysrc/synthorg/persistence/sqlite/_backend_accessors.pysrc/synthorg/workers/runtime_builder.pysrc/synthorg/persistence/flight_recorder_protocol.pysrc/synthorg/settings/definitions/cockpit.pysrc/synthorg/persistence/postgres/flight_recorder_repo.pysrc/synthorg/api/controllers/cockpit.pysrc/synthorg/meta/mcp/handlers/cockpit.py
📚 Learning: 2026-05-17T11:45:11.839Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1952
File: src/synthorg/settings/definitions/api.py:594-638
Timestamp: 2026-05-17T11:45:11.839Z
Learning: In SynthOrg (Aureliolo/synthorg) pre-alpha, apply the strict no-backward-compat policy: any setting-key rename must be fully completed in the same change/PR with all repo callers updated, and you should not keep legacy aliases or compatibility fallbacks. When reviewing, do not flag a setting-key rename as a breaking upgrade hazard if the rename is repo-wide and fully implemented within the same PR.
Applied to files:
src/synthorg/settings/enums.pysrc/synthorg/settings/definitions/__init__.pysrc/synthorg/settings/definitions/cockpit.py
📚 Learning: 2026-05-17T11:45:11.839Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1952
File: src/synthorg/settings/definitions/api.py:594-638
Timestamp: 2026-05-17T11:45:11.839Z
Learning: In this repository, SynthOrg is pre-alpha and uses a strict no-backward-compat policy for setting-key renames. When reviewing code under src/synthorg/settings, do NOT flag a setting-key rename as an “upgrade-safety” issue if the rename is complete/atomic in the same PR: all callers/usages of the old key are updated simultaneously, and the PR does not keep any legacy aliases, compatibility fallbacks, or migration/rollback paths for the old key.
Applied to files:
src/synthorg/settings/enums.pysrc/synthorg/settings/definitions/__init__.pysrc/synthorg/settings/definitions/cockpit.py
🪛 OpenGrep (1.21.0)
src/synthorg/persistence/sqlite/flight_recorder_repo.py
[ERROR] 68-76: SQL query built via f-string passed to execute()/executemany(). Use parameterized queries with placeholders instead.
(coderabbit.sql-injection.python-fstring-execute)
src/synthorg/persistence/postgres/flight_recorder_repo.py
[ERROR] 61-70: SQL query built via f-string passed to execute()/executemany(). Use parameterized queries with placeholders instead.
(coderabbit.sql-injection.python-fstring-execute)
🔇 Additional comments (61)
scripts/_ghost_wiring_manifest.txt (1)
106-109: LGTM!src/synthorg/core/enums.py (1)
1000-1013: LGTM!src/synthorg/persistence/flight_recorder_protocol.py (1)
1-137: LGTM!src/synthorg/persistence/protocol.py (1)
75-77: LGTM!Also applies to: 404-407
src/synthorg/settings/enums.py (1)
38-38: LGTM!src/synthorg/settings/definitions/__init__.py (1)
13-13: LGTM!Also applies to: 42-42
src/synthorg/settings/definitions/cockpit.py (1)
1-155: LGTM!src/synthorg/observability/prometheus_labels.py (1)
289-289: LGTM!src/synthorg/engine/intervention/__init__.py (1)
1-15: LGTM!src/synthorg/observability/events/cockpit.py (1)
1-21: LGTM!src/synthorg/observability/events/persistence.py (1)
62-75: LGTM!src/synthorg/persistence/postgres/backend.py (1)
74-76: LGTM!Also applies to: 231-233, 338-338, 407-407, 479-479, 695-701
src/synthorg/persistence/postgres/flight_recorder_repo.py (1)
1-200: LGTM!src/synthorg/persistence/postgres/revisions/20260522000002_flight_recorder.sql (1)
1-25: LGTM!src/synthorg/persistence/postgres/schema.sql (1)
376-400: LGTM!src/synthorg/persistence/sqlite/__init__.py (1)
13-15: LGTM!Also applies to: 30-30
src/synthorg/persistence/sqlite/_backend_accessors.py (1)
63-65: LGTM!Also applies to: 153-153, 300-307
src/synthorg/persistence/sqlite/backend.py (1)
73-75: LGTM!Also applies to: 243-243, 318-318, 537-540
src/synthorg/persistence/sqlite/flight_recorder_repo.py (1)
1-210: LGTM!tests/conformance/persistence/test_flight_recorder_repository.py (1)
1-163: LGTM!tests/e2e/test_cockpit_mission_control_e2e.py (1)
1-110: LGTM!tests/unit/api/controllers/test_cockpit.py (1)
1-161: LGTM!tests/unit/api/fakes.py (1)
35-38: LGTM!Also applies to: 517-562
tests/unit/api/fakes_backend.py (1)
37-37: LGTM!Also applies to: 641-641, 807-810
tests/unit/api/test_channels.py (1)
13-13: LGTM!Also applies to: 80-80
tests/unit/engine/cockpit/test_cockpit_service.py (1)
1-135: LGTM!tests/unit/engine/flight_recording/test_engine_recording.py (1)
1-92: LGTM!tests/unit/engine/flight_recording/test_flight_recorder_service.py (1)
1-66: LGTM!tests/unit/engine/flight_recording/test_sink.py (1)
1-212: LGTM!tests/unit/engine/intervention/test_steering.py (1)
1-35: LGTM!Also applies to: 49-83
tests/unit/meta/mcp/test_all_handlers_wired.py (1)
214-221: LGTM!tests/unit/observability/audit_chain/test_audit_chain.py (1)
457-457: LGTM!Also applies to: 470-480
tests/unit/observability/test_events.py (1)
355-357: LGTM!web/src/api/endpoints/cockpit.ts (1)
13-17: LGTM!Also applies to: 31-84
web/src/api/types/dtos.gen.ts (1)
18-18: LGTM!Also applies to: 65-65, 70-70, 90-90, 106-106, 262-263, 284-284, 288-288, 373-373, 418-418, 469-470
web/src/api/types/enum-values.gen.ts (1)
387-394: LGTM!Also applies to: 659-659
web/src/api/types/openapi.gen.ts (1)
1275-17030: LGTM!web/src/api/types/websocket.ts (1)
14-22: LGTM!web/src/pages/MissionControlPage.tsx (1)
15-45: LGTM!web/src/pages/mission-control/FlightRecorder.tsx (1)
28-155: LGTM!Also applies to: 157-197
web/src/pages/mission-control/LiveCockpit.tsx (1)
31-116: LGTM!Also applies to: 118-160
web/src/pages/settings/utils.ts (1)
18-45: LGTM!web/src/mocks/handlers/cockpit.ts (1)
16-106: LGTM!web/src/router/route-titles.ts (1)
48-48: LGTM!web/src/router/routes.ts (1)
12-12: LGTM!web/src/components/layout/Sidebar.tsx (1)
19-19: LGTM!Also applies to: 229-229
web/src/components/ui/timeline.stories.tsx (1)
1-43: LGTM!web/src/mocks/handlers/index.ts (1)
73-73: LGTM!Also applies to: 127-127, 178-178
web/src/__tests__/components/ui/timeline.test.tsx (1)
1-45: LGTM!web/src/__tests__/pages/MissionControlPage.test.tsx (1)
1-86: LGTM!web/src/__tests__/stores/mission-control.test.ts (1)
1-55: LGTM!src/synthorg/engine/cockpit/__init__.py (1)
3-13: LGTM!src/synthorg/api/app.py (1)
300-353: LGTM!Also applies to: 1302-1304
src/synthorg/api/channels.py (1)
39-39: LGTM!Also applies to: 64-64
src/synthorg/api/controllers/__init__.py (1)
27-27: LGTM!Also applies to: 172-172
src/synthorg/meta/mcp/domains/__init__.py (1)
11-11: LGTM!Also applies to: 42-42
src/synthorg/meta/mcp/domains/cockpit.py (1)
1-89: LGTM!src/synthorg/meta/mcp/handlers/__init__.py (1)
14-14: LGTM!Also applies to: 49-49
src/synthorg/meta/mcp/handlers/cockpit.py (1)
1-264: LGTM!src/synthorg/engine/flight_recording/__init__.py (1)
1-24: LGTM!src/synthorg/workers/runtime_builder.py (1)
34-37: LGTM!Also applies to: 94-97, 396-417, 429-430, 469-470, 730-740
CI fixes (2): - tests/e2e/test_cockpit_mission_control_e2e.py: construct real Task instead of mock_of[Task] (Pydantic v2 fields are not autospec attrs). - tests/integration/mcp/test_tool_surface.py: bump tool count assertion from 219 to 226 to reflect new cockpit MCP tools. Reviewer feedback (4 gemini, 19 coderabbit): Persistence (protocol + sqlite + postgres + fakes + conformance): - Add FlightRecorderFrameAggregate + get_aggregate() bespoke method (ADR-0001 D7) so cockpit activity and seek can sum cost / max turn in one SQL pass instead of N+1 across active tasks. - Add append_many() bespoke method so flight-recorder finalisation lands as one batched transaction instead of N round-trips. - Make idx_frf_execution_turn UNIQUE (sqlite + postgres revisions and schema.sql) so seek/replay is deterministic. - purge_before now takes AwareDatetime (was plain datetime), rejecting naive values at the contract boundary. Engine: - engine/cockpit/service: _build_activity now consumes get_aggregate; fan-out across tasks via asyncio.TaskGroup keeps snapshot latency bounded by the slowest aggregate query, not the sum. - engine/flight_recording/sink: PersistenceFlightRecorderSink uses append_many; re-raises MemoryError/RecursionError before generic Exception catch; build_frames correlates assistant message by turn.turn_number - 1 (was loop index), fixing resumed-run drift. - engine/flight_recording/service: ReplaySeekView gains a truncated flag; cumulative_cost now comes from the unbounded SQL aggregate so it stays accurate when the windowed frames are capped. - engine/agent_engine: guard build_frames + record_frames in try/except so a construction fault cannot turn a successful run into a failure. - engine/intervention/steering: wrap operator text via wrap_untrusted (SEC-1) before persisting the steering interrupt. API: - api/controllers/cockpit: /frames switches to opaque cursor pagination (PaginatedResponse FlightRecorderFrame); turn_index path param now validated ge=1 at the Litestar boundary; /interventions/hint and /interventions/redirect wrap operator text via wrap_untrusted (SEC-1). - api/state: set_cockpit_services emits accurate transition value (noop / attached / replaced) instead of always attached. Web (regenerated DTO + manual updates): - web/src/api/endpoints/cockpit.ts: getFlightRecorderFrames returns PaginatedResult, takes PaginationParams. - web/src/stores/mission-control.ts: fetchFrames clears stale state synchronously; nextCursor + hasMore tracked; fetchMoreFrames appends. - web/src/components/ui/timeline.tsx: clamp keyboard seek targets to [0, lastIndex] on every key path so a stale currentIndex cannot produce an out-of-range seek. - web/src/hooks/useMissionControlData.ts: coalesce burst-y WS-triggered snapshot refreshes via a single in-flight ref. - web/src/router/index.tsx: derive mission-control path from ROUTES.MISSION_CONTROL constant. - web/src/utils/constants.ts: add cockpit to NAMESPACE_ORDER so the namespace surfaces in Settings. - web/src/utils/status-color.ts: type STATUS_BG against generated TaskStatus union for compile-time safety on status drift. Tests: - tests/unit/api/fakes.py: FakeFlightRecorderFrameRepository mirrors the new protocol methods + UNIQUE-turn invariant. - tests/unit/persistence/test_protocol.py: _FakeFlightRecorderRepository mirrors the new protocol; conformance assertion added. - tests/conformance/persistence/test_flight_recorder_repository.py: dual-backend coverage for append_many, get_aggregate, unique-turn. - tests/unit/api/controllers/test_cockpit.py: frames response shape updated for PaginatedResponse. - tests/unit/engine/intervention/test_steering.py: assert redirect interrupt is persisted; hint test accepts wrap_untrusted envelope.
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/synthorg/engine/flight_recording/service.py`:
- Around line 127-129: The current code computes truncated using the requested
turn_index which can be misleading; change the logic in the block using
self._repository.get_aggregate(filter_spec) to derive truncated from the
aggregate's actual data (e.g., use aggregate.max_turn_index or similar field
returned by get_aggregate) rather than turn_index or only _MAX_SEEK_FRAMES, so
set truncated = aggregate.max_turn_index > _MAX_SEEK_FRAMES (or compare
aggregate.max_turn_index to the requested window) and leave cumulative =
aggregate.total_cost as-is.
In `@tests/unit/api/fakes.py`:
- Around line 591-604: The current selection of latest frame uses
candidates_sorted = sorted(candidates, key=lambda f: (f.turn_index,
f.timestamp), reverse=True) which prioritizes turn_index over actual time,
causing latest_timestamp/latest_execution_id in FlightRecorderFrameAggregate to
sometimes be older; change the selection to pick the frame with the greatest
timestamp (tie-break by turn_index) instead — e.g. compute latest =
max(candidates, key=lambda f: (f.timestamp, f.turn_index)) and then use
latest.timestamp and latest.execution_id when building
FlightRecorderFrameAggregate, keeping total_cost and max_turn_index calculations
unchanged.
In `@web/src/stores/mission-control.ts`:
- Around line 97-103: The async fetch in fetchFrames/fetchMoreFrames can apply
stale results after executionId changes; to fix, record a per-request identifier
(e.g., capture executionId or an incrementing framesRequestId from the store) at
the start of the async call and compare it right before calling set(...) so you
only mutate frames when the captured id matches the current store value; apply
the same guard around the getFlightRecorderFrames response handling in both
fetchFrames and fetchMoreFrames (use the symbols fetchFrames, fetchMoreFrames,
getFlightRecorderFrames, executionId, framesNextCursor/framesHasMore to locate
the spots).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI (base), Organization UI (inherited)
Review profile: ASSERTIVE
Plan: Pro
Run ID: 9628f0a6-0393-47e0-b95c-38a2bcc99c61
📒 Files selected for processing (31)
src/synthorg/api/controllers/cockpit.pysrc/synthorg/api/state.pysrc/synthorg/engine/agent_engine.pysrc/synthorg/engine/cockpit/service.pysrc/synthorg/engine/flight_recording/service.pysrc/synthorg/engine/flight_recording/sink.pysrc/synthorg/engine/intervention/steering.pysrc/synthorg/persistence/flight_recorder_protocol.pysrc/synthorg/persistence/postgres/flight_recorder_repo.pysrc/synthorg/persistence/postgres/revisions/20260522000002_flight_recorder.sqlsrc/synthorg/persistence/postgres/schema.sqlsrc/synthorg/persistence/sqlite/flight_recorder_repo.pysrc/synthorg/persistence/sqlite/revisions/20260522000002_flight_recorder.sqlsrc/synthorg/persistence/sqlite/schema.sqltests/conformance/persistence/test_flight_recorder_repository.pytests/e2e/test_cockpit_mission_control_e2e.pytests/integration/mcp/test_tool_surface.pytests/unit/api/controllers/test_cockpit.pytests/unit/api/fakes.pytests/unit/engine/intervention/test_steering.pytests/unit/persistence/test_protocol.pyweb/src/api/endpoints/cockpit.tsweb/src/api/types/dtos.gen.tsweb/src/api/types/openapi.gen.tsweb/src/components/ui/timeline.tsxweb/src/hooks/useMissionControlData.tsweb/src/mocks/handlers/cockpit.tsweb/src/router/index.tsxweb/src/stores/mission-control.tsweb/src/utils/constants.tsweb/src/utils/status-color.ts
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Test Integration
- GitHub Check: Test Unit
🧰 Additional context used
📓 Path-based instructions (18)
web/src/**/*.{js,jsx,ts,tsx,mts}
📄 CodeRabbit inference engine (web/CLAUDE.md)
web/src/**/*.{js,jsx,ts,tsx,mts}: Always usecreateLoggerfrom@/lib/logger; never bareconsole.warn/console.error/console.debugin application code. Variable name must always belog. Onlylogger.tsitself may use bare console methods. Uselog.debug()(DEV-only, stripped in production),log.warn(),log.error().
Pass dynamic/untrusted values as separate args to logger calls (not interpolated into the message string) so they go throughsanitizeArg
Attacker-controlled fields inside structured objects must be wrapped insanitizeForLog()before embedding in log calls
Error-code constants (MANDATORY): importErrorCodeandErrorCategoryfrom@/api/types/errors(re-exported from the generatedweb/src/api/types/error-codes.gen.ts). Discriminate onErrorCode.<NAME>, never on raw integer literals.
Use@eslint-react/web-api-no-leaked-fetchto detectfetch()in effects withoutAbortControllercleanup
Files:
web/src/router/index.tsxweb/src/utils/constants.tsweb/src/mocks/handlers/cockpit.tsweb/src/utils/status-color.tsweb/src/api/endpoints/cockpit.tsweb/src/components/ui/timeline.tsxweb/src/hooks/useMissionControlData.tsweb/src/api/types/dtos.gen.tsweb/src/stores/mission-control.tsweb/src/api/types/openapi.gen.ts
web/src/**/*.{jsx,tsx}
📄 CodeRabbit inference engine (web/CLAUDE.md)
web/src/**/*.{jsx,tsx}: Use@eslint-react/no-leaked-conditional-renderingto catch the{count && <Foo />}bug where0renders verbatim. ForReactNode | undefinedprops use{value != null && value !== false && <jsx>}; for compound truthiness useBoolean(...).
Use@eslint-react/globalsto restrictwindow/document/localStorage/ etc. inside render. Hoist offenders into auseCallbackevent handler, auseEffect, or auseSyncExternalStore-backed hook.Reuse
web/src/components/ui/design tokens in Web Dashboard Design System; detail inweb/CLAUDE.md
Files:
web/src/router/index.tsxweb/src/components/ui/timeline.tsx
web/src/**/*.{ts,tsx,mts}
📄 CodeRabbit inference engine (web/CLAUDE.md)
web/src/**/*.{ts,tsx,mts}: Use@typescript-eslint/no-floating-promisesto forbid unawaited promises so async work cannot survive the test that scheduled it and trip the active-handle gate
Use@typescript-eslint/no-misused-promises(withchecksVoidReturn: { attributes: false }) to forbid passing async functions where the callsite ignores the returned promise. React 19asyncevent handlers stay allowed via theattributes: falseexemption.
Files:
web/src/router/index.tsxweb/src/utils/constants.tsweb/src/mocks/handlers/cockpit.tsweb/src/utils/status-color.tsweb/src/api/endpoints/cockpit.tsweb/src/components/ui/timeline.tsxweb/src/hooks/useMissionControlData.tsweb/src/api/types/dtos.gen.tsweb/src/stores/mission-control.tsweb/src/api/types/openapi.gen.ts
**/*.{py,ts,tsx,jsx,md}
📄 CodeRabbit inference engine (CLAUDE.md)
No region/currency/locale privileged; use metric units; British English per docs/reference/regional-defaults.md
Files:
web/src/router/index.tsxweb/src/utils/constants.tsweb/src/mocks/handlers/cockpit.tsweb/src/utils/status-color.tssrc/synthorg/engine/intervention/steering.pyweb/src/api/endpoints/cockpit.tstests/unit/persistence/test_protocol.pyweb/src/components/ui/timeline.tsxtests/integration/mcp/test_tool_surface.pytests/unit/api/fakes.pytests/unit/engine/intervention/test_steering.pytests/e2e/test_cockpit_mission_control_e2e.pyweb/src/hooks/useMissionControlData.tssrc/synthorg/api/state.pysrc/synthorg/persistence/flight_recorder_protocol.pysrc/synthorg/engine/flight_recording/service.pysrc/synthorg/engine/agent_engine.pysrc/synthorg/engine/flight_recording/sink.pysrc/synthorg/engine/cockpit/service.pysrc/synthorg/persistence/postgres/flight_recorder_repo.pyweb/src/api/types/dtos.gen.tstests/unit/api/controllers/test_cockpit.pytests/conformance/persistence/test_flight_recorder_repository.pyweb/src/stores/mission-control.tssrc/synthorg/api/controllers/cockpit.pysrc/synthorg/persistence/sqlite/flight_recorder_repo.pyweb/src/api/types/openapi.gen.ts
web/src/utils/constants.ts
📄 CodeRabbit inference engine (web/CLAUDE.md)
WS wire protocol (MANDATORY): the client-server contract lives in
web/src/utils/constants.ts(WS_PROTOCOL_VERSION,WS_MAX_MESSAGE_SIZE,WS_HEARTBEAT_INTERVAL_MS,WS_PONG_TIMEOUT_MS,LOG_SANITIZE_MAX_LENGTH) and MUST stay in lockstep withsrc/synthorg/api/ws_models.py/src/synthorg/api/controllers/ws.py. Bump the protocol version on both sides together for breaking payload changes.
Files:
web/src/utils/constants.ts
web/src/{components,utils}/**/*.{ts,tsx}
📄 CodeRabbit inference engine (web/CLAUDE.md)
NEVER write
getXIcon(value): LucideIconfactories called inside JSX bodies. Export a<XIcon value={...} />wrapper that does the lookup viacreateElementinside the wrapper body. Wrapper components live in their own file, not alongside utility exports.
Files:
web/src/utils/constants.tsweb/src/utils/status-color.tsweb/src/components/ui/timeline.tsx
web/src/mocks/handlers/**/*.ts
📄 CodeRabbit inference engine (web/CLAUDE.md)
web/src/mocks/handlers/**/*.ts: MSW handlers (MANDATORY):web/src/mocks/handlers/must mirrorweb/src/api/endpoints/*.ts1:1 with a default happy-path handler for every exported endpoint. UseonUnhandledRequest: 'error'in test setup; tests override per-case viaserver.use(...), nevervi.mock('@/api/endpoints/*').
Use typed envelope helpers (successFor,paginatedFor,voidSuccess) to keep MSW handlers in lockstep with endpoint return types
Files:
web/src/mocks/handlers/cockpit.ts
src/synthorg/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/synthorg/**/*.py: Onlysrc/synthorg/persistence/may import sqlite/psycopg or emit raw SQL; new repository protocols inherit from generic categories inpersistence/_generics.py; bespoke methods permitted only under ADR-0001 D7
Configuration Precedence: DB > env > code default viaSettingsService/ConfigResolver(Cat-1) or env > code default (Cat-2,read_only_post_init); Cat-3 bootstrap secrets pure env; YAML is ingestion format only, not precedence tier; noos.environ.getoutside startup
No hardcoded numeric values; numerics live insettings/definitions/; allowlist only 0/1/-1, HTTP codes, hex masks, powers-of-2, and module-level annotated named constants (NAME: int|float|Final); enforced byscripts/check_no_magic_numbers.py
Comments document WHY only; no reviewer citations, issue back-refs, or migration framing; enforced bycheck_no_review_origin_in_code.py+check_no_migration_framing.py
Nofrom __future__ import annotations(Python 3.14 has PEP 649); use PEP 758 except:except A, B:no parens unless binding
Type hints on public functions; mypy strict; Google-style docstrings; line length 88; functions <50 lines; files <800 lines
Errors follow<Domain><Condition>Errorpattern fromDomainError; never inheritException/RuntimeErrordirectly; enforced bycheck_domain_error_hierarchy.py
Pydantic v2 frozen +extra="forbid"on every frozen model project-wide; gatecheck_frozen_model_extra_forbid.py;@computed_fieldauto-exempt; per-line# lint-allow: frozen-extra-forbid -- <reason>forextra="allow"/"ignore"boundaries; use@computed_fieldfor derived; useNotBlankStrfor identifiers
Args models at every system boundary;parse_typed()for every external dict ingestion; enforced bycheck_boundary_typed.py
Immutability: usemodel_copy(update=...)orcopy.deepcopy(); deepcopy at system boundaries
Async: useasyncio.TaskGroupfor fan-out/fan-in; helpers catchException(re-raiseMemoryError/`RecursionError...
Files:
src/synthorg/engine/intervention/steering.pysrc/synthorg/api/state.pysrc/synthorg/persistence/flight_recorder_protocol.pysrc/synthorg/engine/flight_recording/service.pysrc/synthorg/engine/agent_engine.pysrc/synthorg/engine/flight_recording/sink.pysrc/synthorg/engine/cockpit/service.pysrc/synthorg/persistence/postgres/flight_recorder_repo.pysrc/synthorg/api/controllers/cockpit.pysrc/synthorg/persistence/sqlite/flight_recorder_repo.py
src/**/*.py
⚙️ CodeRabbit configuration file
This project uses Python 3.14+ with PEP 758 except syntax: "except A, B:" (comma-separated, no parentheses) is correct and mandatory -- do NOT flag it as a typo or suggest parenthesized form. The "except builtins.MemoryError, RecursionError: raise" pattern is intentional project convention for system-error propagation. When evaluating the 50-line function limit, count only the function body excluding the signature lines, decorators, and docstring. Functions 1-5 lines over due to docstrings or multi-line signatures should not be flagged. Do not suggest extracting single-use helper functions called exactly once -- this reduces readability without improving maintainability.
Files:
src/synthorg/engine/intervention/steering.pysrc/synthorg/api/state.pysrc/synthorg/persistence/flight_recorder_protocol.pysrc/synthorg/engine/flight_recording/service.pysrc/synthorg/engine/agent_engine.pysrc/synthorg/engine/flight_recording/sink.pysrc/synthorg/engine/cockpit/service.pysrc/synthorg/persistence/postgres/flight_recorder_repo.pysrc/synthorg/api/controllers/cockpit.pysrc/synthorg/persistence/sqlite/flight_recorder_repo.py
web/src/{api/endpoints,stores}/**/*.ts
📄 CodeRabbit inference engine (web/CLAUDE.md)
Cursor pagination (MANDATORY): list endpoints must use opaque cursor-based paging via
PaginationMeta. Stores must keepnextCursor+hasMorein state (not offset arithmetic) and early-return when!hasMore || !nextCursor. Display counts must come fromdata.length.
Files:
web/src/api/endpoints/cockpit.tsweb/src/stores/mission-control.ts
web/src/api/endpoints/**/*.ts
📄 CodeRabbit inference engine (web/CLAUDE.md)
Health / readiness endpoints (MANDATORY):
getLiveness()is always 200 while the process is alive;getReadiness()is 200 healthy / 503 unavailable (binary'ok' | 'unavailable'outcome, no tri-state). Any new caller must handle the 503 path explicitly.
Files:
web/src/api/endpoints/cockpit.ts
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
tests/**/*.py: Timeout/slow failures = source-code regression; never edittests/baselines/unit_timing.jsonor anyscripts/*_baseline.{txt,json}/scripts/_*_baseline.py; both families PreToolUse-blocked; per-invocation bypass requires explicit approval (ALLOW_BASELINE_GROWTH=1 git commit)
Test markers:@pytest.mark.{unit,integration,e2e,slow}; async auto; timeout 30s global; coverage 80% min
xdist-n 8 --dist=loadfileauto-applied via pyprojectaddopts; Windows unit tests useWindowsSelectorEventLoopPolicy; subprocess tests override back
Test doubles:FakeClockfor Clock seam,mock_of[T](**overrides)for typed-boundary substitutions,SimpleNamespacefor attribute-bags; bareMagicMockat typed boundary blocked byscripts/check_mock_spec.py
Hypothesis: 10 deterministic CI examples; failures are real bugs (fix + add@example(...)); never skip/xfail flaky tests; fix fundamentally
Files:
tests/unit/persistence/test_protocol.pytests/integration/mcp/test_tool_surface.pytests/unit/api/fakes.pytests/unit/engine/intervention/test_steering.pytests/e2e/test_cockpit_mission_control_e2e.pytests/unit/api/controllers/test_cockpit.pytests/conformance/persistence/test_flight_recorder_repository.py
⚙️ CodeRabbit configuration file
Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare
@settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which@given() honors automatically.
Files:
tests/unit/persistence/test_protocol.pytests/integration/mcp/test_tool_surface.pytests/unit/api/fakes.pytests/unit/engine/intervention/test_steering.pytests/e2e/test_cockpit_mission_control_e2e.pytests/unit/api/controllers/test_cockpit.pytests/conformance/persistence/test_flight_recorder_repository.py
web/src/components/ui/**/*.{ts,tsx}
📄 CodeRabbit inference engine (web/CLAUDE.md)
web/src/components/ui/**/*.{ts,tsx}: ALWAYS reuse existing components fromweb/src/components/ui/before creating new ones. NEVER hardcode hex colors, font-family declarations, pixel spacing, Motion transition durations, BCP 47 locale literals ('en-US'), or currency symbols / codes; use design tokens,@/lib/motionpresets, the helpers in@/utils/format, andDEFAULT_CURRENCYfrom@/utils/currencies.
Every new shared component lives inweb/src/components/ui/with a sibling.stories.tsxcovering all states (default, hover, loading, error, empty, disabled where applicable)
Component Props interface name must be<ComponentName>Propsand must be exported from the same file (e.g.AgentCardPropsinagent-card.tsx). This makes the contract greppable and lets callers extend the props without re-typing the shape.
Base UI primitives must compose Portal + Backdrop + Popup explicitly, use therenderprop for polymorphism, and rely on animation state attributes (data-[open],data-[closed]) rather than the olderdata-[state=open]form.
Base UI primitives are imported directly from@base-ui/react/<subpath>and use the nativerenderprop for polymorphism; the local<Slot>helper is reserved for<Button asChild>only
Files:
web/src/components/ui/timeline.tsx
web/src/{components,hooks}/**/*.{ts,tsx}
📄 CodeRabbit inference engine (web/CLAUDE.md)
NEVER read
window.innerWidth/window.innerHeightdirectly in a render body oruseMemo; useuseViewportSize()from@/hooks/useViewportSizeinstead
Files:
web/src/components/ui/timeline.tsxweb/src/hooks/useMissionControlData.ts
src/synthorg/persistence/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/synthorg/persistence/**/*.py: Repository CRUD:save(entity),get(id),delete(id) -> bool,list_items(...),query(...)returning tuples
Datetime in persistence: useparse_iso_utc/format_iso_utcfrompersistence._shared(reject naive); usenormalize_utcfor already-typed
Files:
src/synthorg/persistence/flight_recorder_protocol.pysrc/synthorg/persistence/postgres/flight_recorder_repo.pysrc/synthorg/persistence/sqlite/flight_recorder_repo.py
web/src/api/types/**/*.gen.ts
📄 CodeRabbit inference engine (web/CLAUDE.md)
Generated DTO types (MANDATORY): NEVER hand-edit
web/src/api/types/*.gen.ts. Regenerate withuv run python scripts/generate_dto_types_ts.py. Import DTOs via the barrel (import type { AgentConfig } from '@/api/types').
Files:
web/src/api/types/dtos.gen.tsweb/src/api/types/openapi.gen.ts
tests/conformance/persistence/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Dual-backend conformance:
tests/conformance/persistence/consumesbackendfixture (SQLite + Postgres); enforced bycheck_dual_backend_test_parity.py
Files:
tests/conformance/persistence/test_flight_recorder_repository.py
web/src/stores/**/*.ts
📄 CodeRabbit inference engine (web/CLAUDE.md)
web/src/stores/**/*.ts: List reads (fetch*) must seterror: string | nullon the store instead of toasting
Test teardown (MANDATORY): any new store that schedules timers or attaches event listeners must expose an equivalent cleanup hook and register it in the globalafterEach. The globalafterEachinweb/src/test-setup.tsxalready callsuseToastStore.getState().dismissAll(),cancelPendingPersist(), anduseThemeStore.getState().teardown().
Files:
web/src/stores/mission-control.ts
🧠 Learnings (3)
📚 Learning: 2026-05-05T09:04:46.195Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1760
File: scripts/_dual_backend_parity_lib.py:215-216
Timestamp: 2026-05-05T09:04:46.195Z
Learning: This repository targets Python 3.14+ and follows PEP 758. Therefore, reviewer tooling should NOT treat unparenthesized multi-exception `except` clauses written without an `as` clause (e.g., `except MemoryError, RecursionError:`) as syntax errors. Only flag `except`-clause problems when they are genuinely invalid for Python 3.14+.
Applied to files:
src/synthorg/engine/intervention/steering.pytests/unit/persistence/test_protocol.pytests/integration/mcp/test_tool_surface.pytests/unit/api/fakes.pytests/unit/engine/intervention/test_steering.pytests/e2e/test_cockpit_mission_control_e2e.pysrc/synthorg/api/state.pysrc/synthorg/persistence/flight_recorder_protocol.pysrc/synthorg/engine/flight_recording/service.pysrc/synthorg/engine/agent_engine.pysrc/synthorg/engine/flight_recording/sink.pysrc/synthorg/engine/cockpit/service.pysrc/synthorg/persistence/postgres/flight_recorder_repo.pytests/unit/api/controllers/test_cockpit.pytests/conformance/persistence/test_flight_recorder_repository.pysrc/synthorg/api/controllers/cockpit.pysrc/synthorg/persistence/sqlite/flight_recorder_repo.py
📚 Learning: 2026-05-21T22:55:20.496Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 2035
File: src/synthorg/meta/toolsmith/models.py:114-114
Timestamp: 2026-05-21T22:55:20.496Z
Learning: In this repo’s “magic number” review standard, the existing gate in `scripts/check_no_magic_numbers.py` intentionally does NOT flag numeric literals used as raw call-site arguments. So, do not flag numeric literals passed as keyword arguments to Pydantic `Field()` (e.g., `Field(ge=0, le=100)` / `Field(ge=1, le=50)`)—this is an established idiom. Only treat numeric literals as “magic numbers” when they occur in the locations the gate checks (module-level assignments and function/method parameter defaults).
Applied to files:
src/synthorg/engine/intervention/steering.pytests/unit/persistence/test_protocol.pytests/integration/mcp/test_tool_surface.pytests/unit/api/fakes.pytests/unit/engine/intervention/test_steering.pytests/e2e/test_cockpit_mission_control_e2e.pysrc/synthorg/api/state.pysrc/synthorg/persistence/flight_recorder_protocol.pysrc/synthorg/engine/flight_recording/service.pysrc/synthorg/engine/agent_engine.pysrc/synthorg/engine/flight_recording/sink.pysrc/synthorg/engine/cockpit/service.pysrc/synthorg/persistence/postgres/flight_recorder_repo.pytests/unit/api/controllers/test_cockpit.pytests/conformance/persistence/test_flight_recorder_repository.pysrc/synthorg/api/controllers/cockpit.pysrc/synthorg/persistence/sqlite/flight_recorder_repo.py
📚 Learning: 2026-05-21T22:55:09.289Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 2035
File: src/synthorg/meta/toolsmith/config.py:29-30
Timestamp: 2026-05-21T22:55:09.289Z
Learning: For this repo’s Pydantic configuration idiom, do not treat numeric literals passed directly as arguments to `pydantic.Field(...)` as “magic numbers” during review. This includes call-site usages like `Field(default=0.2, ge=0.0, le=1.0)` (e.g., in config models such as `ToolAuthoringConfig`, `ToolValidationConfig`, `ToolsmithConfig`). Do not request extracting those `Field(...)` numeric arguments into named constants, since the repo’s `scripts/check_no_magic_numbers.py` intentionally excludes call-site `Field(...)` numerics and relies on `Field(...)` as the canonical way to express these constraints/defaults.
Applied to files:
src/synthorg/engine/intervention/steering.pysrc/synthorg/api/state.pysrc/synthorg/persistence/flight_recorder_protocol.pysrc/synthorg/engine/flight_recording/service.pysrc/synthorg/engine/agent_engine.pysrc/synthorg/engine/flight_recording/sink.pysrc/synthorg/engine/cockpit/service.pysrc/synthorg/persistence/postgres/flight_recorder_repo.pysrc/synthorg/api/controllers/cockpit.pysrc/synthorg/persistence/sqlite/flight_recorder_repo.py
🔇 Additional comments (33)
web/src/router/index.tsx (1)
170-170: LGTM!src/synthorg/persistence/sqlite/revisions/20260522000002_flight_recorder.sql (1)
20-24: LGTM!web/src/utils/constants.ts (1)
144-144: LGTM!src/synthorg/persistence/postgres/schema.sql (1)
395-396: LGTM!web/src/mocks/handlers/cockpit.ts (1)
14-14: LGTM!Also applies to: 61-70, 84-84
web/src/utils/status-color.ts (1)
8-20: LGTM!Also applies to: 33-35
src/synthorg/engine/intervention/steering.py (1)
24-24: LGTM!Also applies to: 114-118, 126-126
web/src/api/endpoints/cockpit.ts (1)
2-2: LGTM!Also applies to: 9-16, 24-41
src/synthorg/persistence/sqlite/schema.sql (1)
381-385: LGTM!tests/unit/persistence/test_protocol.py (1)
26-26: LGTM!Also applies to: 443-445, 455-461, 1471-1482
src/synthorg/persistence/postgres/revisions/20260522000002_flight_recorder.sql (1)
20-24: LGTM!web/src/components/ui/timeline.tsx (1)
40-49: LGTM!tests/integration/mcp/test_tool_surface.py (1)
398-402: LGTM!tests/unit/engine/intervention/test_steering.py (1)
34-40: LGTM!Also applies to: 54-58
tests/e2e/test_cockpit_mission_control_e2e.py (1)
17-18: LGTM!Also applies to: 68-82
web/src/hooks/useMissionControlData.ts (1)
42-60: LGTM!Also applies to: 79-79, 82-82
src/synthorg/api/state.py (1)
1590-1612: LGTM!Also applies to: 1619-1619
src/synthorg/persistence/flight_recorder_protocol.py (1)
25-26: LGTM!Also applies to: 113-153, 164-170, 173-175, 179-188, 200-214, 216-223
src/synthorg/engine/agent_engine.py (1)
36-36: LGTM!Also applies to: 41-41, 675-706
src/synthorg/engine/flight_recording/sink.py (1)
9-9: LGTM!Also applies to: 13-17, 69-69, 73-103, 158-164, 183-183, 192-207
src/synthorg/engine/cockpit/service.py (1)
9-9: LGTM!Also applies to: 119-122, 131-141, 170-178, 180-186
src/synthorg/persistence/postgres/flight_recorder_repo.py (1)
13-13: LGTM!Also applies to: 16-16, 29-29, 44-50, 67-67, 88-126, 138-138, 143-143, 149-149, 161-207, 209-220, 233-262
web/src/api/types/dtos.gen.ts (1)
333-333: LGTM!tests/unit/api/controllers/test_cockpit.py (1)
84-90: LGTM!tests/conformance/persistence/test_flight_recorder_repository.py (1)
164-246: LGTM!src/synthorg/api/controllers/cockpit.py (1)
110-141: LGTM!Also applies to: 148-148, 252-252
src/synthorg/persistence/sqlite/flight_recorder_repo.py (1)
95-133: LGTM!Also applies to: 165-211, 213-224, 239-261
web/src/api/types/openapi.gen.ts (6)
10122-10136: LGTM!
11407-11411: LGTM!
11417-11428: LGTM!
16826-16830: LGTM!
16846-16846: LGTM!
16864-16865: LGTM!
CodeRabbit re-reviewed the round-1 push (review 4348377208, 3 actionable comments on HEAD a6f8b64). All three valid. Engine: - engine/flight_recording/service.seek: ``truncated`` now derives from ``aggregate.max_turn_index > _MAX_SEEK_FRAMES`` (the actual recorded max turn) instead of the operator-supplied ``turn_index``. A scrubber that seeks to turn 2000 in a run that only recorded 50 frames is NOT truncated, even though 2000 > _MAX_SEEK_FRAMES. Persistence (sqlite + postgres + protocol + fake parity): - ``FlightRecorderFrameAggregate.latest_timestamp`` / ``latest_execution_id`` now come from the row ordered by ``(timestamp DESC, turn_index DESC)`` instead of ``(turn_index DESC, timestamp DESC)``. Ordering by timestamp first matches the semantic meaning of "latest activity": a frame at turn 5 written 30s after turn 6 (clock skew, resumed-run interleaving) is the *more recent* activity even if turn 6 has a higher index. Applied to both ``SELECT timestamp / execution_id ORDER BY ...`` subqueries in sqlite + postgres, the in-memory fake's selection, and the protocol docstring. Web: - web/src/stores/mission-control.ts: ``fetchFrames`` and ``fetchMoreFrames`` now capture the requesting executionId in a local before awaiting and re-check ``get().framesExecutionId`` before every ``set(...)`` call. A user who loads execution A then quickly loads execution B will never see B's timeline polluted with a stale A page that lands after the switch.
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #2044 +/- ##
==========================================
+ Coverage 85.03% 85.04% +0.01%
==========================================
Files 2193 2208 +15
Lines 127351 128126 +775
Branches 10579 10613 +34
==========================================
+ Hits 108288 108967 +679
- Misses 16397 16486 +89
- Partials 2666 2673 +7 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
main merged #2044 (mission-control + flight-recorder) which added the cockpit MCP domain, cockpit settings namespace, and its own generated types. Rebasing this charter PR on top required: - Keeping both charter and cockpit imports in the MCP domain / handler aggregators and the SettingNamespace enum. - Bumping the pinned tool-count assertions to 231 (219 baseline + 7 cockpit + 5 charter) in test_tool_surface and test_all_handlers_wired. - Regenerating web/src/api/types/{dtos,enum-values,openapi}.gen.ts so both the cockpit and charter DTO additions are reflected (the generator is the single source of truth; the rebase conflict markers inside the .gen.ts files were resolved by accepting cockpit then regenerating to produce the union).
<!-- HIGHLIGHTS_START --> ## Highlights > _AI-generated summary (model: `openai/gpt-4.1-mini` via GitHub Models). Commit-based changelog below._ ### What you'll notice - New brownfield codebase intake mode supports merger and acquisition scenarios. - Added deep CEO interview feature to improve project charter creation. - Introduced mission control and flight recorder operator cockpit for better operational oversight. - Research mode added for enhanced exploratory work. - Runtime services now log safety-spine state at boot for clearer diagnostics. ### What's new - Research mode feature enables deeper data exploration. - CEO interview integration helps shape project charters. - Mission control and flight recorder cockpit introduced for operational tracking. ### Under the hood - Improved codebase modularity with module-size gates and lint tightening. - Added __init__.py to 21 test directories for better test discovery. - Promoted six transitive dependencies to direct dependencies for clarity. - Split codespell ignore list into vocabulary and source renames. - Decomposed oversized web utilities, hooks, and libraries for maintainability. - Enhanced CI with Lychee link checker integration and retry logic for cosign signing. - Sharded unit and integration tests and added Postgres service container in CI. - Updated infrastructure and web dependencies; maintained lock files. <!-- HIGHLIGHTS_END --> :robot: I have created a release *beep* *boop* --- ## [0.8.8](v0.8.7...v0.8.8) (2026-05-24) ### Features * brownfield codebase intake (merger/acquisition entry mode) ([#2042](#2042)) ([e287621](e287621)), closes [#1975](#1975) * deep CEO interview to project charter ([#2045](#2045)) ([904f2fb](904f2fb)) * mission control + flight recorder operator cockpit ([#2044](#2044)) ([1c2660b](1c2660b)) * research mode ([#2041](#2041)) ([f81a5ac](f81a5ac)), closes [#1989](#1989) * surface safety-spine state in runtime-services boot log (closes [#2096](#2096)) ([#2097](#2097)) ([f187b31](f187b31)) ### Refactoring * add __init__.py to 21 leaf test directories (INP001) ([#2081](#2081)) ([2592118](2592118)), closes [#2064](#2064) * codebase modularity (1/4) - module-size gates + lint tightening + tools ([#2078](#2078)) ([556fbd9](556fbd9)), closes [#2047](#2047) [#2040](#2040) * promote 6 transitive deps to direct deps ([#2083](#2083)) ([adedc6a](adedc6a)) * split codespell ignore-words-list into vocab + source renames ([#2085](#2085)) ([917d98a](917d98a)), closes [#2074](#2074) * **web:** PR A foundation, decompose oversized utils/hooks/lib ([#2092](#2092)) ([#2098](#2098)) ([aedbba5](aedbba5)) ### CI/CD * exclude slsa.dev from lychee (transient timeout on canonical badge) ([#2090](#2090)) ([346c51d](346c51d)) * fix paths-filter shallow-clone race and scorecard allowlist ([#2089](#2089)) ([7cd7ce8](7cd7ce8)) * refresh .test_durations.{unit,integration} ([#2087](#2087)) ([ddf2d86](ddf2d86)) * retry cosign sign on transient GHCR/Rekor failures ([#2100](#2100)) ([da9422a](da9422a)) * shard test-unit + test-integration, sysmon coverage, Postgres service container ([#2080](#2080)) ([0768787](0768787)) * wire Lychee link-checker (workflow + installer + pre-push hook) ([#2084](#2084)) ([1c0694a](1c0694a)) ### Maintenance * Lock file maintenance ([#2086](#2086)) ([a78810a](a78810a)) * Update Infrastructure dependencies ([#2055](#2055)) ([041ad8b](041ad8b)) * Update Web dependencies ([#2054](#2054)) ([4d57b9a](4d57b9a)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). --------- Co-authored-by: synthorg-repo-bot[bot] <279117679+synthorg-repo-bot[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Closes #1981.
Summary
Operator cockpit for the SynthOrg runtime: watch the company work live, identify a stuck or runaway agent and intervene, replay any completed run step-by-step.
What this adds
Persistence (dual-backend, conformance-tested)
FlightRecorderFrameappend-only frame store: SQLite + Postgres repos, yoyo revisions, canonicalschema.sqlfor both backends,FakeFlightRecorderFrameRepositorymirrored into the api test fixture.Engine
FlightRecorderSink(default-on, fail-soft, off the per-turn hot path) wired intoAgentEngine._run_and_finalize: every agent run records redacted per-turn frames.CockpitServiceaggregates live activity (who/what + cost + turn count) and flags stuck / runaway agents from operator-tunable thresholds.FlightRecorderServiceserves the scrubber timeline and aseek(turn N)ascending-prefix reconstruction with cumulative cost, frame-authoritative.SteeringDirectiveprotocol with aSafeDefaultSteeringDirectivethat posts anINFO_REQUESTinterrupt for hint / redirect (visible queued artefact, never silent no-op).InterventionKindenum oncore/enums.py;cockpitsettings namespace with 8 knobs;observability/events/cockpit.pyevent constants.API
/cockpitLitestar controller: snapshot, flight-recorder frames + seek, four guarded interventions (pause / kill / hint / redirect), 503-gated via AppState._try_wire_cockpitafter persistence connects; ENFORCED ghost-wiring manifest lines forCockpitService,FlightRecorderService,build_steering_directive,build_flight_recorder_sink.CHANNEL_COCKPITappended to the WS allowlist.MCP
cockpitdomain: 3 read tools (live activity, frames, seek) and 4 admin tools (pause / kill / hint / redirect) that callrequire_admin_guardrailsas their lexically-first call. Tool-count plan bumped to 226.Web dashboard
/mission-controlroute (sidebar nav,Radioicon).cockpit | tasks | agents | budgetchannels.Timelinescrubber primitive (+ stories), transport (prev / next / play / pause / speed), per-turn frame detail.mission-controlstore (mutation pattern: try / catch + toast + sentinel; callers do not wrap),useMissionControlDatahook (polling + WS + cleanup),api/endpoints/cockpit.ts, MSW handlers, regenerated DTOs.statusBgClassutil so Timeline + LiveCockpit + FlightRecorder all map status → colour the same way.Test plan
tests/unit/engine/{cockpit,flight_recording,intervention}/,tests/unit/api/controllers/test_cockpit.py, web Timeline / page / store tests.tests/unit/engine/flight_recording/test_engine_recording.pyprovesAgentEngine.run()auto-records replayable frames through the real hook.tests/conformance/persistence/test_flight_recorder_repository.pyover SQLite + Postgres.tests/e2e/test_cockpit_mission_control_e2e.pyexercises detect → intervene → replay against the real services.All pre-push gates green locally (ruff, mypy on affected, pytest unit on affected, ESLint, ghost-wiring, currency-aggregation, MCP guardrails, frozen-model extra="forbid", typed-boundary, dto-types-ts-in-sync, schema-drift, schema.sql vs revisions, no-review-origin, dead-API-endpoint, convention-gate inventory).
Review coverage
Six core pre-PR review agents ran on the diff: code-reviewer, conventions-enforcer, frontend-reviewer, persistence-reviewer, test-quality-reviewer, issue-resolution-verifier. 13 valid findings addressed (immutability via fresh dict construction in repo deserialization, accessible status indicator + Hint context, named frame type,
_MAX_SEEK_FRAMES: Final[int],useMemofor scrubber projections, shared status-colour util, store reset in test teardown, richer MSW defaults,as const satisfiestyping). One code-reviewer finding (except A, B:claimed as Py2) dismissed as false positive: PEP 758 in 3.14 allows it and the conventions-enforcer confirmed the project standard. Issue-resolution-verifier: PASS, all #1981 criteria RESOLVED.Also fixed one unrelated pre-existing test-isolation bug surfaced by the full-suite gate (
tests/unit/observability/audit_chain/test_audit_chain.py: mocked executorsubmitwas dropping its coroutine arg, leaking an un-awaited coroutine into pytest's unraisable-exception capture and tainting downstream tests on the same xdist worker).