Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -93,3 +93,6 @@ superset-dev-data/
.serena/
test-conflict-repo/
.amp/*

# Claude Code session lock (runtime artifact)
.claude/scheduled_tasks.lock
327 changes: 327 additions & 0 deletions apps/desktop/docs/V2_LAUNCH_CONTEXT.md

Large diffs are not rendered by default.

202 changes: 202 additions & 0 deletions apps/desktop/docs/V2_LAUNCH_CONTEXT_GAPS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,202 @@
# V2 Launch Context — Body-Fetching Gaps

Companion to `V2_LAUNCH_CONTEXT.md`. Tracks remaining work to make
linked issues / PRs / tasks useful to the agent.

## Current state (2026-04-15)

Claude receives titles only — no bodies:

```
<user prompt>

# <task title>

# <issue title>

# PR #<n> — <pr title>
Branch `<branch>` is checked out in this workspace — commits you make continue this PR.

# Attached files
...
- .superset/attachments/<file>
```

Bodies are empty because `buildResolveCtxFromPending` stubs return
empty strings. The pipeline otherwise works end-to-end.

## Design decisions (locked)

1. **Inline in prompt.** Bodies go directly into the prompt via
`{{issues}}` / `{{prs}}` / `{{tasks}}` template variables. No file
writes for linked context. Only user-uploaded attachments write to
`.superset/attachments/`.
2. **PR checkout is true.** The fork-from-PR flow checks out the PR's
head branch. Prompt says so.
3. **No body truncation** (or very high cap, e.g. 200 KB/source). Modern
context windows are large. Don't cap aggressively.
4. **No sanitization.** Prompt goes into a heredoc with a random
delimiter (no shell injection). Agent reads raw text, no HTML parser
downstream. V1's entity escaping was unnecessary.
5. **Attachments framing.** The `{{attachments}}` block includes a short
header cueing the agent to read the files. Just paths; agent handles
the rest.
6. **Issue/PR comments.** Defer. Note in the follow-ups.
7. **Per-agent framing.** Don't over-engineer. Give the path; agent
figures it out.

## Work plan

### 1. Host-service `getIssueContent`

Add to `workspaceCreation` router (same GitHub auth path as
`searchGitHubIssues`):

```ts
getIssueContent: protectedProcedure
.input(z.object({ projectId: z.string(), issueNumber: z.number() }))
.query(async ({ ctx, input }) => {
const repo = await resolveGithubRepo(ctx, input.projectId);
const octokit = await ctx.github();
const { data } = await octokit.issues.get({
owner: repo.owner, repo: repo.name, issue_number: input.issueNumber,
});
return {
number: data.number,
title: data.title,
body: data.body ?? "",
url: data.html_url,
state: data.state,
author: data.user?.login ?? null,
};
}),
```

### 2. Host-service `getPullRequestContent`

Same router, wraps `octokit.pulls.get`:

```ts
getPullRequestContent: protectedProcedure
.input(z.object({ projectId: z.string(), prNumber: z.number() }))
.query(async ({ ctx, input }) => {
const repo = await resolveGithubRepo(ctx, input.projectId);
const octokit = await ctx.github();
const { data } = await octokit.pulls.get({
owner: repo.owner, repo: repo.name, pull_number: input.prNumber,
});
return {
number: data.number,
title: data.title,
body: data.body ?? "",
url: data.html_url,
state: data.state,
branch: data.head.ref,
baseBranch: data.base.ref,
author: data.user?.login ?? null,
};
}),
```

### 3. Internal-task body source

Find the API for task details. V1 uses Electron IPC; V2 has
collections in the task view (live-query from cloud). Options:

- `apiTrpcClient.tasks.get.query({ id })` if such a procedure exists.
- Read from the existing `collections.tasks` live-query data (already
in renderer memory from the task view).
- Host-service proxies the Superset API.

Need to inspect the task view's data source to find the right shape.
The pending row already has `{ id, slug, title }` from the picker;
the missing field is `description` (and potentially
`acceptanceCriteria`, `comments`, `labels`).

### 4. Swap stubs in `buildResolveCtxFromPending`

`apps/desktop/src/renderer/routes/_authenticated/_dashboard/pending/$pendingId/buildForkAgentLaunch.ts`

Replace the three fake fetchers in `buildResolveCtxFromPending` with
real calls to host-service via `getHostServiceClientByUrl(hostUrl)`:

```ts
fetchIssue: async (url) => {
const match = pending.linkedIssues.find(i => i.url === url);
if (!match?.number) throw notFound(url);
const data = await client.workspaceCreation.getIssueContent.query({
projectId: pending.projectId,
issueNumber: match.number,
});
return {
number: data.number,
url: data.url,
title: data.title,
body: data.body,
slug: match.slug,
};
},
```

Same pattern for PR (using `match.prNumber`) and task (using task API).

### 5. Pass `hostUrl` to `buildForkAgentLaunch`

Currently the function doesn't have the host-service client. Thread
`hostUrl` (or the client itself) through `BuildForkAgentLaunchInputs`
so the resolvers can make real calls.

## Target prompt (after fixes)

```
<user prompt>

# Task TASK-42 — Refactor auth middleware

Split session-token storage from request handling so we can encrypt
at rest. Keep the public API shape stable.

Acceptance criteria:
- Sessions encrypted at rest
- No public-API shape change
- Migration for existing sessions

# Issue #123 — Auth middleware stores tokens in plaintext

Legal flagged this. Sessions written to disk without encryption. We
need to move to an encrypted KV before the compliance deadline.

The token-issuance path sets kid=k_primary but the active signing
key rotated to k_2026q1 last quarter. Decrypt falls back to
legacy plaintext which is the compliance violation...

# PR #200 — Rewrite auth middleware

Branch `fix/auth-encryption` is checked out in this workspace —
commits you make continue this PR.

Replaces plaintext token storage with encrypted KV. Migrates
existing sessions on first request...

# Attached files

The user attached these files alongside the prompt. They've been
written into the worktree at `.superset/attachments/`. Read them
to understand the request.

- .superset/attachments/trace.log
- .superset/attachments/notes.md
```

## Sequence

1. `getIssueContent` host-service procedure + stub swap → issue bodies flow.
2. `getPullRequestContent` procedure + stub swap → PR bodies + branch.
3. Task body source (scope the API first).
4. Thread `hostUrl` into `buildForkAgentLaunch` inputs.

## Deferred

- Issue/PR comments (phase 2).
- Body truncation (revisit if agents hit context limits in practice).
- Attach-as-file mode (not needed; inline works).
144 changes: 144 additions & 0 deletions apps/desktop/docs/V2_LAUNCH_TEST_PLAN.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,144 @@
# V2 Launch Dispatch — Test Plan

Checklist for verifying the V2 workspace launch pipeline end-to-end.
Pair with `V2_LAUNCH_CONTEXT.md` for architectural background and
`v2-launch-test-artifacts/` for copy-pasteable sample data.

## Setup

1. `bun dev` at the repo root.
2. Ensure your active org has V2 cloud enabled (or you're testing a V2
project).
3. Settings → Agents: confirm **Claude** is enabled. For chat-agent
tests, enable **Superset Chat**.
4. (Optional) Open devtools console and filter by `[v2-launch]` to trace
dispatch. `collections` is exposed globally for pending-row inspection:
```js
collections.pendingWorkspaces.toArray()
```

## A. Happy-path — terminal agent (Claude)

- [ ] **A1. Prompt only** — "add a README explaining this repo." Claude pane
opens. The command includes the prompt. No errors.
- [ ] **A2. Prompt + text attachment** — drag `v2-launch-test-artifacts/trace.log`
into the modal. After launch: verify `.superset/attachments/trace.log`
exists in the worktree (terminal: `ls .superset/attachments/`).
Claude prompt contains `![trace.log](.superset/attachments/trace.log)`.
- [ ] **A3. Prompt + image** — drag `v2-launch-test-artifacts/sample.png`.
Same as A2 with the image.
- [ ] **A4. Duplicate filename** — drag `trace.log` twice. Both files exist;
second is named `trace_1.log`. Prompt references both.
- [ ] **A5. Prompt + linked GitHub issue** — paste a real issue URL from the
picker. Claude prompt contains `# <issue title>`.
- [ ] **A6. Prompt + linked PR** — paste a PR URL. Prompt contains
`# <PR title>` and `Branch: \`<branch>\``.
- [ ] **A7. Prompt + internal task** — link a task from the picker.
Prompt contains `# Task <id> — <title>`.
- [ ] **A8. Multi-source** — prompt + task + 2 issues + PR + 2 attachments.
All appear in the prompt, ordered:
user-prompt → tasks → issues → prs → attachment list.
- [ ] **A9. Rich-editor multimodal** — if the editor supports inline image
drops, drop an image between two text chunks. Image ref sits inline,
not appended at the end.

## B. Happy-path — chat agent (Superset Chat)

Disable Claude (or set Superset Chat as preferred via order in settings).

- [ ] **B1. Prompt only** — chat pane opens; first user message = prompt;
agent response streams.
- [ ] **B2. Prompt + attachment** — first message carries the file
(visible in the message bubble).
- [ ] **B3. Prompt + linked issue** — first message contains the issue
title block.
- [ ] **B4. Retry on send failure** — block network before submit, wait
for V2 chat retry loop, unblock. Message eventually sends.
`pending.chatLaunch` only clears after success.

## C. Pending-row lifecycle

- [ ] **C1. Field clears after consume** — devtools console after launch:
```js
collections.pendingWorkspaces.toArray()
.find(r => r.workspaceId === '<WS-ID>')
```
`terminalLaunch` / `chatLaunch` are `null`.
- [ ] **C2. No re-fire on revisit** — navigate out and back to the
workspace. No duplicate pane.
- [ ] **C3. Crash-safe** — submit, quit app before workspace opens.
Reopen app, navigate to `/v2-workspace/<ID>`. Pane still opens.
Pending row cleared after.
- [ ] **C4. Concurrent creates** — submit two workspaces in rapid
succession (different projects). Both pending rows dispatch
independently; no cross-contamination.

## D. Failure paths

- [ ] **D1. create fails** — kill host-service, submit. Pending page shows
"failed" with retry. No launch stashed. Retry after restart works.
- [ ] **D2. Attachment write fails** — manually `chmod` the worktree
read-only, submit with attachments. Dispatch logs warning; pane
still opens; files missing (expected degradation).
- [ ] **D3. `ensureSession` fails** — stop host-service after create but
before navigation. Consume hook logs warning. `terminalLaunch`
stays set. Restart host-service, refresh. Consume re-fires.
- [ ] **D4. Agent disabled mid-flow** — enable agent, start submit, disable
before create completes. Pending page finishes. No pane opens.
Pending row `terminalLaunch` stays null.
- [ ] **D5. No enabled agents** — disable all agents in settings. Submit.
Workspace creates. No pane opens. Expected.

## E. Source-mapping edge cases

- [ ] **E1. Empty prompt, attachments only** — submit with only a file,
no text. Terminal opens with the no-prompt command
(`claude --dangerously-skip-permissions`).
- [ ] **E2. Whitespace-only prompt** — `" \n "`. Treated as empty.
- [ ] **E3. Multiple linked issues** — 2+ github issues. Both render in
order.
- [ ] **E4. Task + issue together** — `taskSlug` = task's slug (task
wins). Both bodies render.
- [ ] **E5. Duplicate issue URL** — link same issue twice. Deduped.
- [ ] **E6. PR only** — no prompt, no issues, just a linked PR. Launch
succeeds; prompt = PR block.

## F. Custom / non-default agents

- [ ] **F1. Codex (terminal)** — disable Claude, enable Codex. Submit.
Codex pane runs prompt.
- [ ] **F2. Custom terminal agent** — create one in settings with command
`echo` (simple test). Submit. Pane runs `echo <prompt>`.
- [ ] **F3. Custom `contextPromptTemplateUser`** — settings → Claude →
override user template to `"PREFIX {{userPrompt}} SUFFIX"`. Submit.
Command contains `PREFIX <prompt> SUFFIX`.

## G. Cross-pane behavior

- [ ] **G1. Setup script + agent** — project has a setup script, submit.
Setup script pane **and** agent pane both appear as separate panes.
(This was the V1-bus bug that triggered the rewrite — if agent
appears but setup script merges into same pane, regression.)
- [ ] **G2. Presets + agent** — configure a default preset that
auto-applies. Submit. Preset terminals + agent terminal all
coexist.
- [ ] **G3. Chat + terminal presets** — chat agent + preset terminals.
Both appear.

## H. V1 regression

- [ ] **H1. V1 workspace creation still works** — create via the V1
modal (old workspace view, not V2 dashboard). V1 dispatch via
`WorkspaceInitEffects` + `useTabsStore` unchanged. Agent runs
as before.

## Priority

If time-limited, run these first:

- A1, A2 — minimum happy path terminal
- A8 — multi-source terminal
- B1 — minimum happy path chat
- C1 — field clears
- G1 — setup-script regression
- H1 — V1 regression
2 changes: 1 addition & 1 deletion apps/desktop/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -163,6 +163,7 @@
"date-fns": "^4.1.0",
"default-shell": "^2.2.0",
"detect-libc": "2.0.4",
"dexie": "^4.4.2",
"diff": "^7.0.0",
"dnd-core": "^16.0.1",
"dockerfile-ast": "0.7.1",
Expand All @@ -186,7 +187,6 @@
"highlight.js": "^11.11.1",
"html-to-image": "^1.11.13",
"http-proxy": "^1.18.1",
"idb": "^8.0.3",
"idb-keyval": "^6.2.2",
"jose": "^6.1.3",
"js-yaml": "^4.1.1",
Expand Down
Loading
Loading