Conversation
📝 WalkthroughWalkthroughImplements one-way GitHub Issues → Tasks sync: stores GitHub installation in integrationConnections, adds initial bulk issue sync, webhook-driven incremental sync, mapping utilities to translate issues to tasks, TRPC procedures to view/update sync config and trigger sync, and documentation. Changes
Sequence Diagram(s)sequenceDiagram
actor Client
participant Server
participant GitHub
participant DB
Client->>Server: triggerIssueSync(organizationId)
Server->>DB: Query integrationConnections (provider=github, org)
DB-->>Server: installation + GithubConfig
Server->>GitHub: List repositories for installation
GitHub-->>Server: repo list
loop per repository
Server->>GitHub: Fetch open issues (exclude PRs)
GitHub-->>Server: issues
Server->>DB: resolveTaskStatusIds(organizationId)
DB-->>Server: unstartedId, completedId
loop per issue
Server->>DB: query members by email (assignee)
DB-->>Server: assigneeId?
Server->>Server: mapGithubIssueToTask(issue,...)
Server->>DB: upsert task (onConflictDoUpdate)
DB-->>Server: upsert result
end
end
Server-->>Client: { success: true }
sequenceDiagram
actor GitHub as GitHub Webhook
participant Server
participant DB
GitHub->>Server: POST /webhook (issue event)
Server->>DB: find integrationConnections by repo->installation
DB-->>Server: installation + GithubConfig
alt syncIssues enabled
Server->>Server: processGithubIssueEvent(issue, action)
Server->>DB: resolveTaskStatusIds(or update/delete)
DB-->>Server: statusIds
Server->>DB: upsert or soft-delete task
DB-->>Server: result
else syncIssues disabled
Server-->>GitHub: 200 (skipped)
end
Server-->>GitHub: 200 OK
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
No actionable comments were generated in the recent review. 🎉 🧹 Recent nitpick comments
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Add one-way sync from GitHub issues to Superset tasks, following the established Linear integration pattern. Issues are mapped directly into the tasks table using externalProvider='github'. - Add issue webhook handlers (opened, edited, closed, reopened, assigned, unassigned, labeled, unlabeled, deleted) - Add issue sync to initial-sync job with PR filtering - Create shared map-issue-to-task utility for reuse - Add GithubConfig type with syncIssues toggle - Upsert integrationConnections row on GitHub App install - Add getConfig, updateConfig, triggerIssueSync tRPC procedures - Add design doc at docs/github-issues-sync.md
3af4b66 to
0193780
Compare
🚀 Preview Deployment🔗 Preview Links
Preview updates automatically with new commits |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Fix all issues with AI agents
In `@apps/api/src/app/api/github/lib/map-issue-to-task.ts`:
- Around line 52-67: resolveAssigneeId currently only checks assignee.email
(which GitHub webhooks don't provide) so it effectively always returns null;
update it to either (A) try to match on assignee.login by querying
users.githubLogin (use users.githubLogin in the query if that column exists) and
return the matched user's id, or (B) if the githubLogin column isn't present
yet, add a clear TODO comment inside resolveAssigneeId stating this is a
placeholder and that matching should use assignee.login against
users.githubLogin when the DB schema is updated; reference the resolveAssigneeId
function and the users table (users.githubLogin) when making the change and
ensure the function still returns Promise<string | null>.
- Around line 25-46: The resolveTaskStatusIds function currently uses
statuses.find() after querying taskStatuses without an ORDER BY, which yields
non-deterministic results when multiple rows share the same type; update the DB
query that builds statuses (reference taskStatuses and the const statuses) to
include an explicit ordering (e.g., order by taskStatuses.position or another
stable column) so that the subsequent lookups for unstartedStatus and
completedStatus via statuses.find(...) always pick the intended row
deterministically.
In `@docs/github-issues-sync.md`:
- Around line 15-27: Update the two fenced code blocks in
docs/github-issues-sync.md (the flow diagram blocks shown as "GitHub Issue Event
... Upsert into `tasks` table" and the "User installs GitHub App / triggers
manual sync" block) to include a language specifier by changing the opening
triple backticks to ```text so markdownlint stops flagging them; ensure both
fenced blocks use ```text exactly.
🧹 Nitpick comments (5)
packages/trpc/src/router/integration/github/github.ts (2)
302-342: DRY:triggerIssueSynclargely duplicatestriggerSync.Both procedures share the same pattern: look up installation → build sync URL/body → dev-fetch or QStash publish. Consider extracting a shared helper like
dispatchSyncJob({ installationId, organizationId })to avoid maintaining this logic in two places.♻️ Suggested helper extraction
+async function dispatchSyncJob({ + installationDbId, + organizationId, +}: { + installationDbId: string; + organizationId: string; +}) { + const syncUrl = `${env.NEXT_PUBLIC_API_URL}/api/github/jobs/initial-sync`; + const syncBody = { installationDbId, organizationId }; + + if (env.NODE_ENV === "development") { + fetch(syncUrl, { + method: "POST", + headers: { "Content-Type": "application/json" }, + body: JSON.stringify(syncBody), + }).catch((error) => { + console.error("[github/dispatchSyncJob] Dev sync failed:", error); + }); + } else { + await qstash.publishJSON({ + url: syncUrl, + body: syncBody, + retries: 3, + }); + } +}
247-263: Type assertion on DB config — consider tolerant parsing.Line 260 casts
connection?.configtoGithubConfig | nullwithout validation. While theprovider="github"filter makes this safe in practice, the coding guidelines call for handling untrusted shapes from external/DB data. A lightweight check (e.g.,typeof config?.syncIssues === 'boolean') or a small Zod schema would add resilience if the JSON column ever contains unexpected data.This same
as GithubConfigpattern appears inwebhooks.ts(line 450) andinitial-sync/route.ts(line 248) — a single shared parser would cover all three call sites.apps/api/src/app/api/github/callback/route.ts (1)
128-155: Preserving config on re-install is a nice touch.The
onConflictDoUpdateintentionally omitsconfigso a reinstall doesn't reset the user'ssyncIssuespreference — good behavior.One minor note:
externalOrgIdis set toaccountLogin(line 136), which can change if the GitHub user/org renames. Consider using a stable numeric ID (e.g.,String(installation.account?.id)) forexternalOrgIdwhile keepingaccountLoginforexternalOrgName.apps/api/src/app/api/github/webhook/webhooks.ts (1)
410-453: Three sequential DB queries per webhook event.Each issue webhook fires repo → installation → integrationConnections lookups sequentially. At moderate webhook volume this is fine, but for high-traffic orgs this could become an I/O bottleneck. Consider collapsing into a single join query in a future optimization pass.
apps/api/src/app/api/github/jobs/initial-sync/route.ts (1)
238-249: Late destructure:organizationIdalready available from line 74.
organizationIdis re-destructured fromparsed.dataon line 239, but it was already available since line 74 whereinstallationDbIdwas extracted from the same object. Consider destructuring both together for clarity:♻️ Proposed fix
- const { installationDbId } = parsed.data; + const { installationDbId, organizationId } = parsed.data;Then remove line 239.
| export async function resolveTaskStatusIds({ | ||
| organizationId, | ||
| }: { | ||
| organizationId: string; | ||
| }): Promise<ResolvedStatusIds | null> { | ||
| const statuses = await db | ||
| .select({ id: taskStatuses.id, type: taskStatuses.type }) | ||
| .from(taskStatuses) | ||
| .where(eq(taskStatuses.organizationId, organizationId)); | ||
|
|
||
| const unstartedStatus = statuses.find((s) => s.type === "unstarted"); | ||
| const completedStatus = statuses.find((s) => s.type === "completed"); | ||
|
|
||
| if (!unstartedStatus || !completedStatus) { | ||
| return null; | ||
| } | ||
|
|
||
| return { | ||
| unstartedStatusId: unstartedStatus.id, | ||
| completedStatusId: completedStatus.id, | ||
| }; | ||
| } |
There was a problem hiding this comment.
Non-deterministic status selection when multiple statuses share a type.
statuses.find() returns the first matching element, but the DB query (line 30-33) has no ORDER BY, so the result depends on the DB's internal row ordering. If an org has multiple statuses with type='unstarted' (which the schema allows), the chosen status could vary between calls.
Add an explicit order (e.g., by position) for predictable results:
♻️ Proposed fix
const statuses = await db
.select({ id: taskStatuses.id, type: taskStatuses.type })
.from(taskStatuses)
- .where(eq(taskStatuses.organizationId, organizationId));
+ .where(eq(taskStatuses.organizationId, organizationId))
+ .orderBy(taskStatuses.position);🤖 Prompt for AI Agents
In `@apps/api/src/app/api/github/lib/map-issue-to-task.ts` around lines 25 - 46,
The resolveTaskStatusIds function currently uses statuses.find() after querying
taskStatuses without an ORDER BY, which yields non-deterministic results when
multiple rows share the same type; update the DB query that builds statuses
(reference taskStatuses and the const statuses) to include an explicit ordering
(e.g., order by taskStatuses.position or another stable column) so that the
subsequent lookups for unstartedStatus and completedStatus via
statuses.find(...) always pick the intended row deterministically.
| async function resolveAssigneeId({ | ||
| assignee, | ||
| }: { | ||
| assignee?: GithubIssue["assignee"]; | ||
| }): Promise<string | null> { | ||
| if (!assignee?.email) { | ||
| return null; | ||
| } | ||
|
|
||
| const matchedUser = await db.query.users.findFirst({ | ||
| where: eq(users.email, assignee.email), | ||
| columns: { id: true }, | ||
| }); | ||
|
|
||
| return matchedUser?.id ?? null; | ||
| } |
There was a problem hiding this comment.
resolveAssigneeId will always return null in practice.
This function requires assignee.email to be present, but:
- GitHub's webhook payloads for issue events don't include the assignee's email.
- The webhook handler in
webhooks.ts(line 468) only passes{ login }, neveremail. - The initial-sync route bypasses this function entirely (hardcodes
assigneeId: null).
The design doc correctly flags "Add githubLogin column to users table" as a future consideration, so this appears intentional. However, the function gives a false impression that assignee matching is functional. Consider either:
- Adding a TODO comment explaining why this is a placeholder, or
- Matching on
loginagainst a hypotheticalgithubLoginfield now and skipping the DB call when the field doesn't exist yet.
🤖 Prompt for AI Agents
In `@apps/api/src/app/api/github/lib/map-issue-to-task.ts` around lines 52 - 67,
resolveAssigneeId currently only checks assignee.email (which GitHub webhooks
don't provide) so it effectively always returns null; update it to either (A)
try to match on assignee.login by querying users.githubLogin (use
users.githubLogin in the query if that column exists) and return the matched
user's id, or (B) if the githubLogin column isn't present yet, add a clear TODO
comment inside resolveAssigneeId stating this is a placeholder and that matching
should use assignee.login against users.githubLogin when the DB schema is
updated; reference the resolveAssigneeId function and the users table
(users.githubLogin) when making the change and ensure the function still returns
Promise<string | null>.
| ``` | ||
| GitHub Issue Event | ||
| ↓ | ||
| GitHub Webhook (POST /api/github/webhook) | ||
| ↓ | ||
| Signature verification + idempotent event storage (webhookEvents) | ||
| ↓ | ||
| webhooks.on("issues.*") handler | ||
| ↓ | ||
| processGithubIssueEvent() | ||
| ↓ | ||
| Upsert into `tasks` table (externalProvider='github') | ||
| ``` |
There was a problem hiding this comment.
Add language specifier to fenced code blocks.
The markdownlint tool flags these blocks (lines 15 and 31) for missing language. Since they're flow diagrams, use ```text to satisfy the linter.
🔧 Proposed fix
-```
+```text
GitHub Issue Event-```
+```text
User installs GitHub App / triggers manual syncAlso applies to: 31-43
🧰 Tools
🪛 markdownlint-cli2 (0.20.0)
[warning] 15-15: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🤖 Prompt for AI Agents
In `@docs/github-issues-sync.md` around lines 15 - 27, Update the two fenced code
blocks in docs/github-issues-sync.md (the flow diagram blocks shown as "GitHub
Issue Event ... Upsert into `tasks` table" and the "User installs GitHub App /
triggers manual sync" block) to include a language specifier by changing the
opening triple backticks to ```text so markdownlint stops flagging them; ensure
both fenced blocks use ```text exactly.
Summary
taskstable usingexternalProvider='github'— no separate table neededopen→ unstarted,closed→ completed)syncIssuesconfig flag (defaults totrue) on the integration connectionChanges
New files:
apps/api/src/app/api/github/lib/map-issue-to-task.ts— shared utility for mapping GitHub issues to task fields, used by both webhooks and initial syncdocs/github-issues-sync.md— design doc covering architecture, data flow, field mapping, and future considerationsWebhook handlers (
apps/api/src/app/api/github/webhook/webhooks.ts):issues.opened,edited,closed,reopened,assigned,unassigned,labeled,unlabeled,deletedsyncIssuesconfig before processingInitial sync (
apps/api/src/app/api/github/jobs/initial-sync/route.ts):GitHub callback (
apps/api/src/app/api/github/callback/route.ts):integrationConnectionsrow withprovider='github'on install to enable config systemSchema (
packages/db/src/schema/types.ts):GithubConfigtype withsyncIssues?: booleantoIntegrationConfiguniontRPC procedures (
packages/trpc/src/router/integration/github/github.ts):getConfig— returns current GitHub integration configupdateConfig— setssyncIssuesflagtriggerIssueSync— queues initial sync job for manual re-syncTest Plan
triggerIssueSync→ verify existing open issues are importedpull_requestkey) → skipped during initial syncsyncIssues: falseviaupdateConfig→ verify webhooks skip issue processingbun run typecheckpassesbun run lint:fixpassesSummary by CodeRabbit
New Features
Documentation