Skip to content

feat(api): sync GitHub issues into tasks#1295

Open
Kitenite wants to merge 2 commits intomainfrom
kitenite/github-into-tasks
Open

feat(api): sync GitHub issues into tasks#1295
Kitenite wants to merge 2 commits intomainfrom
kitenite/github-into-tasks

Conversation

@Kitenite
Copy link
Copy Markdown
Collaborator

@Kitenite Kitenite commented Feb 8, 2026

Summary

  • Add one-way sync from GitHub issues → Superset tasks, following the established Linear integration pattern
  • Issues map directly into the tasks table using externalProvider='github' — no separate table needed
  • Status mapping uses org's existing task statuses by type (open → unstarted, closed → completed)
  • Gated by a syncIssues config flag (defaults to true) on the integration connection

Changes

New files:

  • apps/api/src/app/api/github/lib/map-issue-to-task.ts — shared utility for mapping GitHub issues to task fields, used by both webhooks and initial sync
  • docs/github-issues-sync.md — design doc covering architecture, data flow, field mapping, and future considerations

Webhook handlers (apps/api/src/app/api/github/webhook/webhooks.ts):

  • Handle issues.opened, edited, closed, reopened, assigned, unassigned, labeled, unlabeled, deleted
  • Resolve org context via repo → installation chain
  • Check syncIssues config before processing

Initial sync (apps/api/src/app/api/github/jobs/initial-sync/route.ts):

  • Fetch open issues per repo after existing PR sync
  • Filter out pull requests (GitHub issues API includes PRs)
  • Batch upsert into tasks with conflict handling

GitHub callback (apps/api/src/app/api/github/callback/route.ts):

  • Upsert integrationConnections row with provider='github' on install to enable config system

Schema (packages/db/src/schema/types.ts):

  • Add GithubConfig type with syncIssues?: boolean to IntegrationConfig union

tRPC procedures (packages/trpc/src/router/integration/github/github.ts):

  • getConfig — returns current GitHub integration config
  • updateConfig — sets syncIssues flag
  • triggerIssueSync — queues initial sync job for manual re-sync

Test Plan

  • Create a GitHub issue on a connected repo → verify task is created
  • Close the issue → verify task status changes to completed
  • Reopen → verify status changes back to unstarted
  • Edit issue title/body → verify task updates
  • Assign/unassign → verify assignee updates
  • Label/unlabel → verify labels update
  • Delete issue → verify task is soft-deleted
  • Run triggerIssueSync → verify existing open issues are imported
  • Issue that's a PR (has pull_request key) → skipped during initial sync
  • Duplicate webhook delivery → no duplicate tasks (idempotent)
  • Set syncIssues: false via updateConfig → verify webhooks skip issue processing
  • bun run typecheck passes
  • bun run lint:fix passes

Summary by CodeRabbit

  • New Features

    • GitHub Issues now sync to Tasks (initial bulk sync and real-time via webhooks).
    • Per-organization toggle to enable/disable issue syncing and an on-demand trigger to run initial issue sync.
    • Issue events handled: opened, edited, closed, reopened, assigned/unassigned, labeled/unlabeled, deleted; mapped into Tasks with idempotent upserts.
  • Documentation

    • Added detailed guide covering sync flow, field mappings, configuration, and behavior.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Feb 8, 2026

📝 Walkthrough

Walkthrough

Implements one-way GitHub Issues → Tasks sync: stores GitHub installation in integrationConnections, adds initial bulk issue sync, webhook-driven incremental sync, mapping utilities to translate issues to tasks, TRPC procedures to view/update sync config and trigger sync, and documentation.

Changes

Cohort / File(s) Summary
DB Types & Exports
packages/db/src/schema/types.ts
Added GithubConfig type and extended IntegrationConfig union to include GitHub.
OAuth callback
apps/api/src/app/api/github/callback/route.ts
Upserts installation credentials/config into integrationConnections (provider=github) with conflict-update of token, org IDs, connectedByUserId, and updatedAt.
Initial sync job
apps/api/src/app/api/github/jobs/initial-sync/route.ts
Adds config-driven issue sync: resolves org status IDs, fetches open issues (non-PR), maps via mapGithubIssueToTask, and upserts tasks gated by syncIssues and available statuses.
Issue→Task mapping library
apps/api/src/app/api/github/lib/map-issue-to-task.ts
New module exporting resolveTaskStatusIds, mapGithubIssueToTask, and processGithubIssueEvent; resolves assignees by email, maps fields, handles deletes (soft-delete), and upserts tasks.
Webhook handlers
apps/api/src/app/api/github/webhook/webhooks.ts
New issue event handler for opened/edited/closed/reopened/assigned/unassigned/labeled/unlabeled/deleted: verifies installation, checks integrationConnections.config.syncIssues, and delegates to processGithubIssueEvent.
TRPC router
packages/trpc/src/router/integration/github/github.ts
Added procedures: getConfig, updateConfig (org-admin), and triggerIssueSync (manual/queued sync trigger).
Docs
docs/github-issues-sync.md
New documentation describing end-to-end flow, field mappings, webhook events handled, config shape (GithubConfig with syncIssues), and idempotency/upsert behavior.

Sequence Diagram(s)

sequenceDiagram
    actor Client
    participant Server
    participant GitHub
    participant DB

    Client->>Server: triggerIssueSync(organizationId)
    Server->>DB: Query integrationConnections (provider=github, org)
    DB-->>Server: installation + GithubConfig
    Server->>GitHub: List repositories for installation
    GitHub-->>Server: repo list

    loop per repository
        Server->>GitHub: Fetch open issues (exclude PRs)
        GitHub-->>Server: issues
        Server->>DB: resolveTaskStatusIds(organizationId)
        DB-->>Server: unstartedId, completedId
        loop per issue
            Server->>DB: query members by email (assignee)
            DB-->>Server: assigneeId?
            Server->>Server: mapGithubIssueToTask(issue,...)
            Server->>DB: upsert task (onConflictDoUpdate)
            DB-->>Server: upsert result
        end
    end

    Server-->>Client: { success: true }
Loading
sequenceDiagram
    actor GitHub as GitHub Webhook
    participant Server
    participant DB

    GitHub->>Server: POST /webhook (issue event)
    Server->>DB: find integrationConnections by repo->installation
    DB-->>Server: installation + GithubConfig
    alt syncIssues enabled
        Server->>Server: processGithubIssueEvent(issue, action)
        Server->>DB: resolveTaskStatusIds(or update/delete)
        DB-->>Server: statusIds
        Server->>DB: upsert or soft-delete task
        DB-->>Server: result
    else syncIssues disabled
        Server-->>GitHub: 200 (skipped)
    end
    Server-->>GitHub: 200 OK
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Poem

🐰 Hop-hop, I mapped each issue with care,

From repo to task I carried them there,
Webhooks knock, initial sync hums,
Upserts settle, no dupes — here comes,
A carrot-coded sync, all tidy and fair! 🥕✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 16.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title 'feat(api): sync GitHub issues into tasks' clearly and concisely describes the main feature being added - synchronizing GitHub issues into the tasks system.
Description check ✅ Passed The pull request description is comprehensive and well-structured, covering summary, changes, and test plan, though it lacks explicit linking of related issues and explicit type of change designation.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch kitenite/github-into-tasks

No actionable comments were generated in the recent review. 🎉

🧹 Recent nitpick comments
packages/trpc/src/router/integration/github/github.ts (1)

302-342: triggerIssueSync is nearly identical to triggerSync — extract shared helper.

Lines 302-342 duplicate the sync-triggering logic from triggerSync (lines 58-99): same URL, same payload shape, same dev/prod branching. The only difference is the console log message.

Also note that both procedures hit the same /api/github/jobs/initial-sync endpoint, which syncs both PRs and issues. The name triggerIssueSync implies it only syncs issues, which is misleading — consider renaming or documenting this.

♻️ Suggested refactor: extract a shared helper
+const triggerInitialSync = async (params: {
+	installationDbId: string;
+	organizationId: string;
+	logPrefix: string;
+}) => {
+	const syncUrl = `${env.NEXT_PUBLIC_API_URL}/api/github/jobs/initial-sync`;
+	const syncBody = {
+		installationDbId: params.installationDbId,
+		organizationId: params.organizationId,
+	};
+
+	if (env.NODE_ENV === "development") {
+		fetch(syncUrl, {
+			method: "POST",
+			headers: { "Content-Type": "application/json" },
+			body: JSON.stringify(syncBody),
+		}).catch((error) => {
+			console.error(`[${params.logPrefix}] Dev sync failed:`, error);
+		});
+	} else {
+		await qstash.publishJSON({
+			url: syncUrl,
+			body: syncBody,
+			retries: 3,
+		});
+	}
+};

Then both procedures call triggerInitialSync(...).

apps/api/src/app/api/github/jobs/initial-sync/route.ts (2)

278-323: Sequential per-issue upserts may be slow for repos with many issues.

Each issue is upserted individually inside nested for loops (per-repo, then per-issue). For repositories with hundreds of open issues, this creates many sequential DB round-trips. Consider batching upserts or at minimum parallelizing with bounded concurrency (e.g., Promise.all with chunks).

This isn't blocking for an initial implementation, but worth keeping in mind for repos with large issue counts.


292-299: Defensive email extraction from assignee is good, but the value is unused.

The email is extracted from issue.assignee (lines 296-298) and passed to mapGithubIssueToTask, but assigneeId is hardcoded to null on line 305. The mapping function receives the assignee data but the initial sync never resolves it to an org member. This is fine given the "future consideration" for GitHub login matching, but the email extraction work is currently a no-op.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Add one-way sync from GitHub issues to Superset tasks, following the
established Linear integration pattern. Issues are mapped directly into
the tasks table using externalProvider='github'.

- Add issue webhook handlers (opened, edited, closed, reopened, assigned,
  unassigned, labeled, unlabeled, deleted)
- Add issue sync to initial-sync job with PR filtering
- Create shared map-issue-to-task utility for reuse
- Add GithubConfig type with syncIssues toggle
- Upsert integrationConnections row on GitHub App install
- Add getConfig, updateConfig, triggerIssueSync tRPC procedures
- Add design doc at docs/github-issues-sync.md
@Kitenite Kitenite force-pushed the kitenite/github-into-tasks branch from 3af4b66 to 0193780 Compare February 8, 2026 07:45
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Feb 8, 2026

🚀 Preview Deployment

🔗 Preview Links

Service Status Link
Neon Database (Neon) View Branch
Fly.io Electric (Fly.io) View App
Vercel API (Vercel) Open Preview
Vercel Web (Vercel) Open Preview
Vercel Marketing (Vercel) Open Preview
Vercel Admin (Vercel) Open Preview
Vercel Docs (Vercel) Failed to deploy

Preview updates automatically with new commits

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Fix all issues with AI agents
In `@apps/api/src/app/api/github/lib/map-issue-to-task.ts`:
- Around line 52-67: resolveAssigneeId currently only checks assignee.email
(which GitHub webhooks don't provide) so it effectively always returns null;
update it to either (A) try to match on assignee.login by querying
users.githubLogin (use users.githubLogin in the query if that column exists) and
return the matched user's id, or (B) if the githubLogin column isn't present
yet, add a clear TODO comment inside resolveAssigneeId stating this is a
placeholder and that matching should use assignee.login against
users.githubLogin when the DB schema is updated; reference the resolveAssigneeId
function and the users table (users.githubLogin) when making the change and
ensure the function still returns Promise<string | null>.
- Around line 25-46: The resolveTaskStatusIds function currently uses
statuses.find() after querying taskStatuses without an ORDER BY, which yields
non-deterministic results when multiple rows share the same type; update the DB
query that builds statuses (reference taskStatuses and the const statuses) to
include an explicit ordering (e.g., order by taskStatuses.position or another
stable column) so that the subsequent lookups for unstartedStatus and
completedStatus via statuses.find(...) always pick the intended row
deterministically.

In `@docs/github-issues-sync.md`:
- Around line 15-27: Update the two fenced code blocks in
docs/github-issues-sync.md (the flow diagram blocks shown as "GitHub Issue Event
... Upsert into `tasks` table" and the "User installs GitHub App / triggers
manual sync" block) to include a language specifier by changing the opening
triple backticks to ```text so markdownlint stops flagging them; ensure both
fenced blocks use ```text exactly.
🧹 Nitpick comments (5)
packages/trpc/src/router/integration/github/github.ts (2)

302-342: DRY: triggerIssueSync largely duplicates triggerSync.

Both procedures share the same pattern: look up installation → build sync URL/body → dev-fetch or QStash publish. Consider extracting a shared helper like dispatchSyncJob({ installationId, organizationId }) to avoid maintaining this logic in two places.

♻️ Suggested helper extraction
+async function dispatchSyncJob({
+	installationDbId,
+	organizationId,
+}: {
+	installationDbId: string;
+	organizationId: string;
+}) {
+	const syncUrl = `${env.NEXT_PUBLIC_API_URL}/api/github/jobs/initial-sync`;
+	const syncBody = { installationDbId, organizationId };
+
+	if (env.NODE_ENV === "development") {
+		fetch(syncUrl, {
+			method: "POST",
+			headers: { "Content-Type": "application/json" },
+			body: JSON.stringify(syncBody),
+		}).catch((error) => {
+			console.error("[github/dispatchSyncJob] Dev sync failed:", error);
+		});
+	} else {
+		await qstash.publishJSON({
+			url: syncUrl,
+			body: syncBody,
+			retries: 3,
+		});
+	}
+}

247-263: Type assertion on DB config — consider tolerant parsing.

Line 260 casts connection?.config to GithubConfig | null without validation. While the provider="github" filter makes this safe in practice, the coding guidelines call for handling untrusted shapes from external/DB data. A lightweight check (e.g., typeof config?.syncIssues === 'boolean') or a small Zod schema would add resilience if the JSON column ever contains unexpected data.

This same as GithubConfig pattern appears in webhooks.ts (line 450) and initial-sync/route.ts (line 248) — a single shared parser would cover all three call sites.

apps/api/src/app/api/github/callback/route.ts (1)

128-155: Preserving config on re-install is a nice touch.

The onConflictDoUpdate intentionally omits config so a reinstall doesn't reset the user's syncIssues preference — good behavior.

One minor note: externalOrgId is set to accountLogin (line 136), which can change if the GitHub user/org renames. Consider using a stable numeric ID (e.g., String(installation.account?.id)) for externalOrgId while keeping accountLogin for externalOrgName.

apps/api/src/app/api/github/webhook/webhooks.ts (1)

410-453: Three sequential DB queries per webhook event.

Each issue webhook fires repo → installation → integrationConnections lookups sequentially. At moderate webhook volume this is fine, but for high-traffic orgs this could become an I/O bottleneck. Consider collapsing into a single join query in a future optimization pass.

apps/api/src/app/api/github/jobs/initial-sync/route.ts (1)

238-249: Late destructure: organizationId already available from line 74.

organizationId is re-destructured from parsed.data on line 239, but it was already available since line 74 where installationDbId was extracted from the same object. Consider destructuring both together for clarity:

♻️ Proposed fix
-	const { installationDbId } = parsed.data;
+	const { installationDbId, organizationId } = parsed.data;

Then remove line 239.

Comment on lines +25 to +46
export async function resolveTaskStatusIds({
organizationId,
}: {
organizationId: string;
}): Promise<ResolvedStatusIds | null> {
const statuses = await db
.select({ id: taskStatuses.id, type: taskStatuses.type })
.from(taskStatuses)
.where(eq(taskStatuses.organizationId, organizationId));

const unstartedStatus = statuses.find((s) => s.type === "unstarted");
const completedStatus = statuses.find((s) => s.type === "completed");

if (!unstartedStatus || !completedStatus) {
return null;
}

return {
unstartedStatusId: unstartedStatus.id,
completedStatusId: completedStatus.id,
};
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Non-deterministic status selection when multiple statuses share a type.

statuses.find() returns the first matching element, but the DB query (line 30-33) has no ORDER BY, so the result depends on the DB's internal row ordering. If an org has multiple statuses with type='unstarted' (which the schema allows), the chosen status could vary between calls.

Add an explicit order (e.g., by position) for predictable results:

♻️ Proposed fix
 const statuses = await db
 	.select({ id: taskStatuses.id, type: taskStatuses.type })
 	.from(taskStatuses)
-	.where(eq(taskStatuses.organizationId, organizationId));
+	.where(eq(taskStatuses.organizationId, organizationId))
+	.orderBy(taskStatuses.position);
🤖 Prompt for AI Agents
In `@apps/api/src/app/api/github/lib/map-issue-to-task.ts` around lines 25 - 46,
The resolveTaskStatusIds function currently uses statuses.find() after querying
taskStatuses without an ORDER BY, which yields non-deterministic results when
multiple rows share the same type; update the DB query that builds statuses
(reference taskStatuses and the const statuses) to include an explicit ordering
(e.g., order by taskStatuses.position or another stable column) so that the
subsequent lookups for unstartedStatus and completedStatus via
statuses.find(...) always pick the intended row deterministically.

Comment on lines +52 to +67
async function resolveAssigneeId({
assignee,
}: {
assignee?: GithubIssue["assignee"];
}): Promise<string | null> {
if (!assignee?.email) {
return null;
}

const matchedUser = await db.query.users.findFirst({
where: eq(users.email, assignee.email),
columns: { id: true },
});

return matchedUser?.id ?? null;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

resolveAssigneeId will always return null in practice.

This function requires assignee.email to be present, but:

  1. GitHub's webhook payloads for issue events don't include the assignee's email.
  2. The webhook handler in webhooks.ts (line 468) only passes { login }, never email.
  3. The initial-sync route bypasses this function entirely (hardcodes assigneeId: null).

The design doc correctly flags "Add githubLogin column to users table" as a future consideration, so this appears intentional. However, the function gives a false impression that assignee matching is functional. Consider either:

  • Adding a TODO comment explaining why this is a placeholder, or
  • Matching on login against a hypothetical githubLogin field now and skipping the DB call when the field doesn't exist yet.
🤖 Prompt for AI Agents
In `@apps/api/src/app/api/github/lib/map-issue-to-task.ts` around lines 52 - 67,
resolveAssigneeId currently only checks assignee.email (which GitHub webhooks
don't provide) so it effectively always returns null; update it to either (A)
try to match on assignee.login by querying users.githubLogin (use
users.githubLogin in the query if that column exists) and return the matched
user's id, or (B) if the githubLogin column isn't present yet, add a clear TODO
comment inside resolveAssigneeId stating this is a placeholder and that matching
should use assignee.login against users.githubLogin when the DB schema is
updated; reference the resolveAssigneeId function and the users table
(users.githubLogin) when making the change and ensure the function still returns
Promise<string | null>.

Comment on lines +15 to +27
```
GitHub Issue Event
GitHub Webhook (POST /api/github/webhook)
Signature verification + idempotent event storage (webhookEvents)
webhooks.on("issues.*") handler
processGithubIssueEvent()
Upsert into `tasks` table (externalProvider='github')
```
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add language specifier to fenced code blocks.

The markdownlint tool flags these blocks (lines 15 and 31) for missing language. Since they're flow diagrams, use ```text to satisfy the linter.

🔧 Proposed fix
-```
+```text
 GitHub Issue Event
-```
+```text
 User installs GitHub App / triggers manual sync

Also applies to: 31-43

🧰 Tools
🪛 markdownlint-cli2 (0.20.0)

[warning] 15-15: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🤖 Prompt for AI Agents
In `@docs/github-issues-sync.md` around lines 15 - 27, Update the two fenced code
blocks in docs/github-issues-sync.md (the flow diagram blocks shown as "GitHub
Issue Event ... Upsert into `tasks` table" and the "User installs GitHub App /
triggers manual sync" block) to include a language specifier by changing the
opening triple backticks to ```text so markdownlint stops flagging them; ensure
both fenced blocks use ```text exactly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant