Skip to content

feat: add external_id to order key#4379

Merged
chronark merged 18 commits intomainfrom
11-20-feat_add_external_id_to_order_key
Nov 25, 2025
Merged

feat: add external_id to order key#4379
chronark merged 18 commits intomainfrom
11-20-feat_add_external_id_to_order_key

Conversation

@chronark
Copy link
Collaborator

@chronark chronark commented Nov 20, 2025

What does this PR do?

v3 tables just add the external_id to the order key

I will first

  • migrate the tables in clickhouse
  • migrate the users
  • and then we merge and deploy the apis and dashboard

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • Chore (refactoring code, technical debt, workflow improvements)
  • Enhancement (small improvements)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How should this be tested?

  • Verify that the migration runs successfully in a test environment
  • Check that data flows correctly from raw verification events through the aggregation chain
  • Confirm that queries against the new tables return expected results
  • Validate that the TTL policies are correctly applied

Checklist

Required

  • Filled out the "How to test" section in this PR
  • Read Contributing Guide
  • Self-reviewed my own code
  • Commented on my code in hard-to-understand areas
  • Ran pnpm build
  • Ran pnpm fmt
  • Ran make fmt on /go directory
  • Checked for warnings, there are none
  • Removed all console.logs
  • Merged the latest changes from main onto my branch with git pull origin main
  • My changes don't cause any responsiveness issues

@vercel
Copy link

vercel bot commented Nov 20, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

2 Skipped Deployments
Project Deployment Preview Comments Updated (UTC)
dashboard Ignored Ignored Preview Nov 25, 2025 4:39pm
engineering Ignored Ignored Preview Nov 25, 2025 4:39pm

@changeset-bot
Copy link

changeset-bot bot commented Nov 20, 2025

⚠️ No Changeset found

Latest commit: 9268a89

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 20, 2025

📝 Walkthrough

Walkthrough

Adds v3 ClickHouse aggregation tables (minute/hour/day/month) and their materialized views, extends raw retention TTL, updates app and test references to v3, and introduces migration scripts and DDL to populate and backfill the v3 tables.

Changes

Cohort / File(s) Summary
ClickHouse Schema — New v3 Tables & MVs
go/pkg/clickhouse/schema/002_key_verifications_per_minute_v3.sql, go/pkg/clickhouse/schema/003_key_verifications_per_hour_v3.sql, go/pkg/clickhouse/schema/004_key_verifications_per_day_v3.sql, go/pkg/clickhouse/schema/005_key_verifications_per_month_v3.sql
Added AggregatingMergeTree tables key_verifications_per_*_v3 with columns (time, workspace_id, key_space_id, identity_id, external_id, key_id, outcome, tags, count, spent_credits, latency_avg, latency_p75, latency_p99), bloom-filter indexes on identity_id, key_id, tags, TTLs, and paired materialized views to aggregate from upstream tables.
ClickHouse Schema — TTL change
go/pkg/clickhouse/schema/001_key_verifications_raw_v2.sql, go/pkg/clickhouse/migrations/20251125075943.sql, go/pkg/clickhouse/migrations/20251125163818.sql
Modified TTL for key_verifications_raw_v2 to 90 days; added migration DDL creating v3 tables/MVs and a later migration adjusting key_verifications_per_day_v3 TTL to 365 days.
ClickHouse Views — Upstream source updates
go/pkg/clickhouse/schema/017_active_workspaces_per_month_v2.sql, go/pkg/clickhouse/schema/019_billable_verifications_per_month_v2.sql
Updated materialized views to select from key_verifications_per_month_v3 instead of v2 (recreated via migration files).
Application — Table reference updates (Go & API)
apps/api/src/routes/v1_analytics_getVerifications.ts, go/apps/api/routes/v2_analytics_get_verifications/handler.go, go/cmd/create-clickhouse-user/main.go, go/pkg/clickhouse/user.go
Replaced v2 table names with v3 variants in table aliases, allowedTables and related lists; no other logic changes.
Tests — Go and internal TS tests updated
go/pkg/clickhouse/key_verifications_test.go, internal/clickhouse/src/verification_outcomes_propagate_correctly.test.ts
Updated tests to reference v3 aggregation tables; query structure and assertions unchanged.
TypeScript — Interval/table mappings
apps/engineering/content/docs/architecture/services/analytics.mdx, internal/clickhouse/src/keys/active_keys.ts, internal/clickhouse/src/verifications.ts
Updated documentation and runtime interval/table mappings from v2 → v3 for minute/hour/day/month intervals.
Migration / Backfill tooling
tools/migrate/ch_logs.ts, tools/migrate/ch_copy.ts
ch_logs.ts: refactored to windowed, batched processing, adds MySQL lookup, identity caching and UPDATEs to set external_id. ch_copy.ts: new script copying data v2→v3 with per-key concurrency control and adaptive throttling.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    participant Raw as key_verifications_raw_v2
    participant MinMV as key_verifications_per_minute_mv_v3
    participant Min as key_verifications_per_minute_v3
    participant HourMV as key_verifications_per_hour_mv_v3
    participant Hour as key_verifications_per_hour_v3
    participant DayMV as key_verifications_per_day_mv_v3
    participant Day as key_verifications_per_day_v3
    participant MonthMV as key_verifications_per_month_mv_v3
    participant Month as key_verifications_per_month_v3
    participant App as Application

    Raw->>MinMV: materialize (count, spent_credits, avgState, quantilesTDigestState) to minute
    MinMV->>Min: write minute aggregates
    Min->>HourMV: merge-aggregate (avgMergeState, quantilesTDigestMergeState) to hour
    HourMV->>Hour: write hour aggregates
    Hour->>DayMV: merge-aggregate to day
    DayMV->>Day: write day aggregates
    Day->>MonthMV: merge-aggregate to month
    MonthMV->>Month: write month aggregates

    rect rgb(230,245,230)
    Note over App,Month: Application and tests now query `*_v3` tables via aliases
    App->>Min: query v3 tables
    App->>Hour: query v3 tables
    App->>Day: query v3 tables
    App->>Month: query v3 tables
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

  • Files affected: schema DDL, migrations, multiple app layers, tests, and two non-trivial migration scripts.
  • Areas needing focused review:
    • go/pkg/clickhouse/migrations/20251125075943.sql and 20251125163818.sql — verify DDL, partitioning, ORDER BY, index definitions, and TTL expressions.
    • tools/migrate/ch_logs.ts — identity lookup/cache correctness, batching/update semantics, error handling.
    • tools/migrate/ch_copy.ts — concurrency/semaphore correctness, windowing and insert semantics, backoff behavior.
    • Cross-check all application and test references to ensure no remaining v2 table references.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 40.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title 'feat: add external_id to order key' clearly and specifically describes the main change—adding external_id to the order key for v3 tables, matching the PR's core objective.
Description check ✅ Passed The description is well-structured, follows the template, includes purpose, testing strategy, type of change, and a completed checklist demonstrating thoroughness and adherence to contribution standards.
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch 11-20-feat_add_external_id_to_order_key

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e68bc6a and 9268a89.

⛔ Files ignored due to path filters (1)
  • go/pkg/clickhouse/migrations/atlas.sum is excluded by !**/*.sum
📒 Files selected for processing (2)
  • go/pkg/clickhouse/migrations/20251125163818.sql (1 hunks)
  • go/pkg/clickhouse/schema/004_key_verifications_per_day_v3.sql (1 hunks)
🧰 Additional context used
🧠 Learnings (5)
📓 Common learnings
Learnt from: Flo4604
Repo: unkeyed/unkey PR: 4190
File: go/internal/services/keys/verifier.go:51-53
Timestamp: 2025-10-30T15:10:52.743Z
Learning: PR #4190 for unkeyed/unkey is focused solely on database schema and query changes for identity-based credits. It adds IdentityCredits and KeyCredits fields to structs and queries, but does not implement the priority enforcement logic in the usagelimiter. The logic implementation is intentionally deferred to a later PR in the stack.
Learnt from: Flo4604
Repo: unkeyed/unkey PR: 4379
File: apps/engineering/content/docs/architecture/services/analytics.mdx:66-72
Timestamp: 2025-11-21T09:03:42.171Z
Learning: In the unkeyed/unkey repository, when updating ClickHouse analytics table versions, public-facing table aliases (e.g., `key_verifications_per_month_v1`) should only be bumped to newer internal table versions (v2 → v3) if the change affects user-visible behavior or query results. Internal schema changes that don't impact end users (like adding fields to the order key) don't require updating the alias mapping.
Learnt from: chronark
Repo: unkeyed/unkey PR: 3161
File: go/pkg/clickhouse/schema/databases/001_verifications/002_raw_key_verifications_v1.sql:31-33
Timestamp: 2025-04-22T14:40:51.459Z
Learning: The ClickHouse table schemas in the codebase mirror the production environment and cannot be modified directly in PRs without careful migration planning.
Learnt from: Flo4604
Repo: unkeyed/unkey PR: 4016
File: go/pkg/clickhouse/migrations/20250925091254.sql:10-10
Timestamp: 2025-09-25T09:18:38.763Z
Learning: In ClickHouse v2 table migrations, the region field should not be carried over from v1 because v1 uses Cloudflare region identifiers while v2 uses AWS region identifiers. These are incompatible data formats, so setting region to empty string in materialized view migrations is the correct approach.
Learnt from: chronark
Repo: unkeyed/unkey PR: 3958
File: internal/clickhouse/src/requests.ts:7-8
Timestamp: 2025-09-12T13:25:41.849Z
Learning: In the ClickHouse v2 table migration, the location field was renamed from `colo` to `region` across all tables including `default.api_requests_raw_v2`. The zod insert schemas must be updated accordingly to avoid runtime insert failures.
📚 Learning: 2025-11-21T09:03:42.171Z
Learnt from: Flo4604
Repo: unkeyed/unkey PR: 4379
File: apps/engineering/content/docs/architecture/services/analytics.mdx:66-72
Timestamp: 2025-11-21T09:03:42.171Z
Learning: In the unkeyed/unkey repository, when updating ClickHouse analytics table versions, public-facing table aliases (e.g., `key_verifications_per_month_v1`) should only be bumped to newer internal table versions (v2 → v3) if the change affects user-visible behavior or query results. Internal schema changes that don't impact end users (like adding fields to the order key) don't require updating the alias mapping.

Applied to files:

  • go/pkg/clickhouse/schema/004_key_verifications_per_day_v3.sql
  • go/pkg/clickhouse/migrations/20251125163818.sql
📚 Learning: 2025-04-22T14:43:11.724Z
Learnt from: chronark
Repo: unkeyed/unkey PR: 3161
File: go/pkg/clickhouse/schema/databases/002_ratelimits/006_ratelimits_per_day_v1.sql:1-13
Timestamp: 2025-04-22T14:43:11.724Z
Learning: In the unkey project, the SQL files in clickhouse/schema/databases represent the current production schema and shouldn't be modified directly in PRs. Schema changes require dedicated migration scripts.

Applied to files:

  • go/pkg/clickhouse/schema/004_key_verifications_per_day_v3.sql
📚 Learning: 2025-04-22T14:40:51.459Z
Learnt from: chronark
Repo: unkeyed/unkey PR: 3161
File: go/pkg/clickhouse/schema/databases/001_verifications/002_raw_key_verifications_v1.sql:31-33
Timestamp: 2025-04-22T14:40:51.459Z
Learning: The ClickHouse table schemas in the codebase mirror the production environment and cannot be modified directly in PRs without careful migration planning.

Applied to files:

  • go/pkg/clickhouse/schema/004_key_verifications_per_day_v3.sql
📚 Learning: 2025-10-30T15:10:52.743Z
Learnt from: Flo4604
Repo: unkeyed/unkey PR: 4190
File: go/internal/services/keys/verifier.go:51-53
Timestamp: 2025-10-30T15:10:52.743Z
Learning: PR #4190 for unkeyed/unkey is focused solely on database schema and query changes for identity-based credits. It adds IdentityCredits and KeyCredits fields to structs and queries, but does not implement the priority enforcement logic in the usagelimiter. The logic implementation is intentionally deferred to a later PR in the stack.

Applied to files:

  • go/pkg/clickhouse/schema/004_key_verifications_per_day_v3.sql
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Test API / API Test Local
  • GitHub Check: Test Packages / Test
🔇 Additional comments (3)
go/pkg/clickhouse/migrations/20251125163818.sql (1)

1-1: TTL migration aligns with corrected schema TTL.

The 365-day TTL is correctly set, addressing the prior discussion. Note that this matches the TTL already defined in the schema file—this approach is consistent with how migrations apply state changes to ClickHouse tables.

go/pkg/clickhouse/schema/004_key_verifications_per_day_v3.sql (2)

1-32: Table schema correctly adds external_id across all structures.

The new external_id column is properly integrated: defined (line 6), included in the ORDER BY tuple (line 25), and will be populated by the materialized view. Schema and engine choice (AggregatingMergeTree) are appropriate for daily time-series aggregation.


34-60: Materialized view aggregation logic is sound.

The view correctly:

  • Groups by source columns, then transforms time to daily boundary in SELECT (standard ClickHouse aggregation pattern)
  • Uses state merge functions (avgMergeState, quantilesTDigestMergeState) to combine hour-level aggregates
  • Includes external_id in both SELECT (line 39) and GROUP BY (line 56) for proper bucketing

The aggregation chain from hour → day is consistent and should preserve external_id data through the pipeline.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Collaborator Author

This stack of pull requests is managed by Graphite. Learn more about stacking.

@vercel vercel bot temporarily deployed to Preview – engineering November 20, 2025 15:40 Inactive
@vercel vercel bot temporarily deployed to Preview – dashboard November 20, 2025 15:40 Inactive
@chronark
Copy link
Collaborator Author

@coderabbitai review

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 20, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@vercel vercel bot temporarily deployed to Preview – engineering November 20, 2025 21:37 Inactive
@vercel vercel bot temporarily deployed to Preview – dashboard November 20, 2025 21:38 Inactive
@chronark chronark marked this pull request as ready for review November 21, 2025 07:34
@github-actions
Copy link
Contributor

github-actions bot commented Nov 21, 2025

Thank you for following the naming conventions for pull request titles! 🙏

@vercel vercel bot temporarily deployed to Preview – dashboard November 21, 2025 07:37 Inactive
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (10)
tools/migrate/ch_logs.ts (3)

92-105: Consider batching identity resolution for better performance.

The current implementation resolves missing external_id values one-by-one within the row processing loop. For large datasets with many missing external IDs, this results in O(n) database queries, which can significantly slow down the migration.

Consider batching the identity lookups:

// Collect all missing identity_ids first
const missingIdentityIds = rows
  .filter(row => row.external_id === "" && !identityCache.has(row.identity_id))
  .map(row => row.identity_id);

// Batch query all missing identities
if (missingIdentityIds.length > 0) {
  const identities = await db.query.identities.findMany({
    where: (table, { inArray }) => inArray(table.id, missingIdentityIds),
  });
  
  // Populate cache
  for (const identity of identities) {
    identityCache.set(identity.id, identity.externalId);
  }
}

// Now process rows using the cache
for (const row of rows) {
  if (row.external_id === "") {
    const externalId = identityCache.get(row.identity_id);
    if (!externalId) {
      console.error("identity not found", row.identity_id);
      continue;
    }
    row.external_id = externalId;
  }
  inserts.push(row);
}

40-41: Consider making time range configurable.

The start and end timestamps are hardcoded, which reduces the script's flexibility and reusability. Consider making these configurable via environment variables or command-line arguments.

Example using environment variables:

-  const start = 1724930749353;
-  const end = 1738212041696;
+  const start = Number(process.env.MIGRATION_START_TIME || "1724930749353");
+  const end = Number(process.env.MIGRATION_END_TIME || "1738212041696");

Or using a command-line argument library like yargs or commander for better usability.


69-70: Consider parameterized query approach.

While the direct interpolation of t and t + dt is safe here (since they're numbers), using parameterized queries is generally a better practice for maintainability and consistency.

Note: Since ClickHouse query API may not support standard parameterization, and the current approach is safe for numeric values, this is a minor style suggestion. The current implementation is acceptable given that t is a controlled numeric value.

go/pkg/clickhouse/key_verifications_test.go (2)

245-260: Avoid range-variable capture in parallel subtests

In the parallel subtests you call t.Parallel() inside t.Run while closing over the loop variable table (and in some cases keyID/identityID). On Go versions before the new range semantics, this can cause all subtests to use the last loop value, making assertions flaky or misleading. The newer external_id tests already use tbl := table / id := identityID to avoid this; mirroring that pattern here would make these older blocks safer and more consistent.

Example pattern within each for _, table := range ...:

for _, table := range []string{...} {
    tbl := table
    t.Run(tbl, func(t *testing.T) {
        t.Parallel()
        // use tbl instead of table
    })
}

Likewise, introduce k := keyID or similar when keyID is also captured in a parallel subtest.

Also applies to: 269-279, 290-300, 314-323


328-471: New external_id coverage is thorough and matches the data model

The added subtests around external_id (per-identity storage, filtering, outcome breakdowns, identity–external_id consistency, and credits spent per external_id) align with how the data is generated and provide strong end-to-end coverage across all v3 aggregation tables. The use of id := identityID / tbl := table inside parallel subtests correctly avoids range-capture pitfalls as well.

One minor follow-up you might consider later: if a billable_verifications_per_month_v3 table is introduced, mirroring the existing billing per workspace assertion against that table as well would keep billing coverage aligned with the rest of the v3 rollout.

go/pkg/clickhouse/user.go (1)

173-186: Default allowed tables correctly extended to include v3 variants

Adding the four default.key_verifications_per_*_v3 tables to DefaultAllowedTables ensures new ClickHouse users can access the new analytics surfaces without losing access to existing v2 tables. Once v2 is fully deprecated, you may want a follow-up pass to remove the v2 entries and tighten the permissions.

go/cmd/create-clickhouse-user/main.go (2)

37-41: Update description to reflect v3 analytics tables

The help text still says “grants SELECT on all v2 analytics tables”, but you now also grant on the v3 key verification tables. I’d reword this block to something like “v2/v3 analytics tables” or just “analytics tables (key verifications, ratelimits, API requests)” to avoid it going stale again.


67-79: v3 tables correctly added to allowlist; consider rollout ordering

Including the v3 key verification tables in allowedTables looks correct and keeps the CLI in sync with the new schema. Just ensure migrations that create these v3 tables are applied before running create-clickhouse-user in environments where they don’t yet exist, or GRANTs inside ConfigureUser could fail.

go/pkg/clickhouse/schema/004_key_verifications_per_day_v3.sql (2)

1-31: Day‑level v3 schema looks consistent; optional index on external_id

The v3 day table schema (including external_id in both columns and ORDER BY) and the 356‑day TTL look consistent with the rest of the v3 pipeline. If you expect frequent filters on external_id specifically (beyond workspace_id / time / key_space_id), you might consider adding a bloom filter index for it as well, mirroring identity_id/key_id. Otherwise, this definition looks good.


33-59: Align MV SELECT column order with target table to avoid implicit mapping issues

The aggregation logic (summing counts/credits and merging latency states) looks correct, but the SELECT list currently emits time last while the target table has time as its first column. To avoid relying on ClickHouse’s implicit column mapping and any type/order surprises, I’d explicitly match the table’s column order in the MV:

-CREATE MATERIALIZED VIEW key_verifications_per_day_mv_v3 TO key_verifications_per_day_v3 AS
-SELECT
-  workspace_id,
-  key_space_id,
-  identity_id,
-  external_id,
-  key_id,
-  outcome,
-  tags,
-  sum(count) as count,
-  sum(spent_credits) as spent_credits,
-  avgMergeState(latency_avg) as latency_avg,
-  quantilesTDigestMergeState(0.75)(latency_p75) as latency_p75,
-  quantilesTDigestMergeState(0.99)(latency_p99) as latency_p99,
-  toDate(toStartOfDay(time)) AS time
-FROM
-  key_verifications_per_hour_v3
-GROUP BY
-  workspace_id,
-  time,
-  key_space_id,
-  identity_id,
-  external_id,
-  key_id,
-  outcome,
-  tags
-;
+CREATE MATERIALIZED VIEW key_verifications_per_day_mv_v3
+TO key_verifications_per_day_v3 AS
+SELECT
+  toDate(toStartOfDay(time)) AS time,
+  workspace_id,
+  key_space_id,
+  identity_id,
+  external_id,
+  key_id,
+  outcome,
+  tags,
+  sum(count) AS count,
+  sum(spent_credits) AS spent_credits,
+  avgMergeState(latency_avg) AS latency_avg,
+  quantilesTDigestMergeState(0.75)(latency_p75) AS latency_p75,
+  quantilesTDigestMergeState(0.99)(latency_p99) AS latency_p99
+FROM
+  key_verifications_per_hour_v3
+GROUP BY
+  time,
+  workspace_id,
+  key_space_id,
+  identity_id,
+  external_id,
+  key_id,
+  outcome,
+  tags
+;

This keeps the MV unambiguous and aligned with the table schema.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d05f220 and 85f1a16.

⛔ Files ignored due to path filters (1)
  • go/pkg/clickhouse/migrations/atlas.sum is excluded by !**/*.sum
📒 Files selected for processing (15)
  • apps/api/src/routes/v1_analytics_getVerifications.ts (1 hunks)
  • apps/engineering/content/docs/architecture/services/analytics.mdx (1 hunks)
  • go/apps/api/routes/v2_analytics_get_verifications/handler.go (1 hunks)
  • go/cmd/create-clickhouse-user/main.go (1 hunks)
  • go/pkg/clickhouse/key_verifications_test.go (14 hunks)
  • go/pkg/clickhouse/migrations/20251120214146.sql (1 hunks)
  • go/pkg/clickhouse/schema/001_key_verifications_raw_v2.sql (1 hunks)
  • go/pkg/clickhouse/schema/002_key_verifications_per_minute_v3.sql (1 hunks)
  • go/pkg/clickhouse/schema/003_key_verifications_per_hour_v3.sql (1 hunks)
  • go/pkg/clickhouse/schema/004_key_verifications_per_day_v3.sql (1 hunks)
  • go/pkg/clickhouse/user.go (1 hunks)
  • internal/clickhouse/src/keys/active_keys.ts (1 hunks)
  • internal/clickhouse/src/verification_outcomes_propagate_correctly.test.ts (2 hunks)
  • internal/clickhouse/src/verifications.ts (1 hunks)
  • tools/migrate/ch_logs.ts (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • go/pkg/clickhouse/schema/003_key_verifications_per_hour_v3.sql
  • go/pkg/clickhouse/schema/002_key_verifications_per_minute_v3.sql
🧰 Additional context used
🧠 Learnings (9)
📓 Common learnings
Learnt from: Flo4604
Repo: unkeyed/unkey PR: 4190
File: go/internal/services/keys/verifier.go:51-53
Timestamp: 2025-10-30T15:10:52.743Z
Learning: PR #4190 for unkeyed/unkey is focused solely on database schema and query changes for identity-based credits. It adds IdentityCredits and KeyCredits fields to structs and queries, but does not implement the priority enforcement logic in the usagelimiter. The logic implementation is intentionally deferred to a later PR in the stack.
Learnt from: Flo4604
Repo: unkeyed/unkey PR: 4016
File: go/pkg/clickhouse/migrations/20250925091254.sql:10-10
Timestamp: 2025-09-25T09:18:38.763Z
Learning: In ClickHouse v2 table migrations, the region field should not be carried over from v1 because v1 uses Cloudflare region identifiers while v2 uses AWS region identifiers. These are incompatible data formats, so setting region to empty string in materialized view migrations is the correct approach.
Learnt from: chronark
Repo: unkeyed/unkey PR: 3161
File: go/pkg/clickhouse/schema/databases/001_verifications/002_raw_key_verifications_v1.sql:31-33
Timestamp: 2025-04-22T14:40:51.459Z
Learning: The ClickHouse table schemas in the codebase mirror the production environment and cannot be modified directly in PRs without careful migration planning.
Learnt from: chronark
Repo: unkeyed/unkey PR: 3958
File: internal/clickhouse/src/requests.ts:7-8
Timestamp: 2025-09-12T13:25:41.849Z
Learning: In the ClickHouse v2 table migration, the location field was renamed from `colo` to `region` across all tables including `default.api_requests_raw_v2`. The zod insert schemas must be updated accordingly to avoid runtime insert failures.
📚 Learning: 2025-09-12T13:25:41.849Z
Learnt from: chronark
Repo: unkeyed/unkey PR: 3958
File: internal/clickhouse/src/requests.ts:7-8
Timestamp: 2025-09-12T13:25:41.849Z
Learning: In the ClickHouse v2 table migration, the location field was renamed from `colo` to `region` across all tables including `default.api_requests_raw_v2`. The zod insert schemas must be updated accordingly to avoid runtime insert failures.

Applied to files:

  • internal/clickhouse/src/verifications.ts
📚 Learning: 2025-09-25T09:18:38.763Z
Learnt from: Flo4604
Repo: unkeyed/unkey PR: 4016
File: go/pkg/clickhouse/migrations/20250925091254.sql:10-10
Timestamp: 2025-09-25T09:18:38.763Z
Learning: In ClickHouse v2 table migrations, the region field should not be carried over from v1 because v1 uses Cloudflare region identifiers while v2 uses AWS region identifiers. These are incompatible data formats, so setting region to empty string in materialized view migrations is the correct approach.

Applied to files:

  • internal/clickhouse/src/verifications.ts
📚 Learning: 2025-04-22T14:40:51.459Z
Learnt from: chronark
Repo: unkeyed/unkey PR: 3161
File: go/pkg/clickhouse/schema/databases/001_verifications/002_raw_key_verifications_v1.sql:31-33
Timestamp: 2025-04-22T14:40:51.459Z
Learning: The ClickHouse table schemas in the codebase mirror the production environment and cannot be modified directly in PRs without careful migration planning.

Applied to files:

  • go/pkg/clickhouse/migrations/20251120214146.sql
  • go/pkg/clickhouse/schema/001_key_verifications_raw_v2.sql
  • go/pkg/clickhouse/schema/004_key_verifications_per_day_v3.sql
📚 Learning: 2025-04-22T14:43:11.724Z
Learnt from: chronark
Repo: unkeyed/unkey PR: 3161
File: go/pkg/clickhouse/schema/databases/002_ratelimits/006_ratelimits_per_day_v1.sql:1-13
Timestamp: 2025-04-22T14:43:11.724Z
Learning: In the unkey project, the SQL files in clickhouse/schema/databases represent the current production schema and shouldn't be modified directly in PRs. Schema changes require dedicated migration scripts.

Applied to files:

  • go/pkg/clickhouse/migrations/20251120214146.sql
  • go/pkg/clickhouse/schema/001_key_verifications_raw_v2.sql
  • go/pkg/clickhouse/schema/004_key_verifications_per_day_v3.sql
📚 Learning: 2025-07-17T14:24:20.403Z
Learnt from: Flo4604
Repo: unkeyed/unkey PR: 3631
File: go/pkg/db/bulk_keyring_insert.sql.go:23-25
Timestamp: 2025-07-17T14:24:20.403Z
Learning: In go/pkg/db/bulk_keyring_insert.sql.go and similar bulk insert generated files, hardcoded zero values for fields like size_approx and size_last_updated_at are intentional and reflect the original SQL query structure, not missing parameters.

Applied to files:

  • go/pkg/clickhouse/schema/001_key_verifications_raw_v2.sql
📚 Learning: 2025-08-14T16:25:48.167Z
Learnt from: Flo4604
Repo: unkeyed/unkey PR: 3785
File: go/apps/api/routes/v2_keys_reroll_key/401_test.go:52-61
Timestamp: 2025-08-14T16:25:48.167Z
Learning: User Flo4604 requested creation of a GitHub issue to track converting all test files to use table-driven test patterns as a broader codebase improvement, following the suggestion made during review of go/apps/api/routes/v2_keys_reroll_key/401_test.go.

Applied to files:

  • go/pkg/clickhouse/key_verifications_test.go
📚 Learning: 2025-10-30T15:10:52.743Z
Learnt from: Flo4604
Repo: unkeyed/unkey PR: 4190
File: go/internal/services/keys/verifier.go:51-53
Timestamp: 2025-10-30T15:10:52.743Z
Learning: PR #4190 for unkeyed/unkey is focused solely on database schema and query changes for identity-based credits. It adds IdentityCredits and KeyCredits fields to structs and queries, but does not implement the priority enforcement logic in the usagelimiter. The logic implementation is intentionally deferred to a later PR in the stack.

Applied to files:

  • go/pkg/clickhouse/schema/004_key_verifications_per_day_v3.sql
  • tools/migrate/ch_logs.ts
📚 Learning: 2025-01-31T13:50:45.004Z
Learnt from: chronark
Repo: unkeyed/unkey PR: 2825
File: apps/dashboard/app/(app)/logs/components/table/logs-table.tsx:100-118
Timestamp: 2025-01-31T13:50:45.004Z
Learning: Sensitive data in logs is masked at the API layer (apps/api/src/pkg/middleware/metrics.ts) before being written to Clickhouse. This includes redacting API keys, plaintext values, and authorization headers using regex replacements.

Applied to files:

  • tools/migrate/ch_logs.ts
🧬 Code graph analysis (1)
tools/migrate/ch_logs.ts (1)
apps/dashboard/lib/db.ts (1)
  • db (5-26)
🪛 GitHub Actions: autofix.ci
tools/migrate/ch_logs.ts

[error] 35-35: lint/correctness/noUnusedVariables: 'insert' is defined but never used. Unused variable. Suggested fix: rename to '_insert' or apply unsafe fix. Some errors were emitted while applying fixes.

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: PR title
  • GitHub Check: Test API / API Test Local
  • GitHub Check: Test Dashboard / Test Dashboard
  • GitHub Check: Test Go API Local / Test
  • GitHub Check: Test Packages / Test
  • GitHub Check: Build / Build
  • GitHub Check: Analyze (javascript-typescript)
🔇 Additional comments (8)
go/pkg/clickhouse/key_verifications_test.go (1)

121-133: v3 aggregation table wiring in tests looks consistent

All updated loops now target the v3 aggregation tables (per-minute/hour/day/month), matching the rest of the PR (handlers, TS clients, migrations). Query shapes and expectations remain unchanged, so coverage is preserved while pointing at the new storage.

Also applies to: 135-154, 167-180, 194-205, 222-233, 245-260, 269-279, 290-300, 314-323

go/pkg/clickhouse/schema/001_key_verifications_raw_v2.sql (1)

44-45: TTL change to 90 days on raw_v2 looks fine; ensure migration is applied

Bumping the raw_v2 TTL from 1 month to INTERVAL 90 DAY is consistent with extending raw retention and doesn’t affect ordering or indexes. Given these schema files mirror production, just make sure the corresponding ClickHouse migration has already adjusted the live table TTL to match this definition.

internal/clickhouse/src/verification_outcomes_propagate_correctly.test.ts (1)

111-115: Swapping per-day table to v3 keeps the outcome propagation test aligned

Pointing both the OPTIMIZE TABLE and direct aggregation query at default.key_verifications_per_day_v3 matches the rest of the v3 rollout while preserving the test’s intent of validating raw→aggregated outcome propagation.

Also applies to: 118-128

go/apps/api/routes/v2_analytics_get_verifications/handler.go (1)

31-45: v2 analytics handler now correctly targets v3 aggregation tables

Updating tableAliases and allowedTables to the default.key_verifications_per_*_v3 tables keeps the SQL parser’s aliasing and allowlist in sync with the new schema, while still permitting access to default.key_verifications_raw_v2. This matches the ClickHouse user defaults and the TS clients’ v3 table usage.

internal/clickhouse/src/verifications.ts (1)

261-341: Timeseries intervals now consume v3 tables consistently

All entries in INTERVALS have been updated to use the default.key_verifications_per_*_v3 tables, so every granularity (minute → quarter) reads from the new aggregation layer while reusing the existing query and fill logic. This keeps the TS client aligned with the Go handlers and migrations that now target v3.

apps/api/src/routes/v1_analytics_getVerifications.ts (1)

223-245: Deprecated v1 analytics route correctly upgraded to v3 tables

The tables mapping now targets default.key_verifications_per_{hour,day,month}_v3, so v1 consumers read from the same aggregation layer as v2 and the internal TS clients. The existing fill and date handling logic can remain unchanged as long as the v3 schemas preserve the hour/day/month time column types used previously.

internal/clickhouse/src/keys/active_keys.ts (1)

62-142: Active-keys timeseries now reads from v3 verification aggregates

Updating ACTIVE_KEYS_INTERVALS to point at default.key_verifications_per_*_v3 ensures the active-keys charts are driven by the new aggregation tables, matching the rest of the analytics stack without changing query semantics.

go/pkg/clickhouse/migrations/20251120214146.sql (1)

1-4: Confirm intended retention split between day vs hour/minute/raw TTLs

The ALTERs look syntactically fine and consistent (day_v3 at 356 days, hour/minute_v3 and raw_v2 at 90 days). Please just double‑check this retention profile is what you want for v3 (especially reducing/aligning raw_v2 to 90d) before running in prod, as it’s hard to undo once TTL starts deleting data.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 85f1a16 and 544e15b.

⛔ Files ignored due to path filters (1)
  • go/pkg/clickhouse/migrations/atlas.sum is excluded by !**/*.sum
📒 Files selected for processing (3)
  • go/pkg/clickhouse/migrations/20251121081837.sql (1 hunks)
  • go/pkg/clickhouse/schema/017_active_workspaces_per_month_v2.sql (1 hunks)
  • go/pkg/clickhouse/schema/019_billable_verifications_per_month_v2.sql (1 hunks)
🧰 Additional context used
🧠 Learnings (4)
📓 Common learnings
Learnt from: Flo4604
Repo: unkeyed/unkey PR: 4190
File: go/internal/services/keys/verifier.go:51-53
Timestamp: 2025-10-30T15:10:52.743Z
Learning: PR #4190 for unkeyed/unkey is focused solely on database schema and query changes for identity-based credits. It adds IdentityCredits and KeyCredits fields to structs and queries, but does not implement the priority enforcement logic in the usagelimiter. The logic implementation is intentionally deferred to a later PR in the stack.
Learnt from: chronark
Repo: unkeyed/unkey PR: 3161
File: go/pkg/clickhouse/schema/databases/001_verifications/002_raw_key_verifications_v1.sql:31-33
Timestamp: 2025-04-22T14:40:51.459Z
Learning: The ClickHouse table schemas in the codebase mirror the production environment and cannot be modified directly in PRs without careful migration planning.
Learnt from: Flo4604
Repo: unkeyed/unkey PR: 4016
File: go/pkg/clickhouse/migrations/20250925091254.sql:10-10
Timestamp: 2025-09-25T09:18:38.763Z
Learning: In ClickHouse v2 table migrations, the region field should not be carried over from v1 because v1 uses Cloudflare region identifiers while v2 uses AWS region identifiers. These are incompatible data formats, so setting region to empty string in materialized view migrations is the correct approach.
Learnt from: chronark
Repo: unkeyed/unkey PR: 3958
File: internal/clickhouse/src/requests.ts:7-8
Timestamp: 2025-09-12T13:25:41.849Z
Learning: In the ClickHouse v2 table migration, the location field was renamed from `colo` to `region` across all tables including `default.api_requests_raw_v2`. The zod insert schemas must be updated accordingly to avoid runtime insert failures.
📚 Learning: 2025-09-25T09:18:38.763Z
Learnt from: Flo4604
Repo: unkeyed/unkey PR: 4016
File: go/pkg/clickhouse/migrations/20250925091254.sql:10-10
Timestamp: 2025-09-25T09:18:38.763Z
Learning: In ClickHouse v2 table migrations, the region field should not be carried over from v1 because v1 uses Cloudflare region identifiers while v2 uses AWS region identifiers. These are incompatible data formats, so setting region to empty string in materialized view migrations is the correct approach.

Applied to files:

  • go/pkg/clickhouse/migrations/20251121081837.sql
  • go/pkg/clickhouse/schema/019_billable_verifications_per_month_v2.sql
  • go/pkg/clickhouse/schema/017_active_workspaces_per_month_v2.sql
📚 Learning: 2025-04-22T14:40:51.459Z
Learnt from: chronark
Repo: unkeyed/unkey PR: 3161
File: go/pkg/clickhouse/schema/databases/001_verifications/002_raw_key_verifications_v1.sql:31-33
Timestamp: 2025-04-22T14:40:51.459Z
Learning: The ClickHouse table schemas in the codebase mirror the production environment and cannot be modified directly in PRs without careful migration planning.

Applied to files:

  • go/pkg/clickhouse/migrations/20251121081837.sql
  • go/pkg/clickhouse/schema/019_billable_verifications_per_month_v2.sql
  • go/pkg/clickhouse/schema/017_active_workspaces_per_month_v2.sql
📚 Learning: 2025-04-22T14:43:11.724Z
Learnt from: chronark
Repo: unkeyed/unkey PR: 3161
File: go/pkg/clickhouse/schema/databases/002_ratelimits/006_ratelimits_per_day_v1.sql:1-13
Timestamp: 2025-04-22T14:43:11.724Z
Learning: In the unkey project, the SQL files in clickhouse/schema/databases represent the current production schema and shouldn't be modified directly in PRs. Schema changes require dedicated migration scripts.

Applied to files:

  • go/pkg/clickhouse/schema/017_active_workspaces_per_month_v2.sql
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: Test Dashboard / Test Dashboard
  • GitHub Check: Test API / API Test Local
  • GitHub Check: Build / Build
  • GitHub Check: Test Packages / Test
  • GitHub Check: Test Go API Local / Test
🔇 Additional comments (3)
go/pkg/clickhouse/schema/019_billable_verifications_per_month_v2.sql (1)

18-18: Verify source table key_verifications_per_month_v3 exists before this schema is applied.

This schema file documents the target state. The materialized view now sources from key_verifications_per_month_v3 instead of key_verifications_per_month_v2. Ensure the upstream v3 table is created and populated before this schema definition is applied in the target environment.

go/pkg/clickhouse/schema/017_active_workspaces_per_month_v2.sql (1)

13-13: Confirm key_verifications_per_month_v3 source table schema and timeline.

This schema documents the target state where the keys-based materialized view sources from key_verifications_per_month_v3. Verify that v3 table structure (schema, columns) is compatible with the expected aggregation (selecting workspace_id and toDate(time)). Note that the ratelimits-based view (line 15) correctly continues sourcing from ratelimits_per_month_v2.

go/pkg/clickhouse/migrations/20251121081837.sql (1)

1-8: Verify migration sequence and data continuity during cutover.

This migration drops and recreates materialized views to switch their data source from v2 to v3 tables. Ensure the following:

  1. Pre-migration check: Verify that key_verifications_per_month_v3 exists and is populated before this migration executes.
  2. Cutover timing: Confirm there are no data gaps between when the old views stop ingesting (DROP) and new views start ingesting (CREATE). Depending on ClickHouse transaction handling, a gap could cause missing aggregations.
  3. Backfill strategy: If v3 data has a different historical extent than v2, clarify whether backfill or recomputation of aggregations is needed.
  4. Migration order: This migration should execute after all v3 base tables and intermediate aggregations are ready (per PR objectives step 1).

Aligns with PR objective to migrate tables before deploying APIs.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
tools/migrate/ch_logs.ts (1)

67-138: Add connection cleanup.

The past review requested proper cleanup of database connections in a finally block, but this is still missing. The MySQL connection (conn) and ClickHouse clients should be closed when the script completes or fails.

Wrap the main processing loop in try-finally:

try {
  for (const table of tables) {
    // ... existing processing loop
  }
} finally {
  await conn.end();
  await rawCH.close();
  // Note: Check if ClickHouse wrapper client (ch) needs cleanup
}
🧹 Nitpick comments (2)
tools/migrate/ch_logs.ts (2)

49-51: Validate required MySQL environment variables.

The script validates CLICKHOUSE_URL but not the MySQL connection variables (DATABASE_USERNAME, DATABASE_PASSWORD, DATABASE_HOST). Missing variables will cause cryptic connection errors.

Add validation before line 49:

if (!process.env.DATABASE_USERNAME || !process.env.DATABASE_PASSWORD || !process.env.DATABASE_HOST) {
  throw new Error("DATABASE_USERNAME, DATABASE_PASSWORD, and DATABASE_HOST must be set");
}

123-123: Extract concurrency tuning parameters to constants.

The concurrency adjustment uses magic numbers (500, 10, 10, 2) that would be clearer as named constants.

Define constants at the top of the file:

const MAX_CONCURRENCY = 500;
const MIN_CONCURRENCY = 10;
const CONCURRENCY_INCREMENT_FACTOR = 10;
const CONCURRENCY_DECREASE_FACTOR = 2;

Then update the adjustments:

concurrency = Math.min(MAX_CONCURRENCY, concurrency + CONCURRENCY_INCREMENT_FACTOR / concurrency);
// ...
concurrency = Math.max(MIN_CONCURRENCY, concurrency / CONCURRENCY_DECREASE_FACTOR);

Also applies to: 127-127

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 544e15b and f2a2028.

📒 Files selected for processing (1)
  • tools/migrate/ch_logs.ts (1 hunks)
🧰 Additional context used
🧠 Learnings (3)
📓 Common learnings
Learnt from: Flo4604
Repo: unkeyed/unkey PR: 4190
File: go/internal/services/keys/verifier.go:51-53
Timestamp: 2025-10-30T15:10:52.743Z
Learning: PR #4190 for unkeyed/unkey is focused solely on database schema and query changes for identity-based credits. It adds IdentityCredits and KeyCredits fields to structs and queries, but does not implement the priority enforcement logic in the usagelimiter. The logic implementation is intentionally deferred to a later PR in the stack.
Learnt from: Flo4604
Repo: unkeyed/unkey PR: 4379
File: apps/engineering/content/docs/architecture/services/analytics.mdx:66-72
Timestamp: 2025-11-21T09:03:42.171Z
Learning: In the unkeyed/unkey repository, when updating ClickHouse analytics table versions, public-facing table aliases (e.g., `key_verifications_per_month_v1`) should only be bumped to newer internal table versions (v2 → v3) if the change affects user-visible behavior or query results. Internal schema changes that don't impact end users (like adding fields to the order key) don't require updating the alias mapping.
Learnt from: chronark
Repo: unkeyed/unkey PR: 3161
File: go/pkg/clickhouse/schema/databases/001_verifications/002_raw_key_verifications_v1.sql:31-33
Timestamp: 2025-04-22T14:40:51.459Z
Learning: The ClickHouse table schemas in the codebase mirror the production environment and cannot be modified directly in PRs without careful migration planning.
Learnt from: Flo4604
Repo: unkeyed/unkey PR: 4016
File: go/pkg/clickhouse/migrations/20250925091254.sql:10-10
Timestamp: 2025-09-25T09:18:38.763Z
Learning: In ClickHouse v2 table migrations, the region field should not be carried over from v1 because v1 uses Cloudflare region identifiers while v2 uses AWS region identifiers. These are incompatible data formats, so setting region to empty string in materialized view migrations is the correct approach.
Learnt from: chronark
Repo: unkeyed/unkey PR: 3958
File: internal/clickhouse/src/requests.ts:7-8
Timestamp: 2025-09-12T13:25:41.849Z
Learning: In the ClickHouse v2 table migration, the location field was renamed from `colo` to `region` across all tables including `default.api_requests_raw_v2`. The zod insert schemas must be updated accordingly to avoid runtime insert failures.
📚 Learning: 2025-04-22T14:40:51.459Z
Learnt from: chronark
Repo: unkeyed/unkey PR: 3161
File: go/pkg/clickhouse/schema/databases/001_verifications/002_raw_key_verifications_v1.sql:31-33
Timestamp: 2025-04-22T14:40:51.459Z
Learning: The ClickHouse table schemas in the codebase mirror the production environment and cannot be modified directly in PRs without careful migration planning.

Applied to files:

  • tools/migrate/ch_logs.ts
📚 Learning: 2025-10-21T09:45:47.560Z
Learnt from: ogzhanolguncu
Repo: unkeyed/unkey PR: 4107
File: go/pkg/clickhouse/key_verifications_test.go:20-20
Timestamp: 2025-10-21T09:45:47.560Z
Learning: In ClickHouse tests (e.g., go/pkg/clickhouse/key_verifications_test.go), parallel execution with t.Parallel() is safe when tests use workspaceID-based isolation. Each test generates a unique workspaceID (via uid.New(uid.WorkspacePrefix)), and all database operations are scoped to that workspaceID, providing logical isolation even when tests share a single ClickHouse instance.

Applied to files:

  • tools/migrate/ch_logs.ts
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Test Go API Local / Test
  • GitHub Check: Test API / API Test Local
  • GitHub Check: Test Packages / Test
  • GitHub Check: Build / Build
🔇 Additional comments (1)
tools/migrate/ch_logs.ts (1)

7-29: Update script to target v3 tables instead of v2.

V3 tables exist in the codebase (created in migrations 20251120154236.sql) with the external_id field included, yet this migration script targets only v2 aggregate tables. This script is the only place still referencing key_verifications_per_minute_v2, per_hour_v2, per_day_v2, and per_month_v2.

Since the PR objective is adding external_id support to v3 tables, update the table names in lines 9, 15, 20, 25 from v2 to v3 to align with the current migration target.

⛔ Skipped due to learnings
Learnt from: Flo4604
Repo: unkeyed/unkey PR: 4379
File: apps/engineering/content/docs/architecture/services/analytics.mdx:66-72
Timestamp: 2025-11-21T09:03:42.171Z
Learning: In the unkeyed/unkey repository, when updating ClickHouse analytics table versions, public-facing table aliases (e.g., `key_verifications_per_month_v1`) should only be bumped to newer internal table versions (v2 → v3) if the change affects user-visible behavior or query results. Internal schema changes that don't impact end users (like adding fields to the order key) don't require updating the alias mapping.
Learnt from: chronark
Repo: unkeyed/unkey PR: 3161
File: go/pkg/clickhouse/schema/databases/001_verifications/002_raw_key_verifications_v1.sql:31-33
Timestamp: 2025-04-22T14:40:51.459Z
Learning: The ClickHouse table schemas in the codebase mirror the production environment and cannot be modified directly in PRs without careful migration planning.
Learnt from: chronark
Repo: unkeyed/unkey PR: 3161
File: go/pkg/clickhouse/schema/databases/002_ratelimits/006_ratelimits_per_day_v1.sql:1-13
Timestamp: 2025-04-22T14:43:11.724Z
Learning: In the unkey project, the SQL files in clickhouse/schema/databases represent the current production schema and shouldn't be modified directly in PRs. Schema changes require dedicated migration scripts.
Learnt from: Flo4604
Repo: unkeyed/unkey PR: 4016
File: go/pkg/clickhouse/migrations/20250925091254.sql:10-10
Timestamp: 2025-09-25T09:18:38.763Z
Learning: In ClickHouse v2 table migrations, the region field should not be carried over from v1 because v1 uses Cloudflare region identifiers while v2 uses AWS region identifiers. These are incompatible data formats, so setting region to empty string in materialized view migrations is the correct approach.
Learnt from: chronark
Repo: unkeyed/unkey PR: 3958
File: internal/clickhouse/src/requests.ts:7-8
Timestamp: 2025-09-12T13:25:41.849Z
Learning: In the ClickHouse v2 table migration, the location field was renamed from `colo` to `region` across all tables including `default.api_requests_raw_v2`. The zod insert schemas must be updated accordingly to avoid runtime insert failures.

@vercel vercel bot temporarily deployed to Preview – engineering November 25, 2025 16:37 Inactive
@vercel vercel bot temporarily deployed to Preview – dashboard November 25, 2025 16:38 Inactive
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (4)
go/pkg/clickhouse/migrations/20251125075943.sql (1)

21-21: Fix TTL retention period to 365 days.

The TTL uses 356 days instead of 365 days for a year, which creates an incorrect retention window. This same issue appears in the schema file and should be corrected for consistency.

Apply this diff:

-PRIMARY KEY (`workspace_id`, `time`, `key_space_id`, `identity_id`, `external_id`, `key_id`, `outcome`, `tags`) ORDER BY (`workspace_id`, `time`, `key_space_id`, `identity_id`, `external_id`, `key_id`, `outcome`, `tags`) PARTITION BY (toStartOfMonth(time)) TTL time + toIntervalDay(356) SETTINGS index_granularity = 8192;
+PRIMARY KEY (`workspace_id`, `time`, `key_space_id`, `identity_id`, `external_id`, `key_id`, `outcome`, `tags`) ORDER BY (`workspace_id`, `time`, `key_space_id`, `identity_id`, `external_id`, `key_id`, `outcome`, `tags`) PARTITION BY (toStartOfMonth(time)) TTL time + toIntervalDay(365) SETTINGS index_granularity = 8192;
go/pkg/clickhouse/schema/004_key_verifications_per_day_v3.sql (1)

31-31: Fix TTL retention period to 365 days.

The TTL uses 356 days instead of 365 days for a year. This should be corrected to reflect an accurate yearly retention period.

Apply this diff:

-TTL time + INTERVAL 356 DAY DELETE
+TTL time + INTERVAL 365 DAY DELETE
tools/migrate/ch_logs.ts (2)

27-27: Fix retention calculation to use 365 days per year.

The retention calculation uses 356 days instead of 365 days, resulting in an incorrect 4-year retention window.

Apply this diff:

-    retention: 4 * 356 * 24 * 60 * 60 * 1000,
+    retention: 4 * 365 * 24 * 60 * 60 * 1000,

219-229: Critical: SQL injection vulnerability in UPDATE query.

The UPDATE query uses string interpolation to insert externalId, workspace_id, key_space_id, and identity_id directly into the SQL statement. If any of these values contain special characters (e.g., single quotes in "O'Brien"), it could break the query or enable SQL injection attacks.

Use parameterized queries if supported by the ClickHouse client:

-  await rawCH.exec({
-    query: `
-    UPDATE ${table}
-    SET external_id = '${externalId}'
-    WHERE
-    workspace_id = '${row.workspace_id}'
-    AND key_space_id = '${row.key_space_id}'
-    AND identity_id = '${row.identity_id}'
-    AND ( external_id = '' OR external_id = 'undefined' )
-    `,
-  });
+  await rawCH.exec({
+    query: `
+    UPDATE ${table}
+    SET external_id = {externalId:String}
+    WHERE
+    workspace_id = {workspaceId:String}
+    AND key_space_id = {keySpaceId:String}
+    AND identity_id = {identityId:String}
+    AND ( external_id = '' OR external_id = 'undefined' )
+    `,
+    query_params: {
+      externalId: externalId,
+      workspaceId: row.workspace_id,
+      keySpaceId: row.key_space_id,
+      identityId: row.identity_id,
+    },
+  });

If parameterized queries are not available, at minimum escape single quotes:

const escape = (str: string) => str.replace(/'/g, "\\'");
await rawCH.exec({
  query: `
  UPDATE ${table}
  SET external_id = '${escape(externalId)}'
  WHERE
  workspace_id = '${escape(row.workspace_id)}'
  AND key_space_id = '${escape(row.key_space_id)}'
  AND identity_id = '${escape(row.identity_id)}'
  AND ( external_id = '' OR external_id = 'undefined' )
  `,
});
🧹 Nitpick comments (4)
tools/migrate/ch_copy.ts (4)

63-77: Avoid logging full JSON result sets for every window

console.log(json); will dump the full response payload for each time window, which can explode logs and slow the script for large ranges or many keys.

Consider either removing this log or replacing it with a summarized variant (e.g., json.data.length and maybe a few sample IDs), or gating the full dump behind a debug flag:

-    console.log(json);
+    if (process.env.DEBUG_CH_COPY === "1") {
+      console.log(json);
+    }

83-127: Concurrency control looks solid; minor optional cleanups

The semaphore + Promise.race pattern with AIMD‑style concurrency control is well thought out and should behave nicely under load. A couple of optional polish ideas:

  • Extract the concurrency tuning logic into small helpers for readability (e.g., onSuccess() / onError()).
  • Consider rounding concurrency when comparing against semaphore.size to avoid subtle float vs int comparisons over time (even though it’s unlikely to cause real issues).

No blockers here; just nits if you touch this again.


131-131: Ensure your runtime supports top‑level await for this script

The script ends with a top‑level await Promise.all(...). That’s fine in ESM/modern TS setups (e.g., module set appropriately and using tsx/Node with top‑level await support), but will fail in older CommonJS or misconfigured builds.

Just double‑check this is executed in an environment where top‑level await is allowed (e.g., via tsx tools/migrate/ch_copy.ts or similar).


89-113: Use query_params option (not params) with @clickhouse/client-web v1.11.1

The suggestion to use named query parameters is sound—the current string interpolation of keyId should be replaced with a typed parameter. However, the proposed diff uses params, but @clickhouse/client-web v1.11.1 expects query_params as the option key.

           .query({
             query: `
             INSERT INTO ${v3}
             SELECT
               time,
               workspace_id,
               key_space_id,
               identity_id,
               external_id,
               key_id,
               outcome,
               tags,
               count,
               spent_credits,
               latency_avg,
               latency_p75,
               latency_p99
             FROM ${v2}
             WHERE time >= fromUnixTimestamp64Milli(${t})
             AND time < fromUnixTimestamp64Milli(${t + dt})
-            AND key_id = '${keyId}'
-          `,
-          })
+            AND key_id = {keyId:String}
+          `,
+            query_params: { keyId },
+          })
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f2a2028 and e68bc6a.

⛔ Files ignored due to path filters (1)
  • go/pkg/clickhouse/migrations/atlas.sum is excluded by !**/*.sum
📒 Files selected for processing (8)
  • apps/engineering/content/docs/architecture/services/analytics.mdx (1 hunks)
  • go/pkg/clickhouse/migrations/20251125075943.sql (1 hunks)
  • go/pkg/clickhouse/schema/002_key_verifications_per_minute_v3.sql (1 hunks)
  • go/pkg/clickhouse/schema/003_key_verifications_per_hour_v3.sql (1 hunks)
  • go/pkg/clickhouse/schema/004_key_verifications_per_day_v3.sql (1 hunks)
  • go/pkg/clickhouse/schema/005_key_verifications_per_month_v3.sql (1 hunks)
  • tools/migrate/ch_copy.ts (1 hunks)
  • tools/migrate/ch_logs.ts (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • go/pkg/clickhouse/schema/005_key_verifications_per_month_v3.sql
  • apps/engineering/content/docs/architecture/services/analytics.mdx
  • go/pkg/clickhouse/schema/003_key_verifications_per_hour_v3.sql
🧰 Additional context used
🧠 Learnings (9)
📓 Common learnings
Learnt from: Flo4604
Repo: unkeyed/unkey PR: 4190
File: go/internal/services/keys/verifier.go:51-53
Timestamp: 2025-10-30T15:10:52.743Z
Learning: PR #4190 for unkeyed/unkey is focused solely on database schema and query changes for identity-based credits. It adds IdentityCredits and KeyCredits fields to structs and queries, but does not implement the priority enforcement logic in the usagelimiter. The logic implementation is intentionally deferred to a later PR in the stack.
Learnt from: Flo4604
Repo: unkeyed/unkey PR: 4379
File: apps/engineering/content/docs/architecture/services/analytics.mdx:66-72
Timestamp: 2025-11-21T09:03:42.171Z
Learning: In the unkeyed/unkey repository, when updating ClickHouse analytics table versions, public-facing table aliases (e.g., `key_verifications_per_month_v1`) should only be bumped to newer internal table versions (v2 → v3) if the change affects user-visible behavior or query results. Internal schema changes that don't impact end users (like adding fields to the order key) don't require updating the alias mapping.
Learnt from: chronark
Repo: unkeyed/unkey PR: 3161
File: go/pkg/clickhouse/schema/databases/001_verifications/002_raw_key_verifications_v1.sql:31-33
Timestamp: 2025-04-22T14:40:51.459Z
Learning: The ClickHouse table schemas in the codebase mirror the production environment and cannot be modified directly in PRs without careful migration planning.
Learnt from: chronark
Repo: unkeyed/unkey PR: 3958
File: internal/clickhouse/src/requests.ts:7-8
Timestamp: 2025-09-12T13:25:41.849Z
Learning: In the ClickHouse v2 table migration, the location field was renamed from `colo` to `region` across all tables including `default.api_requests_raw_v2`. The zod insert schemas must be updated accordingly to avoid runtime insert failures.
Learnt from: chronark
Repo: unkeyed/unkey PR: 3161
File: go/pkg/clickhouse/schema/databases/002_ratelimits/006_ratelimits_per_day_v1.sql:1-13
Timestamp: 2025-04-22T14:43:11.724Z
Learning: In the unkey project, the SQL files in clickhouse/schema/databases represent the current production schema and shouldn't be modified directly in PRs. Schema changes require dedicated migration scripts.
Learnt from: Flo4604
Repo: unkeyed/unkey PR: 4016
File: go/pkg/clickhouse/migrations/20250925091254.sql:10-10
Timestamp: 2025-09-25T09:18:38.763Z
Learning: In ClickHouse v2 table migrations, the region field should not be carried over from v1 because v1 uses Cloudflare region identifiers while v2 uses AWS region identifiers. These are incompatible data formats, so setting region to empty string in materialized view migrations is the correct approach.
📚 Learning: 2025-11-21T09:03:42.171Z
Learnt from: Flo4604
Repo: unkeyed/unkey PR: 4379
File: apps/engineering/content/docs/architecture/services/analytics.mdx:66-72
Timestamp: 2025-11-21T09:03:42.171Z
Learning: In the unkeyed/unkey repository, when updating ClickHouse analytics table versions, public-facing table aliases (e.g., `key_verifications_per_month_v1`) should only be bumped to newer internal table versions (v2 → v3) if the change affects user-visible behavior or query results. Internal schema changes that don't impact end users (like adding fields to the order key) don't require updating the alias mapping.

Applied to files:

  • go/pkg/clickhouse/schema/004_key_verifications_per_day_v3.sql
  • go/pkg/clickhouse/schema/002_key_verifications_per_minute_v3.sql
  • tools/migrate/ch_copy.ts
  • tools/migrate/ch_logs.ts
  • go/pkg/clickhouse/migrations/20251125075943.sql
📚 Learning: 2025-04-22T14:40:51.459Z
Learnt from: chronark
Repo: unkeyed/unkey PR: 3161
File: go/pkg/clickhouse/schema/databases/001_verifications/002_raw_key_verifications_v1.sql:31-33
Timestamp: 2025-04-22T14:40:51.459Z
Learning: The ClickHouse table schemas in the codebase mirror the production environment and cannot be modified directly in PRs without careful migration planning.

Applied to files:

  • go/pkg/clickhouse/schema/004_key_verifications_per_day_v3.sql
  • go/pkg/clickhouse/schema/002_key_verifications_per_minute_v3.sql
  • tools/migrate/ch_copy.ts
  • tools/migrate/ch_logs.ts
  • go/pkg/clickhouse/migrations/20251125075943.sql
📚 Learning: 2025-04-22T14:43:11.724Z
Learnt from: chronark
Repo: unkeyed/unkey PR: 3161
File: go/pkg/clickhouse/schema/databases/002_ratelimits/006_ratelimits_per_day_v1.sql:1-13
Timestamp: 2025-04-22T14:43:11.724Z
Learning: In the unkey project, the SQL files in clickhouse/schema/databases represent the current production schema and shouldn't be modified directly in PRs. Schema changes require dedicated migration scripts.

Applied to files:

  • go/pkg/clickhouse/schema/004_key_verifications_per_day_v3.sql
  • go/pkg/clickhouse/schema/002_key_verifications_per_minute_v3.sql
  • go/pkg/clickhouse/migrations/20251125075943.sql
📚 Learning: 2025-10-30T15:10:52.743Z
Learnt from: Flo4604
Repo: unkeyed/unkey PR: 4190
File: go/internal/services/keys/verifier.go:51-53
Timestamp: 2025-10-30T15:10:52.743Z
Learning: PR #4190 for unkeyed/unkey is focused solely on database schema and query changes for identity-based credits. It adds IdentityCredits and KeyCredits fields to structs and queries, but does not implement the priority enforcement logic in the usagelimiter. The logic implementation is intentionally deferred to a later PR in the stack.

Applied to files:

  • go/pkg/clickhouse/schema/004_key_verifications_per_day_v3.sql
  • go/pkg/clickhouse/schema/002_key_verifications_per_minute_v3.sql
📚 Learning: 2025-09-25T09:18:38.763Z
Learnt from: Flo4604
Repo: unkeyed/unkey PR: 4016
File: go/pkg/clickhouse/migrations/20250925091254.sql:10-10
Timestamp: 2025-09-25T09:18:38.763Z
Learning: In ClickHouse v2 table migrations, the region field should not be carried over from v1 because v1 uses Cloudflare region identifiers while v2 uses AWS region identifiers. These are incompatible data formats, so setting region to empty string in materialized view migrations is the correct approach.

Applied to files:

  • tools/migrate/ch_copy.ts
  • tools/migrate/ch_logs.ts
  • go/pkg/clickhouse/migrations/20251125075943.sql
📚 Learning: 2025-10-21T09:45:47.560Z
Learnt from: ogzhanolguncu
Repo: unkeyed/unkey PR: 4107
File: go/pkg/clickhouse/key_verifications_test.go:20-20
Timestamp: 2025-10-21T09:45:47.560Z
Learning: In ClickHouse tests (e.g., go/pkg/clickhouse/key_verifications_test.go), parallel execution with t.Parallel() is safe when tests use workspaceID-based isolation. Each test generates a unique workspaceID (via uid.New(uid.WorkspacePrefix)), and all database operations are scoped to that workspaceID, providing logical isolation even when tests share a single ClickHouse instance.

Applied to files:

  • tools/migrate/ch_copy.ts
  • tools/migrate/ch_logs.ts
📚 Learning: 2025-02-27T14:35:15.251Z
Learnt from: ogzhanolguncu
Repo: unkeyed/unkey PR: 2918
File: apps/dashboard/lib/trpc/routers/api/overview-api-search.ts:0-0
Timestamp: 2025-02-27T14:35:15.251Z
Learning: When using Drizzle ORM with LIKE queries, use the sql`` tagged template syntax (e.g., sql`${table.name} LIKE ${`%${input.query}%`}`) instead of direct string interpolation to prevent SQL injection attacks.

Applied to files:

  • tools/migrate/ch_logs.ts
📚 Learning: 2025-09-12T13:25:41.849Z
Learnt from: chronark
Repo: unkeyed/unkey PR: 3958
File: internal/clickhouse/src/requests.ts:7-8
Timestamp: 2025-09-12T13:25:41.849Z
Learning: In the ClickHouse v2 table migration, the location field was renamed from `colo` to `region` across all tables including `default.api_requests_raw_v2`. The zod insert schemas must be updated accordingly to avoid runtime insert failures.

Applied to files:

  • go/pkg/clickhouse/migrations/20251125075943.sql
🧬 Code graph analysis (2)
tools/migrate/ch_copy.ts (1)
internal/clickhouse/src/testutil.ts (1)
  • start (27-50)
tools/migrate/ch_logs.ts (1)
apps/dashboard/lib/db.ts (1)
  • db (5-26)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: Test Dashboard / Test Dashboard
  • GitHub Check: Test Packages / Test
  • GitHub Check: Test Go API Local / Test
  • GitHub Check: Test API / API Test Local
  • GitHub Check: Build / Build
  • GitHub Check: Analyze (javascript-typescript)
🔇 Additional comments (11)
go/pkg/clickhouse/migrations/20251125075943.sql (4)

1-1: LGTM: TTL modification on raw table.

The 90-day TTL on the raw table aligns with the retention period of the minute and hour v3 tables.


23-81: LGTM: v3 table schemas are consistent.

All v3 tables (hour, minute, month) consistently include external_id in the order key after identity_id, with appropriate bloom filter indexes and TTL settings.


83-89: LGTM: v2 materialized views correctly redirected to v3 sources.

The v2 materialized views are properly updated to consume data from the new v3 tables while maintaining their v2 target tables. This approach preserves backward compatibility while leveraging the improved v3 schema.


91-97: LGTM: v3 materialized views form correct aggregation chain.

The materialized views properly cascade from raw_v2 through minute, hour, day, to month v3 tables, with all aggregation functions (avgMergeState, quantilesTDigestMergeState) correctly applied and external_id included in the GROUP BY clauses.

go/pkg/clickhouse/schema/004_key_verifications_per_day_v3.sql (1)

34-60: LGTM: Materialized view correctly aggregates hour-level data.

The view properly aggregates from key_verifications_per_hour_v3 with appropriate merge state functions and includes external_id in the grouping dimensions.

go/pkg/clickhouse/schema/002_key_verifications_per_minute_v3.sql (2)

1-32: LGTM: Minute-level table schema is correct.

The table structure properly includes external_id in the order key, uses appropriate DateTime granularity, and has a reasonable 90-day TTL for minute-level data.


34-60: LGTM: Materialized view correctly aggregates raw events.

The view properly uses avgState and quantilesTDigestState (not Merge variants) when aggregating from the raw table, and correctly includes external_id from the source data.

tools/migrate/ch_logs.ts (4)

31-96: LGTM: Robust setup with cache handling.

The script includes proper environment validation, persistent caching to reduce database lookups, and signal handlers for graceful shutdown with cache persistence.


106-189: LGTM: Well-designed batch processing with adaptive concurrency.

The windowed processing approach with adaptive concurrency control (increases on success, decreases on failure) is a robust pattern for managing load on both ClickHouse and MySQL during migration.


194-218: LGTM: Identity lookup with effective caching.

The caching strategy effectively reduces database load by storing both found and deleted (null) identities, preventing repeated lookups for missing records.


232-232: LGTM: Clean exit after migration completes.

The explicit exit ensures the process terminates cleanly after all tables are processed and the cache is saved.

Copy link
Member

@perkinsjr perkinsjr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see anything that would be an issue

@graphite-app
Copy link

graphite-app bot commented Nov 25, 2025

Jim Carrey Thumbs Up GIF (Added via Giphy)

@chronark chronark merged commit a33b61d into main Nov 25, 2025
39 of 50 checks passed
@chronark chronark deleted the 11-20-feat_add_external_id_to_order_key branch November 25, 2025 17:11
@graphite-app
Copy link

graphite-app bot commented Nov 25, 2025

Graphite Automations

"Post a GIF when PR approved" took an action on this PR • (11/25/25)

1 gif was posted to this PR based on Andreas Thomas's automation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants