feat: track credits and latency in clickhouse#3854
Conversation
|
📝 WalkthroughWalkthroughRemoves Goose-based ClickHouse migrations from CI, Makefiles, tooling, and tests. Introduces a custom ClickHouse image and init script that load SQL schemas at container startup. Replaces old migration files under internal/clickhouse with new schema files under go/pkg/clickhouse/schema. Adds v2 tables/materialized views for verifications and ratelimits. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Dev as Developer
participant DC as docker-compose
participant CH as ClickHouse (custom image)
participant Init as init-clickhouse.sh
participant FS as /opt/clickhouse-schemas
Dev->>DC: docker compose up clickhouse
DC->>CH: Build from deployment/Dockerfile.clickhouse
CH->>Init: Run entrypoint init scripts on first start
Init->>FS: Iterate database/* directories
loop For each *.sql
Init->>CH: clickhouse-client -q "<sql>"
CH-->>Init: OK/ERR
end
Init-->>Dev: Initialization complete
sequenceDiagram
autonumber
actor CI as CI Workflow
participant UT as Unit Tests
participant API as API Local Tests
CI->>UT: Setup Node / Build
CI--X UT: Install goose (removed)
CI->>UT: Run tests
CI->>API: Build / Load MySQL schema
CI--X API: Install goose (removed)
CI--X API: goose up ClickHouse (removed)
CI->>API: Run tests / dump & upload logs
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Tip 🔌 Remote MCP (Model Context Protocol) integration is now available!Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats. ✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
|
The latest updates on your projects. Learn more about Vercel for GitHub.
1 Skipped Deployment
|
|
Thank you for following the naming conventions for pull request titles! 🙏 |
Graphite Automations"Notify author when CI fails" took an action on this PR • (08/26/25)1 teammate was notified to this PR based on Andreas Thomas's automation. |
perkinsjr
left a comment
There was a problem hiding this comment.
I don't see anything insane that would cause issues.
There was a problem hiding this comment.
Actionable comments posted: 62
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (6)
go/pkg/db/retry.go (2)
31-42: Optional: compute backoff instead of predefining slice (simpler, same semantics).Removes the allocation and keeps behavior identical (50ms, 100ms, 200ms). Consider also adding jitter in the future to avoid synchronized retries, but that would require an extra import and is not necessary for this PR.
- retry.Backoff(func(n int) time.Duration { - // Predefined backoff delays: 50ms, 100ms, 200ms - delays := []time.Duration{ - DefaultBackoff, // 50ms for attempt 1 - DefaultBackoff * 2, // 100ms for attempt 2 - DefaultBackoff * 4, // 200ms for attempt 3 - } - if n <= 0 || n > len(delays) { - return DefaultBackoff // fallback to base delay - } - return delays[n-1] - }), + retry.Backoff(func(n int) time.Duration { + if n < 1 { + n = 1 + } + // 50ms, 100ms, 200ms + return DefaultBackoff << (n - 1) + }),
43-56: Do not retry on context cancellation or deadline exceeded.Currently, canceled or timed-out requests will still be retried, adding latency and wasting capacity. Skip retries for these cases.
retry.ShouldRetry(func(err error) bool { + // Do not retry if the caller canceled or the context deadline elapsed. + if errors.Is(err, context.Canceled) || errors.Is(err, context.DeadlineExceeded) { + return false + } // Don't retry if resource is not found - this is a valid response if IsNotFound(err) { return false }Add the necessary imports outside this hunk:
import ( "context" "errors" "time" "github.com/unkeyed/unkey/go/pkg/retry" )go/pkg/retry/retry_test.go (1)
136-187: Optional: add boundary/limit cases to strengthen ShouldRetry coverageNice table-driven tests. Consider adding:
- Attempts cap interaction with ShouldRetry (e.g., Attempts(2) + always-retry policy).
- Zero-attempts behavior under ShouldRetry (mirror the “invalid attempts” case from TestRetry).
Apply this diff to extend the table with two focused cases:
@@ tests := []struct { name string shouldRetry func(error) bool errorSequence []error expectedCalls int expectedError error expectedSleep time.Duration }{ + // Attempts cap should stop even if ShouldRetry would continue + { + name: "attempts cap limits retries despite retryable errors", + shouldRetry: func(err error) bool { return true }, + // two failures, then would succeed, but we will cap attempts to 2 below + errorSequence: []error{retryableError, retryableError, nil}, + // we will set Attempts(2) for this case in the test body + expectedCalls: 2, + expectedError: retryableError, + expectedSleep: 100 * time.Millisecond, + }, + // Zero attempts should short-circuit even with ShouldRetry policy + { + name: "zero attempts short-circuits even with ShouldRetry", + shouldRetry: func(err error) bool { return true }, + errorSequence: []error{retryableError}, + // we will set Attempts(0) for this case in the test body + expectedCalls: 0, + expectedError: errors.New("invalid attempts"), // will be replaced below + expectedSleep: 0, + }, { name: "should retry all errors by default", shouldRetry: nil, // default behavior errorSequence: []error{retryableError, retryableError, nil}, expectedCalls: 3, expectedError: nil, expectedSleep: 300 * time.Millisecond, // 100ms + 200ms }, @@ for _, tt := range tests { t.Run(tt.name, func(t *testing.T) { var retrier *retry - if tt.shouldRetry != nil { - retrier = New(ShouldRetry(tt.shouldRetry)) - } else { - retrier = New() - } + // Per-test overrides for the two new boundary cases + switch tt.name { + case "attempts cap limits retries despite retryable errors": + retrier = New(ShouldRetry(tt.shouldRetry), Attempts(2)) + case "zero attempts short-circuits even with ShouldRetry": + retrier = New(ShouldRetry(tt.shouldRetry), Attempts(0)) + // err value asserted below; keep a stable sentinel + tt.expectedError = retrier.validate() // assume Do() returns validate() error + default: + if tt.shouldRetry != nil { + retrier = New(ShouldRetry(tt.shouldRetry)) + } else { + retrier = New() + } + }Note: If
retry.validate()isn’t available, replace the sentinel withrequire.Error(t, err)and drop the equality check for that case. I can adjust to your actual constructor/validation behavior if you point me to the implementation.go/pkg/clickhouse/schema/databases/001_verifications/004_key_verifications_per_minute_mv_v2.sql (1)
5-9: Verify target table schema aligns with GROUP BY keys and column list.You project identity_id, key_id, tags into the sink. Confirm verifications.key_verifications_per_minute_v2 defines these columns and they are included in ORDER BY appropriately. High-cardinality keys (identity_id, key_id) at per-minute granularity can explode rows; confirm this matches dashboard needs.
Also applies to: 18-25
deployment/docker-compose.yaml (2)
114-138: Healthcheck should validate schema readiness, not just server liveness.
SELECT 1returns OK even if init SQL hasn’t run yet. This can start dependents before tables exist.Probe for a known table (e.g.,
ratelimits.raw_ratelimits_v2):- "--query", - "SELECT 1", + "--query", + "SELECT count() FROM system.tables WHERE database = 'ratelimits' AND name = 'raw_ratelimits_v2'",This better gates service readiness for apiv2/agent.
290-295: Unused volume “clickhouse-keeper”.No service mounts
clickhouse-keeper. Dead volumes cause confusion.volumes: mysql: clickhouse: - clickhouse-keeper: s3: metald-aio-data:
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (88)
.github/workflows/job_test_api_local.yaml(0 hunks).github/workflows/job_test_unit.yaml(0 hunks)Makefile(0 hunks)apps/engineering/content/docs/contributing/testing.mdx(0 hunks)deployment/Dockerfile.clickhouse(1 hunks)deployment/docker-compose.yaml(4 hunks)deployment/init-clickhouse.sh(1 hunks)go/Makefile(0 hunks)go/pkg/clickhouse/schema/databases/001_verifications/002_raw_key_verifications_v2.sql(1 hunks)go/pkg/clickhouse/schema/databases/001_verifications/003_key_verifications_per_minute_v2.sql(1 hunks)go/pkg/clickhouse/schema/databases/001_verifications/004_key_verifications_per_minute_mv_v2.sql(3 hunks)go/pkg/clickhouse/schema/databases/001_verifications/005_key_verifications_per_hour_v2.sql(1 hunks)go/pkg/clickhouse/schema/databases/001_verifications/006_key_verifications_per_hour_mv_v2.sql(2 hunks)go/pkg/clickhouse/schema/databases/001_verifications/007_key_verifications_per_day_v4.sql(1 hunks)go/pkg/clickhouse/schema/databases/001_verifications/008_key_verifications_per_day_mv_v4.sql(1 hunks)go/pkg/clickhouse/schema/databases/001_verifications/009_key_verifications_per_month_v4.sql(1 hunks)go/pkg/clickhouse/schema/databases/001_verifications/010_key_verifications_per_month_mv_v4.sql(1 hunks)go/pkg/clickhouse/schema/databases/001_verifications/011_temp_sync_v1_to_v2.sql(1 hunks)go/pkg/clickhouse/schema/databases/001_verifications/MIGRATION_v1_to_v2.md(1 hunks)go/pkg/clickhouse/schema/databases/002_ratelimits/002_raw_ratelimits_v2.sql(1 hunks)go/pkg/clickhouse/schema/databases/002_ratelimits/003_ratelimits_per_minute_v2.sql(1 hunks)go/pkg/clickhouse/schema/databases/002_ratelimits/004_ratelimits_per_minute_mv_v2.sql(1 hunks)go/pkg/clickhouse/schema/databases/002_ratelimits/005_ratelimits_per_hour_v2.sql(1 hunks)go/pkg/clickhouse/schema/databases/002_ratelimits/006_ratelimits_per_hour_mv_v2.sql(1 hunks)go/pkg/clickhouse/schema/databases/002_ratelimits/007_ratelimits_per_day_v2.sql(1 hunks)go/pkg/clickhouse/schema/databases/002_ratelimits/008_ratelimits_per_day_mv_v2.sql(1 hunks)go/pkg/clickhouse/schema/databases/002_ratelimits/009_ratelimits_per_month_v2.sql(1 hunks)go/pkg/clickhouse/schema/databases/002_ratelimits/010_ratelimits_per_month_mv_v2.sql(1 hunks)go/pkg/clickhouse/schema/databases/002_ratelimits/011_ratelimits_last_used_v2.sql(1 hunks)go/pkg/clickhouse/schema/databases/002_ratelimits/012_ratelimits_last_used_mv_v2.sql(1 hunks)go/pkg/clickhouse/schema/databases/002_ratelimits/013_temp_sync_v1_to_v2.sql(1 hunks)go/pkg/clickhouse/schema/databases/002_ratelimits/MIGRATION_v1_to_v2.md(1 hunks)go/pkg/db/retry.go(1 hunks)go/pkg/retry/retry_test.go(5 hunks)internal/clickhouse/Dockerfile(0 hunks)internal/clickhouse/package.json(1 hunks)internal/clickhouse/schema/000_README.md(0 hunks)internal/clickhouse/schema/001_create_databases.sql(0 hunks)internal/clickhouse/schema/002_create_metrics_raw_api_requests_v1.sql(0 hunks)internal/clickhouse/schema/003_create_verifications_raw_key_verifications_v1.sql(0 hunks)internal/clickhouse/schema/004_create_verifications_key_verifications_per_hour_v1.sql(0 hunks)internal/clickhouse/schema/005_create_verifications_key_verifications_per_day_v1.sql(0 hunks)internal/clickhouse/schema/006_create_verifications_key_verifications_per_month_v1.sql(0 hunks)internal/clickhouse/schema/007_create_verifications_key_verifications_per_hour_mv_v1.sql(0 hunks)internal/clickhouse/schema/008_create_verifications_key_verifications_per_day_mv_v1.sql(0 hunks)internal/clickhouse/schema/009_create_verifications_key_verifications_per_month_mv_v1.sql(0 hunks)internal/clickhouse/schema/010_create_ratelimits_raw_ratelimits_table.sql(0 hunks)internal/clickhouse/schema/011_create_telemetry_raw_sdks_v1.sql(0 hunks)internal/clickhouse/schema/012_create_billing_billable_verifications_per_month_v1.sql(0 hunks)internal/clickhouse/schema/013_create_billing_billable_verifications_per_month_mv_v1.sql(0 hunks)internal/clickhouse/schema/014_create_ratelimits_ratelimits_per_minute_v1.sql(0 hunks)internal/clickhouse/schema/015_create_ratelimits_ratelimits_per_hour_v1.sql(0 hunks)internal/clickhouse/schema/016_create_ratelimits_ratelimits_per_day_v1.sql(0 hunks)internal/clickhouse/schema/017_create_ratelimits_ratelimits_per_month_v1.sql(0 hunks)internal/clickhouse/schema/018_create_ratelimits_ratelimits_per_minute_mv_v1.sql(0 hunks)internal/clickhouse/schema/019_create_ratelimits_ratelimits_per_hour_mv_v1.sql(0 hunks)internal/clickhouse/schema/020_create_ratelimits_ratelimits_per_day_mv_v1.sql(0 hunks)internal/clickhouse/schema/021_create_ratelimits_ratelimits_per_month_mv_v1.sql(0 hunks)internal/clickhouse/schema/022_create_business_active_workspaces_per_month_v1.sql(0 hunks)internal/clickhouse/schema/023_create_business_active_workspaces_per_month_mv_v1.sql(0 hunks)internal/clickhouse/schema/024_create_ratelimits_last_used_mv_v1.sql(0 hunks)internal/clickhouse/schema/025_create_billable_verifications_v2.sql(0 hunks)internal/clickhouse/schema/026_create_billable_ratelimits_v1.sql(0 hunks)internal/clickhouse/schema/027_add_tags_to_verifications.raw_key_verifications_v1.sql(0 hunks)internal/clickhouse/schema/028_create_verifications.key_verifications_per_hour_v2.sql(0 hunks)internal/clickhouse/schema/030_create_verifications.key_verifications_per_day_v2.sql(0 hunks)internal/clickhouse/schema/031_create_verifications.key_verifications_per_day_mv_v2.sql(0 hunks)internal/clickhouse/schema/032_create_verifications.key_verifications_per_month_v2.sql(0 hunks)internal/clickhouse/schema/033_create_verifications.key_verifications_per_month_mv_v2.sql(0 hunks)internal/clickhouse/schema/034_billing_read_from_verifications.key_verifications_per_month_v2.sql(0 hunks)internal/clickhouse/schema/035_business_update_active_workspaces_keys_per_month_mv_v1_read_from_verifications.key_verifications_per_month_v2.sql(0 hunks)internal/clickhouse/schema/036_create_verifications.key_verifications_per_hour_v3.sql(0 hunks)internal/clickhouse/schema/037_create_verifications.key_verifications_per_hour_mv_v3.sql(0 hunks)internal/clickhouse/schema/038_create_verifications.key_verifications_per_day_v3.sql(0 hunks)internal/clickhouse/schema/039_create_verifications.key_verifications_per_day_mv_v3.sql(0 hunks)internal/clickhouse/schema/040_create_verifications.key_verifications_per_month_v3.sql(0 hunks)internal/clickhouse/schema/041_create_verifications.key_verifications_per_month_mv_v3.sql(0 hunks)internal/clickhouse/schema/042_create_api_requests_per_hour_v1.sql(0 hunks)internal/clickhouse/schema/043_create_api_requests_per_hour_mv_v1.sql(0 hunks)internal/clickhouse/schema/044_create_api_requests_per_minute_v1.sql(0 hunks)internal/clickhouse/schema/045_create_api_requests_per_minute_mv_v1.sql(0 hunks)internal/clickhouse/schema/046_create_api_requests_per_day_v1.sql(0 hunks)internal/clickhouse/schema/047_create_api_requests_per_day_mv_v1.sql(0 hunks)internal/clickhouse/schema/048_raw_ratelimits_metrics_indexes_v1.sql(0 hunks)internal/clickhouse/schema/049_raw_api_metrics_ratelimit_indexes_v1.sql(0 hunks)internal/clickhouse/schema/050_create_verifications.key_verifications_per_minute_v1.sql(0 hunks)internal/clickhouse/src/testutil.ts(2 hunks)tools/local/src/main.ts(1 hunks)
💤 Files with no reviewable changes (56)
- apps/engineering/content/docs/contributing/testing.mdx
- go/Makefile
- internal/clickhouse/schema/022_create_business_active_workspaces_per_month_v1.sql
- internal/clickhouse/schema/043_create_api_requests_per_hour_mv_v1.sql
- internal/clickhouse/Dockerfile
- internal/clickhouse/schema/026_create_billable_ratelimits_v1.sql
- internal/clickhouse/schema/000_README.md
- internal/clickhouse/schema/048_raw_ratelimits_metrics_indexes_v1.sql
- .github/workflows/job_test_unit.yaml
- internal/clickhouse/schema/021_create_ratelimits_ratelimits_per_month_mv_v1.sql
- internal/clickhouse/schema/049_raw_api_metrics_ratelimit_indexes_v1.sql
- internal/clickhouse/schema/001_create_databases.sql
- internal/clickhouse/schema/023_create_business_active_workspaces_per_month_mv_v1.sql
- internal/clickhouse/schema/045_create_api_requests_per_minute_mv_v1.sql
- internal/clickhouse/schema/017_create_ratelimits_ratelimits_per_month_v1.sql
- internal/clickhouse/schema/011_create_telemetry_raw_sdks_v1.sql
- internal/clickhouse/schema/030_create_verifications.key_verifications_per_day_v2.sql
- internal/clickhouse/schema/047_create_api_requests_per_day_mv_v1.sql
- internal/clickhouse/schema/032_create_verifications.key_verifications_per_month_v2.sql
- internal/clickhouse/schema/024_create_ratelimits_last_used_mv_v1.sql
- internal/clickhouse/schema/040_create_verifications.key_verifications_per_month_v3.sql
- internal/clickhouse/schema/038_create_verifications.key_verifications_per_day_v3.sql
- internal/clickhouse/schema/027_add_tags_to_verifications.raw_key_verifications_v1.sql
- internal/clickhouse/schema/041_create_verifications.key_verifications_per_month_mv_v3.sql
- internal/clickhouse/schema/042_create_api_requests_per_hour_v1.sql
- internal/clickhouse/schema/019_create_ratelimits_ratelimits_per_hour_mv_v1.sql
- internal/clickhouse/schema/039_create_verifications.key_verifications_per_day_mv_v3.sql
- internal/clickhouse/schema/034_billing_read_from_verifications.key_verifications_per_month_v2.sql
- internal/clickhouse/schema/004_create_verifications_key_verifications_per_hour_v1.sql
- internal/clickhouse/schema/046_create_api_requests_per_day_v1.sql
- internal/clickhouse/schema/005_create_verifications_key_verifications_per_day_v1.sql
- internal/clickhouse/schema/016_create_ratelimits_ratelimits_per_day_v1.sql
- internal/clickhouse/schema/018_create_ratelimits_ratelimits_per_minute_mv_v1.sql
- internal/clickhouse/schema/012_create_billing_billable_verifications_per_month_v1.sql
- internal/clickhouse/schema/003_create_verifications_raw_key_verifications_v1.sql
- internal/clickhouse/schema/044_create_api_requests_per_minute_v1.sql
- internal/clickhouse/schema/014_create_ratelimits_ratelimits_per_minute_v1.sql
- internal/clickhouse/schema/050_create_verifications.key_verifications_per_minute_v1.sql
- internal/clickhouse/schema/031_create_verifications.key_verifications_per_day_mv_v2.sql
- internal/clickhouse/schema/009_create_verifications_key_verifications_per_month_mv_v1.sql
- internal/clickhouse/schema/010_create_ratelimits_raw_ratelimits_table.sql
- internal/clickhouse/schema/028_create_verifications.key_verifications_per_hour_v2.sql
- .github/workflows/job_test_api_local.yaml
- internal/clickhouse/schema/036_create_verifications.key_verifications_per_hour_v3.sql
- Makefile
- internal/clickhouse/schema/033_create_verifications.key_verifications_per_month_mv_v2.sql
- internal/clickhouse/schema/008_create_verifications_key_verifications_per_day_mv_v1.sql
- internal/clickhouse/schema/015_create_ratelimits_ratelimits_per_hour_v1.sql
- internal/clickhouse/schema/035_business_update_active_workspaces_keys_per_month_mv_v1_read_from_verifications.key_verifications_per_month_v2.sql
- internal/clickhouse/schema/025_create_billable_verifications_v2.sql
- internal/clickhouse/schema/037_create_verifications.key_verifications_per_hour_mv_v3.sql
- internal/clickhouse/schema/020_create_ratelimits_ratelimits_per_day_mv_v1.sql
- internal/clickhouse/schema/006_create_verifications_key_verifications_per_month_v1.sql
- internal/clickhouse/schema/007_create_verifications_key_verifications_per_hour_mv_v1.sql
- internal/clickhouse/schema/002_create_metrics_raw_api_requests_v1.sql
- internal/clickhouse/schema/013_create_billing_billable_verifications_per_month_mv_v1.sql
🧰 Additional context used
📓 Path-based instructions (5)
**/*.go
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.go: Follow comprehensive documentation guidelines for Go code as described in go/GO_DOCUMENTATION_GUIDELINES.md
Every public function/type in Go code must be documented
Prefer interfaces for testability in Go code
Use AIDEV-* comments for complex/important code in Go services
Files:
go/pkg/db/retry.gogo/pkg/retry/retry_test.go
**/*.{env,js,ts,go}
📄 CodeRabbit inference engine (CLAUDE.md)
All environment variables must follow the format: UNKEY_<SERVICE_NAME>_VARNAME
Files:
go/pkg/db/retry.gotools/local/src/main.tsgo/pkg/retry/retry_test.gointernal/clickhouse/src/testutil.ts
**/*.{js,jsx,ts,tsx}
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.{js,jsx,ts,tsx}: Use Biome for formatting and linting in TypeScript/JavaScript projects
Prefer named exports over default exports in TypeScript/JavaScript, except for Next.js pages
Files:
tools/local/src/main.tsinternal/clickhouse/src/testutil.ts
**/*.{ts,tsx}
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.{ts,tsx}: Follow strict TypeScript configuration
Use Zod for runtime validation in TypeScript projects
Files:
tools/local/src/main.tsinternal/clickhouse/src/testutil.ts
**/*_test.go
📄 CodeRabbit inference engine (CLAUDE.md)
**/*_test.go: Use table-driven tests in Go
Organize Go integration tests with real dependencies
Organize Go tests by HTTP status codes
Files:
go/pkg/retry/retry_test.go
🧠 Learnings (2)
📓 Common learnings
Learnt from: chronark
PR: unkeyed/unkey#3161
File: go/pkg/clickhouse/schema/databases/001_verifications/002_raw_key_verifications_v1.sql:31-33
Timestamp: 2025-04-22T14:40:51.459Z
Learning: The ClickHouse table schemas in the codebase mirror the production environment and cannot be modified directly in PRs without careful migration planning.
Learnt from: chronark
PR: unkeyed/unkey#3161
File: go/pkg/clickhouse/schema/databases/002_ratelimits/006_ratelimits_per_day_v1.sql:1-13
Timestamp: 2025-04-22T14:43:11.724Z
Learning: In the unkey project, the SQL files in clickhouse/schema/databases represent the current production schema and shouldn't be modified directly in PRs. Schema changes require dedicated migration scripts.
📚 Learning: 2025-04-22T14:43:11.724Z
Learnt from: chronark
PR: unkeyed/unkey#3161
File: go/pkg/clickhouse/schema/databases/002_ratelimits/006_ratelimits_per_day_v1.sql:1-13
Timestamp: 2025-04-22T14:43:11.724Z
Learning: In the unkey project, the SQL files in clickhouse/schema/databases represent the current production schema and shouldn't be modified directly in PRs. Schema changes require dedicated migration scripts.
Applied to files:
go/pkg/clickhouse/schema/databases/001_verifications/003_key_verifications_per_minute_v2.sql
🧬 Code graph analysis (1)
tools/local/src/main.ts (2)
tools/local/src/db.ts (1)
prepareDatabase(13-81)tools/local/src/cmd/api.ts (1)
bootstrapApi(9-52)
🪛 LanguageTool
go/pkg/clickhouse/schema/databases/002_ratelimits/MIGRATION_v1_to_v2.md
[grammar] ~1-~1: Use correct spacing
Context: ...Migration Guide: Raw Ratelimits v1 to v2 This guide explains how to migrate from ...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~3-~3: Use correct spacing
Context: ... raw_ratelimits_v2 with zero downtime. ## What's Changed in v2 - ✅ **Latency trac...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~5-~5: Use correct spacing
Context: ... zero downtime. ## What's Changed in v2 - ✅ Latency tracking: New latency co...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~7-~7: There might be a mistake here.
Context: ...cy tracking**: New latency column for ratelimit check performance - ✅ Partitioning:...
(QB_NEW_EN_OTHER)
[grammar] ~7-~7: There might be a mistake here.
Context: ...` column for ratelimit check performance - ✅ Partitioning: Monthly partitions f...
(QB_NEW_EN)
[grammar] ~8-~8: There might be a mistake here.
Context: ...ter performance and lifecycle management - ✅ TTL: Automatic data cleanup after ...
(QB_NEW_EN)
[grammar] ~9-~9: There might be a mistake here.
Context: ...*: Automatic data cleanup after 3 months - ✅ Deduplication: Protection against ...
(QB_NEW_EN)
[grammar] ~10-~10: There might be a mistake here.
Context: ...against duplicate inserts during retries - ✅ Better ordering: Optimized ORDER B...
(QB_NEW_EN)
[grammar] ~11-~11: There might be a mistake here.
Context: ...timized ORDER BY for time-series queries - ✅ Latency aggregations: avg, p75, p9...
(QB_NEW_EN)
[grammar] ~12-~12: Insert the missing word
Context: ...s - ✅ Latency aggregations: avg, p75, p99 latency metrics in all aggregation ...
(QB_NEW_EN_OTHER_ERROR_IDS_32)
[grammar] ~12-~12: Use correct spacing
Context: ...atency metrics in all aggregation tables ## Migration Steps ### Phase 1: Deploy New...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~14-~14: Use correct spacing
Context: ...l aggregation tables ## Migration Steps ### Phase 1: Deploy New Schema (Zero Downtim...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~16-~16: Use correct spacing
Context: ...ase 1: Deploy New Schema (Zero Downtime) 1. Deploy the new schema files - This cre...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[typographical] ~18-~18: To join two clauses or set off examples, consider using an em dash.
Context: ...ntime) 1. Deploy the new schema files - This creates v2 tables and temporary sync: ...
(QB_NEW_EN_DASH_RULE_EM)
[grammar] ~18-~18: Use correct spacing
Context: ...is creates v2 tables and temporary sync: bash # The auto-migration will create: # - raw_ratelimits_v2 (new table) # - All new aggregation tables (minute/hour/day/month v2) # - temp_sync_v1_to_v2 (materialized view for live sync) 2. Verify deployment - Check that all new...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[typographical] ~26-~26: To join two clauses or set off examples, consider using an em dash.
Context: ...ive sync) 2. **Verify deployment** - Check that all new tables exist: sql ...
(QB_NEW_EN_DASH_RULE_EM)
[grammar] ~26-~26: Use correct spacing
Context: ...ent** - Check that all new tables exist: sql SHOW TABLES FROM ratelimits LIKE '%v2%'; ### Phase 2: Backfill Historical Data **
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~31-~31: Use correct spacing
Context: ...` ### Phase 2: Backfill Historical Data
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~33-~33: Use correct spacing
Context: ...storical data while new writes continue. Run these queries manually to backfi...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~35-~35: Use correct spacing
Context: ...* to backfill historical data in chunks: #### Step 1: Find your data range ```sql SELE...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~37-~37: There might be a mistake here.
Context: ...unks: #### Step 1: Find your data range sql SELECT toDateTime(min(time)/1000) as earliest_data, toDateTime(max(time)/1000) as latest_data, count(*) as total_rows FROM ratelimits.raw_ratelimits_v1; #### Step 2: Backfill month by month Replace ...
(QB_NEW_EN_OTHER)
[grammar] ~46-~46: There might be a mistake here.
Context: ...`` #### Step 2: Backfill month by month Replace the date ranges based on your ac...
(QB_NEW_EN_OTHER)
[grammar] ~47-~47: Use correct spacing
Context: ...s based on your actual data from Step 1: sql -- January 2024 (example - adjust dates) INSERT INTO ratelimits.raw_ratelimits_v2 SELECT request_id, time, workspace_id, namespace_id, identifier, passed, 0.0 as latency -- v1 doesn't have this column, default to 0.0 FROM ratelimits.raw_ratelimits_v1 WHERE time >= toUnixTimestamp64Milli(toDateTime('2024-01-01 00:00:00')) AND time < toUnixTimestamp64Milli(toDateTime('2024-02-01 00:00:00')); -- February 2024 INSERT INTO ratelimits.raw_ratelimits_v2 SELECT request_id, time, workspace_id, namespace_id, identifier, passed, 0.0 as latency FROM ratelimits.raw_ratelimits_v1 WHERE time >= toUnixTimestamp64Milli(toDateTime('2024-02-01 00:00:00')) AND time < toUnixTimestamp64Milli(toDateTime('2024-03-01 00:00:00')); -- Continue for each month until you reach current data... #### Step 3: Monitor progress ```sql -- Check...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~81-~81: Use correct spacing
Context: ...ta... #### Step 3: Monitor progresssql -- Check backfill progress SELECT 'v1' as version, toDateTime(min(time)/1000) as earliest, toDateTime(max(time)/1000) as latest, count() as rows FROM ratelimits.raw_ratelimits_v1 UNION ALL SELECT 'v2' as version, toDateTime(min(time)/1000) as earliest, toDateTime(max(time)/1000) as latest, count() as rows FROM ratelimits.raw_ratelimits_v2 ORDER BY version; ``` ### Phase 3: Switch Application to v2 1. **...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~100-~100: Use correct spacing
Context: ... ### Phase 3: Switch Application to v2 1. **Update your application** to write tor...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~102-~102: Use the right verb tense
Context: ...write to raw_ratelimits_v2 instead of v1 2. Include the new field: - latency (Float6...
(QB_NEW_EN_OTHER_ERROR_IDS_13)
[grammar] ~103-~103: Use correct spacing
Context: ...ad of v1 2. Include the new field: - latency (Float64) - ratelimit check latency in m...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~104-~104: Use correct spacing
Context: ...cy in milliseconds with sub-ms precision ### Phase 4: Cleanup (After Confirming v2 Wo...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~106-~106: Use correct spacing
Context: ...e 4: Cleanup (After Confirming v2 Works) 1. Drop the temporary sync view: ```sq...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~108-~108: Use correct spacing
Context: ...s) 1. Drop the temporary sync view: sql DROP VIEW ratelimits.temp_sync_v1_to_v2; 2. Optional: Keep v1 as backup or drop it...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~113-~113: Use correct spacing
Context: ...ptional: Keep v1 as backup** or drop it: sql -- To drop v1 (only after confirming v2 works perfectly) DROP TABLE ratelimits.raw_ratelimits_v1; ## Rollback Plan If you need to rollback: ...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~119-~119: Use correct spacing
Context: ..._ratelimits_v1; ``` ## Rollback Plan If you need to rollback: 1. **Switch ap...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~121-~121: Use the right verb tense
Context: ... ``` ## Rollback Plan If you need to rollback: 1. Switch application back to writing to ...
(QB_NEW_EN_OTHER_ERROR_IDS_13)
[grammar] ~124-~124: There might be a mistake here.
Context: ...1` 2. Keep the temp sync view active 3. Investigate issues with v2 4. **Re-run...
(QB_NEW_EN_OTHER)
[grammar] ~125-~125: There might be a mistake here.
Context: ...active 3. Investigate issues with v2 4. Re-run migration when ready ## Import...
(QB_NEW_EN_OTHER)
[grammar] ~126-~126: There might be a mistake here.
Context: ...th v2 4. Re-run migration when ready ## Important Notes - 🔄 Zero downtime:...
(QB_NEW_EN_OTHER)
[grammar] ~128-~128: Use correct spacing
Context: ...gration** when ready ## Important Notes - 🔄 Zero downtime: New data automatic...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~130-~130: There might be a mistake here.
Context: ...cally flows to v2 via the temp sync view - 📊 Historical data: Will have `laten...
(QB_NEW_EN)
[grammar] ~131-~131: There might be a mistake here.
Context: ...ted later when you add latency tracking) - 🗓️ TTL: v2 data will auto-delete af...
(QB_NEW_EN)
[grammar] ~132-~132: There might be a mistake here.
Context: ... v2 data will auto-delete after 3 months - 🔍 Deduplication: Protects against d...
(QB_NEW_EN)
[grammar] ~133-~133: There might be a mistake here.
Context: ...against duplicate inserts during retries - 📈 Performance: Monthly partitions i...
(QB_NEW_EN)
[grammar] ~134-~134: There might be a mistake here.
Context: ...improve query performance on time ranges - ⚡ TDigest quantiles: Efficient appro...
(QB_NEW_EN)
[grammar] ~135-~135: There might be a mistake here.
Context: ...mate percentile calculations for latency ## New Aggregation Metrics Available With ...
(QB_NEW_EN)
[grammar] ~137-~137: Use correct spacing
Context: ...cy ## New Aggregation Metrics Available With v2, all aggregation tables now incl...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~139-~139: There might be a mistake here.
Context: ... v2, all aggregation tables now include: - latency_avg: Average ratelimit check latency - `lat...
(QB_NEW_EN)
[grammar] ~140-~140: There might be a mistake here.
Context: ...s now include: - latency_avg: Average ratelimit check latency - latency_p75: 75th per...
(QB_NEW_EN_OTHER)
[grammar] ~140-~140: There might be a mistake here.
Context: ...cy_avg: Average ratelimit check latency - latency_p75: 75th percentile latency - latency_p99...
(QB_NEW_EN)
[grammar] ~141-~141: There might be a mistake here.
Context: ...- latency_p75: 75th percentile latency - latency_p99: 99th percentile latency Perfect for m...
(QB_NEW_EN)
[grammar] ~142-~142: Use correct spacing
Context: ...- latency_p99: 99th percentile latency Perfect for monitoring ratelimit perform...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~144-~144: There might be a mistake here.
Context: ...centile latency Perfect for monitoring ratelimit performance and identifying bottlenecks...
(QB_NEW_EN_OTHER)
[grammar] ~144-~144: Use correct spacing
Context: ...performance and identifying bottlenecks! ## Questions? - Check the temp sync view i...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~146-~146: Use correct spacing
Context: ... identifying bottlenecks! ## Questions? - Check the temp sync view is active: `SEL...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~148-~148: There might be a mistake here.
Context: ... - Check the temp sync view is active: SELECT * FROM system.tables WHERE name = 'temp_sync_v1_to_v2' - Monitor both tables during migration to ...
(QB_NEW_EN)
[grammar] ~149-~149: There might be a mistake here.
Context: ...ing migration to ensure data consistency - The aggregation tables (minute/hour/day/...
(QB_NEW_EN_OTHER)
[grammar] ~150-~150: Use a period to end declarative sentences
Context: ...th) will automatically populate from v2 data
(QB_NEW_EN_OTHER_ERROR_IDS_25)
go/pkg/clickhouse/schema/databases/001_verifications/MIGRATION_v1_to_v2.md
[grammar] ~1-~1: Use correct spacing
Context: ...on Guide: Raw Key Verifications v1 to v2 This guide explains how to migrate from ...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~3-~3: Use correct spacing
Context: ...ey_verifications_v2` with zero downtime. ## What's Changed in v2 - ✅ **Partitioning...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~5-~5: Use correct spacing
Context: ... zero downtime. ## What's Changed in v2 - ✅ Partitioning: Monthly partitions f...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~7-~7: There might be a mistake here.
Context: ...ter performance and lifecycle management - ✅ TTL: Automatic data cleanup after ...
(QB_NEW_EN)
[grammar] ~8-~8: There might be a mistake here.
Context: ...*: Automatic data cleanup after 3 months - ✅ Deduplication: Protection against ...
(QB_NEW_EN)
[grammar] ~9-~9: There might be a mistake here.
Context: ...against duplicate inserts during retries - ✅ Sub-millisecond latency: Float64 f...
(QB_NEW_EN)
[grammar] ~10-~10: There might be a mistake here.
Context: ...Float64 for precise latency measurements - ✅ Credit tracking: New `spent_credit...
(QB_NEW_EN)
[grammar] ~11-~11: There might be a mistake here.
Context: ...t tracking**: New spent_credits column - ✅ Better ordering: Optimized ORDER B...
(QB_NEW_EN)
[grammar] ~12-~12: Use correct spacing
Context: ...timized ORDER BY for time-series queries ## Migration Steps ### Phase 1: Deploy New...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~14-~14: Use correct spacing
Context: ... time-series queries ## Migration Steps ### Phase 1: Deploy New Schema (Zero Downtim...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~16-~16: Use correct spacing
Context: ...ase 1: Deploy New Schema (Zero Downtime) 1. Deploy the new schema files - This cre...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[typographical] ~18-~18: To join two clauses or set off examples, consider using an em dash.
Context: ...ntime) 1. Deploy the new schema files - This creates v2 tables and temporary sync: ...
(QB_NEW_EN_DASH_RULE_EM)
[grammar] ~18-~18: Use correct spacing
Context: ...is creates v2 tables and temporary sync: bash # The auto-migration will create: # - raw_key_verifications_v2 (new table) # - All new aggregation tables (minute/hour/day/month v2/v4) # - temp_sync_v1_to_v2 (materialized view for live sync) 2. Verify deployment - Check that all new...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[typographical] ~26-~26: To join two clauses or set off examples, consider using an em dash.
Context: ...ive sync) 2. **Verify deployment** - Check that all new tables exist: sql ...
(QB_NEW_EN_DASH_RULE_EM)
[grammar] ~26-~26: Use correct spacing
Context: ...ent** - Check that all new tables exist: sql SHOW TABLES FROM verifications LIKE '%v2%'; SHOW TABLES FROM verifications LIKE '%v4%'; ### Phase 2: Backfill Historical Data **
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~32-~32: Use correct spacing
Context: ...` ### Phase 2: Backfill Historical Data
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~34-~34: Use correct spacing
Context: ...storical data while new writes continue. Run these queries manually to backfi...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~36-~36: Use correct spacing
Context: ...* to backfill historical data in chunks: #### Step 1: Find your data range ```sql SELE...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~38-~38: There might be a mistake here.
Context: ...unks: #### Step 1: Find your data range sql SELECT toDateTime(min(time)/1000) as earliest_data, toDateTime(max(time)/1000) as latest_data, count(*) as total_rows FROM verifications.raw_key_verifications_v1; #### Step 2: Backfill month by month Replace ...
(QB_NEW_EN_OTHER)
[grammar] ~47-~47: There might be a mistake here.
Context: ...`` #### Step 2: Backfill month by month Replace the date ranges based on your ac...
(QB_NEW_EN_OTHER)
[grammar] ~48-~48: Use correct spacing
Context: ...s based on your actual data from Step 1: sql -- January 2024 (example - adjust dates) INSERT INTO verifications.raw_key_verifications_v2 SELECT request_id, time, workspace_id, key_space_id, identity_id, key_id, region, outcome, tags, 0 as spent_credits, -- v1 doesn't have this column, default to 0 0.0 as latency -- v1 doesn't have this column, default to 0.0 FROM verifications.raw_key_verifications_v1 WHERE time >= toUnixTimestamp64Milli(toDateTime('2024-01-01 00:00:00')) AND time < toUnixTimestamp64Milli(toDateTime('2024-02-01 00:00:00')); -- February 2024 INSERT INTO verifications.raw_key_verifications_v2 SELECT request_id, time, workspace_id, key_space_id, identity_id, key_id, region, outcome, tags, 0 as spent_credits, 0.0 as latency FROM verifications.raw_key_verifications_v1 WHERE time >= toUnixTimestamp64Milli(toDateTime('2024-02-01 00:00:00')) AND time < toUnixTimestamp64Milli(toDateTime('2024-03-01 00:00:00')); -- Continue for each month until you reach current data... #### Step 3: Monitor progress ```sql -- Check...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~90-~90: Use correct spacing
Context: ...ta... #### Step 3: Monitor progresssql -- Check backfill progress SELECT 'v1' as version, toDateTime(min(time)/1000) as earliest, toDateTime(max(time)/1000) as latest, count() as rows FROM verifications.raw_key_verifications_v1 UNION ALL SELECT 'v2' as version, toDateTime(min(time)/1000) as earliest, toDateTime(max(time)/1000) as latest, count() as rows FROM verifications.raw_key_verifications_v2 ORDER BY version; ``` ### Phase 3: Switch Application to v2 1. **...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~109-~109: Use correct spacing
Context: ... ### Phase 3: Switch Application to v2 1. **Update your application** to write tor...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~111-~111: Use the right verb tense
Context: ...o raw_key_verifications_v2 instead of v1 2. Include the new fields: - spent_credits ...
(QB_NEW_EN_OTHER_ERROR_IDS_13)
[grammar] ~112-~112: Use correct spacing
Context: ...d of v1 2. Include the new fields: - spent_credits (Int64) - credits consumed by this verif...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~113-~113: Use correct spacing
Context: ... - credits consumed by this verification - latency (Float64) - latency in milliseconds with...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~114-~114: Use correct spacing
Context: ...cy in milliseconds with sub-ms precision ### Phase 4: Cleanup (After Confirming v2 Wo...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~116-~116: Use correct spacing
Context: ...e 4: Cleanup (After Confirming v2 Works) 1. Drop the temporary sync view: ```sq...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~118-~118: Use correct spacing
Context: ...s) 1. Drop the temporary sync view: sql DROP VIEW verifications.temp_sync_v1_to_v2; 2. Optional: Keep v1 as backup or drop it...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~123-~123: Use correct spacing
Context: ...ptional: Keep v1 as backup** or drop it: sql -- To drop v1 (only after confirming v2 works perfectly) DROP TABLE verifications.raw_key_verifications_v1; ## Rollback Plan If you need to rollback: ...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~129-~129: Use correct spacing
Context: ...rifications_v1; ``` ## Rollback Plan If you need to rollback: 1. **Switch ap...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~131-~131: Use the right verb tense
Context: ... ``` ## Rollback Plan If you need to rollback: 1. Switch application back to writing to ...
(QB_NEW_EN_OTHER_ERROR_IDS_13)
[grammar] ~134-~134: There might be a mistake here.
Context: ...1` 2. Keep the temp sync view active 3. Investigate issues with v2 4. **Re-run...
(QB_NEW_EN_OTHER)
[grammar] ~135-~135: There might be a mistake here.
Context: ...active 3. Investigate issues with v2 4. Re-run migration when ready ## Import...
(QB_NEW_EN_OTHER)
[grammar] ~136-~136: There might be a mistake here.
Context: ...th v2 4. Re-run migration when ready ## Important Notes - 🔄 Zero downtime:...
(QB_NEW_EN_OTHER)
[grammar] ~138-~138: Use correct spacing
Context: ...gration** when ready ## Important Notes - 🔄 Zero downtime: New data automatic...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~140-~140: There might be a mistake here.
Context: ...cally flows to v2 via the temp sync view - 📊 Historical data: Will have `spent...
(QB_NEW_EN)
[grammar] ~141-~141: There might be a mistake here.
Context: ...cy=0.0` (can be updated later if needed) - 🗓️ TTL: v2 data will auto-delete af...
(QB_NEW_EN)
[grammar] ~142-~142: There might be a mistake here.
Context: ... v2 data will auto-delete after 3 months - 🔍 Deduplication: Protects against d...
(QB_NEW_EN)
[grammar] ~143-~143: There might be a mistake here.
Context: ...against duplicate inserts during retries - 📈 Performance: Monthly partitions i...
(QB_NEW_EN)
[grammar] ~144-~144: There might be a mistake here.
Context: ...improve query performance on time ranges ## Questions? - Check the temp sync view i...
(QB_NEW_EN)
[grammar] ~146-~146: Use correct spacing
Context: ...erformance on time ranges ## Questions? - Check the temp sync view is active: `SEL...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~148-~148: There might be a mistake here.
Context: ... - Check the temp sync view is active: SELECT * FROM system.tables WHERE name = 'temp_sync_v1_to_v2' - Monitor both tables during migration to ...
(QB_NEW_EN)
[grammar] ~149-~149: There might be a mistake here.
Context: ...ing migration to ensure data consistency - The aggregation tables (minute/hour/day/...
(QB_NEW_EN_OTHER)
[grammar] ~150-~150: Use a period to end declarative sentences
Context: ...th) will automatically populate from v2 data
(QB_NEW_EN_OTHER_ERROR_IDS_25)
🪛 Checkov (3.2.334)
deployment/Dockerfile.clickhouse
[LOW] 1-8: Ensure that HEALTHCHECK instructions have been added to container images
(CKV_DOCKER_2)
[LOW] 1-8: Ensure that a user for the container has been created
(CKV_DOCKER_3)
🔇 Additional comments (22)
go/pkg/db/retry.go (1)
35-35: LGTM: formatting-only change.Whitespace-only tweak on the 100ms entry. No functional impact.
go/pkg/retry/retry_test.go (3)
149-155: LGTM: formatting-only change; expectations remain consistent with default backoffThe alignment edits are cosmetic. The test semantics (retry-all default with 100ms, 200ms sleeps) are preserved and still validate the intended behavior.
196-201: LGTM: deterministic timing via injected sleeperOverriding the internal sleeper to accumulate durations keeps tests fast and eliminates flakiness. Good pattern for backoff testing.
212-215: LGTM: strict equality on accumulated sleep is safe hereSince no real time elapses, exact equality on durations is appropriate. No tolerance window needed.
deployment/init-clickhouse.sh (1)
6-13: Manual Verification Required: glob behavior and sorting
The automated test did not yield the expected directory and file listings, so please confirm manually that:
shopt -s nullglobis in effect before any glob operations, ensuring globs with no matches expand to empty rather than literal patterns.- Database directories under
/opt/clickhouse-schemas/are iterated in natural (numeric) sort order (e.g.001_…,002_…,010_…).- SQL files within each database directory are likewise processed via an explicit
sort -V, guaranteeing consistent numbering order.Once verified, update or retain the proposed refactor accordingly.
go/pkg/clickhouse/schema/databases/001_verifications/010_key_verifications_per_month_mv_v4.sql (1)
4-17: Guard against column-order coupling between MV and target tableWithout an explicit column list, INSERT … SELECT semantics apply: mapping is by position. Any future reorder in the destination will silently corrupt data.
Please confirm the destination table verifications.key_verifications_per_month_v4 has columns in exactly this order:
workspace_id, key_space_id, identity_id, key_id, outcome, count, spent_credits, latency_avg, latency_p75, latency_p99, time, tags.If not, adjust the SELECT order to match, or (preferred) define a view-on-view pattern that inserts with explicit column list. I can generate a script to parse the DDLs and validate order across all MV/targets if helpful.
Also applies to: 18-26
go/pkg/clickhouse/schema/databases/001_verifications/008_key_verifications_per_day_mv_v4.sql (1)
4-17: Verify MV → table column order alignmentSame risk as the monthly MV. Please confirm the destination table column order exactly matches the SELECT order, or update accordingly.
If you want, I can generate a quick repo scan to compare MV SELECT order with table DDLs for all v4 per-day/month/hour artifacts.
Also applies to: 18-26
go/pkg/clickhouse/schema/databases/001_verifications/006_key_verifications_per_hour_mv_v2.sql (1)
1-3: MV column order must match target table exactlyBecause TO … uses positional mapping, any column order mismatch will silently corrupt data. Please verify the destination verifications.key_verifications_per_hour_v2 order matches this SELECT.
I can provide a quick script to parse both DDLs and compare column sequences if you want an automated check.
Also applies to: 4-17, 18-26
tools/local/src/main.ts (1)
66-73: Seed flow should also wait for ClickHouse (if it touches CH) or explicitly note it doesn’tIf seed interacts with ClickHouse (directly or via services), add waitForClickHouse(); otherwise document that seed only targets PlanetScale.
- await startContainers(["planetscale", "clickhouse", "agent"]); + await startContainers(["planetscale", "clickhouse", "agent"]); + await waitForClickHouse();deployment/Dockerfile.clickhouse (1)
6-8: The above diff looks good, but I need to confirm exactly how yourinit-clickhouse.shscript references credentials to ensure the ENV vars line up. Could you please paste the first ~50 lines ofdeployment/init-clickhouse.sh? You can do that with:sed -n '1,50p' deployment/init-clickhouse.shOnce I see which env vars it uses (e.g.
CLICKHOUSE_ADMIN_USER,CLICKHOUSE_ADMIN_PASSWORD, or others), I can verify that your compose file matches and finalize the hardened Dockerfile snippet.go/pkg/clickhouse/schema/databases/002_ratelimits/008_ratelimits_per_day_mv_v2.sql (1)
13-19: Confirmed – keep the millisecond conversionWe’ve verified that in
ratelimits.raw_ratelimits_v2, thetimecolumn is declared asInt64(unix milliseconds) (see002_raw_ratelimits_v2.sql:6). Therefore usingfromUnixTimestamp64Milli(time)beforetoStartOfDayis correct and no change is needed.• File:
go/pkg/clickhouse/schema/databases/002_ratelimits/008_ratelimits_per_day_mv_v2.sql
• Lines: 13–19go/pkg/clickhouse/schema/databases/002_ratelimits/010_ratelimits_per_month_mv_v2.sql (1)
13-14: Ignore DateTime conversion suggestion for per-month MVThe
timecolumn inratelimits.raw_ratelimits_v2is defined asInt64holding Unix milliseconds and not as aDateTime/DateTime64; thus, wrapping it withfromUnixTimestamp64Milli(time)beforetoStartOfMonth()is required:
- In
go/pkg/clickhouse/schema/databases/002_ratelimits/002_raw_ratelimits_v2.sql:6,time Int64 CODEC(Delta, LZ4)indicates an integer epoch milli value.You can leave the existing expression
toStartOfMonth(fromUnixTimestamp64Milli(time)) AS timeas-is.
Likely an incorrect or invalid review comment.
go/pkg/clickhouse/schema/databases/002_ratelimits/006_ratelimits_per_hour_mv_v2.sql (1)
13-13: No change needed:raw.timeis Int64 (milliseconds) so conversion is correctThe
raw_ratelimits_v2.timecolumn is defined asInt64representing Unix milliseconds, so usingfromUnixTimestamp64Milli(time)(and thustoStartOfHour(...)) is required rather than callingtoStartOfHour(time)directly. Thanks for suggesting the check—this conversion is indeed necessary here.go/pkg/clickhouse/schema/databases/002_ratelimits/002_raw_ratelimits_v2.sql (3)
22-22: Validate ORDER BY against your dominant query patterns.Current key is
(workspace_id, time, namespace_id, identifier). If dashboards mostly filter a single workspace and narrow by time across all namespaces/identifiers, this is good. If they typically slice by(workspace_id, namespace_id, identifier)first and then time,(workspace_id, namespace_id, identifier, time)may improve locality.Share expected WHERE/GROUP BY patterns so we can pick the optimal sort key. I can run a few CH “EXPLAIN PIPELINE” examples to confirm.
13-16: Type choices look good; ensure writers send correct types.
passed Boolandlatency Float64are appropriate. Confirm writers actually sendBool/UInt8forpassedand millisecond precision forlatencyto avoid implicit casts.If any producers send strings like "true"/"false", we should normalize at ingestion.
6-6: Efficient time compression codec.
Int64 CODEC(Delta, LZ4)on millisecond epoch is a solid default for time columns.go/pkg/clickhouse/schema/databases/001_verifications/MIGRATION_v1_to_v2.md (1)
21-24: “v2/v4” aggregations mention is confusing.You list “minute/hour/day/month v2/v4”. Are there v4 tables in this PR for verifications? If not, drop “/v4” to avoid operators looking for non-existent tables.
Once confirmed, I’ll update the wording here.
go/pkg/clickhouse/schema/databases/002_ratelimits/004_ratelimits_per_minute_mv_v2.sql (2)
13-19: GROUP BY alias is fine; double-check destination schema compatibility.Grouping by the
timealias is supported. Ensureratelimits.ratelimits_per_minute_v2matches column order/types:(workspace_id String, namespace_id String, identifier String, passed UInt64, total UInt64, latency_avg Float64, latency_p75 Float64, latency_p99 Float64, time DateTime)orDateTime64(3)if you prefer ms.Please run
DESCRIBE TABLE ratelimits.ratelimits_per_minute_v2and confirm types; I’ll adjust if needed.
10-12: Good choice of TDigest for p75/p99.TDigest handles skewed latency distributions well. No changes needed.
go/pkg/clickhouse/schema/databases/002_ratelimits/MIGRATION_v1_to_v2.md (1)
9-10: Retention wording vs DDL mismatch.The doc says “3 months” TTL; the raw table uses
INTERVAL 100 DAY. Align doc and schema.Let me know the intended retention and I’ll update either the doc or the DDL.
deployment/docker-compose.yaml (2)
82-82: Verify image tag “redis:8.0” exists for your environment.Some registries don’t yet publish
redis:8.0. If unavailable, pin to a known-good (e.g.,redis:7.2or7.4).Confirm the tag resolves in your CI runners and dev machines.
53-56: Compose “deploy.replicas” is ignored outside Swarm.If you run plain docker-compose,
deploy:is ignored. If you intend 3 replicas locally, usedocker compose up --scale apiv2=3or switch to Swarm/Stack.Confirm target runtime (Compose vs Swarm) so we can document the right scaling approach.
| await startContainers(["planetscale", "clickhouse", "agent"]); | ||
|
|
||
| const resources = await prepareDatabase(); | ||
| !skipEnv && (await bootstrapApi(resources)); | ||
| break; |
There was a problem hiding this comment.
🧹 Nitpick (assertive)
Same readiness concern for API flow
Ensure ClickHouse (and schema init) is ready before bootstrapping the API.
Apply the same waitForClickHouse() call here. Also confirm the API reads CLICKHOUSE_URL from env; current local bootstrap uses CLICKHOUSE_URL (not prefixed UNKEY_), which may violate our env naming guideline for ts/js files. If that guideline applies here, consider UNKEY_CLICKHOUSE_URL.
🤖 Prompt for AI Agents
In tools/local/src/main.ts around lines 59–63, add a waitForClickHouse() call
before calling bootstrapApi to ensure ClickHouse and its schema are fully ready
(and verify prepareDatabase completes schema init before returning); also verify
the API expects the environment variable name—if our TS/JS env-naming guideline
requires UNKEY_ prefixes, change the local bootstrap to set UNKEY_CLICKHOUSE_URL
(and update any local reads/use of CLICKHOUSE_URL to UNKEY_CLICKHOUSE_URL) so
the API reads the correct prefixed variable; ensure the wait call and env-name
change are propagated where prepareDatabase/bootstrapApi interact.
go/pkg/clickhouse/schema/databases/001_verifications/003_key_verifications_per_minute_v2.sql
Show resolved
Hide resolved
| time DateTime, | ||
| workspace_id String, | ||
| key_space_id String, | ||
| identity_id String, | ||
| key_id String, | ||
| outcome LowCardinality(String), | ||
| tags Array(String), | ||
| count Int64, | ||
| spent_credits Int64, | ||
| latency_avg Float64, | ||
| latency_p75 Float64, | ||
| latency_p99 Float64 | ||
| ) | ||
| ENGINE = SummingMergeTree() | ||
| ORDER BY (workspace_id, time, key_space_id, identity_id, key_id, tags, outcome) |
There was a problem hiding this comment.
🛠️ Refactor suggestion
Use AggregatingMergeTree with stateful metrics; avoid summing finalized averages/quantiles.
Same issue as ratelimits: SummingMergeTree will corrupt latency_* in merges. Also switch counters to unsigned and consider a normalized key for tags.
Proposed DDL:
CREATE TABLE IF NOT EXISTS verifications.key_verifications_per_minute_v2
(
time DateTime,
workspace_id String,
key_space_id String,
identity_id String,
key_id String,
outcome LowCardinality(String),
- tags Array(String),
- count Int64,
- spent_credits Int64,
- latency_avg Float64,
- latency_p75 Float64,
- latency_p99 Float64
+ tags Array(String),
+ tags_hash UInt64 MATERIALIZED cityHash64(arrayStringConcat(arraySort(tags))),
+ count SimpleAggregateFunction(sum, UInt64),
+ spent_credits SimpleAggregateFunction(sum, UInt64),
+ latency_avg_state AggregateFunction(avg, Float64),
+ latency_p75_state AggregateFunction(quantileTDigest(0.75), Float64),
+ latency_p99_state AggregateFunction(quantileTDigest(0.99), Float64)
)
-ENGINE = SummingMergeTree()
-ORDER BY (workspace_id, time, key_space_id, identity_id, key_id, tags, outcome)
+ENGINE = AggregatingMergeTree()
+PARTITION BY toYYYYMM(time)
+ORDER BY (workspace_id, time, key_space_id, identity_id, key_id, tags_hash, outcome)
;
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| time DateTime, | |
| workspace_id String, | |
| key_space_id String, | |
| identity_id String, | |
| key_id String, | |
| outcome LowCardinality(String), | |
| tags Array(String), | |
| count Int64, | |
| spent_credits Int64, | |
| latency_avg Float64, | |
| latency_p75 Float64, | |
| latency_p99 Float64 | |
| ) | |
| ENGINE = SummingMergeTree() | |
| ORDER BY (workspace_id, time, key_space_id, identity_id, key_id, tags, outcome) | |
| CREATE TABLE IF NOT EXISTS verifications.key_verifications_per_minute_v2 | |
| ( | |
| time DateTime, | |
| workspace_id String, | |
| key_space_id String, | |
| identity_id String, | |
| key_id String, | |
| outcome LowCardinality(String), | |
| tags Array(String), | |
| tags_hash UInt64 MATERIALIZED cityHash64(arrayStringConcat(arraySort(tags))), | |
| count SimpleAggregateFunction(sum, UInt64), | |
| spent_credits SimpleAggregateFunction(sum, UInt64), | |
| latency_avg_state AggregateFunction(avg, Float64), | |
| latency_p75_state AggregateFunction(quantileTDigest(0.75), Float64), | |
| latency_p99_state AggregateFunction(quantileTDigest(0.99), Float64) | |
| ) | |
| ENGINE = AggregatingMergeTree() | |
| PARTITION BY toYYYYMM(time) | |
| ORDER BY (workspace_id, time, key_space_id, identity_id, key_id, tags_hash, outcome) | |
| ; |
🤖 Prompt for AI Agents
In
go/pkg/clickhouse/schema/databases/001_verifications/003_key_verifications_per_minute_v2.sql
around lines 3-17, replace the SummingMergeTree with an AggregatingMergeTree and
change types to use stateful aggregates and unsigned counters: convert count and
spent_credits to UInt64, replace latency_avg/latency_p75/latency_p99 with
appropriate AggregateFunction states (e.g., AggregateFunction(avgState, Float64)
and AggregateFunction(quantilesState, Float64) or quantileState variants), stop
storing finalized averages/quantiles as Float64, and adjust the ORDER BY to
avoid putting raw tags there (either remove tags from ORDER BY or use a
normalized tag key) — ensure the DDL stores aggregation states so merges compute
metrics correctly and consider normalizing tags (or moving to a tag mapping
table) before including them in the primary key.
| CREATE MATERIALIZED VIEW IF NOT EXISTS verifications.temp_sync_v1_to_v2 | ||
| TO verifications.raw_key_verifications_v2 | ||
| AS | ||
| SELECT | ||
| request_id, | ||
| time, | ||
| workspace_id, | ||
| key_space_id, | ||
| identity_id, | ||
| key_id, | ||
| region, | ||
| outcome, | ||
| tags, | ||
| 0 as spent_credits, -- v1 doesn't have this column, default to 0 | ||
| 0.0 as latency -- v1 doesn't have this column, default to 0.0 | ||
| FROM verifications.raw_key_verifications_v1; No newline at end of file |
There was a problem hiding this comment.
🧹 Nitpick (assertive)
Same migration guardrails as ratelimits; avoid biasing new latency/credits.
- Consider a time fence to avoid duplicating historical backfills during the live-sync window.
- Emitting 0 for spent_credits and 0.0 for latency systematically skews aggregates. Prefer making these Nullable in v2 raw and emitting NULLs from the temp MV, or track a separate “observed” flag to exclude unknowns from averages/quantiles.
Example fence:
FROM verifications.raw_key_verifications_v1;
+-- If backfilling, constrain the live-sync window:
+-- WHERE time >= toDateTime64('2025-08-26 00:00:00', 3, 'UTC')
Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In
go/pkg/clickhouse/schema/databases/001_verifications/011_temp_sync_v1_to_v2.sql
around lines 5-20, the temp MV currently emits 0 and 0.0 for
spent_credits/latency and has no time fence which risks skewing aggregates and
duplicating historical backfills; change the approach so the v2 raw columns are
nullable (or add an explicit observed boolean) and have the MV emit NULLs (or
observed=false) for these unknown fields, and add a time fence condition (e.g.,
WHERE time >= <live_sync_start_ts>) to avoid replaying historical rows during
the live-sync window; ensure downstream schema is updated to accept NULLs (or
the observed flag) and adjust any aggregate queries to ignore NULLs or filter on
observed.
| - ✅ **Partitioning**: Monthly partitions for better performance and lifecycle management | ||
| - ✅ **TTL**: Automatic data cleanup after 3 months | ||
| - ✅ **Deduplication**: Protection against duplicate inserts during retries | ||
| - ✅ **Sub-millisecond latency**: Float64 for precise latency measurements | ||
| - ✅ **Credit tracking**: New `spent_credits` column | ||
| - ✅ **Better ordering**: Optimized ORDER BY for time-series queries |
There was a problem hiding this comment.
🧹 Nitpick (assertive)
Clarify retention: doc says “3 months” but tables often use “100 DAY”.
Keep the doc precise and consistent with the DDL (use “3 months” only if the table uses INTERVAL 3 MONTH). This avoids misunderstandings in compliance/retention expectations.
I can submit a doc-only patch once you confirm the intended TTL.
🧰 Tools
🪛 LanguageTool
[grammar] ~7-~7: There might be a mistake here.
Context: ...ter performance and lifecycle management - ✅ TTL: Automatic data cleanup after ...
(QB_NEW_EN)
[grammar] ~8-~8: There might be a mistake here.
Context: ...*: Automatic data cleanup after 3 months - ✅ Deduplication: Protection against ...
(QB_NEW_EN)
[grammar] ~9-~9: There might be a mistake here.
Context: ...against duplicate inserts during retries - ✅ Sub-millisecond latency: Float64 f...
(QB_NEW_EN)
[grammar] ~10-~10: There might be a mistake here.
Context: ...Float64 for precise latency measurements - ✅ Credit tracking: New `spent_credit...
(QB_NEW_EN)
[grammar] ~11-~11: There might be a mistake here.
Context: ...t tracking**: New spent_credits column - ✅ Better ordering: Optimized ORDER B...
(QB_NEW_EN)
[grammar] ~12-~12: Use correct spacing
Context: ...timized ORDER BY for time-series queries ## Migration Steps ### Phase 1: Deploy New...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
🤖 Prompt for AI Agents
In go/pkg/clickhouse/schema/databases/001_verifications/MIGRATION_v1_to_v2.md
around lines 7 to 12, the bullet saying “TTL: Automatic data cleanup after 3
months” is inconsistent with the DDL which uses a 100 DAY TTL; update the doc to
match the exact TTL used by the table (e.g., replace “3 months” with “100 DAY”
or change the DDL to use INTERVAL 3 MONTH), and be explicit by quoting the exact
INTERVAL expression used in the table definition so retention language is
precise and unambiguous.
| workspace_id String, | ||
| namespace_id String, | ||
| identifier String, |
There was a problem hiding this comment.
🧹 Nitpick (assertive)
Optional storage optimization: LowCardinality for strings and TTL.
For consistency and storage reduction:
- workspace_id String,
- namespace_id String,
- identifier String,
+ workspace_id LowCardinality(String),
+ namespace_id LowCardinality(String),
+ identifier LowCardinality(String),
Add PARTITION BY and optional TTL as needed by retention policy.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| workspace_id String, | |
| namespace_id String, | |
| identifier String, | |
| -- change these three columns to LowCardinality(String) for storage efficiency | |
| workspace_id LowCardinality(String), | |
| namespace_id LowCardinality(String), | |
| identifier LowCardinality(String), |
| SELECT | ||
| workspace_id, | ||
| namespace_id, | ||
| identifier, | ||
| countIf(passed > 0) as passed, | ||
| count(*) as total, | ||
| avg(latency) as latency_avg, | ||
| quantileTDigest(0.75)(latency) as latency_p75, | ||
| quantileTDigest(0.99)(latency) as latency_p99, | ||
| toStartOfMonth(fromUnixTimestamp64Milli(time)) AS time | ||
| FROM ratelimits.raw_ratelimits_v2 |
There was a problem hiding this comment.
🛠️ Refactor suggestion
Emit states for month MV as well for correctness.
Keep MV outputs consistent across all grains to prevent merge-time skew.
SELECT
workspace_id,
namespace_id,
identifier,
- countIf(passed > 0) as passed,
- count(*) as total,
- avg(latency) as latency_avg,
- quantileTDigest(0.75)(latency) as latency_p75,
- quantileTDigest(0.99)(latency) as latency_p99,
+ sum(toUInt64(passed > 0)) AS passed,
+ count() AS total,
+ avgState(latency) AS latency_avg_state,
+ quantileTDigestState(0.75)(latency) AS latency_p75_state,
+ quantileTDigestState(0.99)(latency) AS latency_p99_state,
toStartOfMonth(fromUnixTimestamp64Milli(time)) AS time
FROM ratelimits.raw_ratelimits_v2
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| SELECT | |
| workspace_id, | |
| namespace_id, | |
| identifier, | |
| countIf(passed > 0) as passed, | |
| count(*) as total, | |
| avg(latency) as latency_avg, | |
| quantileTDigest(0.75)(latency) as latency_p75, | |
| quantileTDigest(0.99)(latency) as latency_p99, | |
| toStartOfMonth(fromUnixTimestamp64Milli(time)) AS time | |
| FROM ratelimits.raw_ratelimits_v2 | |
| SELECT | |
| workspace_id, | |
| namespace_id, | |
| identifier, | |
| sum(toUInt64(passed > 0)) AS passed, | |
| count() AS total, | |
| avgState(latency) AS latency_avg_state, | |
| quantileTDigestState(0.75)(latency) AS latency_p75_state, | |
| quantileTDigestState(0.99)(latency) AS latency_p99_state, | |
| toStartOfMonth(fromUnixTimestamp64Milli(time)) AS time | |
| FROM ratelimits.raw_ratelimits_v2 |
🤖 Prompt for AI Agents
In
go/pkg/clickhouse/schema/databases/002_ratelimits/010_ratelimits_per_month_mv_v2.sql
around lines 4–14, the materialized view for monthly grain only emits aggregated
metrics and is missing the corresponding "state" output columns that other grain
MVs include; add the same state columns to the SELECT (e.g., passed_state,
total_state, latency_avg_state, latency_p75_state, latency_p99_state — actual
names matching the other MVs) and compute/aggregate them the same way those MVs
do, keep the SELECT aggregations and the toStartOfMonth(time) time bucketing
unchanged, and ensure the GROUP BY includes time, workspace_id, namespace_id and
identifier so the monthly MV outputs are consistent with other grains to avoid
merge-time skew.
| CREATE MATERIALIZED VIEW IF NOT EXISTS ratelimits.temp_sync_v1_to_v2 | ||
| TO ratelimits.raw_ratelimits_v2 | ||
| AS | ||
| SELECT | ||
| request_id, | ||
| time, | ||
| workspace_id, | ||
| namespace_id, | ||
| identifier, | ||
| passed, | ||
| 0.0 as latency -- v1 doesn't have this column, default to 0.0 | ||
| FROM ratelimits.raw_ratelimits_v1; No newline at end of file |
There was a problem hiding this comment.
💡 Verification agent
🧩 Analysis chain
Guard against double-writes/backfills; consider Nullable latency.
- If you backfill v2 separately while this MV is active, any concurrent backfill into v1 will also flow into v2, risking duplication. Plan sequencing or add a time fence.
- Encoding unknown latency as 0.0 biases averages/quantiles. Prefer Nullable(Float64) in v2 raw and emit NULL here, or add a separate boolean “latency_known”.
If you adopt a time fence, one pragmatic approach (replace the timestamp at deploy):
FROM ratelimits.raw_ratelimits_v1
+WHERE time >= toDateTime64('2025-08-26 00:00:00', 3, 'UTC')
;
And/or ensure v2 backfill reads from the authoritative source and v1 backfill is paused while the MV is active.
🏁 Script executed:
#!/bin/bash
# List all SQL migration files in the ratelimits directory
echo "=== Migration files ==="
ls go/pkg/clickhouse/schema/databases/002_ratelimits
# Find where the raw_ratelimits_v2 table is created
echo -e "\n=== CREATE TABLE raw_ratelimits_v2 context ==="
rg -nP "CREATE TABLE.*raw_ratelimits_v2" -C5 go/pkg/clickhouse/schema/databases/002_ratelimits
# Show the column definitions for raw_ratelimits_v2, focusing on the latency column
echo -e "\n=== Column definitions in raw_ratelimits_v2 ==="
rg -nP "raw_ratelimits_v2" -C5 go/pkg/clickhouse/schema/databases/002_ratelimits | rg -P "latency"
Length of output: 3849
🏁 Script executed:
#!/bin/bash
# Inspect raw_ratelimits_v2 schema for latency and time types
echo "=== raw_ratelimits_v2 CREATE TABLE ==="
sed -n '1,50p' go/pkg/clickhouse/schema/databases/002_ratelimits/002_raw_ratelimits_v2.sql
echo -e "\n=== raw_ratelimits_v2 latency & time columns ==="
grep -nE 'latency|time' go/pkg/clickhouse/schema/databases/002_ratelimits/002_raw_ratelimits_v2.sql
# Inspect raw_ratelimits_v1 schema for time type
echo -e "\n=== raw_ratelimits_v1 CREATE TABLE ==="
sed -n '1,50p' go/pkg/clickhouse/schema/databases/002_ratelimits/001_raw_ratelimits_v1.sql
echo -e "\n=== raw_ratelimits_v1 time column ==="
grep -nE 'time' go/pkg/clickhouse/schema/databases/002_ratelimits/001_raw_ratelimits_v1.sql
Length of output: 2559
🏁 Script executed:
#!/bin/bash
# Display the migration guide for v1 to v2 to inspect backfill instructions
sed -n '1,200p' go/pkg/clickhouse/schema/databases/002_ratelimits/MIGRATION_v1_to_v2.md
# Search specifically for any backfill sequencing guidance
grep -nC2 'backfill' go/pkg/clickhouse/schema/databases/002_ratelimits/MIGRATION_v1_to_v2.md
Length of output: 5474
Update MV to prevent backfill duplicates and support nullable latency
Please strengthen the temporary sync view in go/pkg/clickhouse/schema/databases/002_ratelimits/013_temp_sync_v1_to_v2.sql to avoid duplicate records and avoid biasing latency metrics:
• Add a time‐fence filter to the MV’s SELECT so it only ingests new rows after deployment. For example:
CREATE MATERIALIZED VIEW IF NOT EXISTS ratelimits.temp_sync_v1_to_v2
TO ratelimits.raw_ratelimits_v2 AS
SELECT
request_id,
time,
workspace_id,
namespace_id,
identifier,
passed,
NULL AS latency -- switch to NULLable latency
FROM ratelimits.raw_ratelimits_v1
WHERE time >= toUnixTimestamp64Milli(toDateTime('2025-08-26 00:00:00'))
;
• Change raw_ratelimits_v2.latency to Nullable(Float64) (e.g.
ALTER TABLE ratelimits.raw_ratelimits_v2
MODIFY COLUMN latency Nullable(Float64);
) and have the MV emit NULL for unknown latency instead of 0.0.
• (Optional) Alternatively, if you want to keep latency non-nullable, add a boolean latency_known flag in v2 to distinguish backfilled rows.
These updates will (1) prevent overlapping backfill batches from being re-streamed by the MV and (2) avoid skewing average/quantile calculations with default zeros.
🤖 Prompt for AI Agents
In go/pkg/clickhouse/schema/databases/002_ratelimits/013_temp_sync_v1_to_v2.sql
around lines 5-16, the materialized view currently emits 0.0 for latency and
will re-ingest historical rows causing duplicate/backfill overlap; update the MV
SELECT to only ingest rows after deployment by adding a WHERE time >=
toUnixTimestamp64Milli(toDateTime('<DEPLOY_TS>')) filter, emit NULL for latency
instead of 0.0, and ensure the target column is nullable by changing
raw_ratelimits_v2.latency to Nullable(Float64) via an ALTER TABLE MODIFY COLUMN
statement (or alternatively add a latency_known boolean if you must keep latency
non-nullable).
| ### Phase 2: Backfill Historical Data | ||
|
|
||
| **⚠️ Important**: The temporary materialized view (`temp_sync_v1_to_v2`) is now active and syncing all NEW data from v1 to v2. You can safely backfill historical data while new writes continue. | ||
|
|
||
| Run these queries **manually** to backfill historical data in chunks: | ||
|
|
There was a problem hiding this comment.
🛠️ Refactor suggestion
Add a high-water-mark and grace window to the backfill plan.
Same rationale as verifications: define HWM at start, backfill < HWM, then after cutover run a small final catch-up (e.g., +5 minutes) before dropping the temp sync MV.
I can add the snippet to the doc.
🧰 Tools
🪛 LanguageTool
[grammar] ~31-~31: Use correct spacing
Context: ...` ### Phase 2: Backfill Historical Data
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~33-~33: Use correct spacing
Context: ...storical data while new writes continue. Run these queries manually to backfi...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~35-~35: Use correct spacing
Context: ...* to backfill historical data in chunks: #### Step 1: Find your data range ```sql SELE...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
🤖 Prompt for AI Agents
In go/pkg/clickhouse/schema/databases/002_ratelimits/MIGRATION_v1_to_v2.md
around lines 31-36, the backfill instructions lack a high-water-mark (HWM) and a
grace window; update the plan to compute and record an HWM timestamp at the
start, restrict chunked backfills to rows with timestamps strictly less than the
HWM, and after cutover run a final catch-up pass that backfills data up to HWM +
5 minutes (grace window) before dropping the temporary materialized view; add a
short example SQL snippet showing how to set the HWM, use it in WHERE clauses
for chunked backfills, and run the final catch-up with the +5 minute grace
window.
| ## New Aggregation Metrics Available | ||
|
|
||
| With v2, all aggregation tables now include: | ||
| - `latency_avg`: Average ratelimit check latency | ||
| - `latency_p75`: 75th percentile latency | ||
| - `latency_p99`: 99th percentile latency | ||
|
|
||
| Perfect for monitoring ratelimit performance and identifying bottlenecks! | ||
|
|
There was a problem hiding this comment.
🧹 Nitpick (assertive)
Call out minute bucketing explicitly for dashboards.
Since v2 exposes latency_avg/p75/p99 per-minute/hour/day/month, add an example query that reads one bucket to help dashboard authors and validate time bucketing.
I can add a “Sample queries” section with a per-minute and per-hour example.
🧰 Tools
🪛 LanguageTool
[grammar] ~137-~137: Use correct spacing
Context: ...cy ## New Aggregation Metrics Available With v2, all aggregation tables now incl...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~139-~139: There might be a mistake here.
Context: ... v2, all aggregation tables now include: - latency_avg: Average ratelimit check latency - `lat...
(QB_NEW_EN)
[grammar] ~140-~140: There might be a mistake here.
Context: ...s now include: - latency_avg: Average ratelimit check latency - latency_p75: 75th per...
(QB_NEW_EN_OTHER)
[grammar] ~140-~140: There might be a mistake here.
Context: ...cy_avg: Average ratelimit check latency - latency_p75: 75th percentile latency - latency_p99...
(QB_NEW_EN)
[grammar] ~141-~141: There might be a mistake here.
Context: ...- latency_p75: 75th percentile latency - latency_p99: 99th percentile latency Perfect for m...
(QB_NEW_EN)
[grammar] ~142-~142: Use correct spacing
Context: ...- latency_p99: 99th percentile latency Perfect for monitoring ratelimit perform...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~144-~144: There might be a mistake here.
Context: ...centile latency Perfect for monitoring ratelimit performance and identifying bottlenecks...
(QB_NEW_EN_OTHER)
[grammar] ~144-~144: Use correct spacing
Context: ...performance and identifying bottlenecks! ## Questions? - Check the temp sync view i...
(QB_NEW_EN_OTHER_ERROR_IDS_5)

What does this PR do?
The main goal was to add credit tracking and add latencies, cause we need both in the dashboard.
This PR adds new raw tables for verifications and ratelimits to include the required columns and optimises the table settings.
This only adds tables, no code reads or writes to them yet, however there is a materialized view that populates the new table in real time.
I have already migrated our preview cluster and want to let it bake for a few days before migrating prod.
It also replaces the Goose migration system for ClickHouse with a Docker-based initialization approach. Instead of running migrations with Goose, the ClickHouse schema is now initialized directly during container startup using SQL files mounted into the container.
Key changes:
@perkinsjr for your PR we need to merge this one first, and then revert the schema changes in yours, then we should be good to go. I can do that if you want, lmk.
Type of change
How should this be tested?
make upand verify ClickHouse initializes correctlyChecklist
Required
pnpm buildpnpm fmtconsole.logsgit pull origin mainSummary by CodeRabbit
New Features
Dev Experience
Tests
Documentation
Chores