fix: do not send 4x as many rows to clickhouse#3161
Conversation
We now run clickhouse in integration tests and assert the correct rows are inserted.
|
|
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
|
Caution Review failedThe pull request is closed. 📝 WalkthroughWalkthroughThis change introduces comprehensive ClickHouse integration across the Go codebase, including infrastructure, schema, test harness, and API layers. It adds embedded SQL migrations for ClickHouse, creates new databases and tables for verifications, ratelimits, metrics, billing, and business analytics, and implements materialized views for efficient aggregation. The test harness and container orchestration now support ClickHouse alongside MySQL, with updated configuration and environment handling. New generic query utilities and direct ClickHouse connection accessors are provided. Integration and route tests are refactored to validate ClickHouse persistence, and batch processing logic is improved for concurrency and encapsulation. Changes
Sequence Diagram(s)sequenceDiagram
participant Test as Test Harness
participant MySQL as MySQL Container
participant CH as ClickHouse Container
participant API as API Cluster
Test->>CH: Start ClickHouse container & run embedded migrations
Note right of CH: Embedded SQL files executed
Test->>CH: Initialize ClickHouse client
Test->>MySQL: Start MySQL container
Test->>MySQL: Initialize MySQL client
Test->>API: Start API cluster with MySQL & ClickHouse DSNs
API->>CH: Persist and query ratelimit, metrics, etc.
API->>MySQL: Persist and query relational data
Test->>CH: Query ClickHouse for test assertions
Possibly related PRs
Suggested reviewers
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 golangci-lint (1.64.8)Error: you are using a configuration file for golangci-lint v2 with golangci-lint v1: please use golangci-lint v2 📜 Recent review detailsConfiguration used: CodeRabbit UI ⛔ Files ignored due to path filters (1)
📒 Files selected for processing (1)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
|
niceeee. we have a proper namespacing/folder structure for schemas :praise-the-unkey: |
There was a problem hiding this comment.
Actionable comments posted: 6
🔭 Outside diff range comments (1)
go/pkg/clickhouse/schema/databases/002_ratelimits/010_ratelimits_last_used_v1.sql (1)
1-12:⚠️ Potential issueCritical: Table column type mismatch for AggregatingMergeTree
TheAggregatingMergeTreeengine requires the column storing an aggregate state (e.g., frommaxSimpleState) to be of typeAggregateFunction. Definingtime Int64will fail when the materialized view tries to INSERT a state.Please update the
timecolumn to the correct state type. For example:CREATE TABLE IF NOT EXISTS ratelimits.ratelimits_last_used_v1 ( - time Int64, + time AggregateFunction(maxSimpleState, Int64), workspace_id String, namespace_id String, identifier String ) ENGINE = AggregatingMergeTree() ORDER BY (workspace_id, namespace_id, time, identifier);
🧹 Nitpick comments (40)
go/pkg/clickhouse/schema/databases/005_business/001_database.sql (1)
1-2: Unify SQL formatting for consistency.Here the semicolon is placed on its own line, but other database migrations (e.g.,
verifications,ratelimits,metrics) include the semicolon on the same line. To maintain consistency, consider combining into a single statement:CREATE DATABASE IF NOT EXISTS business;go/pkg/clickhouse/noop.go (1)
15-16: Remove duplicate interface assertionThe interface assertion
var _ Bufferer = (*noop)(nil)appears twice consecutively on lines 15 and 16.var _ Bufferer = (*noop)(nil) -var _ Bufferer = (*noop)(nil)go/pkg/clickhouse/schema/databases/005_business/002_active_workspaces_per_month_v1.sql (1)
1-8: Consider optimizing table structure for query patternsThe table definition looks good for basic monthly workspace tracking. However, consider the following optimizations:
- No explicit PRIMARY KEY is defined (it defaults to the ORDER BY key). If you frequently query by
workspace_id, consider adding a secondary index or changing the sorting key.- There's no TTL (Time-To-Live) specified. If this data doesn't need to be retained indefinitely, adding a TTL could help manage storage.
- Consider specifying compression options if this table will store significant data.
CREATE TABLE IF NOT EXISTS business.active_workspaces_per_month_v1 ( time Date, workspace_id String ) ENGINE = MergeTree() ORDER BY (time) +PRIMARY KEY (time) +SETTINGS index_granularity = 8192 ;go/pkg/clickhouse/client.go (2)
66-68: Be cautious with connection URL loggingAdding connection information logging is good for debugging, but be aware that the URL might contain sensitive information such as credentials. Consider masking sensitive parts of the URL in production environments.
-config.Logger.Info("initializing clickhouse client", - "url", config.URL, -) +// Parse the URL to mask credentials if present +parsedURL, err := url.Parse(config.URL) +if err == nil && parsedURL.User != nil { + parsedURL.User = url.UserPassword(parsedURL.User.Username(), "******") +} +config.Logger.Info("initializing clickhouse client", + "url", parsedURL.String(), +)
254-256: Exposing raw connection increases flexibility but requires cautionAdding the
Conn()method provides flexibility for direct database access, which is useful for custom queries and testing. However, it also allows bypassing the client's abstractions and safeguards.Consider documenting usage guidelines in the method comment to ensure proper use:
+// Conn returns the underlying ClickHouse connection. +// Use with caution as it bypasses the client's batch processing and other safeguards. +// Primarily intended for direct queries and testing purposes. func (c *clickhouse) Conn() ch.Conn { return c.conn }go/pkg/clickhouse/schema/databases/005_business/003_active_workspaces_ratelimits_per_month_mv_v1.sql (1)
1-8: Well-structured materialized view for workspace analyticsThis materialized view effectively aggregates data from the ratelimits table into the business analytics table. The implementation follows good practices for maintaining derived data automatically.
One minor style suggestion: consider removing the blank line at the beginning of the file for consistency with other SQL files.
- CREATE MATERIALIZED VIEW IF NOT EXISTS business.active_workspaces_ratelimits_per_month_mv_v1 TO business.active_workspaces_per_month_v1 AS SELECT workspace_id, toDate(time) as time FROM ratelimits.ratelimits_per_month_v1 ;go/pkg/clickhouse/schema/databases/001_verifications/016_key_verifications_per_hour_mv_v3.sql (1)
1-23: Verify SQL dialect support for alias in GROUP BY or use full expression
ClickHouse may not allow grouping by thetimealias; it’s safer to use the fulltoStartOfHour(fromUnixTimestamp64Milli(time))expression. Consider this optional refactor to avoid runtime errors:GROUP BY workspace_id, key_space_id, identity_id, key_id, outcome, - time, + toStartOfHour(fromUnixTimestamp64Milli(time)), tags ;Please confirm your ClickHouse version supports alias-based grouping or apply the above change.
go/pkg/clickhouse/schema/databases/004_billing/005_billalble_verifications_v2_mv.sql (1)
2-15: Use explicit expressions instead of alias in GROUP BY for compatibility
Like above, grouping onyearandmonthaliases may not be accepted in all ClickHouse versions. To ensure portability, reference the functions directly:GROUP BY workspace_id, - year, - month + toYear(time), + toMonth(time) ;This guarantees the view compiles across ClickHouse releases.
go/pkg/clickhouse/schema/databases/002_ratelimits/004_ratelimits_per_hour_v1.sql (1)
1-14: Consider adding a PARTITION BY clause for better data lifecycle management
Currently the table is ordered by(workspace_id, namespace_id, time, identifier), but lacks partitioning. Introducing aPARTITION BY toYYYYMMDD(time)(or monthly) can greatly improve merge performance and enable TTL-based cleanup.ENGINE = SummingMergeTree() +PARTITION BY toYYYYMMDD(time) ORDER BY (workspace_id, namespace_id, time, identifier)Partitioning by day (or month) will help manage retention and speed up queries against date ranges.
go/pkg/clickhouse/schema/databases/001_verifications/014_key_verifications_per_month_mv_v2.sql (1)
1-22: Verify column mappings and consider aliasing for clarityThis MV writes into
verifications.key_verifications_per_month_v2. Please confirm that the SELECT output column order and names align exactly with the target table’s schema (e.g.,(workspace_id, key_space_id, identity_id, key_id, outcome, count, time, tags)). Additionally, consider renaming thecountalias to a more descriptive name (e.g.,verification_count) to avoid shadowing the SQL function and improve readability.go/pkg/clickhouse/schema/databases/004_billing/003_billable_verifications_per_month_mv_v1.sql (1)
1-15: Consider renamingcountalias for clarityThe view currently uses
count(*) AS count, which shadows the built‑in SQL function and can be confusing downstream. It’s recommended to use a descriptive alias such asbillable_countorverification_countto clearly convey its intent in the billing schema.go/pkg/clickhouse/schema/databases/002_ratelimits/008_ratelimits_per_month_v1.sql (1)
1-14: Consider adding partitioning to the SummingMergeTree
Right now the table is unpartitioned, which could lead to a large number of partitions as it grows. For monthly aggregates, adding aPARTITION BY toYYYYMM(time)(or equivalent) can improve query performance and data lifecycle management.go/pkg/clickhouse/schema/databases/002_ratelimits/009_ratelimits_per_month_mv_v1.sql (1)
1-18: Optional: Partitioning recommendation for materialized view target
Although this is a view, ensure that the target tableratelimits_per_month_v1(defined in008_ratelimits_per_month_v1.sql) uses partitioning by month to optimize storage and queries.go/pkg/clickhouse/schema/databases/002_ratelimits/001_raw_ratelimits_v1.sql (1)
3-4: Consider using DateTime instead of Int64 for timeStoring time as Int64 unix milliseconds works, but consider using ClickHouse's native DateTime/DateTime64 type. This would enable more readable queries and direct use of ClickHouse's time functions without conversion.
- -- unix milli - time Int64, + time DateTime64(3), -- millisecond precisiongo/pkg/clickhouse/schema/databases/001_verifications/006_key_verifications_per_day_mv_v1.sql (3)
4-11: Avoid shadowing input column and improve clarityAlias
timehere shadows the originaltimefield fromraw_key_verifications_v1. Consider renaming the alias to something likeday_startordate_bucketto clearly signal that this is the bucketed timestamp.
10-11: Rename aggregated count column for clarityUsing
countas both the function and the output column name can be confusing. Renaming the alias toverification_count(or similar) will make the schema more self-documenting.
1-20: Consider backfilling existing data withPOPULATEBy default, ClickHouse materialized views only capture new inserts. If you need this view to aggregate historical rows already in
raw_key_verifications_v1, you should useCREATE MATERIALIZED VIEW … POPULATEor run a one‑off backfill.go/pkg/clickhouse/schema/databases/003_metrics/007_api_requests_per_day_v1.sql (3)
1-11: AddPARTITION BYto manage large partitionsCurrently this table will use the primary key as the partitioning key, leading to unbounded partitions over time. For daily aggregates, consider:
PARTITION BY toYYYYMM(time)to keep monthly partitions manageable.
3-4: EvaluateLowCardinalityon high‑repetition columnsColumns like
workspace_idandhostmay have low unique value counts. Marking them asLowCardinality(String)can reduce memory usage and speed up queries:- workspace_id String, - host String, + workspace_id LowCardinality(String), + host LowCardinality(String),
1-20: Implement TTL to enforce retention policyTo automatically purge stale metrics, add a TTL clause such as:
TTL time + INTERVAL 90 DAYThis helps control disk usage for rolling‑window analytics.
go/pkg/clickhouse/schema/databases/001_verifications/013_key_verifications_per_month_v2.sql (3)
1-11: DefinePARTITION BYfor monthly aggregatesWithout an explicit partition key, all data lands in a single partition. For a monthly summary table, add:
PARTITION BY toYYYYMM(time)This keeps partitions aligned with calendar boundaries.
1-10: Optimizetagscolumn withLowCardinalityIf individual tag values repeat frequently, using
Array(LowCardinality(String))can significantly reduce index size:- tags Array(String), + tags Array(LowCardinality(String)),
1-14: Add TTL for long‑term data cleanupEven monthly aggregates can grow unbounded. Consider:
TTL time + INTERVAL 1 YEARto automatically remove data older than your retention period.
go/pkg/clickhouse/schema/databases/001_verifications/017_key_verifications_per_day_v3.sql (3)
1-11: IntroducePARTITION BYfor daily summaryTo avoid a single ever‑growing partition, partition by month (or day):
PARTITION BY toYYYYMM(time)This optimizes merges and query performance.
1-10: UseLowCardinalityontagsarray elementsIf tag values are drawn from a small set, switch to:
- tags Array(String), + tags Array(LowCardinality(String)),to shrink index size.
1-14: Set a TTL for daily aggregatesHourly and daily summaries may only be needed short‑term. Add, for example:
TTL time + INTERVAL 30 DAYto auto‑purge stale buckets.
go/pkg/clickhouse/schema/databases/001_verifications/015_key_verifications_per_hour_v3.sql (3)
1-11: Partition hourly data to control shard sizeDefine a partition key, for example:
PARTITION BY toYYYYMM(time)so that each month’s hourly buckets live in their own partitions.
1-10: ApplyLowCardinalitytotagsarray if appropriateFor sparse tag sets, use:
- tags Array(String), + tags Array(LowCardinality(String)),to reduce the memory footprint.
1-14: Add TTL for short‑term hourly summariesConsider:
TTL time + INTERVAL 7 DAYto automatically expire hourly aggregates after one week.
go/pkg/clickhouse/schema/databases/000_README.md (1)
18-20: Minor formatting inconsistency in the backtick usage.There's a slight formatting inconsistency in the backtick usage around the prefix examples.
- - `raw_`: Input data tables - - `tmp_{yourname}_`: Temporary tables for experiments, add your name, so it's easy to identify ownership. + - `raw_` - Input data tables + - `tmp_{yourname}_` - Temporary tables for experiments, add your name, so it's easy to identify ownership.🧰 Tools
🪛 LanguageTool
[uncategorized] ~18-~18: Loose punctuation mark.
Context: ...tion][version]### Prefixes -raw: Input data tables -tmp_{yourname}_`: ...(UNLIKELY_OPENING_PUNCTUATION)
go/apps/api/integration/multi_node_ratelimiting/run.go (3)
202-217: Fix step numbering in comments.This section is labeled as "Step 6" but should be "Step 7" since there's already a Step 6 above.
-// Step 6: Verify Clickhouse Metrics Data +// Step 7: Verify Clickhouse Metrics Data
208-208: Consider using a named query parameter for workspace_id.This line uses string formatting for the workspace ID, which is inconsistent with the parameterized approach used in the previous query. Consider using named parameters here as well for consistency and to avoid potential SQL injection issues.
-row := h.CH.Conn().QueryRow(ctx, fmt.Sprintf(`SELECT count(*) as total_requests, count(DISTINCT request_id) as unique_requests FROM metrics.raw_api_requests_v1 WHERE workspace_id = '%s';`, h.Resources().UserWorkspace.ID)) +row := h.CH.Conn().QueryRow(ctx, `SELECT count(*) as total_requests, count(DISTINCT request_id) as unique_requests FROM metrics.raw_api_requests_v1 WHERE workspace_id = {workspace_id:String};`, + map[string]string{ + "workspace_id": h.Resources().UserWorkspace.ID, + })
214-214: Simplify boolean expressions by using direct equality checks.The boolean expression can be simplified by directly comparing the values rather than applying the equality comparison to the result of the comparison.
-return metricsCount == uint64(totalRequests) && uniqueCount == uint64(totalRequests) // nolint:gosec +return metricsCount == uint64(totalRequests) && uniqueCount == uint64(totalRequests) // nolint:gosecNo actual change needed here, just noting that the expression is already optimized.
go/pkg/testutil/containers/clickhouse.go (3)
17-43: Docs are out‑of‑sync with the actual APIThe comment block still promises that the function returns a
driver.Connplus one DSN, but the signature now returns two DSNs and no connection. This can mislead future callers and reviewers.Please revise the godoc or reinstate the original return value for consistency.
101-122: Retry loop could leak ifPinghangs longer than the contextMinor: the context inside
Pinghas a 10 s timeout, while the outerRetryhas none.
If ClickHouse never comes up the goroutine will keep spinning until the test‑timeout kicks in.
Consider adding a max‑retry duration or wrapping the entireRetrywith a parent context to avoid very long hangs.
132-179: SQL splitting by;is brittle
runClickHouseMigrationsnaïvely splits on;, which breaks on:
- Embedded semicolons in string literals / comments
- PL/SQL style multiline statements (future‑proofing)
A safer option is to use
clickhouse.Conn.Execwithclient.MultiStatementenabled or a SQL parser.If you keep the current approach, guard against empty statements and trim comments.
go/pkg/clickhouse/schema/databases/003_metrics/002_raw_api_requests_v1.sql (2)
4-4: PreferDateTime64(3)over rawInt64for timestampsStoring Unix‑milliseconds as
Int64makes time‑based functions, partition pruning and human querying harder.
DateTime64(3)keeps the same precision, is automatically understood by ClickHouse’s date functions, and compresses better.
29-30: UseIPv6data type forip_addressClickHouse has a dedicated
IPv6(which also stores IPv4) providing better validation and sorting.
Storing as plainStringloses this benefit and wastes space.go/pkg/clickhouse/schema/databases/001_verifications/002_raw_key_verifications_v1.sql (2)
6-6: Timestamp column typeSame comment as for the metrics table – switch
time Int64→time DateTime64(3)for native support of ClickHouse date functions.
21-22: ConsiderEnum8foroutcomeThe
outcomecolumn has a fixed, known set.
Enum8('VALID'=1,'RATE_LIMITED'=2,'EXPIRED'=3,'DISABLED'=4)is faster, smaller and self‑documenting compared toLowCardinality(String).
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
go/go.sumis excluded by!**/*.sum
📒 Files selected for processing (68)
go/apps/api/integration/harness.go(4 hunks)go/apps/api/integration/multi_node_ratelimiting/run.go(3 hunks)go/apps/api/routes/v2_ratelimit_limit/200_test.go(9 hunks)go/cmd/api/main.go(2 hunks)go/go.mod(5 hunks)go/pkg/batch/process.go(2 hunks)go/pkg/clickhouse/client.go(4 hunks)go/pkg/clickhouse/interface.go(2 hunks)go/pkg/clickhouse/noop.go(2 hunks)go/pkg/clickhouse/schema/databases/000_README.md(1 hunks)go/pkg/clickhouse/schema/databases/001_verifications/001_database.sql(1 hunks)go/pkg/clickhouse/schema/databases/001_verifications/002_raw_key_verifications_v1.sql(1 hunks)go/pkg/clickhouse/schema/databases/001_verifications/003_key_verifications_per_hour_v1.sql(1 hunks)go/pkg/clickhouse/schema/databases/001_verifications/004_key_verifications_per_hour_mv_v1.sql(1 hunks)go/pkg/clickhouse/schema/databases/001_verifications/005_key_verifications_per_day_v1.sql(1 hunks)go/pkg/clickhouse/schema/databases/001_verifications/006_key_verifications_per_day_mv_v1.sql(1 hunks)go/pkg/clickhouse/schema/databases/001_verifications/007_key_verifications_per_month_v1.sql(1 hunks)go/pkg/clickhouse/schema/databases/001_verifications/008_key_verifications_per_month_mv_v1.sql(1 hunks)go/pkg/clickhouse/schema/databases/001_verifications/009_key_verifications_per_hour_v2.sql(1 hunks)go/pkg/clickhouse/schema/databases/001_verifications/010_key_verifications_per_hour_mv_v2.sql(1 hunks)go/pkg/clickhouse/schema/databases/001_verifications/011_key_verifications_per_day_v2.sql(1 hunks)go/pkg/clickhouse/schema/databases/001_verifications/012_key_verifications_per_day_mv_v2.sql(1 hunks)go/pkg/clickhouse/schema/databases/001_verifications/013_key_verifications_per_month_v2.sql(1 hunks)go/pkg/clickhouse/schema/databases/001_verifications/014_key_verifications_per_month_mv_v2.sql(1 hunks)go/pkg/clickhouse/schema/databases/001_verifications/015_key_verifications_per_hour_v3.sql(1 hunks)go/pkg/clickhouse/schema/databases/001_verifications/016_key_verifications_per_hour_mv_v3.sql(1 hunks)go/pkg/clickhouse/schema/databases/001_verifications/017_key_verifications_per_day_v3.sql(1 hunks)go/pkg/clickhouse/schema/databases/001_verifications/018_key_verifications_per_day_mv_v3.sql(1 hunks)go/pkg/clickhouse/schema/databases/001_verifications/019_key_verifications_per_month_v3.sql(1 hunks)go/pkg/clickhouse/schema/databases/001_verifications/020_key_verifications_per_month_mv_v3.sql(1 hunks)go/pkg/clickhouse/schema/databases/002_ratelimits/000_database.sql(1 hunks)go/pkg/clickhouse/schema/databases/002_ratelimits/001_raw_ratelimits_v1.sql(1 hunks)go/pkg/clickhouse/schema/databases/002_ratelimits/002_ratelimits_per_minute_v1.sql(1 hunks)go/pkg/clickhouse/schema/databases/002_ratelimits/003_ratelimits_per_minute_mv_v1.sql(1 hunks)go/pkg/clickhouse/schema/databases/002_ratelimits/004_ratelimits_per_hour_v1.sql(1 hunks)go/pkg/clickhouse/schema/databases/002_ratelimits/005_ratelimits_per_hour_mv_v1.sql(1 hunks)go/pkg/clickhouse/schema/databases/002_ratelimits/006_ratelimits_per_day_v1.sql(1 hunks)go/pkg/clickhouse/schema/databases/002_ratelimits/007_ratelimits_per_day_mv_v1.sql(1 hunks)go/pkg/clickhouse/schema/databases/002_ratelimits/008_ratelimits_per_month_v1.sql(1 hunks)go/pkg/clickhouse/schema/databases/002_ratelimits/009_ratelimits_per_month_mv_v1.sql(1 hunks)go/pkg/clickhouse/schema/databases/002_ratelimits/010_ratelimits_last_used_v1.sql(1 hunks)go/pkg/clickhouse/schema/databases/002_ratelimits/011_ratelimits_last_used_mv_v1.sql(1 hunks)go/pkg/clickhouse/schema/databases/003_metrics/001_database.sql(1 hunks)go/pkg/clickhouse/schema/databases/003_metrics/002_raw_api_requests_v1.sql(1 hunks)go/pkg/clickhouse/schema/databases/003_metrics/003_api_requests_per_hour_v1.sql(1 hunks)go/pkg/clickhouse/schema/databases/003_metrics/004_api_requests_per_hour_mv_v1.sql(1 hunks)go/pkg/clickhouse/schema/databases/003_metrics/005_api_requests_per_minute_v1.sql(1 hunks)go/pkg/clickhouse/schema/databases/003_metrics/006_api_requests_per_minute_mv_v1.sql(1 hunks)go/pkg/clickhouse/schema/databases/003_metrics/007_api_requests_per_day_v1.sql(1 hunks)go/pkg/clickhouse/schema/databases/003_metrics/008_api_requests_per_day_mv_v1.sql(1 hunks)go/pkg/clickhouse/schema/databases/004_billing/001_database.sql(1 hunks)go/pkg/clickhouse/schema/databases/004_billing/002_billable_verifications_per_month_v1.sql(1 hunks)go/pkg/clickhouse/schema/databases/004_billing/003_billable_verifications_per_month_mv_v1.sql(1 hunks)go/pkg/clickhouse/schema/databases/004_billing/004_billable_verifications_v2.sql(1 hunks)go/pkg/clickhouse/schema/databases/004_billing/005_billalble_verifications_v2_mv.sql(1 hunks)go/pkg/clickhouse/schema/databases/004_billing/006_billable_ratelimits_v1.sql(1 hunks)go/pkg/clickhouse/schema/databases/004_billing/006_billable_ratelimits_v1_mv.sql(1 hunks)go/pkg/clickhouse/schema/databases/005_business/001_database.sql(1 hunks)go/pkg/clickhouse/schema/databases/005_business/002_active_workspaces_per_month_v1.sql(1 hunks)go/pkg/clickhouse/schema/databases/005_business/003_active_workspaces_keys_per_month_mv_v1.sql(1 hunks)go/pkg/clickhouse/schema/databases/005_business/003_active_workspaces_ratelimits_per_month_mv_v1.sql(1 hunks)go/pkg/clickhouse/schema/embed.go(1 hunks)go/pkg/clickhouse/select.go(1 hunks)go/pkg/codes/constants_gen.go(1 hunks)go/pkg/testutil/containers/api.go(3 hunks)go/pkg/testutil/containers/clickhouse.go(5 hunks)go/pkg/testutil/http.go(2 hunks)go/pkg/testutil/seed/seed.go(0 hunks)
💤 Files with no reviewable changes (1)
- go/pkg/testutil/seed/seed.go
🧰 Additional context used
🧬 Code Graph Analysis (2)
go/pkg/clickhouse/client.go (1)
go/pkg/otel/logging/interface.go (1)
Logger(11-116)
go/pkg/testutil/containers/clickhouse.go (2)
go/pkg/testutil/containers/containers.go (1)
Containers(13-17)go/pkg/clickhouse/schema/embed.go (1)
Migrations(10-10)
🪛 LanguageTool
go/pkg/clickhouse/schema/databases/000_README.md
[uncategorized] ~18-~18: Loose punctuation mark.
Context: ...tion][version] ### Prefixes -raw: Input data tables - tmp_{yourname}_`: ...
(UNLIKELY_OPENING_PUNCTUATION)
🔇 Additional comments (67)
go/pkg/codes/constants_gen.go (1)
2-2: Generated timestamp update is expected.This file is auto‑generated by
generate.go; updating theGenerated attimestamp reflects a fresh regeneration. No manual edits are required.go/pkg/clickhouse/schema/databases/001_verifications/001_database.sql (1)
1-1: Good: Idempotent database creation.The
CREATE DATABASE IF NOT EXISTS verifications;statement correctly ensures theverificationsdatabase is only created once. No changes needed.go/pkg/clickhouse/schema/databases/002_ratelimits/000_database.sql (1)
1-1: Good: Idempotent database creation.The
CREATE DATABASE IF NOT EXISTS ratelimits;statement is correctly formed for idempotent schema setup. No adjustments required.go/pkg/clickhouse/schema/databases/003_metrics/001_database.sql (1)
1-1: Good: Idempotent database creation.The
CREATE DATABASE IF NOT EXISTS metrics;statement is properly defined, ensuring idempotent application of the migration.go/pkg/clickhouse/schema/databases/004_billing/001_database.sql (1)
1-2: Looks good, proper usage of IF NOT EXISTSThe SQL statement for creating the billing database is correctly using the
IF NOT EXISTSclause to ensure idempotency, which is essential for migrations that may be run multiple times.go/pkg/clickhouse/noop.go (2)
6-6: Looks good, proper import for ClickHouse driverAdding the ClickHouse Go driver import supports the new interface method.
43-45: Implementation aligns with interface requirementsThe
Conn()method correctly implements the interface by returningnil, which is appropriate for a no-op implementation where the actual connection isn't needed.go/cmd/api/main.go (2)
128-128: Good cleanup removing redundant type conversionRemoving the unnecessary
int()type conversion improves code clarity sincecmd.Int()already returns anint.
144-144: Good cleanup removing redundant type conversionRemoving the unnecessary
int()type conversion improves code clarity sincecmd.Int()already returns anint.go/pkg/clickhouse/interface.go (2)
6-6: Looks good, proper import for ClickHouse driverAdding the ClickHouse Go driver import supports the new interface method.
31-33: Interface extension enables direct connection accessAdding the
Conn()method to the Querier interface provides direct access to the underlying ClickHouse connection, enabling more fine-grained control and the ability to verify exactly how many rows are being inserted - which aligns well with the PR objective of fixing the "4x rows" issue.go/pkg/clickhouse/schema/embed.go (1)
1-10: Well-structured implementation of file embedding for SQL migrationsThis file uses Go's
embedpackage to include SQL migration files directly in the compiled binary, which is a good practice for several reasons:
- Ensures migrations are always available alongside the application
- Simplifies deployment by eliminating the need for separate files
- Makes testing more reliable and portable
The implementation follows Go's standard patterns for file embedding and includes clear documentation.
go/pkg/clickhouse/client.go (1)
145-145: Good performance optimization with increased concurrencyDoubling the number of consumers from 4 to 8 for the ratelimits batch processor should improve throughput. This change aligns well with the PR objective of addressing the issue of sending too many rows to ClickHouse.
go/pkg/testutil/http.go (2)
57-57: Improved naming clarity for database connection string.Renaming from generic
dsnto more specificmysqlDSNmakes the code more self-documenting and distinguishes it from the ClickHouse DSN added later.
91-99:Details
✅ Verification successful
Well-structured integration of real ClickHouse client.
The code correctly initializes a real ClickHouse connection instead of a mock, which directly addresses the PR objective of fixing the issue with incorrect row counts sent to ClickHouse. The proper error handling ensures test failures are clear and immediate if the ClickHouse client cannot be created.
Let's verify this test harness is being used in integration tests that check row counts:
🏁 Script executed:
#!/bin/bash # Find integration tests that might be asserting on ClickHouse row counts rg -A 5 -B 5 "clickhouse|ClickHouse" --type go go/apps/api/integration/ rg -A 5 -B 5 "assert.*row.*count" --type go go/apps/api/integration/Length of output: 6453
Integration tests leverage the real ClickHouse client for aggregated counts
- Found in
go/apps/api/integration/harness.go: the test harness callscontainerMgr.RunClickHouse()and usesclickhouse.New(...)—exactly the code under review.- In
go/apps/api/integration/multi_node_ratelimiting/run.go, tests executeclickhouse.Select[...]to retrieve aggregated counts (total_requests,success_count,failure_count), validating row counts indirectly.These usages confirm that your real ClickHouse initialization in
go/pkg/testutil/http.go(lines 91–99) is exercised by integration tests and addresses the original issue. No further changes needed.go/pkg/clickhouse/schema/databases/004_billing/006_billable_ratelimits_v1.sql (1)
1-10: Well-designed table schema for billable ratelimits aggregation.This table effectively supports monthly aggregation of ratelimit usage per workspace:
- The SummingMergeTree engine is an excellent choice as it automatically sums the count column during merges
- Ordering by workspace_id, year, month enables efficient querying for billing reports
- The schema is consistent with other billing tables, maintaining a clear pattern
This schema directly supports the PR objective by providing a structured way to track and verify the number of rows inserted for ratelimits.
go/pkg/clickhouse/schema/databases/004_billing/004_billable_verifications_v2.sql (1)
1-10: Consistent schema design for billable verifications v2.This table follows the same design principles as the ratelimits table:
- Uses SummingMergeTree for efficient aggregation
- Maintains the same column structure and ordering
- The v2 suffix indicates this is an iteration on an earlier schema version
This schema, along with the test harness changes, will help validate correct insertion counts for verifications.
go/pkg/clickhouse/schema/databases/004_billing/002_billable_verifications_per_month_v1.sql (1)
1-10:Details
✅ Verification successful
Original billable verifications schema maintained for compatibility.
This v1 table has identical structure to the v2 version, suggesting it's kept for backward compatibility or migration purposes. Maintaining both versions is a good practice during schema evolution to ensure continuous operation.
The presence of both v1 and v2 tables raises a question:
🏁 Script executed:
#!/bin/bash # Check if there are any materialized views or other objects that still reference the v1 table rg -A 10 "billable_verifications_per_month_v1" go/pkg/clickhouse/schema/Length of output: 2491
v1 table usage verified
- The materialized view at
go/pkg/clickhouse/schema/databases/004_billing/003_billable_verifications_per_month_mv_v1.sqlwrites intobilling.billable_verifications_per_month_v1, confirming it’s still in use and must be retained for compatibility.All set to approve these changes.
go/pkg/clickhouse/schema/databases/002_ratelimits/011_ratelimits_last_used_mv_v1.sql (1)
4-17: ratelimits_last_used_mv_v1: Materialized view definition looks solid
The use ofmaxSimpleState(time)with grouping on(workspace_id, namespace_id, identifier)correctly captures the last-used timestamp per identifier. No issues detected.go/pkg/clickhouse/schema/databases/002_ratelimits/003_ratelimits_per_minute_mv_v1.sql (1)
1-17: Materialized view definition looks correctThe
ratelimits_per_minute_mv_v1view accurately aggregates by minute, computespassedandtotalcounts, and writes toratelimits.ratelimits_per_minute_v1. No issues spotted in the SELECT, grouping, or MV syntax.go/pkg/clickhouse/schema/databases/001_verifications/018_key_verifications_per_day_mv_v3.sql (1)
1-22: Approve materialized view for daily aggregations
Thekey_verifications_per_day_mv_v3correctly selects and groups by all required dimensions includingtagsand appliestoStartOfDayon the millisecond timestamp. This aligns with the v3 schema evolution for daily verifications.go/pkg/clickhouse/schema/databases/001_verifications/020_key_verifications_per_month_mv_v3.sql (1)
1-23: Approve materialized view for monthly aggregations
Thekey_verifications_per_month_mv_v3view correctly mirrors the daily version, usingtoStartOfMonthand includingtags. It follows the established v3 pattern for monthly summarization.go/pkg/clickhouse/schema/databases/002_ratelimits/009_ratelimits_per_month_mv_v1.sql (1)
1-18: Approve monthly rate-limit materialized view
Theratelimits_per_month_mv_v1view properly usescountIf(passed > 0)andcount(*), grouping by the month-start timestamp. It aligns with existing per-minute, per-hour, and per-day views.go/pkg/clickhouse/schema/databases/003_metrics/008_api_requests_per_day_mv_v1.sql (1)
1-18: Daily aggregation view for API requests looks correct
This materialized view properly bucketsmetrics.raw_api_requests_v1into daily counts byworkspace_id,path,response_status,host,method, andtoStartOfDay(...). The SQL follows established conventions and should integrate seamlessly.go/pkg/clickhouse/schema/databases/001_verifications/008_key_verifications_per_month_mv_v1.sql (1)
2-21: Monthly aggregation view for key verifications is accurate
The view groupsraw_key_verifications_v1by all key dimensions andtoStartOfMonth(...)for a monthly summary. The syntax and naming align with the other verifications migrations.go/pkg/clickhouse/schema/databases/001_verifications/010_key_verifications_per_hour_mv_v2.sql (1)
1-22: Hourly aggregation view for key verifications (v2) with tags is correct
This view extends the hourly aggregation to include thetagscolumn, grouping on the same dimensions plustags. The use oftoStartOfHour(...)matches project standards and the grouping is complete.go/pkg/clickhouse/schema/databases/004_billing/006_billable_ratelimits_v1_mv.sql (1)
2-15: Monthly billing view for rate limits is well-defined
The view filterspassed > 0and sumspassedperworkspace_id, year, and month usingtoYear(time)andtoMonth(time). This matches billing requirements and is consistent with other aggregate views.go/pkg/clickhouse/schema/databases/003_metrics/006_api_requests_per_minute_mv_v1.sql (1)
1-18: Per-minute aggregation view for API requests is solid
This view correctly groups raw API requests into one‑minute buckets viatoStartOfMinute(fromUnixTimestamp64Milli(time))and dimensions. The pattern mirrors existing per-hour and per-day views.go/pkg/clickhouse/schema/databases/003_metrics/004_api_requests_per_hour_mv_v1.sql (1)
1-18: Materialized view for hourly API requests is correctly defined.
The aggregation logic, grouping keys, and timestamp truncation align with the intended design.go/pkg/clickhouse/schema/databases/002_ratelimits/002_ratelimits_per_minute_v1.sql (1)
1-14: Per-minute ratelimits table schema looks good.
Use of SummingMergeTree and the specified ORDER BY columns is appropriate for incremental counters.go/pkg/clickhouse/schema/databases/002_ratelimits/005_ratelimits_per_hour_mv_v1.sql (1)
1-18: Hourly ratelimits materialized view is properly constructed.
ThecountIffor passed requests, total count, and hourly timestamp truncation match the requirements.go/pkg/clickhouse/schema/databases/001_verifications/012_key_verifications_per_day_mv_v2.sql (1)
1-23: Daily key verifications materialized view is accurate.
Includingtagsin the grouping and usingtoStartOfDayon the timestamp is correct for v2.go/pkg/clickhouse/schema/databases/001_verifications/004_key_verifications_per_hour_mv_v1.sql (1)
1-21: Hourly key verifications materialized view implementation is sound.
The grouping on the truncated hour and count aggregation aligns with the raw data structure.go/pkg/clickhouse/schema/databases/001_verifications/003_key_verifications_per_hour_v1.sql (1)
1-13: Well-designed hourly aggregation table with appropriate engine choiceThis ClickHouse table is well-structured for hourly verification aggregations. The SummingMergeTree engine is ideal for this use case as it will automatically sum the
countcolumn during merges. The column ordering in the PRIMARY KEY supports efficient filtering by workspace and key space first, followed by time-based access patterns.The use of LowCardinality for the outcome field is a good optimization for a column that likely has few distinct values.
go/pkg/clickhouse/schema/databases/001_verifications/007_key_verifications_per_month_v1.sql (1)
1-13: Consistent table design following the established patternThis monthly aggregation table maintains the same well-designed structure as the hourly table, creating a consistent pattern across different time granularities. The identical column structure and SummingMergeTree engine configuration promote code maintainability and predictable query patterns.
The versioned naming convention (
_v1) also indicates good planning for future schema evolution.go/pkg/clickhouse/schema/databases/002_ratelimits/001_raw_ratelimits_v1.sql (2)
1-14: Well-structured raw events table with appropriate indexingThis table design for raw rate limit events uses appropriate indexing strategies with both the primary ORDER BY and secondary indices to support common query patterns. The MergeTree engine is a good choice for raw event storage.
16-19: Good use of secondary indicesThe secondary indices on (workspace_id, time) and request_id will help optimize different query patterns beyond what the primary key ordering provides.
go/pkg/batch/process.go (3)
105-107: Improved thread safety by localizing batch stateMoving the batch slice from being a struct field to a local variable within the process method is an excellent refactoring. This ensures each consumer goroutine has its own independent batch state, eliminating potential race conditions.
109-116: Updated flushAndReset to use localized batchThe flushAndReset function has been correctly updated to work with the localized batch variable.
119-134: Updated main processing loopThe main processing loop has been properly updated to use the localized batch variable, maintaining the same batching logic while improving thread safety.
go/pkg/clickhouse/schema/databases/001_verifications/006_key_verifications_per_day_mv_v1.sql (1)
1-3: Validate MV target table existence and schema alignmentEnsure that the destination table
verifications.key_verifications_per_day_v1already exists and its column order and types exactly match the SELECT list (workspace_id,key_space_id,identity_id,key_id,outcome,count,time). A mismatch will cause the materialized view to fail at runtime.go/pkg/clickhouse/schema/databases/001_verifications/019_key_verifications_per_month_v3.sql (1)
1-14: Well-structured aggregation table for monthly key verificationsThis is a well-designed table with appropriate use of the SummingMergeTree engine, which will automatically sum the count column during merges. The ORDER BY clause includes all dimensions needed for efficient querying of verification metrics by workspace, key space, identity, key, and time, with tags and outcome at the end.
The inclusion of
tagsas an Array(String) adds important flexibility for multi-dimensional analysis.go/pkg/clickhouse/schema/databases/003_metrics/003_api_requests_per_hour_v1.sql (1)
1-20: Well-designed hourly API metrics table with appropriate indexingThe table structure effectively supports hourly API request analytics with:
- Good use of LowCardinality for the method column to optimize storage
- Clear comments explaining the method column format
- Appropriate use of SummingMergeTree for aggregation
- Logical column ordering in the ORDER BY clause that prioritizes workspace_id and time
This will enable efficient querying of API usage patterns across multiple dimensions.
go/pkg/clickhouse/schema/databases/003_metrics/005_api_requests_per_minute_v1.sql (1)
1-20: Properly structured minute-level API metrics tableThis table mirrors the hourly metrics table design but at minute-level granularity, which is appropriate for more detailed analytics. The table structure and indexing strategy are consistent with the hourly table, maintaining a good balance between query performance and storage efficiency.
go/pkg/clickhouse/schema/databases/001_verifications/005_key_verifications_per_day_v1.sql (1)
1-13: Appropriate daily verification metrics table designThis v1 table design provides a good structure for day-level verification metrics. The SummingMergeTree engine is appropriate for this use case, and the ORDER BY clause efficiently organizes data for common query patterns focusing on workspace, key space, and time.
Note that this v1 table doesn't include the tags array that exists in the v3 monthly table, which is expected for a v1 schema version.
go/apps/api/integration/harness.go (6)
8-8: Proper import of the ClickHouse packageAdding the ClickHouse import is necessary for the integration test harness to use ClickHouse functionality.
27-27: Good addition of ClickHouse client to the Harness structAdding the CH field of type clickhouse.ClickHouse to the Harness struct allows tests to interact with ClickHouse directly.
47-56: Proper ClickHouse container setup with error handlingThe code correctly:
- Starts a ClickHouse container with migrations
- Creates a real ClickHouse client with the host DSN
- Uses a no-op logger appropriate for tests
- Checks for errors during client creation
This ensures ClickHouse is properly configured for integration tests.
57-57: Appropriate MySQL setup after ClickHouse initializationStarting MySQL after ClickHouse is a logical sequence, as it ensures both databases are available before the test harness is fully initialized.
73-75: Proper field initialization in the Harness structThe MySQL DSN and ClickHouse client are correctly assigned to the Harness struct fields, making them available for tests.
80-84: Well-structured API configuration with both database connectionsThe API configuration now properly includes both MySQL and ClickHouse DSNs, which allows the API to connect to both databases during tests. This change addresses the PR objective of enabling verification that the correct number of rows are sent to ClickHouse.
go/pkg/clickhouse/schema/databases/001_verifications/011_key_verifications_per_day_v2.sql (1)
1-14: Table schema looks well-designed for daily verification aggregations.The table follows good ClickHouse design practices:
- Uses
SummingMergeTreeengine which is appropriate for aggregation tables- Includes
LowCardinality(String)foroutcomewhich optimizes for repeated string values- Properly orders data by all dimensions to support efficient querying patterns
- Includes
tagsas an Array type for flexible categorizationgo/pkg/clickhouse/schema/databases/000_README.md (1)
1-72: Comprehensive naming convention documentation.This README provides clear and structured guidance for maintaining consistency in the ClickHouse schema. The conventions are well-defined and the examples effectively illustrate each pattern.
🧰 Tools
🪛 LanguageTool
[uncategorized] ~18-~18: Loose punctuation mark.
Context: ...tion][version]### Prefixes -raw: Input data tables -tmp_{yourname}_`: ...(UNLIKELY_OPENING_PUNCTUATION)
go/pkg/clickhouse/schema/databases/001_verifications/009_key_verifications_per_hour_v2.sql (1)
1-14: Table schema looks well-designed for hourly verification aggregations.The table structure maintains consistency with other time-based aggregation tables while providing hourly granularity:
- Uses identical column structure and types as the daily aggregation table
- Implements the same composite ordering key
- Appropriately uses
SummingMergeTreefor aggregationgo/apps/api/integration/multi_node_ratelimiting/run.go (3)
14-14: LGTM: Added ClickHouse import.Added import for the ClickHouse package to support new verification steps.
134-137: LGTM: Improved variable naming.Renamed variable from
errtocallErrto avoid shadowing the earliererrvariable.
166-201: LGTM: Added ClickHouse verification for rate limit data.This Step 6 adds thorough verification of rate limit data in ClickHouse:
- Uses a well-structured aggregatedCounts type for query results
- Implements polling with require.Eventually to handle async writes
- Verifies total requests, successes and failures match expected counts
go/pkg/clickhouse/select.go (2)
9-61: Great documentation! Extensive and clear explanations.The documentation for this function is excellent. It covers:
- Purpose and functionality
- Parameter substitution and SQL injection prevention
- Requirements for struct tags
- Context usage for timeouts
- Error handling expectations
- Example usage with real code
- Memory considerations for large result sets
- Thread-safety information
This level of detail makes the function self-documenting and assists future developers in using it correctly.
62-74: Clean implementation with proper error handlingThe implementation is concise and follows the principle of minimal code for maximum functionality. The error handling is appropriate, wrapping and forwarding any errors from the underlying driver.
go/apps/api/routes/v2_ratelimit_limit/200_test.go (3)
13-14: Good addition of ClickHouse importsThe addition of ClickHouse-related imports aligns with the PR objective of integrating ClickHouse for data verification in tests.
65-109: Excellent implementation of ClickHouse verification testThis new test case effectively validates that rate limit events are properly persisted in ClickHouse. The test:
- Creates a namespace with a unique identifier
- Sends a rate limit request
- Queries ClickHouse using the new
Selectgeneric function- Verifies the event data was correctly stored
The use of
require.Eventuallyis appropriate for handling potential asynchronous data flushing to ClickHouse.
325-338: Good refactoring with helper functionExtracting the namespace creation logic into a helper function improves code maintainability and reduces duplication across test cases. This is a positive change that follows the DRY principle.
go/pkg/testutil/containers/api.go (3)
22-26: Good struct design for API configurationCreating a structured configuration type is a better approach than passing multiple parameters. This design:
- Makes the API more maintainable
- Allows for future expansion without changing function signatures
- Makes the code more readable by grouping related parameters
28-28: Clean function signature refactoringUpdating the function signature to use the new config struct maintains backward compatibility while improving the API design.
56-75: Proper ClickHouse integrationThe addition of the ClickHouse DSN to the environment variables ensures that the API containers can connect to ClickHouse. This integration is implemented correctly.
go/go.mod (2)
16-16: Appropriate addition of goose for database migrationsAdding
github.com/pressly/goose/v3is appropriate for managing database migrations, which aligns with the introduction of ClickHouse schema migrations.
22-22:Details
❓ Verification inconclusive
Dependency version update
Updating
github.com/urfave/cli/v3to v3.2.0 is likely necessary to support other changes in the codebase. Make sure to test for any breaking changes in the CLI functionality.
🏁 Script executed:
#!/bin/bash # Check for any breaking changes in urfave/cli v3.2.0 gh api repos/urfave/cli/releases/tags/v3.2.0 --jq '.body' | grep -i "break"Length of output: 73
Action Required: Verify CLI v3.2.0 for breaking changes
An automated scan of the GitHub release notes for “break” returned no matches, but keyword searches can miss subtler API changes. Please manually review the CHANGELOG or full release notes and exercise your CLI commands to ensure nothing has regressed.• File: go/go.mod (line 22)
github.com/urfave/cli/v3 v3.2.0
go/pkg/clickhouse/schema/databases/005_business/003_active_workspaces_keys_per_month_mv_v1.sql
Show resolved
Hide resolved
go/pkg/clickhouse/schema/databases/002_ratelimits/007_ratelimits_per_day_mv_v1.sql
Show resolved
Hide resolved
go/pkg/clickhouse/schema/databases/002_ratelimits/006_ratelimits_per_day_v1.sql
Show resolved
Hide resolved
go/pkg/clickhouse/schema/databases/003_metrics/002_raw_api_requests_v1.sql
Show resolved
Hide resolved
go/pkg/clickhouse/schema/databases/001_verifications/002_raw_key_verifications_v1.sql
Show resolved
Hide resolved
|
Thank you for following the naming conventions for pull request titles! 🙏 |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
go/pkg/testutil/containers/clickhouse.go (2)
142-177: Enhanced migration process using embedded filesystemThe migration process has been improved by using the embedded filesystem from the schema package. Some suggestions:
- Consider adding a comment that explicitly states migrations are applied in lexicographic order (as guaranteed by
fs.WalkDir).- It might be worth adding logging for each migration being applied.
func runClickHouseMigrations(conn driver.Conn) error { + // Migrations are applied in lexicographic order as guaranteed by fs.WalkDir return fs.WalkDir(schema.Migrations, ".", func(path string, d fs.DirEntry, err error) error { if err != nil { return err } if d.IsDir() { return nil } + // Apply migration from file + fmt.Printf("Applying ClickHouse migration: %s\n", path) f, err := schema.Migrations.Open(path) if err != nil { return err } defer f.Close()
170-173: Verify SQL statement executionYour code appends a semicolon to each SQL statement before execution. While this works correctly, it could result in double semicolons if the original statement already includes one. Consider checking if the statement already ends with a semicolon before adding another.
- err = conn.Exec(context.Background(), fmt.Sprintf("%s;", query)) + // Add semicolon if not already present + if !strings.HasSuffix(query, ";") { + query = query + ";" + } + err = conn.Exec(context.Background(), query)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
go/pkg/testutil/containers/clickhouse.go(5 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
go/pkg/testutil/containers/clickhouse.go (2)
go/pkg/testutil/containers/containers.go (1)
Containers(13-17)go/pkg/clickhouse/schema/embed.go (1)
Migrations(10-10)
🔇 Additional comments (2)
go/pkg/testutil/containers/clickhouse.go (2)
74-78: Well done fixing the migration execution order issue!The code now correctly runs migrations before closing the connection, addressing the critical issue identified in the previous review. Good addition of execution timing information for diagnostics.
Also applies to: 127-128
86-87: Nice improvement with Docker network integrationThe container is now explicitly connected to the Docker network, and you're providing two DSN strings - one for host access and another for Docker network access. This facilitates inter-container communication in your test environment.
Also applies to: 96-97
What does this PR do?
We now run clickhouse in integration tests and assert the correct rows are inserted.
As an experiment, this also restructures the way we define clickhouse tables and how migrations would run.
We're trying to with test containers first, it has no impact on production yet.
Type of change
How should this be tested?
Checklist
Required
pnpm buildpnpm fmtconsole.logsgit pull origin mainAppreciated
Summary by CodeRabbit
New Features
Improvements
Bug Fixes
Documentation
Dependency Updates
Refactor