Skip to content

feat(ch): use latest clickhouse exporter schema#104

Merged
Vunovati merged 5 commits into
mainfrom
feat/use-latest-collector
Apr 27, 2026
Merged

feat(ch): use latest clickhouse exporter schema#104
Vunovati merged 5 commits into
mainfrom
feat/use-latest-collector

Conversation

@Vunovati
Copy link
Copy Markdown
Collaborator

@Vunovati Vunovati commented Mar 20, 2026

Summary by CodeRabbit

  • New Features

    • Logs now include an optional EventName field — stored, returned in queries, and filterable.
  • Documentation

    • Migration guide added/updated with instructions and post-migration cleanup steps for EventName.
  • Chores

    • OpenTelemetry Collector reference bumped to v0.150.1; ClickHouse test image updated; DB schemas (ClickHouse & SQLite) updated to support EventName.
  • Tests

    • New unit, integration, and end-to-end tests covering ingestion, migration idempotency, querying, and filtering for EventName.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Mar 20, 2026

📝 Walkthrough

Walkthrough

Adds an optional EventName field across ClickHouse and SQLite schemas, queries, filters, tests, and migrations; introduces migration and E2E tests that validate ingestion before/after migration; bumps OpenTelemetry Collector test/example references from 0.136.0 → 0.150.1 and retains the old version constant for compatibility testing.

Changes

Cohort / File(s) Summary
Version & Documentation
examples/clickhouse-observability-backend/README.md, examples/clickhouse-observability-backend/docker-compose.yml, packages/clickhouse-datasource/src/test/constants.ts
Bumps OTEL collector image/version to 0.150.1, adds OTEL_COLLECTOR_VERSION_OLD = "0.136.0", and updates ClickHouse image tag to 25.6-alpine.
ClickHouse Migration & Docs
packages/clickhouse-datasource/migrations/001_add_event_name_to_otel_logs.sql, packages/clickhouse-datasource/migrations/README.md
Adds migration to ALTER TABLE ... ADD COLUMN IF NOT EXISTS EventName String CODEC(ZSTD(1)); README documents execution, idempotency, and post-migration cleanup steps.
ClickHouse Schemas & Queries
packages/clickhouse-datasource/src/ch-row-schemas.ts, packages/clickhouse-datasource/src/query-logs.ts
Log row schema accepts optional EventName; query builder projects EventName and applies equality filter when filter.eventName provided.
ClickHouse Tests
packages/clickhouse-datasource/src/ch-row-schemas.test.ts, packages/clickhouse-datasource/src/datasource.test.ts, packages/clickhouse-datasource/src/e2e.integration.test.ts, packages/clickhouse-datasource/src/migration.integration.test.ts
Fixtures updated/added and new integration tests validate ingestion before/after migration, idempotency, read compatibility, and filtering by eventName; seeding updated to include an EventName value for one row.
SQLite Schema & Implementation
packages/sqlite-datasource/src/sqlite-opentelemetry-ddl.sql, packages/sqlite-datasource/src/sqlite-opentelemetry-ddl.ts, packages/sqlite-datasource/src/db-types.ts, packages/sqlite-datasource/src/db-datasource.ts
Adds EventName to SQLite DDL (non-null default '') and types; includes EventName in selects, filters, and mapping; adds toOptionalString() helper to map empty strings → undefined.
Core Schemas
packages/core/src/data-filters-zod.ts, packages/core/src/denormalized-signals-zod.ts
Adds optional eventName to logsDataFilterSchema and optional EventName to otelLogsSchema (affects inferred types).
Test Utilities / Payloads
packages/clickhouse-datasource/src/test/otel-payloads.ts
Adds TEST_EVENT_NAME, TEST_EVENT_SERVICE_NAME, and createLogsPayloadWithEventName() to generate OTLP payloads containing eventName.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    participant OldCollector as Old OTEL Collector (v0.136.0)
    participant NewCollector as New OTEL Collector (v0.150.1)
    participant ClickHouse as ClickHouse DB
    participant Migrator as Migration Runner
    participant Datasource as Read Datasource

    OldCollector->>ClickHouse: send OTLP logs (no eventName)
    ClickHouse-->>Datasource: stored rows (no EventName column)
    Migrator->>ClickHouse: run migration SQL (add EventName column)
    ClickHouse-->>Migrator: migration applied (idempotent)
    NewCollector->>ClickHouse: send OTLP logs (with eventName)
    ClickHouse-->>Datasource: rows stored (new rows include EventName)
    Datasource->>ClickHouse: query with optional eventName filter
    ClickHouse-->>Datasource: return rows (EventName present or undefined)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Poem

🐇 I hopped through rows and added a name,
From ClickHouse halls to SQLite's frame,
Old collectors bubbled, new ones sang true,
Migration marched on and tests passed through,
I nibble on filters — event names, hooray!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 29.41% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: updating to use the latest ClickHouse exporter schema (v0.150.1), which is reflected across version updates, database migrations, and schema extensions throughout the changeset.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/use-latest-collector

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (3)
packages/clickhouse-datasource/src/ch-row-schemas.test.ts (1)

83-83: Add a direct assertion for EventName transform in this unit test.

Line 83 updates the fixture, but this test block doesn’t verify the parsed EventName value. Adding one assertion will lock in the expected "" -> undefined behavior here too.

✅ Suggested assertion
   expect(result.Body).toBe("something failed");
+  expect(result.EventName).toBeUndefined();
   expect(result.ResourceSchemaUrl).toBeUndefined();
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/clickhouse-datasource/src/ch-row-schemas.test.ts` at line 83, Add a
direct assertion in the test that uses the fixture with EventName: "" to verify
the transform produces undefined; after the row is parsed (the variable commonly
named result or parsed in this test block), add an assertion that
result.EventName is undefined (e.g., expect(parsed.EventName).toBeUndefined())
to lock the "" -> undefined behavior for the EventName transform.
packages/sqlite-datasource/src/sqlite-opentelemetry-ddl.sql (1)

34-35: Consider indexing EventName for new log filters.

With EventName added at Line 35 and filter support introduced, an index can prevent full scans on larger SQLite datasets.

⚡ Suggested index
 CREATE INDEX IF NOT EXISTS idx_otel_logs_service_time ON otel_logs (ServiceName, TimestampTime);
 CREATE INDEX IF NOT EXISTS idx_otel_logs_trace_id ON otel_logs (TraceId) WHERE TraceId != '';
 CREATE INDEX IF NOT EXISTS idx_otel_logs_severity ON otel_logs (SeverityNumber);
+CREATE INDEX IF NOT EXISTS idx_otel_logs_event_name ON otel_logs (EventName);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/sqlite-datasource/src/sqlite-opentelemetry-ddl.sql` around lines 34
- 35, Add a non-unique index on the EventName column to speed up event-based
filters: after the table definition that declares LogAttributes and EventName,
create an index (e.g., idx_logs_eventname or similar) on the EventName column so
queries filtering by EventName avoid full-table scans; reference the existing
column name "EventName" when adding the index and ensure the index creation runs
as part of the same migration/DDL that creates the table.
packages/clickhouse-datasource/src/test/otel-payloads.ts (1)

156-160: Reuse the shared resource fixture here.

This inline resource.attributes copy drops deployment.environment, so the EventName path stops exercising the same resource mapping as the rest of this file. Composing from resourceAttributes and overriding only the service-specific keys will keep this helper representative and avoid fixture drift.

♻️ Suggested refactor
         resource: {
           attributes: [
-            otelAttr("service.name", TEST_EVENT_SERVICE_NAME),
-            otelAttr("service.version", "2.0.0"),
+            ...resourceAttributes.filter(
+              ({ key }) => key !== "service.name" && key !== "service.version"
+            ),
+            otelAttr("service.name", TEST_EVENT_SERVICE_NAME),
+            otelAttr("service.version", "2.0.0"),
           ],
         },
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/clickhouse-datasource/src/test/otel-payloads.ts` around lines 156 -
160, The inline resource.attributes block in the test helper should reuse the
shared fixture resourceAttributes instead of redefining the array; update the
resource object in otel-payloads.ts to compose from the existing
resourceAttributes and then override only service.name and service.version (use
TEST_EVENT_SERVICE_NAME and the "2.0.0" value), so that deployment.environment
and other shared attributes from resourceAttributes are preserved; locate the
place that currently constructs resource: { attributes: [
otelAttr("service.name", TEST_EVENT_SERVICE_NAME), otelAttr("service.version",
"2.0.0"), ] } and replace it with a composition that spreads or concatenates
resourceAttributes plus the two service-specific otelAttr entries.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/clickhouse-datasource/src/e2e.integration.test.ts`:
- Around line 320-330: The test currently loops over result.data and asserts
EventName is undefined but can vacuously pass when result.data is empty; update
the it block to first assert that result.data is non-empty (e.g.,
expect(result.data.length).toBeGreaterThan(0) or
expect(result.data).not.toHaveLength(0)) after calling ds.getLogs so the
subsequent loop over result.data meaningfully validates EventName for returned
logs (refer to ds.getLogs, TEST_SERVICE_NAME, requestContext(), and
result.data).

In `@packages/clickhouse-datasource/src/migration.integration.test.ts`:
- Around line 67-75: The filter that builds `statements` incorrectly treats
multi-line chunks as non-comment when inner lines have leading whitespace;
update the filter predicate (the callback after `sql.split(";").map((s) =>
s.trim())`) so each line in the chunk is trimmed before checking for
comment-only, i.e. change the `every((l) => l.startsWith("--") || l.trim() ===
"")` check to trim `l` first (e.g. `every((l) => l.trim().startsWith("--") ||
l.trim() === "")`) so formatted migrations like `-- first\n  -- second` are
recognized as comments and filtered out.
- Around line 93-102: The waitForData helper should tolerate transient errors
thrown by the provided check function (e.g., countLogs) instead of letting the
first exception abort the wait; update waitForData to call await check() inside
a try/catch and on error swallow the exception (optionally log) and continue
polling until the timeout expires, returning only when check() resolves true or
throwing the timeout error otherwise.

In `@packages/sqlite-datasource/src/sqlite-opentelemetry-ddl.sql`:
- Line 2: The header comment "-- Migrated from ClickHouse OTEL exporter schema
v0.148.0" and the later source reference that points to "...v136.sql" are
inconsistent; update the source reference to match the header version (or vice
versa) so both reference the same schema version—e.g., change the "...v136.sql"
reference to "...v0.148.0.sql" (or change the header to v0.136) ensuring the two
strings are identical; look for the exact comment line and the source reference
in sqlite-opentelemetry-ddl.sql to make the change.

---

Nitpick comments:
In `@packages/clickhouse-datasource/src/ch-row-schemas.test.ts`:
- Line 83: Add a direct assertion in the test that uses the fixture with
EventName: "" to verify the transform produces undefined; after the row is
parsed (the variable commonly named result or parsed in this test block), add an
assertion that result.EventName is undefined (e.g.,
expect(parsed.EventName).toBeUndefined()) to lock the "" -> undefined behavior
for the EventName transform.

In `@packages/clickhouse-datasource/src/test/otel-payloads.ts`:
- Around line 156-160: The inline resource.attributes block in the test helper
should reuse the shared fixture resourceAttributes instead of redefining the
array; update the resource object in otel-payloads.ts to compose from the
existing resourceAttributes and then override only service.name and
service.version (use TEST_EVENT_SERVICE_NAME and the "2.0.0" value), so that
deployment.environment and other shared attributes from resourceAttributes are
preserved; locate the place that currently constructs resource: { attributes: [
otelAttr("service.name", TEST_EVENT_SERVICE_NAME), otelAttr("service.version",
"2.0.0"), ] } and replace it with a composition that spreads or concatenates
resourceAttributes plus the two service-specific otelAttr entries.

In `@packages/sqlite-datasource/src/sqlite-opentelemetry-ddl.sql`:
- Around line 34-35: Add a non-unique index on the EventName column to speed up
event-based filters: after the table definition that declares LogAttributes and
EventName, create an index (e.g., idx_logs_eventname or similar) on the
EventName column so queries filtering by EventName avoid full-table scans;
reference the existing column name "EventName" when adding the index and ensure
the index creation runs as part of the same migration/DDL that creates the
table.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: b5810cd1-8e32-4725-8dda-d7da53a285b9

📥 Commits

Reviewing files that changed from the base of the PR and between 7e911c3 and 6b7f6ae.

⛔ Files ignored due to path filters (1)
  • packages/ui/src/lib/__snapshots__/generate-prompt-instructions.test.ts.snap is excluded by !**/*.snap
📒 Files selected for processing (18)
  • examples/clickhouse-observability-backend/README.md
  • examples/clickhouse-observability-backend/docker-compose.yml
  • packages/clickhouse-datasource/migrations/001_add_event_name_to_otel_logs.sql
  • packages/clickhouse-datasource/migrations/README.md
  • packages/clickhouse-datasource/src/ch-row-schemas.test.ts
  • packages/clickhouse-datasource/src/ch-row-schemas.ts
  • packages/clickhouse-datasource/src/datasource.test.ts
  • packages/clickhouse-datasource/src/e2e.integration.test.ts
  • packages/clickhouse-datasource/src/migration.integration.test.ts
  • packages/clickhouse-datasource/src/query-logs.ts
  • packages/clickhouse-datasource/src/test/constants.ts
  • packages/clickhouse-datasource/src/test/otel-payloads.ts
  • packages/core/src/data-filters-zod.ts
  • packages/core/src/denormalized-signals-zod.ts
  • packages/sqlite-datasource/src/db-datasource.ts
  • packages/sqlite-datasource/src/db-types.ts
  • packages/sqlite-datasource/src/sqlite-opentelemetry-ddl.sql
  • packages/sqlite-datasource/src/sqlite-opentelemetry-ddl.ts

Comment thread packages/clickhouse-datasource/src/e2e.integration.test.ts
Comment thread packages/clickhouse-datasource/src/migration.integration.test.ts
Comment thread packages/clickhouse-datasource/src/migration.integration.test.ts
@@ -1,5 +1,5 @@
-- OpenTelemetry SQLite Schema
-- Migrated from ClickHouse OTEL exporter schema v0.136.0
-- Migrated from ClickHouse OTEL exporter schema v0.148.0
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix version/source comment mismatch.

Line 2 says v0.148.0, but the source reference later still points to ...v136.sql. Please align both comments to the same schema version.

🛠️ Suggested edit
--- Source: /k8s/migrations/clickhouse/002-otel-schema-v136.sql
+++ Source: /k8s/migrations/clickhouse/002-otel-schema-v148.sql
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/sqlite-datasource/src/sqlite-opentelemetry-ddl.sql` at line 2, The
header comment "-- Migrated from ClickHouse OTEL exporter schema v0.148.0" and
the later source reference that points to "...v136.sql" are inconsistent; update
the source reference to match the header version (or vice versa) so both
reference the same schema version—e.g., change the "...v136.sql" reference to
"...v0.148.0.sql" (or change the header to v0.136) ensuring the two strings are
identical; look for the exact comment line and the source reference in
sqlite-opentelemetry-ddl.sql to make the change.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
packages/clickhouse-datasource/src/migration.integration.test.ts (1)

172-224: Clean up the temporary collector config directory.

mkdtemp() allocates a fresh directory under the OS temp root, but this suite never removes it. Repeated runs will keep leaving migration-otel-* behind.

♻️ Suggested cleanup
   let clickhouseContainer: StartedTestContainer;
   let adminClient: ClickHouseClient;
   let ds: ClickHouseReadDatasource;
+  let tmpDir: string | undefined;
   let configPath: string;
   let chBaseUrl: string;
@@
-    const tmpDir = await fs.mkdtemp(path.join(os.tmpdir(), "migration-otel-"));
+    tmpDir = await fs.mkdtemp(path.join(os.tmpdir(), "migration-otel-"));
     configPath = path.join(tmpDir, "config.yaml");
     await fs.writeFile(configPath, createOtelCollectorConfig(), "utf-8");
@@
   afterAll(async () => {
     await ds?.close();
     await adminClient?.close();
     if (clickhouseContainer) await clickhouseContainer.stop();
     if (network) await network.stop();
+    if (tmpDir) await fs.rm(tmpDir, { recursive: true, force: true });
   });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/clickhouse-datasource/src/migration.integration.test.ts` around
lines 172 - 224, The test creates a temporary directory via fs.mkdtemp (tmpDir)
and writes configPath into it but never removes it; modify the teardown in
afterAll to delete the temp file and directory (remove configPath and then
tmpDir) using fs.rm/fs.rmdir (with recursive:true or force:true as appropriate)
after closing ds and adminClient and stopping containers, ensuring any errors
are handled or ignored so cleanup always runs; reference tmpDir, configPath,
beforeAll, afterAll, fs.mkdtemp, and fs.writeFile when locating the change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/clickhouse-datasource/src/migration.integration.test.ts`:
- Around line 250-251: The test currently waits for any row by calling
waitForData(() => countLogs(adminClient) > 0) which races the exporter; change
it to poll for the full Phase 1 baseline by computing the expected count (e.g.,
derive from createLogsPayload() or a constant baselineCount) and use
waitForData(async () => (await countLogs(adminClient)) === expectedCount) or >=
expectedCount; update the call site where waitForData and countLogs are used in
the migration.integration.test so the test only proceeds once the expected
baseline rows are present.

---

Nitpick comments:
In `@packages/clickhouse-datasource/src/migration.integration.test.ts`:
- Around line 172-224: The test creates a temporary directory via fs.mkdtemp
(tmpDir) and writes configPath into it but never removes it; modify the teardown
in afterAll to delete the temp file and directory (remove configPath and then
tmpDir) using fs.rm/fs.rmdir (with recursive:true or force:true as appropriate)
after closing ds and adminClient and stopping containers, ensuring any errors
are handled or ignored so cleanup always runs; reference tmpDir, configPath,
beforeAll, afterAll, fs.mkdtemp, and fs.writeFile when locating the change.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 374eff5f-14d5-4f55-9415-076675e5fcfc

📥 Commits

Reviewing files that changed from the base of the PR and between 6b7f6ae and 787e927.

📒 Files selected for processing (2)
  • packages/clickhouse-datasource/src/e2e.integration.test.ts
  • packages/clickhouse-datasource/src/migration.integration.test.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • packages/clickhouse-datasource/src/e2e.integration.test.ts

Comment on lines +250 to +251
// Wait for ingestion using raw count
await waitForData(async () => (await countLogs(adminClient)) > 0);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Wait for the full Phase 1 baseline here.

This returns as soon as the first row is visible, but Line 260 asserts both rows from createLogsPayload(). Poll for the expected baseline count so this does not race the exporter flush.

🐛 Suggested fix
-      await waitForData(async () => (await countLogs(adminClient)) > 0);
+      await waitForData(async () => (await countLogs(adminClient)) >= 2);
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Wait for ingestion using raw count
await waitForData(async () => (await countLogs(adminClient)) > 0);
// Wait for ingestion using raw count
await waitForData(async () => (await countLogs(adminClient)) >= 2);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/clickhouse-datasource/src/migration.integration.test.ts` around
lines 250 - 251, The test currently waits for any row by calling waitForData(()
=> countLogs(adminClient) > 0) which races the exporter; change it to poll for
the full Phase 1 baseline by computing the expected count (e.g., derive from
createLogsPayload() or a constant baselineCount) and use waitForData(async () =>
(await countLogs(adminClient)) === expectedCount) or >= expectedCount; update
the call site where waitForData and countLogs are used in the
migration.integration.test so the test only proceeds once the expected baseline
rows are present.

…ctor

# Conflicts:
#	packages/clickhouse-datasource/src/test/constants.ts
#	packages/ui/src/lib/__snapshots__/generate-prompt-instructions.test.ts.snap
#	pnpm-lock.yaml
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
packages/sqlite-datasource/src/sqlite-opentelemetry-ddl.ts (1)

13-13: ⚠️ Potential issue | 🟡 Minor

Update the stale source reference.

The header says the schema is migrated from v0.150.1, but the Source: comment still points to 002-otel-schema-v136.sql. Either rename/repoint the source file or update this comment to keep the provenance accurate.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/sqlite-datasource/src/sqlite-opentelemetry-ddl.ts` at line 13,
Update the stale source comment in sqlite-opentelemetry-ddl.ts so the provenance
matches the header stating the schema is migrated from v0.150.1: locate the line
containing the SQL source comment string "-- Source:
/k8s/migrations/clickhouse/002-otel-schema-v136.sql" and either change the
path/filename to the correct migration file for v0.150.1 or remove/replace the
comment with the correct source reference; ensure the header and the Source
comment consistently reference the same migration version.
🧹 Nitpick comments (3)
packages/clickhouse-datasource/src/migration.integration.test.ts (3)

158-165: Prefer parameterized query over string interpolation.

CH_DATABASE is a constant today so this is not exploitable, but using query_params keeps the helper safe if the value ever becomes dynamic and is consistent with how query-logs.ts builds parameterized statements.

♻️ Suggested fix
   const result = await client.query({
-    query: `SELECT name FROM system.columns WHERE database = '${CH_DATABASE}' AND table = 'otel_logs' AND name = 'EventName'`,
+    query:
+      "SELECT name FROM system.columns WHERE database = {db:String} AND table = 'otel_logs' AND name = 'EventName'",
+    query_params: { db: CH_DATABASE },
     format: "JSONEachRow",
   });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/clickhouse-datasource/src/migration.integration.test.ts` around
lines 158 - 165, The helper hasEventNameColumn currently interpolates
CH_DATABASE into the SQL string; change it to a parameterized query like in
query-logs.ts by using the client's query_params option and a named placeholder
for the database (e.g., {db}) in the SQL, passing CH_DATABASE as the parameter
value, so the query becomes parameterized and safe if CH_DATABASE ever becomes
dynamic while keeping the rest of the function (query call and result.json
usage) unchanged.

211-224: Temp directory created in beforeAll is never cleaned up.

fs.mkdtemp(...) at Line 212 leaves a directory behind under the OS temp path on every run. Consider tracking it and removing it in afterAll for hygiene on developer machines and CI runners.

♻️ Suggested fix
   let configPath: string;
+  let tmpDir: string;
   let chBaseUrl: string;
@@
-    const tmpDir = await fs.mkdtemp(path.join(os.tmpdir(), "migration-otel-"));
+    tmpDir = await fs.mkdtemp(path.join(os.tmpdir(), "migration-otel-"));
     configPath = path.join(tmpDir, "config.yaml");
@@
   afterAll(async () => {
     await ds?.close();
     await adminClient?.close();
     if (clickhouseContainer) await clickhouseContainer.stop();
     if (network) await network.stop();
+    if (tmpDir) await fs.rm(tmpDir, { recursive: true, force: true });
   });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/clickhouse-datasource/src/migration.integration.test.ts` around
lines 211 - 224, The temp directory created with fs.mkdtemp (tmpDir) in the
beforeAll setup is never removed; update the afterAll teardown to remove that
directory (and the config file) to avoid accumulating temp files: reference the
tmpDir and configPath variables and call the appropriate async removal (e.g.,
fs.rm or fs.rmdir with recursive=true or force options) inside afterAll after
closing ds/adminClient and stopping containers; ensure removal is awaited and
guarded (check tmpDir exists) so tests still tear down cleanly.

67-77: SQL splitter is brittle for future migrations.

Splitting on ; works for the current single-statement migration but will break the moment a migration contains a semicolon inside a string literal, comment, or function body (e.g. a DEFAULT 'a;b' or a CTE). It’s fine for now, but worth noting since the helper is presented as a generic migration runner — please prefer a real SQL splitter (or one statement per file convention) before adding more complex DDL.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/clickhouse-datasource/src/migration.integration.test.ts` around
lines 67 - 77, The current SQL splitting logic that builds the statements array
by splitting on ";" (variable statements) is brittle and will break on
semicolons inside strings/comments; replace this with a robust approach—either
integrate a SQL-aware splitter/parser (e.g., a library that tokenizes SQL and
respects string literals/comments) to produce individual statements before
executing them, or change the migration runner to require one-statement-per-file
and load/execute files one-by-one; update the code paths that reference
statements to use the new splitter/parser API or the per-file loading logic (the
variable/statements construction and the migration execution loop) so semicolons
inside literals or comments no longer break migrations.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@packages/sqlite-datasource/src/sqlite-opentelemetry-ddl.ts`:
- Line 13: Update the stale source comment in sqlite-opentelemetry-ddl.ts so the
provenance matches the header stating the schema is migrated from v0.150.1:
locate the line containing the SQL source comment string "-- Source:
/k8s/migrations/clickhouse/002-otel-schema-v136.sql" and either change the
path/filename to the correct migration file for v0.150.1 or remove/replace the
comment with the correct source reference; ensure the header and the Source
comment consistently reference the same migration version.

---

Nitpick comments:
In `@packages/clickhouse-datasource/src/migration.integration.test.ts`:
- Around line 158-165: The helper hasEventNameColumn currently interpolates
CH_DATABASE into the SQL string; change it to a parameterized query like in
query-logs.ts by using the client's query_params option and a named placeholder
for the database (e.g., {db}) in the SQL, passing CH_DATABASE as the parameter
value, so the query becomes parameterized and safe if CH_DATABASE ever becomes
dynamic while keeping the rest of the function (query call and result.json
usage) unchanged.
- Around line 211-224: The temp directory created with fs.mkdtemp (tmpDir) in
the beforeAll setup is never removed; update the afterAll teardown to remove
that directory (and the config file) to avoid accumulating temp files: reference
the tmpDir and configPath variables and call the appropriate async removal
(e.g., fs.rm or fs.rmdir with recursive=true or force options) inside afterAll
after closing ds/adminClient and stopping containers; ensure removal is awaited
and guarded (check tmpDir exists) so tests still tear down cleanly.
- Around line 67-77: The current SQL splitting logic that builds the statements
array by splitting on ";" (variable statements) is brittle and will break on
semicolons inside strings/comments; replace this with a robust approach—either
integrate a SQL-aware splitter/parser (e.g., a library that tokenizes SQL and
respects string literals/comments) to produce individual statements before
executing them, or change the migration runner to require one-statement-per-file
and load/execute files one-by-one; update the code paths that reference
statements to use the new splitter/parser API or the per-file loading logic (the
variable/statements construction and the migration execution loop) so semicolons
inside literals or comments no longer break migrations.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: c95eb907-026e-4da3-8e6c-ea43c9ec3607

📥 Commits

Reviewing files that changed from the base of the PR and between 48145d4 and e819676.

⛔ Files ignored due to path filters (1)
  • pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml
📒 Files selected for processing (7)
  • examples/clickhouse-observability-backend/README.md
  • examples/clickhouse-observability-backend/docker-compose.yml
  • packages/clickhouse-datasource/src/e2e.integration.test.ts
  • packages/clickhouse-datasource/src/migration.integration.test.ts
  • packages/clickhouse-datasource/src/test/constants.ts
  • packages/core/src/data-filters-zod.ts
  • packages/sqlite-datasource/src/sqlite-opentelemetry-ddl.ts
✅ Files skipped from review due to trivial changes (2)
  • examples/clickhouse-observability-backend/README.md
  • examples/clickhouse-observability-backend/docker-compose.yml
🚧 Files skipped from review as they are similar to previous changes (2)
  • packages/clickhouse-datasource/src/e2e.integration.test.ts
  • packages/clickhouse-datasource/src/test/constants.ts

@Vunovati Vunovati merged commit f1fe701 into main Apr 27, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant