Feat/clickhouse read datasource#55
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughThis PR extends API route handlers to propagate request context through datasource method calls and introduces a new ClickHouse datasource package with query builders, row schema parsing, and comprehensive testing for traces, logs, and metrics querying. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant APIRoute as API Route Handler
participant Datasource as ReadTelemetryDatasource
participant QueryBuilder as Query Builder
participant ClickHouse as ClickHouse Server
participant RowParser as Row Parser
Client->>APIRoute: HTTP Request<br/>(with auth context)
APIRoute->>APIRoute: Extract requestContext<br/>from req
APIRoute->>Datasource: getTraces/getLogs/getMetrics<br/>({...filter, requestContext})
Datasource->>Datasource: Validate requestContext
Datasource->>QueryBuilder: buildTracesQuery/etc<br/>(filter with context)
QueryBuilder->>QueryBuilder: Apply multi-tenant<br/>filters via requestContext
QueryBuilder->>ClickHouse: Execute parameterized query
ClickHouse->>ClickHouse: Execute with context-specific<br/>database/credentials
ClickHouse-->>RowParser: Stream raw rows
RowParser->>RowParser: Parse & coerce types<br/>via schema
RowParser-->>Datasource: Typed data rows
Datasource->>Datasource: Apply pagination<br/>(cursor/limit)
Datasource-->>APIRoute: {data, nextCursor}
APIRoute-->>Client: JSON Response
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 5
🧹 Nitpick comments (10)
packages/clickhouse-datasource/src/query-traces.ts (2)
76-120: Consider extracting the repeated attribute-filter loop into a helper.The four attribute blocks (span, resource, events, links) share identical logic — iterate entries, validate key, push condition, store param — differing only in the column expression and param prefix. A small helper would eliminate the duplication and reduce the chance of a copy-paste divergence.
♻️ Example helper extraction
+type AttrFilterOpts = { + column: string; // e.g. "SpanAttributes" or template for arrayExists + paramPrefix: string; // e.g. "spanAttrVal" + isArray?: boolean; // use arrayExists wrapper +}; + +function addAttributeFilters( + attrs: Record<string, string>, + opts: AttrFilterOpts, + conditions: string[], + params: Record<string, unknown>, +): void { + let i = 0; + for (const [key, value] of Object.entries(attrs)) { + validateAttributeKey(key); + const paramName = `${opts.paramPrefix}${i}`; + if (opts.isArray) { + conditions.push( + `arrayExists(x -> x['${key}'] = {${paramName}:String}, \`${opts.column}\`)` + ); + } else { + conditions.push(`${opts.column}['${key}'] = {${paramName}:String}`); + } + params[paramName] = value; + i++; + } +}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/clickhouse-datasource/src/query-traces.ts` around lines 76 - 120, Extract the repeated attribute-filter loop into a small helper (e.g., addAttributeFilters) that accepts the attributes object, the column expression string (like "SpanAttributes", "ResourceAttributes", "`Events.Attributes`", "`Links.Attributes`") and a param prefix ("spanAttrVal", "resAttrVal", "evtAttrVal", "lnkAttrVal"); inside the helper iterate Object.entries(attributes), call validateAttributeKey(key), push the constructed condition into the existing conditions array and add the param into the params object using the prefixed incremental index; then replace the four duplicated blocks by calling this helper with filter.spanAttributes/filter.resourceAttributes/filter.eventsAttributes/filter.linksAttributes respectively and the appropriate column expression and prefix.
122-143: Cursor parsing assumes a specific format — consider validating the timestamp portion.If the timestamp portion of the cursor (before the first
:) is not a valid numeric string,nanosToDateTime64→BigInt()will throw a rawSyntaxError. A targeted validation or try/catch with a domain-specific error message would improve debuggability for callers.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/clickhouse-datasource/src/query-traces.ts` around lines 122 - 143, The cursor parsing block that slices filter.cursor into cursorTs and cursorSpanId and then calls nanosToDateTime64(cursorTs) can throw a raw SyntaxError if cursorTs is not a valid numeric string; update the code in the cursor-handling branch (where filter.cursor, cursorTs, cursorSpanId, params.cursorTs, and nanosToDateTime64 are used) to validate cursorTs (e.g., ensure it matches /^\d+$/ or Number.isFinite/BigInt-safe check) or wrap nanosToDateTime64(cursorTs) in a try/catch and throw a descriptive Error like "Invalid cursor timestamp: expected integer nanoseconds" so callers get a domain-specific message instead of a raw SyntaxError.packages/api/src/index.ts (1)
33-37: Consider narrowingrequestContextbeyondunknown.Using
unknownforces every consumer to assert/narrow independently and provides no compile-time guidance on what the context actually contains. If there's a known shape (e.g., auth info, tenant ID), defining a minimal interface would improve type safety across route handlers and datasource calls.If the shape is genuinely open-ended, this is fine for now — but a
TODOcomment documenting the intent to tighten it later would help.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/api/src/index.ts` around lines 33 - 37, The declaration merges into FastifyRequest currently types requestContext as unknown; replace unknown with a narrow interface (e.g., define an exported RequestContext interface containing fields you expect such as userId?: string, tenantId?: string, auth?: { roles: string[] } or other minimal known properties) and use that interface in the declare module block for FastifyRequest.requestContext; if the shape is intentionally open-ended, add a TODO comment above the declaration noting the intent to tighten RequestContext later and export a generic RequestContext type so callers can extend it safely.packages/collector/package.json (1)
41-41: Fastify 5.7.4 exists and is the current latest version. The version is valid and consistently pinned across all packages (api, app, collector).Bumping the peer dependency floor from
^5.0.0to^5.7.4is intentional and synchronized across the monorepo. However, the specific reason for this minimum version is not documented in the codebase. If this version was chosen to address a particular bug fix or feature requirement, consider adding a brief comment in the package.json or a changelog entry to help future maintainers understand the constraint.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/collector/package.json` at line 41, The package.json dependency pin for "fastify" was intentionally raised to ^5.7.4 across the monorepo but the reason isn't documented; add a short note explaining why this minimum (e.g., bug fix, security patch, or required feature) was chosen—either as a comment in packages/collector/package.json near the "fastify" entry or as a changelog/README line in the collector package; reference the "fastify" dependency and the package.json file so future maintainers can trace that the floor was raised to ^5.7.4 for that specific reason.packages/clickhouse-datasource/src/query-metrics.ts (1)
4-11:ATTRIBUTE_KEY_PATTERNandvalidateAttributeKeyare duplicated across query builders.The same regex and validation function exist in
query-logs.ts(and likelyquery-traces.ts). Extract to a shared module to avoid drift.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/clickhouse-datasource/src/query-metrics.ts` around lines 4 - 11, Duplicate ATTRIBUTE_KEY_PATTERN and validateAttributeKey should be extracted into a single shared utility module: create a new module (e.g., attribute-utils or validation utilities) exporting ATTRIBUTE_KEY_PATTERN and validateAttributeKey, replace the local definitions in query-metrics.ts, query-logs.ts (and query-traces.ts) with imports from that shared module, and update any references to still use the same symbol names so behavior remains unchanged; ensure tests/consumers import the new module and remove the duplicated code blocks from each query-* file.packages/clickhouse-datasource/src/row-mappers.ts (1)
107-112:MetricTypeis redefined locally — consider importing from@kopai/core.
datasource.MetricTypeis already exported frompackages/core/src/telemetry-datasource.tswith the same union. Importing it avoids drift if the set of metric types changes.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/clickhouse-datasource/src/row-mappers.ts` around lines 107 - 112, Replace the locally redefined MetricType union with the shared type exported from `@kopai/core`: remove the local "MetricType" declaration and add an import for the exported MetricType (previously referenced as datasource.MetricType) and update any usages to reference that imported symbol so the file uses the single source of truth for metric types (ensure import name matches the export and update any type annotations that used the local MetricType).packages/clickhouse-datasource/src/attributes.ts (1)
8-16: Minor:Number()coerces hex and scientific notation strings.
Number("0x1A")→26,Number("1e308")→1e308are both finite and will be coerced to numbers. If OTel attribute values can contain hex-prefixed or scientific-notation strings that should remain as strings, you may want to tighten the numeric check (e.g., use a regex orparseFloat+ strict comparison).Likely not a problem for typical OTel data, so flagging as optional.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/clickhouse-datasource/src/attributes.ts` around lines 8 - 16, coerceAttributeValue currently uses Number(value) which will coerce hex and scientific-notation strings into numbers; update the function to only coerce when the string strictly matches a simple decimal/integer pattern. Replace the Number-based check in coerceAttributeValue with a stricter numeric test (e.g., a regex like /^-?\d+(\.\d+)?$/ or parseFloat plus strict string equality) so strings like "0x1A" or "1e308" remain strings; keep existing handling for "true"/"false" and empty string behavior in AttributeValue.packages/clickhouse-datasource/src/types.ts (1)
14-37: Well-structured assertion guard.The two-phase validation (structural then type check) is clean. Consider whether empty-string values for
databaseorusernameshould be rejected, since they'd likely cause ClickHouse connection failures at runtime. Currently onlypasswordhas a legitimate use case for empty strings (as shown in tests).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/clickhouse-datasource/src/types.ts` around lines 14 - 37, Update assertClickHouseRequestContext to reject empty-string values for database and username while still allowing an empty password: inside the function (assertClickHouseRequestContext) after confirming types, add checks that obj.database.trim().length > 0 and obj.username.trim().length > 0 and throw the same Error("requestContext must provide { database, username, password }") if either is empty; keep password validation as-is to permit empty passwords per tests.packages/clickhouse-datasource/src/datasource.test.ts (1)
78-241: MissingfinallyfordbClientcleanup inbeforeAll.If any
CREATE TABLEorseed*call throws,dbClient(line 82) is never closed. While the container teardown inafterAllwill ultimately clean up, wrapping the setup in try/finally is cleaner.🔧 Suggested fix
const dbClient = createClient({ url: baseUrl, database: TEST_DATABASE, username: "default", password: "", }); - // Create traces table - await dbClient.command({ - ... - await seedMultiAttrMetric(dbClient); - - await dbClient.close(); + try { + // Create traces table + await dbClient.command({ + ... + await seedMultiAttrMetric(dbClient); + } finally { + await dbClient.close(); + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/clickhouse-datasource/src/datasource.test.ts` around lines 78 - 241, The setup in beforeAll opens dbClient and runs multiple CREATE TABLE and seed* calls (seedTraces, seedLogs, seedMetrics, seedTruncationMetric, seedMultiAttrMetric) but doesn't guarantee dbClient.close() on error; wrap the body that uses dbClient in a try/finally (after creating dbClient) so that in the finally block you always await dbClient.close(); ensure you keep the existing adminClient.command/database creation outside or unchanged and only move the cleanup of dbClient into the finally to guarantee closure on failures.packages/clickhouse-datasource/src/datasource.ts (1)
65-99: Creating a new ClickHouse client per method call is costly.Every call to
getTraces,getLogs,getMetrics, anddiscoverMetricsinstantiates a new@clickhouse/clientand tears it down infinally. This forfeits HTTP connection pooling and adds connection-establishment overhead on every API request.Consider a lightweight LRU cache keyed by
(database, username)that reuses clients for the same tenant, with a TTL-based eviction to limit open connections. This would significantly reduce latency under load while still supporting per-tenant credentials.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/clickhouse-datasource/src/datasource.ts` around lines 65 - 99, The code currently creates a new ClickHouse client per call (see createClient usage in getTraces and the assertClickHouseRequestContext flow) which prevents HTTP connection reuse; implement a small LRU cache keyed by `${database}:${username}` that returns an existing client or creates one if absent, with a configurable TTL and max size, and ensure clients are closed when evicted; update getTraces, getLogs, getMetrics, and discoverMetrics to obtain the client from this cache instead of calling createClient directly, and remove the per-call client.close() in finally (instead ensure client.close() is invoked on eviction or service shutdown).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@packages/clickhouse-datasource/src/datasource.ts`:
- Around line 189-201: discoverMetrics currently calls
assertClickHouseRequestContext(ctx) which will throw when options or
options.requestContext is undefined (contradicting the interface that marks
requestContext optional); fix by either (A) making the parameter require a
requestContext in the discoverMetrics signature or (B) adding an explicit guard
at the top of discoverMetrics that checks if ctx is undefined and throws a
clear, actionable error (e.g., "discoverMetrics requires a ClickHouse
requestContext"), then only call assertClickHouseRequestContext(ctx) after that
guard so code and contract are consistent; reference discoverMetrics and
assertClickHouseRequestContext when making the change.
In `@packages/clickhouse-datasource/src/query-metrics.ts`:
- Around line 226-239: The query uses arrayJoin(mapKeys(...)) twice which relies
on unsafe CSE; update the template strings pushed into attrUnions to compute the
expanded key once and reuse it: perform a single ARRAY JOIN (or a single
arrayJoin(...) AS attr_key in the SELECT) over mapKeys(Attributes) /
mapKeys(ResourceAttributes) and then reference the map value using
Attributes[attr_key] / ResourceAttributes[attr_key]; change the two blocks that
build the strings in the attrUnions array (the templates that currently call
arrayJoin(mapKeys(Attributes)) and arrayJoin(mapKeys(ResourceAttributes))) to
use this single-array-join pattern so the key is computed once and reused.
In `@packages/clickhouse-datasource/src/query-traces.ts`:
- Line 19: The current builder sets const limit = filter.limit ?? 100 and later
uses limit + 1 which can overflow ClickHouse UInt32; clamp and sanitize the
incoming limit in query-traces.ts by defining a MAX_LIMIT (e.g. a safe cap like
100_000 or UInt32_MAX - 1) and replace the assignment with a capped value (e.g.
Math.max(0, Math.floor(filter.limit ?? 100)) then Math.min(thatValue,
MAX_LIMIT)) so limit+1 cannot exceed the UInt32 range; apply the same
clamping/sanitization logic to the other occurrences referenced (around the
logic at lines 175-177) so all uses of limit are safe.
In `@packages/clickhouse-datasource/src/row-mappers.ts`:
- Around line 272-280: The toNumber function currently converts an empty string
to 0 because Number("") === 0; update the string branch in toNumber to treat
empty or all-whitespace strings as missing by checking if value.trim() === ""
(or length === 0 after trim) and returning undefined before calling
Number(value), then proceed with Number() and isNaN check as before; target the
toNumber function to apply this fix.
- Around line 282-286: toNumberArray currently maps elements with Number(...)
which silently yields NaN for non-numeric inputs; update to use the existing
toNumber helper for each element (call toNumber inside toNumberArray) and filter
out undefined values so the result contains only valid numbers (e.g., map to
toNumber and then filter((n): n is number => n !== undefined)). Ensure the
function still returns number[] | undefined (return undefined for non-array or
empty input) and reference toNumberArray and toNumber when making the change.
---
Duplicate comments:
In `@packages/api/package.json`:
- Line 37: Update the fastify dependency spec to match the change made in the
collector package: replace the current caret version string "^5.7.4" under the
"fastify" entry in package.json with the same pinned/version-range used in
packages/collector (e.g., remove the caret and pin to "5.7.4" or use the exact
range the collector uses) so both packages use the identical fastify version
constraint.
In `@packages/clickhouse-datasource/src/query-logs.ts`:
- Around line 4-11: The ATTRIBUTE_KEY_PATTERN constant and validateAttributeKey
function are duplicated; extract them into a shared utility module (e.g., create
and export ATTRIBUTE_KEY_PATTERN and validateAttributeKey from a new util file)
and replace the local definitions in query-logs.ts and query-metrics.ts with
imports of those exported symbols; update imports in both files to use the
shared validateAttributeKey/ATTRIBUTE_KEY_PATTERN and ensure the new util is
exported from the package index if needed.
---
Nitpick comments:
In `@packages/api/src/index.ts`:
- Around line 33-37: The declaration merges into FastifyRequest currently types
requestContext as unknown; replace unknown with a narrow interface (e.g., define
an exported RequestContext interface containing fields you expect such as
userId?: string, tenantId?: string, auth?: { roles: string[] } or other minimal
known properties) and use that interface in the declare module block for
FastifyRequest.requestContext; if the shape is intentionally open-ended, add a
TODO comment above the declaration noting the intent to tighten RequestContext
later and export a generic RequestContext type so callers can extend it safely.
In `@packages/clickhouse-datasource/src/attributes.ts`:
- Around line 8-16: coerceAttributeValue currently uses Number(value) which will
coerce hex and scientific-notation strings into numbers; update the function to
only coerce when the string strictly matches a simple decimal/integer pattern.
Replace the Number-based check in coerceAttributeValue with a stricter numeric
test (e.g., a regex like /^-?\d+(\.\d+)?$/ or parseFloat plus strict string
equality) so strings like "0x1A" or "1e308" remain strings; keep existing
handling for "true"/"false" and empty string behavior in AttributeValue.
In `@packages/clickhouse-datasource/src/datasource.test.ts`:
- Around line 78-241: The setup in beforeAll opens dbClient and runs multiple
CREATE TABLE and seed* calls (seedTraces, seedLogs, seedMetrics,
seedTruncationMetric, seedMultiAttrMetric) but doesn't guarantee
dbClient.close() on error; wrap the body that uses dbClient in a try/finally
(after creating dbClient) so that in the finally block you always await
dbClient.close(); ensure you keep the existing adminClient.command/database
creation outside or unchanged and only move the cleanup of dbClient into the
finally to guarantee closure on failures.
In `@packages/clickhouse-datasource/src/datasource.ts`:
- Around line 65-99: The code currently creates a new ClickHouse client per call
(see createClient usage in getTraces and the assertClickHouseRequestContext
flow) which prevents HTTP connection reuse; implement a small LRU cache keyed by
`${database}:${username}` that returns an existing client or creates one if
absent, with a configurable TTL and max size, and ensure clients are closed when
evicted; update getTraces, getLogs, getMetrics, and discoverMetrics to obtain
the client from this cache instead of calling createClient directly, and remove
the per-call client.close() in finally (instead ensure client.close() is invoked
on eviction or service shutdown).
In `@packages/clickhouse-datasource/src/query-metrics.ts`:
- Around line 4-11: Duplicate ATTRIBUTE_KEY_PATTERN and validateAttributeKey
should be extracted into a single shared utility module: create a new module
(e.g., attribute-utils or validation utilities) exporting ATTRIBUTE_KEY_PATTERN
and validateAttributeKey, replace the local definitions in query-metrics.ts,
query-logs.ts (and query-traces.ts) with imports from that shared module, and
update any references to still use the same symbol names so behavior remains
unchanged; ensure tests/consumers import the new module and remove the
duplicated code blocks from each query-* file.
In `@packages/clickhouse-datasource/src/query-traces.ts`:
- Around line 76-120: Extract the repeated attribute-filter loop into a small
helper (e.g., addAttributeFilters) that accepts the attributes object, the
column expression string (like "SpanAttributes", "ResourceAttributes",
"`Events.Attributes`", "`Links.Attributes`") and a param prefix ("spanAttrVal",
"resAttrVal", "evtAttrVal", "lnkAttrVal"); inside the helper iterate
Object.entries(attributes), call validateAttributeKey(key), push the constructed
condition into the existing conditions array and add the param into the params
object using the prefixed incremental index; then replace the four duplicated
blocks by calling this helper with
filter.spanAttributes/filter.resourceAttributes/filter.eventsAttributes/filter.linksAttributes
respectively and the appropriate column expression and prefix.
- Around line 122-143: The cursor parsing block that slices filter.cursor into
cursorTs and cursorSpanId and then calls nanosToDateTime64(cursorTs) can throw a
raw SyntaxError if cursorTs is not a valid numeric string; update the code in
the cursor-handling branch (where filter.cursor, cursorTs, cursorSpanId,
params.cursorTs, and nanosToDateTime64 are used) to validate cursorTs (e.g.,
ensure it matches /^\d+$/ or Number.isFinite/BigInt-safe check) or wrap
nanosToDateTime64(cursorTs) in a try/catch and throw a descriptive Error like
"Invalid cursor timestamp: expected integer nanoseconds" so callers get a
domain-specific message instead of a raw SyntaxError.
In `@packages/clickhouse-datasource/src/row-mappers.ts`:
- Around line 107-112: Replace the locally redefined MetricType union with the
shared type exported from `@kopai/core`: remove the local "MetricType" declaration
and add an import for the exported MetricType (previously referenced as
datasource.MetricType) and update any usages to reference that imported symbol
so the file uses the single source of truth for metric types (ensure import name
matches the export and update any type annotations that used the local
MetricType).
In `@packages/clickhouse-datasource/src/types.ts`:
- Around line 14-37: Update assertClickHouseRequestContext to reject
empty-string values for database and username while still allowing an empty
password: inside the function (assertClickHouseRequestContext) after confirming
types, add checks that obj.database.trim().length > 0 and
obj.username.trim().length > 0 and throw the same Error("requestContext must
provide { database, username, password }") if either is empty; keep password
validation as-is to permit empty passwords per tests.
In `@packages/collector/package.json`:
- Line 41: The package.json dependency pin for "fastify" was intentionally
raised to ^5.7.4 across the monorepo but the reason isn't documented; add a
short note explaining why this minimum (e.g., bug fix, security patch, or
required feature) was chosen—either as a comment in
packages/collector/package.json near the "fastify" entry or as a
changelog/README line in the collector package; reference the "fastify"
dependency and the package.json file so future maintainers can trace that the
floor was raised to ^5.7.4 for that specific reason.
| } { | ||
| const conditions: string[] = []; | ||
| const params: Record<string, unknown> = {}; | ||
| const limit = filter.limit ?? 100; |
There was a problem hiding this comment.
limit + 1 can overflow UInt32 if an uncapped value is passed.
The query declares {limit:UInt32} (max 4,294,967,295). If the caller passes a very large filter.limit, limit + 1 may exceed UInt32 range at the ClickHouse level. Consider clamping or capping the limit to a reasonable maximum in this builder, rather than relying entirely on upstream validation.
🛡️ Proposed clamp
- const limit = filter.limit ?? 100;
+ const MAX_LIMIT = 10_000;
+ const limit = Math.min(filter.limit ?? 100, MAX_LIMIT);Also applies to: 175-177
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/clickhouse-datasource/src/query-traces.ts` at line 19, The current
builder sets const limit = filter.limit ?? 100 and later uses limit + 1 which
can overflow ClickHouse UInt32; clamp and sanitize the incoming limit in
query-traces.ts by defining a MAX_LIMIT (e.g. a safe cap like 100_000 or
UInt32_MAX - 1) and replace the assignment with a capped value (e.g. Math.max(0,
Math.floor(filter.limit ?? 100)) then Math.min(thatValue, MAX_LIMIT)) so limit+1
cannot exceed the UInt32 range; apply the same clamping/sanitization logic to
the other occurrences referenced (around the logic at lines 175-177) so all uses
of limit are safe.
There was a problem hiding this comment.
🧹 Nitpick comments (3)
packages/clickhouse-datasource/src/datasource.ts (2)
97-107: Cursor is computed from unmapped (raw) rows — ensure consistency with query builders.The cursor reads
TimestampandSpanIdfrom raw ClickHouse rows (DateTime64 strings), which is correct sincedateTime64ToNanosexpects the DateTime64 format. However, the castas TracesCursorRow | undefinedis an unchecked assertion onRecord<string, unknown>.A minor defensive check: if ClickHouse ever returns an unexpected format or the column is renamed, this would silently produce a malformed cursor. Low risk given the tightly coupled query builder, but worth noting.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/clickhouse-datasource/src/datasource.ts` around lines 97 - 107, The code computes nextCursor from raw rows using an unchecked cast to TracesCursorRow; add a defensive runtime check instead of the cast: verify the selected lastRow exists and that lastRow.Timestamp is a string (or the expected DateTime64 shape) and lastRow.SpanId is present (and of the expected type) before calling dateTime64ToNanos, and only compute nextCursor when those checks pass—otherwise set nextCursor to null (adjust references in rows, lastRow, nextCursor, dateTime64ToNanos, and TracesCursorRow accordingly).
227-229: Redundant guard afterMap.set.Lines 228-229 re-fetch and null-check
attrMap.get(key)immediately after the block that guarantees the key exists (lines 219-226). Theif (!entry) continueis dead code.♻️ Minor cleanup
- // Safe: we just ensured the key exists above - const entry = attrMap.get(key); - if (!entry) continue; + // eslint-disable-next-line `@typescript-eslint/no-non-null-assertion` -- key was just set above + const entry = attrMap.get(key)!;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/clickhouse-datasource/src/datasource.ts` around lines 227 - 229, The guard `if (!entry) continue` is redundant because you just ensured the key exists by calling `attrMap.set(key, { ... })`; remove the dead check and subsequent continue to simplify the loop. Locate the loop that uses `attrMap`, `key`, and the `entry` variable (the block where `attrMap.set(key, ...)` is executed) and delete the re-fetch/null-check lines (`const entry = attrMap.get(key); if (!entry) continue`), then use the already-known `entry` from the set or reassign once without a null check if needed.packages/clickhouse-datasource/src/datasource.test.ts (1)
43-43:import.meta.urlpathname may have issues on Windows.
new URL(".", import.meta.url).pathnameproduces a POSIX-style path. On Windows, this results in/C:/...which can breakpath.join. This is typically fine for CI (Linux/macOS) and Dockerized environments, but worth noting if anyone runs tests locally on Windows.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/clickhouse-datasource/src/datasource.test.ts` at line 43, The test uses new URL(".", import.meta.url).pathname which yields POSIX-style paths and can produce a leading slash on Windows; replace that with a proper file URL-to-path conversion using Node's fileURLToPath to get a platform-correct path (e.g., derive dirname by calling fileURLToPath on import.meta.url or new URL(".", import.meta.url) and/or use path.dirname), updating the variable named dirname in datasource.test.ts so path.join behaves correctly on Windows.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@packages/clickhouse-datasource/src/datasource.ts`:
- Around line 181-186: discoverMetrics declares options? optional but
immediately calls assertClickHouseRequestContext(ctx) which throws when ctx is
undefined; fix by making the contract consistent: either update the
discoverMetrics signature to require options with requestContext (remove the
optional flag) so callers must pass it, or keep it optional but add an explicit
guard that throws a clear, typed error before calling
assertClickHouseRequestContext; locate the discoverMetrics function and the
assertClickHouseRequestContext(ctx) call to implement the chosen fix and mirror
the same change in any sibling methods that use the same pattern.
---
Nitpick comments:
In `@packages/clickhouse-datasource/src/datasource.test.ts`:
- Line 43: The test uses new URL(".", import.meta.url).pathname which yields
POSIX-style paths and can produce a leading slash on Windows; replace that with
a proper file URL-to-path conversion using Node's fileURLToPath to get a
platform-correct path (e.g., derive dirname by calling fileURLToPath on
import.meta.url or new URL(".", import.meta.url) and/or use path.dirname),
updating the variable named dirname in datasource.test.ts so path.join behaves
correctly on Windows.
In `@packages/clickhouse-datasource/src/datasource.ts`:
- Around line 97-107: The code computes nextCursor from raw rows using an
unchecked cast to TracesCursorRow; add a defensive runtime check instead of the
cast: verify the selected lastRow exists and that lastRow.Timestamp is a string
(or the expected DateTime64 shape) and lastRow.SpanId is present (and of the
expected type) before calling dateTime64ToNanos, and only compute nextCursor
when those checks pass—otherwise set nextCursor to null (adjust references in
rows, lastRow, nextCursor, dateTime64ToNanos, and TracesCursorRow accordingly).
- Around line 227-229: The guard `if (!entry) continue` is redundant because you
just ensured the key exists by calling `attrMap.set(key, { ... })`; remove the
dead check and subsequent continue to simplify the loop. Locate the loop that
uses `attrMap`, `key`, and the `entry` variable (the block where
`attrMap.set(key, ...)` is executed) and delete the re-fetch/null-check lines
(`const entry = attrMap.get(key); if (!entry) continue`), then use the
already-known `entry` from the set or reassign once without a null check if
needed.
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (8)
packages/clickhouse-datasource/src/row-mappers.ts (3)
262-265: Redundantvalue === ""check inemptyToUndefined.
!valuealready evaluates totruefor empty string, so the|| value === ""clause is unreachable. Harmless, but slightly misleading.♻️ Simplify
function emptyToUndefined(value: string | undefined): string | undefined { - if (!value || value === "") return undefined; + if (!value) return undefined; return value; }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/clickhouse-datasource/src/row-mappers.ts` around lines 262 - 265, The function emptyToUndefined has a redundant check (value === "") because the preceding !value already covers empty string; simplify by replacing the conditional with if (!value) return undefined so empty string and undefined both map to undefined, keeping the function name emptyToUndefined and its signature (emptyToUndefined(value: string | undefined): string | undefined) intact.
31-32: Unsafeascasts onraw: unknownskip all runtime validation.All three mapper functions cast
rawwithas ClickHouse*Row(Lines 32, 86, 170) without any runtime check. If the ClickHouse response shape changes or a query returns unexpected columns, this will produce silent data corruption rather than a clear error.This is a common pattern in data-access layers, so flagging as optional — but consider at minimum validating the presence of critical fields (e.g.,
TraceId,Timestamp,TimeUnix) to fail fast on schema mismatches.Also applies to: 85-86, 166-170
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/clickhouse-datasource/src/row-mappers.ts` around lines 31 - 32, The mappers (mapTracesRow, mapMetricsRow, mapLogsRow) are unsafely casting raw: unknown to ClickHouse*Row types and need runtime validation to fail fast on schema mismatches; add a small type-guard routine or inline checks that assert required fields (e.g., TraceId, Timestamp/TimeUnix for traces; the analogous critical columns for metrics/logs) exist and have the expected primitive types before mapping, and throw a descriptive error if validation fails instead of blindly using "as ClickHouseTracesRow" (update the casts in mapTracesRow, mapMetricsRow, mapLogsRow to call the guard or perform checks and then proceed).
107-112:MetricTypeduplicates the union from@kopai/core.This type is identical to
datasource.MetricTypeexported from@kopai/core(seepackages/core/src/telemetry-datasource.tslines 66–71). Import it instead to avoid drift.♻️ Import MetricType from core
-import type { denormalizedSignals } from "@kopai/core"; +import type { denormalizedSignals, datasource } from "@kopai/core"; import { dateTime64ToNanos } from "./timestamp.js"; import { coerceAttributes, coerceAttributesArray } from "./attributes.js";Then replace the local type:
-type MetricType = - | "Gauge" - | "Sum" - | "Histogram" - | "ExponentialHistogram" - | "Summary"; +type MetricType = datasource.MetricType;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/clickhouse-datasource/src/row-mappers.ts` around lines 107 - 112, Remove the duplicated local MetricType union and import the canonical type from `@kopai/core` instead; replace usages of the local MetricType with the imported type (e.g. datasource.MetricType) by adding an import from '@kopai/core' and updating any references in this file (look for the local MetricType declaration and places that annotate with MetricType).packages/clickhouse-datasource/src/query-metrics.test.ts (1)
1-23: Test coverage only addressesbuildDiscoverMetricsQueries;buildMetricsQueryis untested.
buildMetricsQueryhas non-trivial logic — attribute filtering with validation, cursor parsing, per-type column selection, and time-range conditions. Consider adding tests for:
- Each metric type producing the correct table and columns
- Attribute filter validation (valid and invalid keys)
- Cursor parsing (valid format, invalid format throwing)
- Time range filter conditions
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/clickhouse-datasource/src/query-metrics.test.ts` around lines 1 - 23, Tests currently only cover buildDiscoverMetricsQueries; add unit tests for buildMetricsQuery to cover its non-trivial logic: write tests that call buildMetricsQuery with different metric types to assert correct table and selected columns, tests for attribute filter validation using valid and invalid attribute keys to ensure validation behavior, tests for cursor parsing both with a valid cursor string and invalid formats to verify correct parsing and thrown errors, and tests for time-range conditions to confirm generated WHERE clauses include the expected time filters; use the existing test file pattern (vitest) and reference the buildMetricsQuery function when creating these cases.packages/clickhouse-datasource/src/row-mappers.test.ts (1)
1-42: Good coverage for helper functions.The tests cover the important edge cases (empty string, null, undefined, non-numeric strings, passthrough). Consider adding a test for
toNumberArraywhen all elements are non-numeric (should returnundefinedper the implementation), e.g.:it("returns undefined when all values are non-numeric", () => { expect(toNumberArray(["abc", "def"])).toBeUndefined(); });🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/clickhouse-datasource/src/row-mappers.test.ts` around lines 1 - 42, Add a test to row-mappers.test.ts that asserts toNumberArray returns undefined when every element is non-numeric: create an it block (e.g., "returns undefined when all values are non-numeric") that calls toNumberArray(["abc", "def"]) and expects the result toBeUndefined(); place it alongside the existing toNumberArray tests so it covers the all-non-numeric edge case for the toNumberArray function.packages/clickhouse-datasource/src/query-metrics.ts (2)
190-251:metricTypesarray duplicatesTABLE_MAP— consider deriving one from the other.Lines 194–203 define a local
metricTypesarray with the same type-to-table mapping already captured inTABLE_MAP(Lines 13–19). If a new metric type or table name is added, both must be updated in lockstep.♻️ Derive metricTypes from TABLE_MAP
- const metricTypes: Array<{ type: string; table: string }> = [ - { type: "Gauge", table: "otel_metrics_gauge" }, - { type: "Sum", table: "otel_metrics_sum" }, - { type: "Histogram", table: "otel_metrics_histogram" }, - { - type: "ExponentialHistogram", - table: "otel_metrics_exponential_histogram", - }, - { type: "Summary", table: "otel_metrics_summary" }, - ]; + const metricTypes = Object.entries(TABLE_MAP).map(([type, table]) => ({ + type, + table, + }));🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/clickhouse-datasource/src/query-metrics.ts` around lines 190 - 251, The function buildDiscoverMetricsQueries currently defines a local metricTypes array that duplicates the global TABLE_MAP; replace the hard-coded metricTypes with a derived value built from TABLE_MAP (e.g., iterate Object.entries or similar) so the type-to-table mapping is sourced from TABLE_MAP; remove the duplicate literal metricTypes and update all references in buildDiscoverMetricsQueries to use the derived array so future changes only require updating TABLE_MAP.
80-185:limitis off-by-one relative to the comment/intent forfilter.limit.Line 85 sets
limit = filter.limit ?? 100, then Line 182 setsparams.limit = limit + 1. The query fetcheslimit + 1rows and the datasource caller (datasource.ts Line 93/132/167) checksrows.length > limitto detect a next page. This works correctly as a keyset pagination pattern.However, if a caller explicitly passes
filter.limit = 0, the query willLIMIT 1and return 1 row (treated as "has more"), which is likely unintended. Consider guarding against non-positive limits.💡 Defensive limit guard
- const limit = filter.limit ?? 100; + const limit = Math.max(filter.limit ?? 100, 1);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/clickhouse-datasource/src/query-metrics.ts` around lines 80 - 185, buildMetricsQuery currently sets params.limit = limit + 1 which breaks when filter.limit is 0; normalize the requested limit by reading filter.limit into limit, then if limit <= 0 set params.limit = 0 (and keep limit = 0 so the LIMIT placeholder becomes 0), otherwise keep the existing behavior and set params.limit = limit + 1; update the variable handling around limit (in buildMetricsQuery) and the final params.limit assignment to implement this guard so an explicit non-positive filter.limit yields no rows rather than an unintended extra row.packages/clickhouse-datasource/src/datasource.ts (1)
214-239: Attribute map key uses:separator — collision-safe assuming metric names don't contain:.Line 215:
const key = \${row.MetricName}:${row.MetricType}`. If a metric name contains:, keys could collide (e.g.,"foo:bar" + "Gauge"vs"foo" + "bar:Gauge"). OTel metric naming conventions typically use.` separators, so this is unlikely in practice but worth a note.💡 Use a separator unlikely in metric names
- const key = `${row.MetricName}:${row.MetricType}`; + const key = `${row.MetricName}\0${row.MetricType}`;Apply consistently to both line 215 and line 243.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/clickhouse-datasource/src/datasource.ts` around lines 214 - 239, The composite key built for attrMap using `${row.MetricName}:${row.MetricType}` is collision-prone; update the key generation in the attrRows loop (where `key` is defined) and any other place that builds the same key (searched by occurrences of `${row.MetricName}:${row.MetricType}`) to use a separator unlikely to appear in metric names (e.g., a control character like `\u001F` or a JSON-encoded tuple) or replace the Map with a two-level Map (attrMap.get(MetricName)?.get(MetricType)). Ensure the same approach is used for both the set and the subsequent get operations so `attrMap.set(key, ...)` and `attrMap.get(key)` remain consistent; reference symbols: attrRows, attrMap, key, row.MetricName, row.MetricType.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@packages/clickhouse-datasource/src/datasource.ts`:
- Around line 96-104: The cursor-construction logic must be made defensive in
getLogs and getMetrics to match getTraces: when computing nextCursor from
lastRow, first verify lastRow exists and that the timestamp field (Timestamp for
logs, TimeUnix for metrics) and any id field used (SpanId or equivalent) are
present and typeof === "string" before calling dateTime64ToNanos; if these
checks fail return null instead of calling dateTime64ToNanos on invalid types.
Update the cursor logic in getLogs and getMetrics to follow the same typeof
checks used in getTraces (use LogsCursorRow and MetricsCursorRow types as hints)
so runtime errors are avoided and behavior is consistent across
getTraces/getLogs/getMetrics.
In `@packages/clickhouse-datasource/src/row-mappers.ts`:
- Around line 198-259: The current code in row-mappers.ts uses repeated ifs on
metricType and falls through to return a Summary-shaped object for any
unrecognized metricType; change this to perform an explicit exhaustiveness
check: after handling "Gauge", "Sum", "Histogram", and "ExponentialHistogram" do
not silently return the Summary shape for unknown values of metricType—instead
throw or assert (e.g., throw new Error or use a utility assertUnreachable)
including the invalid metricType value so bugs surface; keep the Summary return
only for the intentional "Summary" case (or add a final if (metricType ===
"Summary") branch) and replace the current fallthrough return with the explicit
guard referencing metricType and the surrounding mapper function.
---
Duplicate comments:
In `@packages/clickhouse-datasource/src/datasource.ts`:
- Around line 178-183: The function discoverMetrics declares
options.requestContext as optional but immediately calls
assertClickHouseRequestContext(ctx) which throws when ctx is undefined; update
the signature of discoverMetrics to require requestContext (change parameter to
options: { requestContext: unknown } without the optional ?) so the API matches
the implementation, and update any call sites accordingly; ensure the method
still extracts const ctx = options.requestContext and calls
assertClickHouseRequestContext(ctx) as before.
In `@packages/clickhouse-datasource/src/query-metrics.ts`:
- Around line 226-239: The ARRAY JOIN refactor in attrUnions is correct and
resolves the previous double-arrayJoin cartesian product issue; no code change
needed—keep the two template strings that use "ARRAY JOIN mapKeys(Attributes) AS
attr_key" with "Attributes[attr_key]" and "ARRAY JOIN
mapKeys(ResourceAttributes) AS attr_key" with "ResourceAttributes[attr_key]"
inside the attrUnions array as implemented (references: attrUnions, Attributes,
ResourceAttributes, attr_key).
In `@packages/clickhouse-datasource/src/row-mappers.ts`:
- Around line 272-291: The previous concerns about empty-string handling and
filtering were addressed correctly: keep the implementations of toNumber and
toNumberArray as-is (toNumber returns undefined for null/empty
string/non-numeric values and toNumberArray maps using toNumber then filters out
undefined to return number[] | undefined); no further changes required to the
functions toNumber or toNumberArray.
---
Nitpick comments:
In `@packages/clickhouse-datasource/src/datasource.ts`:
- Around line 214-239: The composite key built for attrMap using
`${row.MetricName}:${row.MetricType}` is collision-prone; update the key
generation in the attrRows loop (where `key` is defined) and any other place
that builds the same key (searched by occurrences of
`${row.MetricName}:${row.MetricType}`) to use a separator unlikely to appear in
metric names (e.g., a control character like `\u001F` or a JSON-encoded tuple)
or replace the Map with a two-level Map
(attrMap.get(MetricName)?.get(MetricType)). Ensure the same approach is used for
both the set and the subsequent get operations so `attrMap.set(key, ...)` and
`attrMap.get(key)` remain consistent; reference symbols: attrRows, attrMap, key,
row.MetricName, row.MetricType.
In `@packages/clickhouse-datasource/src/query-metrics.test.ts`:
- Around line 1-23: Tests currently only cover buildDiscoverMetricsQueries; add
unit tests for buildMetricsQuery to cover its non-trivial logic: write tests
that call buildMetricsQuery with different metric types to assert correct table
and selected columns, tests for attribute filter validation using valid and
invalid attribute keys to ensure validation behavior, tests for cursor parsing
both with a valid cursor string and invalid formats to verify correct parsing
and thrown errors, and tests for time-range conditions to confirm generated
WHERE clauses include the expected time filters; use the existing test file
pattern (vitest) and reference the buildMetricsQuery function when creating
these cases.
In `@packages/clickhouse-datasource/src/query-metrics.ts`:
- Around line 190-251: The function buildDiscoverMetricsQueries currently
defines a local metricTypes array that duplicates the global TABLE_MAP; replace
the hard-coded metricTypes with a derived value built from TABLE_MAP (e.g.,
iterate Object.entries or similar) so the type-to-table mapping is sourced from
TABLE_MAP; remove the duplicate literal metricTypes and update all references in
buildDiscoverMetricsQueries to use the derived array so future changes only
require updating TABLE_MAP.
- Around line 80-185: buildMetricsQuery currently sets params.limit = limit + 1
which breaks when filter.limit is 0; normalize the requested limit by reading
filter.limit into limit, then if limit <= 0 set params.limit = 0 (and keep limit
= 0 so the LIMIT placeholder becomes 0), otherwise keep the existing behavior
and set params.limit = limit + 1; update the variable handling around limit (in
buildMetricsQuery) and the final params.limit assignment to implement this guard
so an explicit non-positive filter.limit yields no rows rather than an
unintended extra row.
In `@packages/clickhouse-datasource/src/row-mappers.test.ts`:
- Around line 1-42: Add a test to row-mappers.test.ts that asserts toNumberArray
returns undefined when every element is non-numeric: create an it block (e.g.,
"returns undefined when all values are non-numeric") that calls
toNumberArray(["abc", "def"]) and expects the result toBeUndefined(); place it
alongside the existing toNumberArray tests so it covers the all-non-numeric edge
case for the toNumberArray function.
In `@packages/clickhouse-datasource/src/row-mappers.ts`:
- Around line 262-265: The function emptyToUndefined has a redundant check
(value === "") because the preceding !value already covers empty string;
simplify by replacing the conditional with if (!value) return undefined so empty
string and undefined both map to undefined, keeping the function name
emptyToUndefined and its signature (emptyToUndefined(value: string | undefined):
string | undefined) intact.
- Around line 31-32: The mappers (mapTracesRow, mapMetricsRow, mapLogsRow) are
unsafely casting raw: unknown to ClickHouse*Row types and need runtime
validation to fail fast on schema mismatches; add a small type-guard routine or
inline checks that assert required fields (e.g., TraceId, Timestamp/TimeUnix for
traces; the analogous critical columns for metrics/logs) exist and have the
expected primitive types before mapping, and throw a descriptive error if
validation fails instead of blindly using "as ClickHouseTracesRow" (update the
casts in mapTracesRow, mapMetricsRow, mapLogsRow to call the guard or perform
checks and then proceed).
- Around line 107-112: Remove the duplicated local MetricType union and import
the canonical type from `@kopai/core` instead; replace usages of the local
MetricType with the imported type (e.g. datasource.MetricType) by adding an
import from '@kopai/core' and updating any references in this file (look for the
local MetricType declaration and places that annotate with MetricType).
| if (metricType === "Gauge") { | ||
| return { | ||
| ...base, | ||
| MetricType: "Gauge" as const, | ||
| Value: row.Value as number, | ||
| Flags: toNumber(row.Flags), | ||
| }; | ||
| } | ||
|
|
||
| if (metricType === "Sum") { | ||
| return { | ||
| ...base, | ||
| MetricType: "Sum" as const, | ||
| Value: row.Value as number, | ||
| Flags: toNumber(row.Flags), | ||
| AggTemporality: emptyToUndefined(row.AggTemporality), | ||
| IsMonotonic: toNumber(row.IsMonotonic), | ||
| }; | ||
| } | ||
|
|
||
| if (metricType === "Histogram") { | ||
| return { | ||
| ...base, | ||
| MetricType: "Histogram" as const, | ||
| Count: toNumber(row.Count), | ||
| Sum: row.Sum, | ||
| Min: row.Min, | ||
| Max: row.Max, | ||
| BucketCounts: toNumberArray(row.BucketCounts), | ||
| ExplicitBounds: row.ExplicitBounds, | ||
| AggTemporality: emptyToUndefined(row.AggTemporality), | ||
| }; | ||
| } | ||
|
|
||
| if (metricType === "ExponentialHistogram") { | ||
| return { | ||
| ...base, | ||
| MetricType: "ExponentialHistogram" as const, | ||
| Count: toNumber(row.Count), | ||
| Sum: row.Sum, | ||
| Min: row.Min, | ||
| Max: row.Max, | ||
| Scale: toNumber(row.Scale), | ||
| ZeroCount: toNumber(row.ZeroCount), | ||
| ZeroThreshold: row.ZeroThreshold, | ||
| PositiveOffset: toNumber(row.PositiveOffset), | ||
| PositiveBucketCounts: toNumberArray(row.PositiveBucketCounts), | ||
| NegativeOffset: toNumber(row.NegativeOffset), | ||
| NegativeBucketCounts: toNumberArray(row.NegativeBucketCounts), | ||
| AggTemporality: emptyToUndefined(row.AggTemporality), | ||
| }; | ||
| } | ||
|
|
||
| // Summary | ||
| return { | ||
| ...base, | ||
| MetricType: "Summary" as const, | ||
| Count: toNumber(row.Count), | ||
| Sum: row.Sum, | ||
| "ValueAtQuantiles.Quantile": row["ValueAtQuantiles.Quantile"], | ||
| "ValueAtQuantiles.Value": row["ValueAtQuantiles.Value"], | ||
| }; |
There was a problem hiding this comment.
Fallthrough to Summary for invalid metricType can silently produce wrong data.
If an unrecognized metricType is passed, the function falls through all if checks and returns a Summary-shaped row (Lines 252–259). This masks bugs at runtime. An explicit exhaustiveness check would surface invalid types early.
🐛 Proposed fix: add exhaustiveness guard
+ if (metricType === "Summary") {
// Summary
return {
...base,
MetricType: "Summary" as const,
Count: toNumber(row.Count),
Sum: row.Sum,
"ValueAtQuantiles.Quantile": row["ValueAtQuantiles.Quantile"],
"ValueAtQuantiles.Value": row["ValueAtQuantiles.Value"],
};
+ }
+
+ // Exhaustive check — will fail at compile time if a MetricType variant is unhandled
+ const _exhaustive: never = metricType;
+ throw new Error(`Unknown metric type: ${_exhaustive}`);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/clickhouse-datasource/src/row-mappers.ts` around lines 198 - 259,
The current code in row-mappers.ts uses repeated ifs on metricType and falls
through to return a Summary-shaped object for any unrecognized metricType;
change this to perform an explicit exhaustiveness check: after handling "Gauge",
"Sum", "Histogram", and "ExponentialHistogram" do not silently return the
Summary shape for unknown values of metricType—instead throw or assert (e.g.,
throw new Error or use a utility assertUnreachable) including the invalid
metricType value so bugs surface; keep the Summary return only for the
intentional "Summary" case (or add a final if (metricType === "Summary") branch)
and replace the current fallthrough return with the explicit guard referencing
metricType and the surrounding mapper function.
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (5)
packages/clickhouse-datasource/package.json (1)
37-37: Consider bumping the@clickhouse/clientfloor to the tested version.The latest published version of
@clickhouse/clientis1.17.0. The^1.14.0constraint will resolve to it automatically, but explicitly setting the floor to the version you actually developed and tested against (^1.17.0) makes the minimum requirement explicit and prevents installs from unexpectedly picking up any future1.xrelease that might introduce regressions.📦 Suggested bump
- "@clickhouse/client": "^1.14.0", + "@clickhouse/client": "^1.17.0",🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/clickhouse-datasource/package.json` at line 37, Update the package.json dependency floor for `@clickhouse/client` from "^1.14.0" to the tested minimum "^1.17.0" so installs explicitly require the version you developed against; edit the dependency entry for "@clickhouse/client" in packages/clickhouse-datasource/package.json to use "^1.17.0".packages/clickhouse-datasource/src/query-metrics.ts (1)
236-242: Consider adding a LIMIT or safeguard to discovery queries for large deployments.Neither
namesQuerynorattributesQueryhas aLIMITclause. In deployments with a high cardinality of metric names × attribute keys, these queries could return very large result sets and put significant load on ClickHouse. A reasonable upper bound (e.g.,LIMIT 10000) or a configurable limit parameter would add a safety net.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/clickhouse-datasource/src/query-metrics.ts` around lines 236 - 242, The discovery queries attributesQuery and namesQuery currently lack any result-size safeguards; modify the code that constructs these SQL strings (look for the constants namesQuery and attributesQuery in query-metrics.ts) to append a configurable LIMIT clause (e.g., default to 10000) or apply a hard cap, and expose the limit as a parameter or config value so callers can override it; ensure the LIMIT is added after ORDER BY in attributesQuery and similarly applied to namesQuery to prevent unbounded result sets.packages/clickhouse-datasource/src/ch-row-schemas.test.ts (1)
113-142: Consider adding a test for theMetricTypedefault behavior.The
chGaugeRowSchemasetsMetricType: z.literal("Gauge").default("Gauge"), meaning the field should default to"Gauge"when absent from the input. The test data on line 131 doesn't includeMetricType, but there's no assertion verifyingresult.MetricType === "Gauge". A quick assertion would confirm the default works as intended.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/clickhouse-datasource/src/ch-row-schemas.test.ts` around lines 113 - 142, Add an assertion to the existing gauge row test to verify the defaulted MetricType is applied: after calling parseChRow(chGaugeRowSchema, validRow) check that result.MetricType === "Gauge". Locate the test using validRow and the parseChRow call in the "chGaugeRowSchema" describe block and add the expectation alongside the other expects (TimeUnix, Value, Flags, MetricDescription, MetricUnit).packages/clickhouse-datasource/src/ch-row-schemas.ts (2)
38-46:toNumberArraysilently drops non-numeric elements.If some elements in the array are non-numeric strings (e.g., corrupt data), they're silently filtered out rather than causing a parse error. This is a reasonable choice for resilient parsing, but worth being aware of — a row with
BucketCounts: ["10", "??", "30"]would yield[10, 30]instead of failing. If strictness is preferred, consider throwing intoNumberfor invalid non-empty strings.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/clickhouse-datasource/src/ch-row-schemas.ts` around lines 38 - 46, toNumberArray currently filters out non-numeric elements silently by relying on toNumber returning undefined; change it to fail fast on invalid non-empty values: iterate the input in toNumberArray, call toNumber for each element, and if toNumber returns undefined for a non-empty input (e.g., a non-numeric string like "??") throw a descriptive error (include the offending value and index) instead of filtering; alternatively, implement the same strict behavior by updating toNumber to throw on invalid non-empty strings and update callers accordingly (refer to toNumberArray and toNumber to locate the logic).
25-35:toNumberallowsInfinitythrough — verify this is acceptable.
Number("Infinity")producesInfinity, andisNaN(Infinity)isfalse, sotoNumber("Infinity")returnsInfinity. Given that ClickHouse UInt64/Int64 values shouldn't produce this, it's unlikely to be hit in practice, but if you want strict finite-number coercion, consider adding a!isFinite(n)guard.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/clickhouse-datasource/src/ch-row-schemas.ts` around lines 25 - 35, The toNumber function currently returns Infinity for inputs like "Infinity" because Number("Infinity") is not NaN; update toNumber to enforce finite numeric results by checking isFinite on the parsed value (e.g. after const n = Number(value) add a guard that returns undefined if !isFinite(n)). Locate the toNumber export and add the finite-number guard so only finite numbers are returned; keep all other existing behaviors (null/empty string/number passthrough) unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@packages/clickhouse-datasource/src/ch-row-schemas.ts`:
- Around line 161-164: Change the schema for "Exemplars.Value" to use the
existing chNumberArray helper (the same one used by
BucketCounts/PositiveBucketCounts/NegativeBucketCounts) so it accepts both
string and number elements and runs toNumberArray transformation; similarly
audit ExplicitBounds (currently z.array(z.number())) and replace it with
chNumberArray if ClickHouse can serialize bounds as strings to keep parsing
consistent. Update the reference to the chNumberArray symbol in
ch-row-schemas.ts and ensure no other logic assumes raw number arrays.
In `@packages/clickhouse-datasource/src/query-metrics.ts`:
- Around line 154-169: The cursor timestamp extracted from filter.cursor
(variable cursorTs) isn't validated before calling nanosToDateTime64, which can
surface raw BigInt/SyntaxError for empty or non-numeric values; update the
cursor parsing block to validate that cursorTs is a non-empty numeric string
(e.g. /^[0-9]+$/) and throw a clear Error (e.g. "Invalid cursor timestamp:
expected numeric nanoseconds") if it fails, then only call nanosToDateTime64 and
set params.cursorTs when validation passes; keep the existing logic that pushes
the TimeUnix condition based on sortOrder and uses params.cursorTs.
- Around line 218-233: In buildDiscoverMetricsQueries, add discovery for
ScopeAttributes to match buildMetricsQuery: inside the loop over metricTypes
where attrUnions is built, insert a third union that mirrors the two existing
blocks but uses ScopeAttributes and sets source = 'scope_attr' (e.g., `SELECT
MetricName, '${type}' AS MetricType, 'scope_attr' AS source, attr_key,
ScopeAttributes[attr_key] AS attr_value FROM ${table} ARRAY JOIN
mapKeys(ScopeAttributes) AS attr_key WHERE notEmpty(ScopeAttributes)`) so
buildDiscoverMetricsQueries (and the attrUnions array) returns keys for
Attributes, ResourceAttributes, and ScopeAttributes consistently with
buildMetricsQuery.
---
Nitpick comments:
In `@packages/clickhouse-datasource/package.json`:
- Line 37: Update the package.json dependency floor for `@clickhouse/client` from
"^1.14.0" to the tested minimum "^1.17.0" so installs explicitly require the
version you developed against; edit the dependency entry for
"@clickhouse/client" in packages/clickhouse-datasource/package.json to use
"^1.17.0".
In `@packages/clickhouse-datasource/src/ch-row-schemas.test.ts`:
- Around line 113-142: Add an assertion to the existing gauge row test to verify
the defaulted MetricType is applied: after calling parseChRow(chGaugeRowSchema,
validRow) check that result.MetricType === "Gauge". Locate the test using
validRow and the parseChRow call in the "chGaugeRowSchema" describe block and
add the expectation alongside the other expects (TimeUnix, Value, Flags,
MetricDescription, MetricUnit).
In `@packages/clickhouse-datasource/src/ch-row-schemas.ts`:
- Around line 38-46: toNumberArray currently filters out non-numeric elements
silently by relying on toNumber returning undefined; change it to fail fast on
invalid non-empty values: iterate the input in toNumberArray, call toNumber for
each element, and if toNumber returns undefined for a non-empty input (e.g., a
non-numeric string like "??") throw a descriptive error (include the offending
value and index) instead of filtering; alternatively, implement the same strict
behavior by updating toNumber to throw on invalid non-empty strings and update
callers accordingly (refer to toNumberArray and toNumber to locate the logic).
- Around line 25-35: The toNumber function currently returns Infinity for inputs
like "Infinity" because Number("Infinity") is not NaN; update toNumber to
enforce finite numeric results by checking isFinite on the parsed value (e.g.
after const n = Number(value) add a guard that returns undefined if
!isFinite(n)). Locate the toNumber export and add the finite-number guard so
only finite numbers are returned; keep all other existing behaviors (null/empty
string/number passthrough) unchanged.
In `@packages/clickhouse-datasource/src/query-metrics.ts`:
- Around line 236-242: The discovery queries attributesQuery and namesQuery
currently lack any result-size safeguards; modify the code that constructs these
SQL strings (look for the constants namesQuery and attributesQuery in
query-metrics.ts) to append a configurable LIMIT clause (e.g., default to 10000)
or apply a hard cap, and expose the limit as a parameter or config value so
callers can override it; ensure the LIMIT is added after ORDER BY in
attributesQuery and similarly applied to namesQuery to prevent unbounded result
sets.
| "Exemplars.Value": z | ||
| .array(z.number()) | ||
| .transform((a) => (a.length === 0 ? undefined : a)) | ||
| .optional(), |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
rg -n "Exemplars.Value" --type=ts -C3Repository: kopai-app/kopai-mono
Length of output: 18942
🏁 Script executed:
# Find the chNumberArray definition
rg -n "chNumberArray" --type=ts -C5Repository: kopai-app/kopai-mono
Length of output: 3129
🏁 Script executed:
# Look for BucketCounts and PositiveBucketCounts usage
rg -n "BucketCounts\|PositiveBucketCounts" --type=ts -C3Repository: kopai-app/kopai-mono
Length of output: 46
🏁 Script executed:
# Check if there are any tests related to Exemplars
rg -n "Exemplars" --type=ts -C2Repository: kopai-app/kopai-mono
Length of output: 29933
🏁 Script executed:
# Check the exact ClickHouse DDL for BucketCounts and related fields
rg -n "BucketCounts\|PositiveBucketCounts\|Exemplars.Value" packages/clickhouse-datasource/src/datasource.test.ts -A 1 -B 1 | head -40Repository: kopai-app/kopai-mono
Length of output: 46
🏁 Script executed:
# Look for any ClickHouse query or migration files that define the schema
fd -e "\.sql" -e "migration\|schema\|ddl" --type f | head -20Repository: kopai-app/kopai-mono
Length of output: 46
🏁 Script executed:
# Check if there's any documentation about how ClickHouse serializes these values
rg -n "Float64\|UInt64\|Array" packages/clickhouse-datasource/src/datasource.test.ts | grep -E "(Exemplars|BucketCounts)"Repository: kopai-app/kopai-mono
Length of output: 46
🏁 Script executed:
# Look at the helper functions that transform these values
rg -n "toNumberArray\|parseNumber" packages/clickhouse-datasource/src/ch-row-schemas.ts -A 3Repository: kopai-app/kopai-mono
Length of output: 46
🏁 Script executed:
# Look at the full ch-row-schemas.ts file to understand the transformations
cat -n packages/clickhouse-datasource/src/ch-row-schemas.ts | head -100Repository: kopai-app/kopai-mono
Length of output: 4228
🏁 Script executed:
# Check the specific area around chNumberArray definition and usage
sed -n '70,90p' packages/clickhouse-datasource/src/ch-row-schemas.tsRepository: kopai-app/kopai-mono
Length of output: 644
🏁 Script executed:
# Look at how the schema is applied and what the actual transformation functions do
sed -n '1,70p' packages/clickhouse-datasource/src/ch-row-schemas.tsRepository: kopai-app/kopai-mono
Length of output: 2619
🏁 Script executed:
# Find the toNumberArray function if it exists
rg -n "toNumberArray\|toNumber" packages/clickhouse-datasource/src/ch-row-schemas.ts -B 2 -A 5Repository: kopai-app/kopai-mono
Length of output: 46
🏁 Script executed:
# Check the exact schema for ExplicitBounds to compare
sed -n '155,210p' packages/clickhouse-datasource/src/ch-row-schemas.tsRepository: kopai-app/kopai-mono
Length of output: 1887
🏁 Script executed:
# Look for any test data or fixtures that might show what ClickHouse actually returns
rg -n "Exemplars.Value.*\[" packages/clickhouse-datasource/src/datasource.test.ts -A 2 -B 2Repository: kopai-app/kopai-mono
Length of output: 1823
Use chNumberArray for Exemplars.Value to match bucket count handling — both can arrive as string arrays from ClickHouse.
The file's header (line 15) documents that ClickHouse returns UInt64/Int64 as strings via its wire protocol. While Exemplars.Value is typed as Array(Float64), the array itself may be serialized as strings. This is why BucketCounts, PositiveBucketCounts, and NegativeBucketCounts all use chNumberArray, which accepts z.union([z.string(), z.number()]) and transforms with toNumberArray.
Exemplars.Value should follow the same pattern for consistency and robustness.
Proposed fix
- "Exemplars.Value": z
- .array(z.number())
- .transform((a) => (a.length === 0 ? undefined : a))
- .optional(),
+ "Exemplars.Value": chNumberArray.optional(),Note: ExplicitBounds (line 188) has the same issue but is typed as z.array(z.number()) without the transform—consider applying the same fix there if ClickHouse can serialize it as strings.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| "Exemplars.Value": z | |
| .array(z.number()) | |
| .transform((a) => (a.length === 0 ? undefined : a)) | |
| .optional(), | |
| "Exemplars.Value": chNumberArray.optional(), |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/clickhouse-datasource/src/ch-row-schemas.ts` around lines 161 - 164,
Change the schema for "Exemplars.Value" to use the existing chNumberArray helper
(the same one used by BucketCounts/PositiveBucketCounts/NegativeBucketCounts) so
it accepts both string and number elements and runs toNumberArray
transformation; similarly audit ExplicitBounds (currently z.array(z.number()))
and replace it with chNumberArray if ClickHouse can serialize bounds as strings
to keep parsing consistent. Update the reference to the chNumberArray symbol in
ch-row-schemas.ts and ensure no other logic assumes raw number arrays.
| // Cursor pagination | ||
| // Cursor format: "{nanosTimestamp}:0" — metrics use timestamp-only cursor | ||
| if (filter.cursor) { | ||
| const colonIdx = filter.cursor.indexOf(":"); | ||
| if (colonIdx === -1) { | ||
| throw new Error("Invalid cursor format: expected '{timestamp}:{id}'"); | ||
| } | ||
| const cursorTs = filter.cursor.slice(0, colonIdx); | ||
| params.cursorTs = nanosToDateTime64(cursorTs); | ||
|
|
||
| if (sortOrder === "DESC") { | ||
| conditions.push("TimeUnix < {cursorTs:DateTime64(9)}"); | ||
| } else { | ||
| conditions.push("TimeUnix > {cursorTs:DateTime64(9)}"); | ||
| } | ||
| } |
There was a problem hiding this comment.
Cursor timestamp is not validated before BigInt conversion.
If cursorTs is empty (cursor = ":0") or non-numeric (cursor = "abc:0"), nanosToDateTime64 will throw a raw SyntaxError from BigInt(...) with an unhelpful message. Since the format is already being validated for the colon separator, consider also validating that the timestamp portion is a non-empty numeric string to provide a clear error.
🛡️ Proposed fix
const cursorTs = filter.cursor.slice(0, colonIdx);
+ if (!/^\d+$/.test(cursorTs)) {
+ throw new Error(`Invalid cursor timestamp: expected numeric nanoseconds, got '${cursorTs}'`);
+ }
params.cursorTs = nanosToDateTime64(cursorTs);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/clickhouse-datasource/src/query-metrics.ts` around lines 154 - 169,
The cursor timestamp extracted from filter.cursor (variable cursorTs) isn't
validated before calling nanosToDateTime64, which can surface raw
BigInt/SyntaxError for empty or non-numeric values; update the cursor parsing
block to validate that cursorTs is a non-empty numeric string (e.g. /^[0-9]+$/)
and throw a clear Error (e.g. "Invalid cursor timestamp: expected numeric
nanoseconds") if it fails, then only call nanosToDateTime64 and set
params.cursorTs when validation passes; keep the existing logic that pushes the
TimeUnix condition based on sortOrder and uses params.cursorTs.
| const attrUnions: string[] = []; | ||
| for (const { type, table } of metricTypes) { | ||
| attrUnions.push( | ||
| `SELECT MetricName, '${type}' AS MetricType, 'attr' AS source, | ||
| attr_key, Attributes[attr_key] AS attr_value | ||
| FROM ${table} | ||
| ARRAY JOIN mapKeys(Attributes) AS attr_key | ||
| WHERE notEmpty(Attributes)` | ||
| ); | ||
| attrUnions.push( | ||
| `SELECT MetricName, '${type}', 'res_attr', | ||
| attr_key, ResourceAttributes[attr_key] AS attr_value | ||
| FROM ${table} | ||
| ARRAY JOIN mapKeys(ResourceAttributes) AS attr_key | ||
| WHERE notEmpty(ResourceAttributes)` | ||
| ); |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check if query-traces.ts and query-logs.ts include ScopeAttributes in their discovery/attribute queries
echo "=== query-traces.ts ==="
rg -n 'ScopeAttributes' --type=ts -g '*query-traces*'
echo ""
echo "=== query-logs.ts ==="
rg -n 'ScopeAttributes' --type=ts -g '*query-logs*'
echo ""
echo "=== query-metrics.ts ==="
rg -n 'ScopeAttributes' --type=ts -g '*query-metrics*'Repository: kopai-app/kopai-mono
Length of output: 537
🏁 Script executed:
# Check the context around lines 142-152 in query-metrics.ts to verify the claim about filtering by scopeAttributes
fd -e ts -g '*query-metrics*' -x wc -l {} \;Repository: kopai-app/kopai-mono
Length of output: 183
🏁 Script executed:
# Find query-metrics.ts file first, then examine lines 142-152
fd -e ts -g '*query-metrics*' | head -1 | xargs -I {} cat -n {} | sed -n '140,155p'Repository: kopai-app/kopai-mono
Length of output: 46
🏁 Script executed:
# Also check the buildDiscoverMetricsQueries function structure across files to compare patterns
echo "=== Checking buildDiscoverMetricsQueries pattern in query-metrics.ts ==="
fd -e ts -g '*query-metrics*' | head -1 | xargs -I {} grep -n 'buildDiscoverMetricsQueries\|Attributes\|ResourceAttributes' {} | head -40Repository: kopai-app/kopai-mono
Length of output: 385
🏁 Script executed:
# Get the full content of query-metrics.ts to understand the structure
cat -n packages/clickhouse-datasource/src/query-metrics.ts | head -250Repository: kopai-app/kopai-mono
Length of output: 8538
🏁 Script executed:
# Also check query-logs.ts to see if it includes ScopeAttributes in discovery queries
cat -n packages/clickhouse-datasource/src/query-logs.ts | grep -A 5 -B 5 'ScopeAttributes' | head -50Repository: kopai-app/kopai-mono
Length of output: 758
ScopeAttributes is filterable in buildMetricsQuery but not discoverable in buildDiscoverMetricsQueries.
buildMetricsQuery supports filtering by scopeAttributes (lines 142–152), but buildDiscoverMetricsQueries only discovers keys from Attributes and ResourceAttributes (lines 218–233). Users cannot discover available scope attribute keys before filtering, creating an inconsistency.
Add a third discovery block per metric type with source = 'scope_attr' matching the existing pattern:
attrUnions.push(
`SELECT MetricName, '${type}', 'res_attr',
attr_key, ResourceAttributes[attr_key] AS attr_value
FROM ${table}
ARRAY JOIN mapKeys(ResourceAttributes) AS attr_key
WHERE notEmpty(ResourceAttributes)`
);
+ attrUnions.push(
+ `SELECT MetricName, '${type}', 'scope_attr',
+ attr_key, ScopeAttributes[attr_key] AS attr_value
+ FROM ${table}
+ ARRAY JOIN mapKeys(ScopeAttributes) AS attr_key
+ WHERE notEmpty(ScopeAttributes)`
+ );🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/clickhouse-datasource/src/query-metrics.ts` around lines 218 - 233,
In buildDiscoverMetricsQueries, add discovery for ScopeAttributes to match
buildMetricsQuery: inside the loop over metricTypes where attrUnions is built,
insert a third union that mirrors the two existing blocks but uses
ScopeAttributes and sets source = 'scope_attr' (e.g., `SELECT MetricName,
'${type}' AS MetricType, 'scope_attr' AS source, attr_key,
ScopeAttributes[attr_key] AS attr_value FROM ${table} ARRAY JOIN
mapKeys(ScopeAttributes) AS attr_key WHERE notEmpty(ScopeAttributes)`) so
buildDiscoverMetricsQueries (and the attrUnions array) returns keys for
Attributes, ResourceAttributes, and ScopeAttributes consistently with
buildMetricsQuery.
- keep limits consistent across signals - use parameteried attr keys for queries
* feat: add full-circle test (e2e and manual with docker compose) * feat(sqlite,ch,core): rename AggTemporality to AggregationTemporality - make ch schema the canonical one
Summary by CodeRabbit
Release Notes
New Features
Chores