feat: update ch parser and add WITH FILL docs#4271
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub. 2 Skipped Deployments
|
|
📝 WalkthroughWalkthroughThe PR updates analytics documentation to describe automatic query filtering by workspace and API scoping, introduces time-series gap-filling guidance with ClickHouse SQL examples, replaces Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes
Possibly related PRs
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Thank you for following the naming conventions for pull request titles! 🙏 |
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (2)
apps/docs/analytics/query-examples.mdx (1)
920-1070: WITH FILL examples are well-structured and include important type-matching guidance.The new section effectively demonstrates gap filling for hourly, daily, and monthly aggregations (lines 940-1070). The type-matching note (lines 932-938) correctly distinguishes between DateTime (for minute/hour tables) and Date (for day/month tables), which is critical for WITH FILL correctness. The note at line 1068-1070 appropriately warns that WITH FILL only works with a single GROUP BY column on time.
However, there are two minor inconsistencies:
Line 1012-1013 (monthly example): The WHERE clause uses
toDate(toStartOfMonth(...))which wraps a Date-returning function intoDate()(redundant). For consistency with the daily example (line 981), this should be:WHERE time >= toStartOfMonth(now() - INTERVAL 12 MONTH)Line 1017-1018: Similarly, the FILL FROM/TO clauses should not wrap the
toStartOfMonth()result intoDate(). The WITH FILL bounds should match the actual column type.Apply this diff to correct type consistency in the monthly WITH FILL example:
WHERE time >= toDate(toStartOfMonth(now() - INTERVAL 12 MONTH)) - AND time <= toDate(toStartOfMonth(now())) + AND time <= toStartOfMonth(now()) GROUP BY time ORDER BY time ASC WITH FILL - FROM toDate(toStartOfMonth(now() - INTERVAL 12 MONTH)) - TO toDate(toStartOfMonth(now())) + FROM toStartOfMonth(now() - INTERVAL 12 MONTH) + TO toStartOfMonth(now()) STEP INTERVAL 1 MONTHAlso update the corresponding cURL query on lines 1027-1028 to reflect this change.
apps/docs/analytics/schema-reference.mdx (1)
55-65: Type distinction between DateTime and Date is helpful but needs clarity.Line 57 attempts to document the time column type distinction but is worded ambiguously: "DateTime for minute/hour tables, Date for day/month tables". This could be misread. The schema should explicitly state which tables use which type:
- Minute & Hour tables:
DateTime- Day & Month tables:
DateConsider rewording to:
DateTime for minute/hour tables; Date for day/month tablesor using a separate row per table type for clarity.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
go/go.sumis excluded by!**/*.sum
📒 Files selected for processing (11)
apps/docs/analytics/getting-started.mdx(1 hunks)apps/docs/analytics/overview.mdx(2 hunks)apps/docs/analytics/query-examples.mdx(2 hunks)apps/docs/analytics/quick-reference.mdx(2 hunks)apps/docs/analytics/schema-reference.mdx(2 hunks)apps/docs/errors/user/bad_request/invalid_analytics_query.mdx(5 hunks)go/Makefile(1 hunks)go/apps/api/openapi/openapi-generated.yaml(0 hunks)go/apps/api/openapi/spec/paths/v2/analytics/getVerifications/index.yaml(0 hunks)go/go.mod(1 hunks)go/pkg/clickhouse/query-parser/validation.go(1 hunks)
💤 Files with no reviewable changes (2)
- go/apps/api/openapi/spec/paths/v2/analytics/getVerifications/index.yaml
- go/apps/api/openapi/openapi-generated.yaml
🧰 Additional context used
🧠 Learnings (11)
📓 Common learnings
Learnt from: Flo4604
Repo: unkeyed/unkey PR: 4190
File: go/internal/services/keys/verifier.go:51-53
Timestamp: 2025-10-30T15:10:52.743Z
Learning: PR #4190 for unkeyed/unkey is focused solely on database schema and query changes for identity-based credits. It adds IdentityCredits and KeyCredits fields to structs and queries, but does not implement the priority enforcement logic in the usagelimiter. The logic implementation is intentionally deferred to a later PR in the stack.
📚 Learning: 2025-09-01T02:33:43.791Z
Learnt from: imeyer
Repo: unkeyed/unkey PR: 3899
File: go/buf.gen.yaml:0-0
Timestamp: 2025-09-01T02:33:43.791Z
Learning: In the unkeyed/unkey repository, buf commands are executed from the `go/` directory where the `buf.yaml` file is located. This means the `out: gen` configuration in `go/buf.gen.yaml` generates files to `go/gen/` relative to the repository root, which aligns with the expected directory structure.
Applied to files:
go/Makefile
📚 Learning: 2025-09-01T01:57:42.227Z
Learnt from: imeyer
Repo: unkeyed/unkey PR: 3899
File: go/proto/metald/v1/metald.proto:5-9
Timestamp: 2025-09-01T01:57:42.227Z
Learning: In the unkeyed/unkey repository, buf is configured to properly resolve metald proto imports like "metald/v1/vm.proto" without needing the full "go/proto/" prefix. The buf lint command `buf lint --path proto/metald` passes successfully with these relative import paths.
Applied to files:
go/Makefile
📚 Learning: 2025-10-15T10:12:40.810Z
Learnt from: Flo4604
Repo: unkeyed/unkey PR: 4098
File: go/proto/ctrl/v1/deployment.proto:33-36
Timestamp: 2025-10-15T10:12:40.810Z
Learning: In the Unkey codebase proto files (ctrl/v1/build.proto, ctrl/v1/deployment.proto, hydra/v1/deployment.proto), use `dockerfile_path` (not `docker_file_path`) for consistency in generated Go field names.
Applied to files:
go/Makefile
📚 Learning: 2025-09-12T08:01:20.792Z
Learnt from: Flo4604
Repo: unkeyed/unkey PR: 3944
File: go/pkg/db/acme_challenge_update_verified_with_expiry.sql_generated.go:31-39
Timestamp: 2025-09-12T08:01:20.792Z
Learning: Do not review or suggest changes to files with sql_generated.go suffix or other files marked as auto-generated (containing "Code generated by" comments), as these are generated by tools like sqlc and changes would be overwritten on regeneration.
Applied to files:
go/Makefile
📚 Learning: 2025-08-08T15:09:01.312Z
Learnt from: Flo4604
Repo: unkeyed/unkey PR: 3753
File: go/apps/api/openapi/config.yaml:9-10
Timestamp: 2025-08-08T15:09:01.312Z
Learning: Repo unkeyed/unkey: In go/apps/api/openapi, oapi-codegen doesn’t support OpenAPI 3.1 union nullability; overlay.yaml must be applied before codegen. The overlay key in oapi-codegen config isn’t supported—use a pre-step (programmatic or CLI) to merge overlay into the bundled spec, then run oapi-codegen.
Applied to files:
go/Makefile
📚 Learning: 2025-08-08T15:09:01.312Z
Learnt from: Flo4604
Repo: unkeyed/unkey PR: 3753
File: go/apps/api/openapi/config.yaml:9-10
Timestamp: 2025-08-08T15:09:01.312Z
Learning: Repo unkeyed/unkey: For go/apps/api/openapi, oapi-codegen is used and does not support OpenAPI 3.1 union types like [T, "null"]; an overlay step is required to downconvert to 3.0-style nullable before code generation.
Applied to files:
go/Makefile
📚 Learning: 2025-08-08T15:20:40.288Z
Learnt from: Flo4604
Repo: unkeyed/unkey PR: 3753
File: go/apps/api/openapi/config.yaml:9-10
Timestamp: 2025-08-08T15:20:40.288Z
Learning: Repo unkeyed/unkey: oapi-codegen v2.4+ (v2.5.0 in use) supports output-options.overlay in go/apps/api/openapi/config.yaml; the generator applies overlay.yaml at codegen time, so no separate pre-step is required if oapi-codegen is invoked with -config=config.yaml.
Applied to files:
go/Makefile
📚 Learning: 2025-08-08T15:09:01.312Z
Learnt from: Flo4604
Repo: unkeyed/unkey PR: 3753
File: go/apps/api/openapi/config.yaml:9-10
Timestamp: 2025-08-08T15:09:01.312Z
Learning: Repo unkeyed/unkey: In go/apps/api/openapi, overlays are required to downconvert 3.1 union nullability to 3.0-style nullable before running oapi-codegen; config.yaml’s output-options.overlay is not recognized by oapi-codegen, so overlays must be applied in a pre-step (programmatic or CLI) prior to codegen.
Applied to files:
go/Makefile
📚 Learning: 2025-04-08T09:34:24.576Z
Learnt from: ogzhanolguncu
Repo: unkeyed/unkey PR: 2872
File: apps/dashboard/lib/trpc/routers/ratelimit/createNamespace.ts:36-39
Timestamp: 2025-04-08T09:34:24.576Z
Learning: In the Unkey dashboard, when making database queries involving workspaces, use `ctx.workspace.id` directly instead of fetching the workspace separately for better performance and security.
Applied to files:
apps/docs/analytics/getting-started.mdx
📚 Learning: 2025-04-08T09:34:24.576Z
Learnt from: ogzhanolguncu
Repo: unkeyed/unkey PR: 2872
File: apps/dashboard/lib/trpc/routers/ratelimit/createNamespace.ts:36-39
Timestamp: 2025-04-08T09:34:24.576Z
Learning: When querying or updating namespaces in the Unkey dashboard, always scope the operations to the current workspace using `eq(table.workspaceId, ctx.workspace.id)` to prevent cross-workspace access.
Applied to files:
apps/docs/analytics/getting-started.mdx
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Test API / API Test Local
- GitHub Check: Test Go API Local / Test
- GitHub Check: Build / Build
- GitHub Check: Test Packages / Test
🔇 Additional comments (14)
apps/docs/analytics/query-examples.mdx (1)
89-157: Excellent addition ofsumIf()examples for outcome pivoting.The new "All outcomes in a single row" sections (lines 89-157) effectively demonstrate pivoting multiple outcome values into columns, which is a critical pattern for gap filling with WITH FILL. The examples are clear and properly formatted. The cURL examples correctly escape single quotes for JSON embedding.
apps/docs/analytics/schema-reference.mdx (1)
19-30: Automatic filtering documentation is clear and accurate.The updated Raw Events Table columns (lines 19-30) properly document that
workspace_idandkey_space_idare automatically filtered based on root key permissions. The descriptions clearly indicate users don't need to filter by workspace_id, and key_space_id is filtered when the root key is scoped to a single API.apps/docs/analytics/getting-started.mdx (1)
154-159: Automatic filtering note is clear and comprehensive.The new Note block (lines 154-159) effectively explains workspace-level and API-level automatic filtering, with clear distinction between
api.<api_id>.read_analytics(automatic) andapi.*.read_analytics(manual filtering required). This aligns well with the broader PR goal of clarifying filtering behavior.apps/docs/analytics/overview.mdx (2)
22-24: Automatic filtering note is concise and appropriate.The new Note (lines 22-24) provides a high-level overview of automatic workspace and API scoping, which serves as a good introductory reference pointing users to detailed docs if needed.
50-61: Available Data table is well-documented with clear filtering guidance.The updated table provides clear descriptions of each field and explicitly notes automatic filtering for
workspace_idand conditional automatic filtering forkey_space_id. The descriptions are consistent with the getting-started.mdx documentation.apps/docs/errors/user/bad_request/invalid_analytics_query.mdx (3)
39-51: api_id to key_space_id terminology update is correct and consistent.The replacement of
api_idwithkey_space_idin the "Missing comma" example (lines 39-51) aligns with the broader documentation updates and correctly reflects the actual schema. The examples remain syntactically correct.
59-69: Unclosed quote example correctly uses current terminology.The "Unclosed quote" example now properly uses
key_space_id(lines 60-62) instead of the outdatedapi_id, maintaining consistency with updated schema references.
104-107: Error resolution guidance is improved with schema reference link.The added guidance (line 107) to "Check the schema" with a link to the Schema Reference is helpful for users debugging query issues. Combined with the "Test incrementally" advice (line 105), this provides practical troubleshooting steps.
apps/docs/analytics/quick-reference.mdx (3)
134-154: WITH FILL example is clear and well-positioned in Quick Reference.The new "Filling Gaps in Time Series" section (lines 134-154) provides a concise example that demonstrates the gap-filling pattern. The tip on line 153 appropriately links to the detailed Query Examples section for more comprehensive guidance on hourly, daily, and monthly variants.
177-182: Automatic filtering note is consistent across documentation.The new Note (lines 177-182) correctly describes automatic workspace and API-scoped filtering, maintaining consistency with the getting-started.mdx and overview.mdx documentation. The clarification about
api.*.read_analyticsrequiring manualkey_space_idfiltering is particularly valuable in a quick reference context.
184-192: Time Ranges subsection improves usability of Quick Reference.The new "Time Ranges" subsection (lines 184-192) provides practical examples of common time-based WHERE clauses, making the Quick Reference more immediately actionable for users building queries.
go/pkg/clickhouse/query-parser/validation.go (1)
74-74: LGTM - sumif function added to whitelist.The addition of
sumifto the allowedFunctions map is correctly placed under conditional functions and follows the established naming pattern.go/Makefile (1)
47-47: LGTM - Cache proto generation enabled.The addition of
--path "./proto/cache"correctly extends the buf generate command to include cache proto definitions, following the established pattern of multiple path flags.go/go.mod (1)
11-11: LGTM - Parser dependency updated to support WITH FILL queries.The bump to v0.4.16 adds support for WITH FILL in ORDER BY, fulfilling the PR's stated objective.
Graphite Automations"Post a GIF when PR approved" took an action on this PR • (11/07/25)1 gif was posted to this PR based on Andreas Thomas's automation. |

What does this PR do?
Updates the clickhouse parser to the latest issue which now allows WITH FILL queries being parsed
AfterShip/clickhouse-sql-parser#211
adjust docs to talk about with fill and adds examples to it
removes select * from analytics docs
explains that the user doesnt need to filter by key_space_id and workspace_id
add analytics endpoint to sdk generation
fix cache file not being generated
Type of change
How should this be tested?
test one of the with fill queries against your local clickhouse and see it working
Checklist
Required
pnpm buildpnpm fmtmake fmton/godirectoryconsole.logsgit pull origin mainAppreciated