Skip to content

refactor: parameterize SQL queries in analytics repositories for improved security#2387

Merged
JivusAyrus merged 2 commits intomainfrom
suvij/eng-8607-wun-q425-11-multiple-sql-injection-instances-in-clickhouse
Dec 3, 2025
Merged

refactor: parameterize SQL queries in analytics repositories for improved security#2387
JivusAyrus merged 2 commits intomainfrom
suvij/eng-8607-wun-q425-11-multiple-sql-injection-instances-in-clickhouse

Conversation

@JivusAyrus
Copy link
Copy Markdown
Member

@JivusAyrus JivusAyrus commented Dec 3, 2025

Summary by CodeRabbit

  • Chores
    • Systematically upgraded database query infrastructure across analytics repositories and services, implementing improved parameter handling mechanisms for metrics, traces, usage analytics, dashboard views, and related data operations. These enhancements strengthen security and reliability throughout the analytics layer while maintaining full backward compatibility with existing application functionality.

✏️ Tip: You can customize this high-level summary in your review settings.

Checklist

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Dec 3, 2025

Walkthrough

This PR systematically converts SQL query construction across 10 repository files from direct string interpolation to parameterized query bindings. Values for dates, IDs, multipliers, and granularity are now passed via typed parameter objects to queryPromise instead of being embedded directly in SQL strings.

Changes

Cohort / File(s) Summary
Core Repository Parameterization
controlplane/src/core/repositories/CacheWarmerRepository.ts
Replaced date literals and ID string interpolations with parameterized placeholders. Added single-quote escaping for operation hashes in IN clause construction. Introduced params objects for all affected queries.
Analytics Repository Parameterization
controlplane/src/core/repositories/analytics/AnalyticsDashboardViewRepository.ts, AnalyticsRequestViewRepository.ts, MetricsRepository.ts, MonthlyRequestViewRepository.ts, RouterMetricsRepository.ts, SubgraphMetricsRepository.ts, TraceRepository.ts
Consistently replaced inline SQL string interpolations with parameterized placeholders for graph IDs, organization IDs, date ranges, granularity, multipliers, quantiles, and service IDs. Each method now constructs and passes a params object to queryPromise.
Analytics Usage Repository
controlplane/src/core/repositories/analytics/UsageRepository.ts
Extended parameterization to three private methods: getUsageRequestSeries, getClientsWithOperations, and getMeta. Updated method signatures to accept a params parameter. Added single-quote escaping for dynamic field names in arrayJoinFields construction.
Service Parameterization
controlplane/src/core/services/SchemaUsageTrafficInspector.ts
Parameterized daysToConsider interval, NamedType/TypeName/FieldName predicates, and identifier filters. Added path segment escaping for single quotes and replaced direct value injections with placeholder-based bindings.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

  • Consistency of pattern: The same parameterization approach is applied uniformly across all 10 files, reducing cognitive load for verification
  • Scope and repetition: While touching many files, the changes are largely mechanical transformations of the same refactoring template
  • Areas requiring extra attention:
    • UsageRepository.ts — Three private method signatures updated; verify all callers pass the new params argument correctly
    • CacheWarmerRepository.ts and SchemaUsageTrafficInspector.ts — Single-quote escaping logic for dynamic values; ensure escaping prevents injection without over-escaping
    • Parameter type annotations (:UInt32, :String, :Float64) — Confirm type correctness for ClickHouse query binding
    • Query result shapes — Verify parameterization doesn't affect result deserialization

Possibly related PRs

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main objective of the PR—parameterizing SQL queries in analytics repositories for improved security—which is the primary change across all modified files.
✨ Finishing touches
  • 📝 Generate docstrings

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@StarpTech StarpTech left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
controlplane/src/core/repositories/analytics/UsageRepository.ts (1)

116-144: Align placeholder types to UInt32 in getMeta for consistency

The getMeta method uses {startDate:String} and {endDate:String} placeholders, but parseTimeFilters returns numeric Unix timestamp values (via getUnixTimeInSeconds), which are passed as dateRange.start and dateRange.end. This misaligns with getUsageRequestSeries and getClientsWithOperations, which both correctly use {startDate:UInt32} and {endDate:UInt32} for the same numeric values from the same TimeFilters object.

Change lines 127-128:

-      toDateTime({startDate:String}) AS startDate,
-      toDateTime({endDate:String}) AS endDate
+      toDateTime({startDate:UInt32}) AS startDate,
+      toDateTime({endDate:UInt32}) AS endDate
🧹 Nitpick comments (3)
controlplane/src/core/repositories/analytics/AnalyticsDashboardViewRepository.ts (2)

289-306: Same IN clause pattern as above.

The subgraph ID handling here follows the same escaped interpolation pattern. If the array parameter approach is adopted for getSubgraphRates, apply it here as well for consistency.


227-252: SubgraphID IN clause uses string interpolation instead of parameterized approach.

While the subgraph IDs are escaped, they're still interpolated into the SQL string. Consider parameterizing this as an array once the ClickHouseClient is enhanced to support Array(String) serialization. Currently, the client only handles primitive types (string, number, boolean) passed as URL query parameters. The proper pattern would be:

-    // Properly escape subgraph IDs for SQL
-    const escapedSubgraphIds = subgraphs.map((s) => `'${s.id.replace(/'/g, "''")}'`).join(',');
...
-      AND SubgraphID IN (${escapedSubgraphIds})
+      AND SubgraphID IN ({subgraphIds:Array(String)})
...
     const params = {
       start: dateRange.start,
       end: dateRange.end,
       multiplier,
       federatedGraphId,
       organizationId,
+      subgraphIds: subgraphs.map((s) => s.id),
     };

This applies similarly to getSubgraphLatency at lines 270–306. Requires enhancing ClickHouseClient._getRequestOptions to serialize array parameters.

controlplane/src/core/repositories/analytics/UsageRepository.ts (1)

195-281: Escaping in getUnusedFields hardens tuple construction against injection

The new arrayJoinFields logic escapes single quotes in Field.name and Field.typeName before embedding them in the tuple literal, which is the main character needed to safely stay inside SQL string literals. Combined with parameterized OrganizationID / FederatedGraphID / date bounds, this significantly reduces injection risk for these dynamically generated tuples.

If you later want to go further, you could consider generating these tuples via repeated placeholders instead of literal construction, but that’s not strictly necessary given the current escaping.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5376bd9 and cac93fb.

📒 Files selected for processing (10)
  • controlplane/src/core/repositories/CacheWarmerRepository.ts (3 hunks)
  • controlplane/src/core/repositories/analytics/AnalyticsDashboardViewRepository.ts (8 hunks)
  • controlplane/src/core/repositories/analytics/AnalyticsRequestViewRepository.ts (2 hunks)
  • controlplane/src/core/repositories/analytics/MetricsRepository.ts (14 hunks)
  • controlplane/src/core/repositories/analytics/MonthlyRequestViewRepository.ts (1 hunks)
  • controlplane/src/core/repositories/analytics/RouterMetricsRepository.ts (2 hunks)
  • controlplane/src/core/repositories/analytics/SubgraphMetricsRepository.ts (12 hunks)
  • controlplane/src/core/repositories/analytics/TraceRepository.ts (3 hunks)
  • controlplane/src/core/repositories/analytics/UsageRepository.ts (15 hunks)
  • controlplane/src/core/services/SchemaUsageTrafficInspector.ts (3 hunks)
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-08-29T10:28:04.846Z
Learnt from: JivusAyrus
Repo: wundergraph/cosmo PR: 2156
File: controlplane/src/core/repositories/SubgraphRepository.ts:1749-1751
Timestamp: 2025-08-29T10:28:04.846Z
Learning: In the controlplane codebase, authentication and authorization checks (including organization scoping) are handled at the service layer in files like unlinkSubgraph.ts before calling repository methods. Repository methods like unlinkSubgraph() in SubgraphRepository.ts can focus purely on data operations without redundant security checks.

Applied to files:

  • controlplane/src/core/repositories/analytics/AnalyticsRequestViewRepository.ts
  • controlplane/src/core/repositories/analytics/TraceRepository.ts
  • controlplane/src/core/repositories/analytics/SubgraphMetricsRepository.ts
  • controlplane/src/core/services/SchemaUsageTrafficInspector.ts
📚 Learning: 2025-09-10T09:59:17.257Z
Learnt from: JivusAyrus
Repo: wundergraph/cosmo PR: 2156
File: controlplane/src/core/bufservices/subgraph/checkSubgraphSchema.ts:417-419
Timestamp: 2025-09-10T09:59:17.257Z
Learning: When skipTrafficCheck is true in schema checks, the traffic inspection loop is completely bypassed (as shown in line 2040 of SubgraphRepository.ts with the continue statement), which means hasClientTraffic will be false. The traffic analysis doesn't run at all when skipTrafficCheck is true, unlike the previous understanding that it always ran to collect usage data.

Applied to files:

  • controlplane/src/core/services/SchemaUsageTrafficInspector.ts
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: build_test
  • GitHub Check: build_push_image
  • GitHub Check: Analyze (javascript-typescript)
  • GitHub Check: Analyze (go)
🔇 Additional comments (32)
controlplane/src/core/repositories/CacheWarmerRepository.ts (1)

49-96: LGTM! Proper parameterization implemented.

The SQL query has been correctly converted to use typed parameter bindings for all dynamic values (dates, IDs, quantile, limits). This eliminates SQL injection risks and improves maintainability.

controlplane/src/core/repositories/analytics/MonthlyRequestViewRepository.ts (1)

14-22: LGTM!

Clean parameterization of the OrganizationID filter. The placeholder syntax {organizationId:String} and params object pattern are correctly implemented for ClickHouse.

controlplane/src/core/repositories/analytics/AnalyticsRequestViewRepository.ts (2)

504-506: LGTM!

Correct parameterization for client name filtering. The comment clarifies the intent.


720-722: LGTM!

The security-critical scoping of data to organization and graph is now properly parameterized, preventing SQL injection in these authorization filters.

controlplane/src/core/repositories/analytics/TraceRepository.ts (2)

64-66: LGTM!

Good parameterization of all security-sensitive fields in both the base and recursive parts of the CTE. The {traceID:String} AS trace_id binding in the WITH clause is valid ClickHouse syntax.

Also applies to: 76-79


97-104: LGTM!

Clean params object construction with all required parameters passed to queryPromise.

controlplane/src/core/services/SchemaUsageTrafficInspector.ts (2)

67-77: LGTM!

Correct parameterization for namedType, typeName, and fieldName filters.


95-98: LGTM!

Date interval and ID filters are properly parameterized with correct ClickHouse type annotations.

controlplane/src/core/repositories/analytics/AnalyticsDashboardViewRepository.ts (4)

34-49: LGTM!

Clean parameterization for the weekly request series query.


75-106: LGTM!

Correct parameterization including the granule interval. The {granule:UInt32} type annotation is appropriate for time intervals.


130-144: LGTM!

Properly parameterized date range and ID filters.


168-187: LGTM!

The {multiplier:Float64} type annotation is correct for the rate calculation divisor.

controlplane/src/core/repositories/analytics/SubgraphMetricsRepository.ts (5)

74-210: Request‑rate metrics parameterization is consistent and complete

getSubgraphRequestRateMetrics now correctly builds params objects (queryRate, top5Params, querySeries) that cover all placeholders used in the SQL (startDate, endDate, organizationId, subgraphId, multiplier, granule) and merges in queryParams for filter placeholders. This looks functionally equivalent to the interpolated version and avoids direct value injection.


215-405: Latency metrics queries use fully bound parameters without regressions

In getSubgraphLatencyMetrics, the new params objects for queryLatency, queryTop5, and querySeries correctly provide startDate, endDate, organizationId, subgraphId, granule, and numeric quantile for the {…} placeholders, alongside queryParams. This keeps the histogram logic intact while removing inline interpolation.


411-557: Error percentage + series parameterization looks correct

For getSubgraphErrorMetrics, both queryPercentage, top5Params, and getSeries are now parameterized with the expected keys and still reuse whereSql plus queryParams for dynamic filters. I don’t see any missing or unused parameters here.


563-605: Error‑rate series now safely parameterized

getSubgraphErrorRateMetrics’s single params object aligns with all placeholders in the series query (startDate, endDate, organizationId, subgraphId, granule). The change is straightforward and removes the remaining interpolation of IDs and time bounds.


749-782: Filter query parameterization for subgraph metrics is sound

getSubgraphMetricFilters now binds startDate, endDate, organizationId, and subgraphId as parameters and passes them via queryPromise(query, params), while leaving the projection and grouping unchanged. This is consistent with the rest of the analytics repositories in this PR.

controlplane/src/core/repositories/analytics/MetricsRepository.ts (8)

62-198: Request‑rate metrics parameterization matches placeholders

In getRequestRateMetrics, the new params, top5Params, and querySeries params objects correctly supply startDate, endDate, organizationId, federatedGraphId, multiplier, and granule plus queryParams for filter placeholders. The SQL only references those names, so this is a clean, safe replacement for string interpolation.


203-394: Latency metrics parameter binding is coherent across all helpers

getLatencyMetrics’s queryLatency, queryTop5, and querySeries each construct params objects that fully cover the used placeholders (startDate, endDate, organizationId, federatedGraphId, granule, quantile) and preserve the histogram logic. Using Number.parseFloat(quantile) keeps the quantile numeric while avoiding inline literals.


399-546: Error metrics queries are correctly parameterized

For getErrorMetrics, both queryPercentage and the top‑5 / series queries now take params objects that supply startDate, endDate, organizationId, federatedGraphId, and granule along with queryParams for dynamic filters. This eliminates direct interpolation of IDs and times without changing the aggregation semantics.


551-606: Error‑rate series parameterization mirrors other series helpers

getErrorRateMetrics’s params object lines up with all {…} placeholders in the SQL (startDate, endDate, organizationId, federatedGraphId, granule). The mapping over the result set is unchanged, so this should be a safe refactor.


719-783: Metric filters query now safely binds IDs and date range

getMetricFilters adds a params object for startDate, endDate, organizationId, and federatedGraphId and passes it to queryPromise. That matches all placeholders in the filter query and keeps the filter options construction logic intact.


820-1051: Operations listing: parameter coverage for pagination, search, and deprecated CTEs looks complete

In getOperations, queryParamsWithPagination correctly merges:

  • filter queryParams
  • pagination (limit, offset)
  • IDs and time bounds (organizationId, federatedGraphId, startDate, endDate)
  • optional searchQueryPattern
  • optional deprecatedStart/deprecatedEnd when deprecatedFields.length > 0

All branches (latency/requests/errors) reference only these placeholders (plus any from whereSql built from queryParams), so the refactor removes inline values without introducing missing/bogus params.


1068-1208: Operations count: parameterized variants for both deprecated and non‑deprecated paths are consistent

getOperationsCount now:

  • Uses the same {startDate:UInt32}, {endDate:UInt32}, {organizationId:String}, {federatedGraphId:String}, and optional {searchQueryPattern:String} placeholders as other queries.
  • Supplies them via the inline queryPromise calls’ params objects in both the deprecated‑only and general cases.

This keeps behavior aligned with getOperations while avoiding raw interpolation of IDs, dates, and search text.


1211-1238: Clients list query: IDs and date range now bound instead of interpolated

getClients now builds a params object for startDate, endDate, organizationId, and federatedGraphId, matching the placeholders in the query. The mapping of results is unchanged; this is a straightforward hardening.

controlplane/src/core/repositories/analytics/UsageRepository.ts (5)

15-58: Usage request series: shared params and placeholders are aligned

getUsageRequestSeries now accepts a params object, extends it with startDate, endDate, and granule, and uses {startDate:UInt32}, {endDate:UInt32}, {granule:UInt32} placeholders. Combined with whereSql‑driven placeholders from params, this removes inline IDs/timestamps cleanly.


60-114: Clients‑with‑operations query uses params consistently

getClientsWithOperations follows the same pattern—merging params with startDate/endDate and binding them via {startDate:UInt32} / {endDate:UInt32}. That matches the SQL and keeps the grouping logic untouched.


150-193: Shared params in getFieldUsage are wired correctly into all three helpers

getFieldUsage now builds a single params object (IDs + optional typename/field/namedType) and a whereSql that references {…:String} placeholders. Passing that same whereSql/timeFilters/params triple into getUsageRequestSeries, getClientsWithOperations, and getMeta keeps everything in sync and centralizes the binding logic.


283-366: getUsedFields applies the same safe escaping and parameterization pattern

getUsedFields mirrors getUnusedFields: escaping single quotes in field names/typeNames, using {startDate:UInt32} / {endDate:UInt32} for time bounds, and binding organizationId / federatedGraphId as parameters. This keeps behavior consistent between “used” and “unused” queries.


368-460: Deprecated‑fields‑per‑operation query is correctly parameterized

getDeprecatedFieldsUsedInOperation:

  • Builds deprecatedFieldsArray with names/typeNames that escape single quotes.
  • Uses {startDate:UInt32}, {endDate:UInt32}, {organizationId:String}, {federatedGraphId:String}, {operationHash:String}, and optional {operationName:String} placeholders.
  • Supplies all of these keys in the params object, only adding operationName when needed.

That’s a solid hardening over any previous string interpolation.

controlplane/src/core/repositories/analytics/RouterMetricsRepository.ts (2)

29-64: Router runtime query now binds IDs via params as intended

getRouterRuntime correctly replaces direct embedding of FederatedGraphID, OrganizationID, and ServiceInstanceID with {…:String} placeholders and passes a matching params object. The aggregation logic over MetricName / MetricValue is untouched.


144-181: Active routers query parameterization is correct and minimal

getActiveRouters now uses {federatedGraphId:String} and {organizationId:String} placeholders and binds them via the params object. This achieves the PR’s security goal here without altering the shape of the returned RouterDTO[].

Comment thread controlplane/src/core/repositories/CacheWarmerRepository.ts
Comment thread controlplane/src/core/services/SchemaUsageTrafficInspector.ts
@JivusAyrus JivusAyrus changed the title refactor: parameterize SQL queries in analytics repositories for improved security and maintainability refactor: parameterize SQL queries in analytics repositories for improved security Dec 3, 2025
@JivusAyrus JivusAyrus merged commit 034d6a1 into main Dec 3, 2025
9 of 11 checks passed
@JivusAyrus JivusAyrus deleted the suvij/eng-8607-wun-q425-11-multiple-sql-injection-instances-in-clickhouse branch December 3, 2025 17:29
asoorm pushed a commit that referenced this pull request Dec 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants