Skip to content

feat: multiple otel profiles#2567

Open
sammaji wants to merge 2 commits intomainfrom
04-08-feat_multiple_otel_profiles
Open

feat: multiple otel profiles#2567
sammaji wants to merge 2 commits intomainfrom
04-08-feat_multiple_otel_profiles

Conversation

@sammaji
Copy link
Copy Markdown
Member

@sammaji sammaji commented Apr 8, 2026

Summary

Adds support for multiple OTEL collector profiles, allowing traces to be exported to multiple destinations simultaneously. The OTEL plugin configuration has been restructured to support an array of profiles, each with independent settings for collector URL, protocol, headers, TLS configuration, and metrics export.

Changes

  • Added SafeExtractInt64 utility function for safe type conversion from any to int64
  • Restructured OTEL plugin configuration to support multiple profiles instead of single collector
  • Added database migration to wrap legacy single-collector configs into new multi-profile format
  • Refactored OTEL plugin architecture with separate OtelProfile struct for each collector destination
  • Enhanced type conversion in OTEL span attributes to handle all integer and float slice types using generics
  • Consolidated TLS configuration logic into shared buildTLSConfig utility function
  • Updated UI to support multiple profiles with collapsible cards and profile management
  • Added profile enable/disable toggle and remove functionality in the UI
  • Improved metrics exporter initialization and error handling per profile
  • Enhanced header environment variable injection with proper error handling

Type of change

  • Feature
  • Refactor

Affected areas

  • Core (Go)
  • Plugins
  • UI (Next.js)

How to test

Configure multiple OTEL profiles in the observability settings:

{
  "enabled": true,
  "otel_config": {
    "profiles": [
      {
        "enabled": true,
        "service_name": "bifrost-jaeger",
        "collector_url": "http://jaeger:14268/api/traces",
        "protocol": "http",
        "trace_type": "otel"
      },
      {
        "enabled": true,
        "service_name": "bifrost-datadog",
        "collector_url": "https://trace.agent.datadoghq.com:443",
        "protocol": "http",
        "headers": {
          "DD-API-KEY": "env.DD_API_KEY"
        },
        "trace_type": "otel"
      }
    ]
  }
}

Send requests and verify traces appear in both collectors. Test profile enable/disable functionality and metrics export per profile.

Screenshots/Recordings

The UI now displays multiple collapsible profile cards with individual controls for each OTEL destination, replacing the single configuration form.

Breaking changes

  • Yes

The OTEL plugin configuration format has changed from a flat structure to a profiles array. Legacy configurations are automatically migrated via database migration, but any external tooling or documentation referencing the old format will need updates.

Migration: Single collector configs like {"collector_url": "...", "protocol": "http"} become {"profiles": [{"collector_url": "...", "protocol": "http"}]}.

Related issues

Enables multi-destination observability for enterprise deployments requiring traces in multiple systems simultaneously.

Security considerations

  • Enhanced CA certificate path validation prevents path traversal attacks
  • Environment variable injection in headers includes proper validation
  • TLS configuration consolidated with consistent security defaults

Checklist

  • I read docs/contributing/README.md and followed the guidelines
  • I added/updated tests where appropriate
  • I updated documentation where needed
  • I verified builds succeed (Go and UI)
  • I verified the CI pipeline passes locally if applicable

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 8, 2026

📝 Walkthrough

Summary by CodeRabbit

  • New Features

    • Multiple OpenTelemetry profiles with per-profile enable/disable, dynamic UI to add/remove profiles, and per-profile initialization/shutdown.
    • Per-profile metrics exporters and trace-to-metrics extraction (span metrics, token/cost reporting, first-token latency).
  • Improvements

    • Environment-variable substitution for collector headers.
    • Centralized TLS handling and simplified insecure/plaintext credential selection; clearer per-profile error cleanup and lifecycle.
  • Updates

    • OTEL config/schema now requires profiles, narrows trace_type, and includes migrations from legacy single-collector configs.

Walkthrough

Refactors the OTEL plugin to support multiple per-profile collectors; adds TLS/header helpers, safe numeric extractors, and slice-to-attribute helpers; moves converter logic into profile scope; enhances metrics extraction; updates frontend/schema/forms for profiles; and adds a DB migration to convert legacy OTEL config.

Changes

Cohort / File(s) Summary
Core utilities
core/schemas/utils.go
Added exported SafeExtractInt64(value any) (int64, bool) and a minor whitespace/alignment tweak in DeepCopyChatTool.
OTEL converter & attributes
plugins/otel/converter.go
Moved several methods to *OtelProfile; made convertSpanToOTELSpan a package-level function; consolidated numeric handling to use schemas.SafeExtractInt64/SafeExtractFloat64; added generic intSliceToKeyValue and floatSliceToAttribute.
OTEL TLS & header helpers
plugins/otel/utils.go
Added injectEnvToHeaders, validateCACertPath, and buildTLSConfig to centralize env header substitution and TLS/CA handling.
Transport TLS delegation
plugins/otel/grpc.go, plugins/otel/http.go
Removed inline CA/PEM parsing and system-root path handling; delegate TLS construction to buildTLSConfig; use plaintext creds only when tlsCACert=="" && insecureMode.
OTEL plugin core
plugins/otel/main.go
Refactored to multi-profile model: added OtelProfileConfig/OtelProfile; Config now contains Profiles; per-profile Init/Inject/Cleanup and metrics handling; added env attribute loading, per-profile defaults, and error-path cleanup; removed ValidateConfig/GetMetricsExporter.
OTEL metrics & extraction
plugins/otel/metrics.go
Introduced generic onceCounter wrappers and constructors, changed NewMetricsExporter signature to include version, delegated TLS selection to helper, and added recordMetricsFromTrace to extract/record span-based metrics.
Config schema
transports/config.schema.json
Replaced flat OTEL config with required profiles array (minItems:1); moved former top-level fields into profiles[] items; added profiles[].enabled and additionalProperties:false; narrowed trace_type enum to genai_extension.
Frontend form UI
ui/app/workspace/observability/fragments/otelFormFragment.tsx
Reworked form to manage otel_config.profiles[] with add/remove profile cards, per-profile defaults, indexed bindings, per-profile validation triggers, and updated prop shape.
Frontend types/schemas
ui/lib/types/schemas.ts
Added otelProfileConfigSchema and exported OtelProfileConfigSchema type; otelConfigSchema now requires non-empty profiles[]; validation updated to per-profile paths.
Migrations
framework/configstore/migrations.go
Added migrationMigrateOtelConfigToProfiles and wired it into triggerMigrations to convert legacy single OTEL config into profiles[0]; no-op when already migrated.
Minor module & docs
plugins/otel/go.mod, plugins/otel/changelog.md
Marked github.com/bytedance/sonic as // indirect; added changelog entry feat: adds support for multiple otel profiles.
No-op / whitespace
transports/bifrost-http/server/plugins.go, .github/.../helm-release.yml, .github/.../validate-schema-sync.sh
Renamed unused ctx param to _ and performed trailing-newline/whitespace adjustments only.

Sequence Diagram(s)

sequenceDiagram
    participant Client as Client
    participant Plugin as OtelPlugin
    participant Profile as OtelProfile
    participant Conv as Converter
    participant OTLP as OTLPClient
    participant Metrics as MetricsExporter

    Client->>Plugin: Inject(trace)
    Plugin->>Profile: forward to each enabled profile
    Profile->>Conv: convertTraceToResourceSpan(trace)
    Conv-->>Profile: ResourceSpan
    Profile->>OTLP: Export(ResourceSpan)
    Profile->>Metrics: recordMetricsFromTrace(trace)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐇
I hopped through profiles, one by one,
TLS and headers neatly spun.
Traces converted, metrics played,
Old configs migrated, none delayed.
Carrots for every collector—yum! 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 71.05% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title 'feat: multiple otel profiles' clearly and concisely summarizes the main change—adding support for multiple OTEL collector profiles. It is specific, directly matches the changeset's primary objective, and follows conventional commit format.
Description check ✅ Passed The PR description is comprehensive and well-structured, covering all key template sections including summary, detailed changes, type of change, affected areas, testing instructions, breaking changes disclosure, security considerations, and a complete checklist.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch 04-08-feat_multiple_otel_profiles

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 golangci-lint (2.11.4)

level=error msg="[linters_context] typechecking error: pattern ./...: directory prefix . does not contain main module or its selected dependencies"


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Member Author

sammaji commented Apr 8, 2026

This stack of pull requests is managed by Graphite. Learn more about stacking.

@sammaji sammaji marked this pull request as ready for review April 8, 2026 14:57
Comment thread plugins/otel/utils.go
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 8, 2026

Confidence Score: 3/5

Not safe to merge as-is — the form validation bug makes the UI unusable when adding a new profile, and the global meter provider overwrite silently drops multi-profile metrics data.

Two P1 findings: the Zod superRefine ignoring per-profile enabled breaks the primary UX flow for the feature (adding a second profile), and the otel.SetMeterProvider overwrite is a silent data-routing bug for multi-profile metrics setups — the exact scenario this PR introduces. Both need fixes before the feature works correctly end-to-end.

ui/lib/types/schemas.ts (validation logic) and plugins/otel/metrics.go (global meter provider) require attention before merge.

Important Files Changed

Filename Overview
plugins/otel/main.go Refactored to support multiple collector profiles; Init fan-out logic is correct, and the env. header injection for nil-Enabled profiles flagged in a prior thread has been fixed.
plugins/otel/metrics.go Per-profile lazy-initialized metric wrappers are well structured, but otel.SetMeterProvider(provider) is called once per profile — with multiple metrics-enabled profiles the global provider is overwritten by the last one, silently misrouting metrics from any code using the global OTEL API.
plugins/otel/utils.go Consolidated TLS config and CA cert path validation; symlink rejection and absolute-path enforcement are solid security improvements.
plugins/otel/converter.go Generic slice converters for int/float types are clean; anyToKeyValue now handles all numeric slice variants correctly.
ui/lib/types/schemas.ts The superRefine URL check ignores per-profile enabled flags, blocking saves whenever an unfilled disabled profile exists — newly added profiles (disabled by default) immediately invalidate the form.
ui/app/workspace/observability/fragments/otelFormFragment.tsx Multi-profile card UI with collapsible headers and per-profile enable/disable is well built; currentConfig interface now correctly includes the profiles field.
framework/configstore/migrations.go Two migrations added: normalize_otel_trace_type (runs first) and migrate_otel_config_to_profiles; ordering is correct and the no-op guard prevents double-migration.
core/schemas/utils.go New SafeExtractInt64 covers all numeric types; the uint64→int64 silent overflow for values > MaxInt64 was flagged in a prior thread and remains unaddressed.
plugins/otel/grpc.go gRPC client correctly uses insecure.NewCredentials() for plaintext mode and credentials.NewTLS for TLS; no issues found.
plugins/otel/http.go HTTP client delegates fully to buildTLSConfig; connection pooling settings and 30s timeout are reasonable.
transports/bifrost-http/server/plugins.go OTEL plugin instantiation unchanged in structure; cleanly delegates marshalling to the new Config.Profiles shape.

Comments Outside Diff (1)

  1. plugins/otel/metrics.go, line 182 (link)

    P1 Global meter provider overwritten by last metrics-enabled profile

    otel.SetMeterProvider(provider) is called once per NewMetricsExporter invocation. With multiple profiles that have metrics_enabled: true, each call replaces the previous global meter provider. After Init finishes, only the last profile's provider is the global one. Any OTEL SDK instrumentation outside MetricsExporter (auto-instrumented libraries, middleware, HTTP handlers using otel.GetMeterProvider()) will silently send all their metrics only to the final profile's collector — dropping data for every earlier profile.

    If the intent is simply to expose the most-recently-created provider globally (e.g. for tests or external libraries), each profile should conditionally set it, or the global set should be removed entirely since m.meter is already bound to the right provider internally.

Reviews (12): Last reviewed commit: "updates helm-chart checks (#2718)" | Re-trigger Greptile

Comment thread core/schemas/utils.go
Comment thread transports/config.schema.json
Comment thread ui/app/workspace/observability/fragments/otelFormFragment.tsx
Comment thread plugins/otel/utils.go
Comment thread plugins/otel/utils.go
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
ui/lib/types/schemas.ts (2)

592-665: ⚠️ Potential issue | 🟠 Major

Disabled OTEL profiles still fail validation.

Both refinements ignore profile.enabled, so a disabled card still needs a collector URL and a valid metrics endpoint. That makes the new per-profile enable/disable toggle ineffective.

Suggested fix
 export const otelProfileConfigSchema = z
 	.object({
 		enabled: z.boolean().default(true),
@@
 	})
 	.superRefine((data, ctx) => {
+		if (!data.enabled) return;
+
 		const protocol = data.protocol;
@@
 export const otelFormSchema = z
 	.object({
 		enabled: z.boolean().default(true),
 		otel_config: otelConfigSchema,
 	})
 	.superRefine((data, ctx) => {
 		if (data.enabled) {
 			data.otel_config.profiles.forEach((profile, i) => {
+				if (!profile.enabled) return;
+
 				const collectorUrl = (profile.collector_url || "").trim();
 				if (!collectorUrl) {
 					ctx.addIssue({
 						code: "custom",
 						path: ["otel_config", "profiles", i, "collector_url"],

Also applies to: 680-689

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ui/lib/types/schemas.ts` around lines 592 - 665, The refinement in
superRefine currently validates collector_url and metrics_endpoint regardless of
the profile's enabled flag; update the logic inside the superRefine callback
(where validateHttpUrl and validateHostPort are used) to first check the
profile.enabled flag and skip all collector/metrics validation when
profile.enabled is false. Specifically, for the collector_url checks
(collector_url + protocol) and the metrics_enabled/metrics_endpoint checks, wrap
them with a guard that returns/does nothing if data.profile?.enabled === false
(or equivalent), so disabled profiles do not require collector_url or
metrics_endpoint validation.

584-586: ⚠️ Potential issue | 🟠 Major

Don't default new OTEL profiles to insecure transport.

This makes secure transport opt-in. It also changes older profiles that omit the field: parsing and re-saving from the UI will now persist insecure: true, which means TLS verification is skipped for HTTP and plaintext is used for gRPC.

Suggested fix
-		insecure: z.boolean().default(true),
+		insecure: z.boolean().default(false),
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ui/lib/types/schemas.ts` around lines 584 - 586, The schema currently
defaults the OTEL profile field insecure to true (insecure:
z.boolean().default(true)), making insecure transport the default; change this
so secure transport is the default by removing the unsafe default — either make
insecure optional (insecure: z.boolean().optional()) or set insecure:
z.boolean().default(false); update the schema where tls_ca_cert and insecure are
defined so missing older profiles are not converted to insecure when parsed and
re-saved (refer to the insecure and tls_ca_cert fields in the schema).
🧹 Nitpick comments (2)
ui/app/workspace/observability/fragments/otelFormFragment.tsx (2)

163-204: Missing data-testid on interactive profile header elements.

Per coding guidelines, new interactive UI elements need data-testid attributes. The collapsible profile header button, profile enable/disable switch, and remove profile button are missing test IDs.

Suggested data-testid additions
 								<button
 									type="button"
 									className="flex w-full items-center justify-between px-4 py-3 text-left"
 									onClick={() => setExpandedProfiles((prev) => ({ ...prev, [field.id]: !prev[field.id] }))}
+									data-testid={`otel-profile-header-btn-${index}`}
 								>
 												<Switch
 													checked={enabledField.value}
 													onCheckedChange={enabledField.onChange}
 													onClick={(e) => e.stopPropagation()}
 													disabled={!hasOtelAccess}
 													aria-label="Enable profile"
+													data-testid={`otel-profile-enable-toggle-${index}`}
 												/>
 											<Button
 												type="button"
 												variant="ghost"
 												size="icon"
 												className="size-7"
 												onClick={(e) => {
 													e.stopPropagation();
 													remove(index);
 												}}
 												disabled={!hasOtelAccess}
 												title="Remove profile"
+												data-testid={`otel-profile-remove-btn-${index}`}
 											>

As per coding guidelines: ui/**/*.{tsx,ts}: Add new interactive UI elements with data-testid attributes following the pattern: data-testid="<entity>-<element>-<qualifier>".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ui/app/workspace/observability/fragments/otelFormFragment.tsx` around lines
163 - 204, Add data-testid attributes to the interactive elements in the profile
header: add data-testid="profile-header-button" to the outer button that toggles
expansion (the onClick using setExpandedProfiles), add
data-testid="profile-enabled-switch" to the Switch rendered inside the FormField
for name `otel_config.profiles.${index}.enabled`, and add
data-testid="profile-remove-button" to the remove Button that calls
remove(index); keep existing onClick handlers (stopPropagation/remove) and
disabled logic intact so behavior doesn't change.

461-471: Missing data-testid on Add Profile button.

Suggested fix
 				<Button
 					type="button"
 					variant="outline"
 					size="sm"
 					onClick={() => append({ ...DEFAULT_PROFILE })}
 					disabled={!hasOtelAccess}
 					className="flex items-center gap-2"
+					data-testid="otel-profile-add-btn"
 				>
 					<Plus className="size-4" />
 					Add Profile
 				</Button>

As per coding guidelines: ui/**/*.{tsx,ts}: Add new interactive UI elements with data-testid attributes.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ui/app/workspace/observability/fragments/otelFormFragment.tsx` around lines
461 - 471, The Add Profile Button (the Button that calls append({
...DEFAULT_PROFILE }) and uses hasOtelAccess) is missing a data-testid; add a
data-testid attribute (e.g., data-testid="add-profile-button" or following
project test id conventions) to that Button in otelFormFragment.tsx so tests can
target this interactive element while keeping the existing props (type, variant,
size, onClick, disabled, className) unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@core/schemas/utils.go`:
- Around line 403-445: SafeExtractInt64 currently casts uint/uint64 and floats
directly which can overflow or silently truncate; update SafeExtractInt64 to
validate ranges and reject unsafe conversions: for unsigned integers (uint,
uint64, uint32, etc.) check value <= math.MaxInt64 before casting; for signed
integers ensure within [math.MinInt64, math.MaxInt64] where applicable; for
float32/float64 first reject math.IsNaN and math.IsInf, then require the value
to be within [math.MinInt64, math.MaxInt64] and equal to math.Trunc(f) (i.e., no
fractional part) before converting to int64; keep json.Number handling but
ensure its Int64() result is validated similarly; leave string handling via
strconv.ParseInt as-is (it already rejects out-of-range), and return (0,false)
for any value that fails these checks.

In `@plugins/otel/converter.go`:
- Around line 224-233: The function intSliceToKeyValue currently ignores the
boolean from schemas.SafeExtractInt64 and silently emits 0 for unrepresentable
values; update intSliceToKeyValue to check the returned ok flag for each element
from SafeExtractInt64 (call site: intSliceToKeyValue and functions
SafeExtractInt64), only append an AnyValue when ok is true, skip invalid
elements, and if no valid elements remain return nil instead of emitting zeros;
keep the final return using kvAny(key, arrValue(...)) when there is at least one
valid element.

In `@plugins/otel/utils.go`:
- Around line 32-57: The validateCACertPath function is too strict: instead of
requiring absolute paths and rejecting symlinks, allow relative paths by
resolving them to an absolute path and follow symlinks before validating;
specifically, in validateCACertPath(certPath string) remove the absolute-path
rejection of cleanPath, call filepath.Abs(cleanPath) (or filepath.Join with a
known base) then resolve symlinks with filepath.EvalSymlinks (or similar) to get
the realPath, use os.Stat(realPath) (not os.Lstat) to verify existence and
ensure info.Mode().IsRegular() on the resolved target; keep clear error messages
but do not blanket-reject symlink mounts used by k8s ConfigMaps/Secrets.
- Around line 42-56: Update the error messages in plugins/otel/utils.go to
follow Go error string conventions: make them start lowercase and avoid
unnecessary trailing punctuation. Specifically, change the fmt.Errorf messages
that reference certPath/cleanPath (the ones in the os.Lstat error branch and the
subsequent symlink and regular-file checks) to begin with "tls" (e.g., "tls ca
cert path must be absolute: %s", "tls ca cert path not accessible: %w", "tls ca
cert path cannot be a symlink: %s", "tls ca cert path is not a regular file:
%s") so they are lowercase and preserve the formatting verbs without adding
terminal punctuation.

In `@transports/config.schema.json`:
- Around line 1352-1395: The schema currently allows both URI and host:port for
collector_url and metrics_endpoint regardless of protocol which causes runtime
failures in plugins/otel/http.go and plugins/otel/grpc.go; update
transports/config.schema.json to add JSON Schema if/then blocks keyed on the
"protocol" property so that when "protocol" == "http" the collector_url and
metrics_endpoint must match an http/https URI (format uri and/or pattern
enforcing ^https?://), and when "protocol" == "grpc" they must match host:port
(pattern with hostname/IP and a valid port range 1-65535); apply the same
conditional rules for both "collector_url" and "metrics_endpoint" and keep
existing descriptions/defaults otherwise.
- Around line 1414-1417: The description for the "insecure" boolean is
misleading because runtime treats it differently for HTTP vs gRPC; update
transports/config.schema.json's "insecure" description to state both behaviors
(e.g., "When true: HTTP skips TLS verification (ignored if tls_ca_cert is set);
for gRPC this switches to plaintext credentials in plugins/otel/grpc.go") or
alternatively change the runtime in plugins/otel/grpc.go to match the current
description; ensure you reference the tls_ca_cert interaction and that
transports/config.schema.json remains the authoritative config definition.

In `@ui/lib/types/schemas.ts`:
- Around line 573-577: The trace_type enum in ui/lib/types/schemas.ts is out of
sync with the authoritative transports/config.schema.json; update the enum used
by trace_type so it matches the backend schema (i.e., remove "genai_extension"
and "vercel" and leave only "otel"), or if new trace types are intended, update
transports/config.schema.json to include those values and ensure both places
accept the same set; modify the z.enum call that defines trace_type to match the
authoritative schema and run validation to confirm consistency.

---

Outside diff comments:
In `@ui/lib/types/schemas.ts`:
- Around line 592-665: The refinement in superRefine currently validates
collector_url and metrics_endpoint regardless of the profile's enabled flag;
update the logic inside the superRefine callback (where validateHttpUrl and
validateHostPort are used) to first check the profile.enabled flag and skip all
collector/metrics validation when profile.enabled is false. Specifically, for
the collector_url checks (collector_url + protocol) and the
metrics_enabled/metrics_endpoint checks, wrap them with a guard that
returns/does nothing if data.profile?.enabled === false (or equivalent), so
disabled profiles do not require collector_url or metrics_endpoint validation.
- Around line 584-586: The schema currently defaults the OTEL profile field
insecure to true (insecure: z.boolean().default(true)), making insecure
transport the default; change this so secure transport is the default by
removing the unsafe default — either make insecure optional (insecure:
z.boolean().optional()) or set insecure: z.boolean().default(false); update the
schema where tls_ca_cert and insecure are defined so missing older profiles are
not converted to insecure when parsed and re-saved (refer to the insecure and
tls_ca_cert fields in the schema).

---

Nitpick comments:
In `@ui/app/workspace/observability/fragments/otelFormFragment.tsx`:
- Around line 163-204: Add data-testid attributes to the interactive elements in
the profile header: add data-testid="profile-header-button" to the outer button
that toggles expansion (the onClick using setExpandedProfiles), add
data-testid="profile-enabled-switch" to the Switch rendered inside the FormField
for name `otel_config.profiles.${index}.enabled`, and add
data-testid="profile-remove-button" to the remove Button that calls
remove(index); keep existing onClick handlers (stopPropagation/remove) and
disabled logic intact so behavior doesn't change.
- Around line 461-471: The Add Profile Button (the Button that calls append({
...DEFAULT_PROFILE }) and uses hasOtelAccess) is missing a data-testid; add a
data-testid attribute (e.g., data-testid="add-profile-button" or following
project test id conventions) to that Button in otelFormFragment.tsx so tests can
target this interactive element while keeping the existing props (type, variant,
size, onClick, disabled, className) unchanged.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 2c208935-9e73-4d8f-9bac-0fe646ca4a7d

📥 Commits

Reviewing files that changed from the base of the PR and between 75cb667 and 54d87e7.

📒 Files selected for processing (11)
  • core/schemas/utils.go
  • plugins/otel/converter.go
  • plugins/otel/grpc.go
  • plugins/otel/http.go
  • plugins/otel/main.go
  • plugins/otel/metrics.go
  • plugins/otel/utils.go
  • transports/bifrost-http/server/plugins.go
  • transports/config.schema.json
  • ui/app/workspace/observability/fragments/otelFormFragment.tsx
  • ui/lib/types/schemas.ts

Comment thread core/schemas/utils.go
Comment thread plugins/otel/converter.go
Comment thread plugins/otel/utils.go
Comment thread plugins/otel/utils.go
Comment thread transports/config.schema.json
Comment thread transports/config.schema.json
Comment thread ui/lib/types/schemas.ts Outdated
@sammaji sammaji force-pushed the 04-08-feat_multiple_otel_profiles branch from 54d87e7 to 248463c Compare April 14, 2026 08:01
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@plugins/otel/converter.go`:
- Around line 194-195: The current type switch treats []byte as []uint8 and
routes it to intSliceToKeyValue, which incorrectly converts binary blobs to
integer arrays; add an explicit case for []byte before the existing case []uint8
in converter.go and handle binary data properly by either encoding the bytes to
a hex (or base64) string and returning that as the attribute value, or map them
to OTEL's binary/bytes attribute type if supported; ensure the existing
intSliceToKeyValue remains for true []uint8-int-slice semantics only and update
any comments to document the chosen behavior.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: fd67e553-eb66-4265-ab88-f392bc597ebd

📥 Commits

Reviewing files that changed from the base of the PR and between 54d87e7 and 248463c.

📒 Files selected for processing (2)
  • core/schemas/utils.go
  • plugins/otel/converter.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • core/schemas/utils.go

Comment thread plugins/otel/converter.go
@sammaji sammaji force-pushed the 04-08-feat_multiple_otel_profiles branch 3 times, most recently from b1f4951 to fe4e0d8 Compare April 14, 2026 08:28
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
ui/lib/types/schemas.ts (1)

714-833: ⚠️ Potential issue | 🟠 Major

Honor profile.enabled in OTEL validation.

enabled is now a per-profile toggle, but both refiners still validate disabled profiles. A blank disabled profile will still fail on collector_url here, and metrics_endpoint can still fail when metrics_enabled is left on, so the new toggle can't be used to park an incomplete destination.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ui/lib/types/schemas.ts` around lines 714 - 833, The profile-level "enabled"
flag is ignored by current validators causing disabled profiles to still be
validated; update otelProfileConfigSchema.superRefine to early-return or gate
all collector_url and metrics_endpoint checks (the blocks that call
validateHttpUrl and validateHostPort and the metrics_enabled check) behind if
(data.enabled) so disabled profiles skip validation, and update
otelFormSchema.superRefine to skip the per-profile collector_url presence check
by adding a guard like if (!profile.enabled) return/continue when iterating
profiles; reference otelProfileConfigSchema, validateHttpUrl, validateHostPort,
and otelFormSchema.superRefine/profile loop to locate and change the logic.
♻️ Duplicate comments (2)
core/schemas/utils.go (1)

403-446: ⚠️ Potential issue | 🔴 Critical

Unsafe numeric casts in SafeExtractInt64 can silently corrupt values.

Line 420/428 allows overflow-wrap for large unsigned values, and Line 430-432 accepts truncation/NaN/Inf. For a SafeExtractInt64 helper, these should be rejected.

Suggested fix
 import (
 	"encoding/json"
 	"fmt"
+	"math"
 	"math/rand"
 	"net/url"
 	"regexp"
 	"strconv"
 	"strings"
@@
 func SafeExtractInt64(value any) (int64, bool) {
 	if value == nil {
 		return 0, false
 	}
 	switch v := value.(type) {
@@
 	case uint:
-		return int64(v), true
+		if uint64(v) > math.MaxInt64 {
+			return 0, false
+		}
+		return int64(v), true
@@
 	case uint64:
-		return int64(v), true
+		if v > math.MaxInt64 {
+			return 0, false
+		}
+		return int64(v), true
 	case float32:
-		return int64(v), true
+		f := float64(v)
+		if math.IsNaN(f) || math.IsInf(f, 0) || math.Trunc(f) != f || f < math.MinInt64 || f > math.MaxInt64 {
+			return 0, false
+		}
+		return int64(f), true
 	case float64:
-		return int64(v), true
+		if math.IsNaN(v) || math.IsInf(v, 0) || math.Trunc(v) != v || v < math.MinInt64 || v > math.MaxInt64 {
+			return 0, false
+		}
+		return int64(v), true
According to the Go language specification, what are the exact semantics when converting (1) uint64 values greater than math.MaxInt64 to int64 and (2) float64 NaN/Inf or out-of-range values to int64?
transports/config.schema.json (1)

1474-1517: ⚠️ Potential issue | 🟠 Major

Make OTEL endpoint shape conditional on protocol.

collector_url and metrics_endpoint still accept either a URI or host:port for both protocols, so grpc can validate with https://... and http can validate with bare host:port. That drifts from the runtime split in the OTEL exporters and lets invalid configs clear schema validation.

As per coding guidelines, transports/config.schema.json must be kept as the authoritative definition for all config.json fields.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@transports/config.schema.json` around lines 1474 - 1517, collector_url and
metrics_endpoint currently allow both URI and host:port regardless of the
protocol; update transports/config.schema.json to make the endpoint shape
conditional on the protocol value by adding JSON Schema conditionals that
reference protocol: add an "if" block checking
{"properties":{"protocol":{"const":"grpc"}}} with a "then" that restricts
collector_url and metrics_endpoint to a URI (oneOf with {"format":"uri"}), and
an "else" that allows either a URI or host:port pattern (oneOf with
{"format":"uri"} and {"pattern":"^[^:\\s]+:\\d+$"}); apply these conditional
schemas to the collector_url and metrics_endpoint definitions so the schema
enforces different validation when protocol is "grpc" vs "http".
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@framework/configstore/migrations.go`:
- Around line 6488-6496: The migration resets plugin.ConfigJSON and
plugin.EncryptionStatus but does not recompute the stored config hash, risking a
stale ConfigHash; call GeneratePluginHash(plugin), handle its error, assign the
returned value to plugin.ConfigHash before calling tx.Save(&plugin) so the
persisted hash matches the restructured plugin.Config and the BeforeSave flow.

In `@plugins/otel/metrics.go`:
- Around line 210-220: The branch that sets opts currently calls
otlpmetrichttp.WithInsecure() when config.Insecure is true and TLSCACert is
empty, which forces plain HTTP and prevents reaching HTTPS collectors with
verification skipped; instead detect when the endpoint is HTTPS and
config.Insecure is true (and TLSCACert is empty), build an http.Client whose
Transport uses a tls.Config with InsecureSkipVerify=true (do not call
buildTLSConfig(..., false) there), and append
otlpmetrichttp.WithHTTPClient(yourClient) to opts; ensure you never combine
otlpmetrichttp.WithInsecure() with otlpmetrichttp.WithTLSClientConfig(), and
keep the existing code path that uses buildTLSConfig when TLSCACert is provided.

In `@transports/config.schema.json`:
- Around line 1464-1546: The schema currently unconditionally requires
"collector_url", "trace_type", and "protocol" so disabled profiles can't be
stored; change the schema to make those fields required only when "enabled" is
true by adding an if/then conditional (e.g., an "if": { "properties": {
"enabled": { "const": true } } } and a corresponding "then": { "required":
["collector_url","trace_type","protocol"] } ) while keeping additionalProperties
false and existing field defs (collector_url, trace_type, protocol, enabled,
etc.) intact; update the block containing the "required" array so it is
moved/expressed under the "then" clause instead of at the top level.

In `@ui/app/workspace/observability/fragments/otelFormFragment.tsx`:
- Around line 164-214: The header button currently nests interactive controls
(the Switch and the remove Button) which is invalid HTML; refactor the markup so
the expand/collapse trigger is a separate button (the element using onClick that
toggles setExpandedProfiles / isExpanded) and move the FormField Switch and the
remove Button out of that header button into their own sibling container (still
inside the profile row). Update any event handlers (remove(index),
setExpandedProfiles adjustments) to work from the moved Buttons and remove the
onClick stopPropagation hacks where no longer needed; keep references to the
same form control name `otel_config.profiles.${index}.enabled`, functions
`remove` and `setExpandedProfiles`, and the isExpanded state so behavior and
state updates remain identical.
- Around line 164-214: Add stable data-testid attributes to the new profile
controls: update the header trigger button (the outer <button> that calls
setExpandedProfiles) to include data-testid={`otel-profile-toggle-${index}`},
add data-testid={`otel-profile-enable-${index}`} to the per-profile Switch
rendered inside FormField (render block using enabledField), add
data-testid={`otel-profile-remove-${index}`} to the remove <Button> that calls
remove(index), and ensure the global "add profile" button (the add-profile
Button referenced around the add handler at the other block ~471-481) includes
data-testid="otel-profile-add-btn"; keep existing handlers (setExpandedProfiles,
enabledField.onChange, remove) unchanged and only add attributes following the
repo pattern.

---

Outside diff comments:
In `@ui/lib/types/schemas.ts`:
- Around line 714-833: The profile-level "enabled" flag is ignored by current
validators causing disabled profiles to still be validated; update
otelProfileConfigSchema.superRefine to early-return or gate all collector_url
and metrics_endpoint checks (the blocks that call validateHttpUrl and
validateHostPort and the metrics_enabled check) behind if (data.enabled) so
disabled profiles skip validation, and update otelFormSchema.superRefine to skip
the per-profile collector_url presence check by adding a guard like if
(!profile.enabled) return/continue when iterating profiles; reference
otelProfileConfigSchema, validateHttpUrl, validateHostPort, and
otelFormSchema.superRefine/profile loop to locate and change the logic.

---

Duplicate comments:
In `@transports/config.schema.json`:
- Around line 1474-1517: collector_url and metrics_endpoint currently allow both
URI and host:port regardless of the protocol; update
transports/config.schema.json to make the endpoint shape conditional on the
protocol value by adding JSON Schema conditionals that reference protocol: add
an "if" block checking {"properties":{"protocol":{"const":"grpc"}}} with a
"then" that restricts collector_url and metrics_endpoint to a URI (oneOf with
{"format":"uri"}), and an "else" that allows either a URI or host:port pattern
(oneOf with {"format":"uri"} and {"pattern":"^[^:\\s]+:\\d+$"}); apply these
conditional schemas to the collector_url and metrics_endpoint definitions so the
schema enforces different validation when protocol is "grpc" vs "http".
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 93b5ffd3-c48d-4d57-948d-907cbec26161

📥 Commits

Reviewing files that changed from the base of the PR and between 248463c and ac226cb.

📒 Files selected for processing (13)
  • core/schemas/utils.go
  • framework/configstore/migrations.go
  • plugins/otel/converter.go
  • plugins/otel/go.mod
  • plugins/otel/grpc.go
  • plugins/otel/http.go
  • plugins/otel/main.go
  • plugins/otel/metrics.go
  • plugins/otel/utils.go
  • transports/bifrost-http/server/plugins.go
  • transports/config.schema.json
  • ui/app/workspace/observability/fragments/otelFormFragment.tsx
  • ui/lib/types/schemas.ts
✅ Files skipped from review due to trivial changes (2)
  • plugins/otel/go.mod
  • plugins/otel/utils.go
🚧 Files skipped from review as they are similar to previous changes (3)
  • plugins/otel/http.go
  • transports/bifrost-http/server/plugins.go
  • plugins/otel/grpc.go

Comment thread framework/configstore/migrations.go
Comment thread plugins/otel/metrics.go
Comment thread transports/config.schema.json
Comment thread ui/app/workspace/observability/fragments/otelFormFragment.tsx Outdated
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
plugins/otel/metrics.go (1)

171-187: ⚠️ Potential issue | 🟠 Major

SetMeterProvider is called repeatedly in the multi-profile loop, causing the global provider to be last-write-wins.

In NewMetricsExporter(), otel.SetMeterProvider(provider) mutates global state on every profile initialization. Since metrics are recorded via the local m.meter (stored in MetricsExporter), the global provider is unused and serves only to confuse the architecture. Each SetMeterProvider call overwrites the previous profile's provider—only the last profile's metrics will be exported if any code uses the global otel.Meter().

To fix: either (1) remove SetMeterProvider() and keep providers private, or (2) set the global provider once to a multiplexing/fan-out implementation that routes metrics to all profiles.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/otel/metrics.go` around lines 171 - 187, NewMetricsExporter currently
calls otel.SetMeterProvider(provider) for each profile which mutates global
state and causes last-write-wins; remove the repeated global mutation and keep
each provider local to MetricsExporter (use m.meter created from provider) OR
implement/set a single global multiplexing/fan-out MeterProvider once that
forwards to each profile provider; specifically, stop calling
otel.SetMeterProvider(...) inside NewMetricsExporter (or replace that call with
logic to register the provider with a shared multiplexing provider), and ensure
MetricsExporter continues to use its local meter (m.meter) for recording so
profile-specific metrics are exported correctly.
♻️ Duplicate comments (8)
core/schemas/utils.go (1)

403-445: ⚠️ Potential issue | 🟠 Major

Harden SafeExtractInt64 to reject unsafe casts.

Line 420, Line 428, Line 430, and Line 432 perform direct casts that can wrap (uint64) or silently coerce invalid values (NaN/Inf/fractional float) into int64. For a helper named SafeExtractInt64, this can corrupt downstream values.

Proposed fix
 import (
 	"encoding/json"
 	"fmt"
+	"math"
 	"math/rand"
 	"net/url"
 	"regexp"
 	"strconv"
 	"strings"
@@
 func SafeExtractInt64(value any) (int64, bool) {
@@
 	case uint:
-		return int64(v), true
+		if uint64(v) > math.MaxInt64 {
+			return 0, false
+		}
+		return int64(v), true
@@
 	case uint64:
-		return int64(v), true
+		if v > math.MaxInt64 {
+			return 0, false
+		}
+		return int64(v), true
 	case float32:
-		return int64(v), true
+		f := float64(v)
+		if math.IsNaN(f) || math.IsInf(f, 0) || math.Trunc(f) != f || f < math.MinInt64 || f > math.MaxInt64 {
+			return 0, false
+		}
+		return int64(f), true
 	case float64:
-		return int64(v), true
+		if math.IsNaN(v) || math.IsInf(v, 0) || math.Trunc(v) != v || v < math.MinInt64 || v > math.MaxInt64 {
+			return 0, false
+		}
+		return int64(v), true
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/schemas/utils.go` around lines 403 - 445, SafeExtractInt64 currently
does unsafe direct casts from unsigned integers and floats (uint64, uint,
float32, float64) that can wrap or silently coerce invalid values; update
SafeExtractInt64 to validate ranges and integrality before converting: for all
unsigned variants (uint, uint8, uint16, uint32, uint64) check value <=
math.MaxInt64 and return false otherwise; for float32/float64 reject NaN/±Inf
and only accept values where math.Trunc(f) == f and f is within [math.MinInt64,
math.MaxInt64]; keep existing safe handling for signed ints, json.Number, and
strconv.ParseInt for strings but remove the direct int64 casts that allow silent
corruption.
framework/configstore/migrations.go (1)

6488-6496: ⚠️ Potential issue | 🟠 Major

Recompute config_hash when rewriting the OTEL plugin config.

This migration changes the logical plugin config shape, but Line 6495 saves the row without updating plugin.ConfigHash. That leaves the otel plugin permanently dirty after upgrade, unlike the other config-shape migrations in this file that recompute hashes after mutation.

Suggested fix
 			plugin.Config = map[string]any{
 				"profiles": []any{cfgMap},
 			}
 			// Force BeforeSave to re-serialize + re-encrypt.
 			plugin.ConfigJSON = ""
 			plugin.EncryptionStatus = tables.EncryptionStatusPlainText
+			hash, err := GeneratePluginHash(plugin)
+			if err != nil {
+				return fmt.Errorf("failed to generate hash for plugin %s: %w", plugin.Name, err)
+			}
+			plugin.ConfigHash = hash
 
 			if err := tx.Save(&plugin).Error; err != nil {
 				return fmt.Errorf("failed to save migrated otel config: %w", err)
 			}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@framework/configstore/migrations.go` around lines 6488 - 6496, The migration
rewrites plugin.Config/ConfigJSON/EncryptionStatus but never updates
plugin.ConfigHash, leaving the otel plugin marked dirty; compute and assign the
new config hash on the plugin before persisting (i.e., after setting
plugin.Config, clearing plugin.ConfigJSON and updating plugin.EncryptionStatus
but before tx.Save(&plugin))—use the same hashing routine used elsewhere in this
file (the function or method that produces plugin.ConfigHash from the serialized
config) so the saved row has an updated plugin.ConfigHash.
plugins/otel/utils.go (1)

32-57: ⚠️ Potential issue | 🟠 Major

Resolve the CA file target instead of rejecting symlinked mounts.

This validator rejects the symlinked files Kubernetes uses for Secret/ConfigMap mounts and also rejects relative tls_ca_cert configs. Since both HTTP and gRPC OTEL clients now funnel through buildTLSConfig, that can block every OTEL profile that uses a mounted CA file.

Suggested fix
-func validateCACertPath(certPath string) error {
+func validateCACertPath(certPath string) (string, error) {
 	if certPath == "" {
-		return nil
+		return "", nil
 	}
 
-	// Clean the path to resolve any .. or . components
 	cleanPath := filepath.Clean(certPath)
-
-	// Require absolute paths to prevent relative path attacks
-	if !filepath.IsAbs(cleanPath) {
-		return fmt.Errorf("TLS CA cert path must be absolute: %s", certPath)
-	}
-
-	// Verify the file exists and is not a symlink
-	info, err := os.Lstat(cleanPath)
+	absPath, err := filepath.Abs(cleanPath)
 	if err != nil {
-		return fmt.Errorf("TLS CA cert path not accessible: %w", err)
+		return "", fmt.Errorf("failed to resolve TLS CA cert path %q: %w", certPath, err)
 	}
-	// Reject symlinks to prevent symlink-based path traversal
-	if info.Mode()&os.ModeSymlink != 0 {
-		return fmt.Errorf("TLS CA cert path cannot be a symlink: %s", certPath)
+	realPath, err := filepath.EvalSymlinks(absPath)
+	if err != nil {
+		return "", fmt.Errorf("TLS CA cert path not accessible: %w", err)
 	}
-	// Ensure path is a regular file, not directories, sockets, pipes, devices, etc.
+	info, err := os.Stat(realPath)
+	if err != nil {
+		return "", fmt.Errorf("TLS CA cert path not accessible: %w", err)
+	}
 	if !info.Mode().IsRegular() {
-		return fmt.Errorf("TLS CA cert path is not a regular file: %s", certPath)
+		return "", fmt.Errorf("TLS CA cert path is not a regular file: %s", realPath)
 	}
-
-	return nil
+	return realPath, nil
 }
@@
 	// TLS priority: custom CA > system roots > insecure
 	if tlsCACert != "" {
-		if err := validateCACertPath(tlsCACert); err != nil {
+		resolvedPath, err := validateCACertPath(tlsCACert)
+		if err != nil {
 			return nil, err
 		}
-		caCert, err := os.ReadFile(tlsCACert)
+		caCert, err := os.ReadFile(resolvedPath)
 		if err != nil {
 			return nil, fmt.Errorf("failed to load provided CA cert: %w", err)
 		}

Also applies to: 74-79

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/otel/utils.go` around lines 32 - 57, The validator validateCACertPath
currently rejects symlinked mounts and relative paths; change it to resolve the
CA file target and accept symlinked ConfigMap/Secret mounts and relative
configs: use filepath.Abs (or filepath.Join with cwd) to canonicalize relative
certPath, then call filepath.EvalSymlinks (or os.Stat which follows symlinks) on
the cleaned/absolute path to get the real target and verify that the resolved
target exists and is a regular file (use os.Stat and Mode().IsRegular()), rather
than rejecting when os.Lstat reports a symlink; apply the same change to the
duplicate validation logic referenced around the second block (the other
validateCACertPath usage / related checks) and ensure buildTLSConfig continues
to read the resolved path.
transports/config.schema.json (2)

1474-1516: ⚠️ Potential issue | 🟠 Major

Make collector_url and metrics_endpoint conditional on protocol.

The profile item schema still accepts both URI and host:port shapes for both protocols. ui/lib/types/schemas.ts already validates HTTP vs gRPC differently, so config.json can still pass the authoritative schema and fail later when the OTEL clients are built.

As per coding guidelines, transports/config.schema.json is the authoritative definition for all config.json fields.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@transports/config.schema.json` around lines 1474 - 1516, The schema currently
allows both URI and host:port for collector_url and metrics_endpoint regardless
of protocol; change this by adding a conditional schema at the enclosing OTEL
transport object that switches validation based on the protocol property: use
JSON Schema "if"/"then"/"else" (or dependentSchemas) on "protocol" (check for
"http" vs "grpc") and in the "then" require collector_url and metrics_endpoint
to match the HTTP form (format: "uri") while in the "else" require them to match
the gRPC form (pattern "^[^:\\s]+:\\d+$"); update the entries for collector_url
and metrics_endpoint to remove the current oneOf so they are validated by the
conditional block and reference the properties "collector_url",
"metrics_endpoint", and "protocol" to locate the change.

1464-1546: ⚠️ Potential issue | 🟠 Major

Disabled profiles still can't be stored as incomplete drafts.

enabled: false says the profile is skipped, and plugins/otel/main.go already skips disabled profiles before requiring collector_url, but this unconditional required list still rejects disabled/incomplete profiles. Move those requirements under an enabled=true condition.

As per coding guidelines, transports/config.schema.json is the authoritative definition for all config.json fields.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@transports/config.schema.json` around lines 1464 - 1546, The schema currently
unconditionally requires collector_url, trace_type, and protocol which prevents
saving disabled/incomplete profiles; change transports/config.schema.json to
conditionally require those fields only when enabled is true by adding a JSON
Schema conditional (use "if": {"properties": {"enabled": {"const": true}}},
"then": {"required": ["collector_url","trace_type","protocol"]}) so disabled
profiles (enabled: false) are not rejected; keep the existing properties
(enabled, collector_url, trace_type, protocol, metrics_*, headers, tls_ca_cert,
insecure) and additionalProperties: false unchanged.
ui/app/workspace/observability/fragments/otelFormFragment.tsx (2)

200-255: ⚠️ Potential issue | 🟠 Major

Split the expand trigger from the enable/remove controls.

The header <button> still wraps the Switch and the remove Button, so interactive descendants remain nested inside another interactive element. That keeps the invalid HTML and keyboard/focus issue from the earlier review.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ui/app/workspace/observability/fragments/otelFormFragment.tsx` around lines
200 - 255, The header's native <button> currently wraps interactive children
(Switch and remove Button) causing invalid nesting; refactor so the expand
trigger is separate from the enable/remove controls: convert the outer header
<button> into a non-interactive wrapper (e.g., div with role="button" and
keyboard handlers) or remove the Switch and the remove <Button> from inside the
header and render them as sibling elements, keeping their existing props
(FormField -> name "otel_config.profiles.${index}.enabled", Switch handlers,
remove(index) call and setExpandedProfiles update) and preserving
disabled/title/aria attributes; ensure the expand click still calls
setExpandedProfiles((prev) => ({ ...prev, [index]: !prev[index] })) and that
inner controls no longer rely on stopPropagation to avoid nested interactive
elements.

200-255: ⚠️ Potential issue | 🟡 Minor

Add test ids to the remaining profile actions.

The metrics toggle already has a stable selector, but the collapse trigger, per-profile enable switch, remove action, and add-profile button still don't. Please add data-testids for those controls as well.

As per coding guidelines, ui/**/*.{tsx,ts} must add new interactive UI elements with data-testid attributes following the pattern data-testid="<entity>-<element>-<qualifier>".

Also applies to: 573-583

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ui/app/workspace/observability/fragments/otelFormFragment.tsx` around lines
200 - 255, Add stable data-testid attributes to the interactive controls in
otelFormFragment: add data-testid to the outer collapse trigger button (the
<button> that calls setExpandedProfiles) using pattern
"otelProfile-collapse-<index>", add data-testid to the per-profile enable Switch
(inside FormField rendering for name `otel_config.profiles.${index}.enabled`)
like "otelProfile-enable-<index>", add data-testid to the remove Button (the
ghost Button that calls remove(index)) like "otelProfile-remove-<index>", and
also add the same pattern for the add-profile button referenced around lines
573-583 (use "otelProfile-add" or "otelProfile-add-<qualifier>" as appropriate);
follow the repository convention data-testid="<entity>-<element>-<qualifier>"
and ensure values include the profile index where applicable and are present on
the actual interactive elements: the collapse trigger button, the Switch
component, the remove Button, and the add-profile Button.
plugins/otel/metrics.go (1)

210-220: ⚠️ Potential issue | 🟠 Major

WithInsecure() still breaks HTTPS collectors in skip-verify mode.

When insecure is true and tls_ca_cert is empty, this branch always forces otlpmetrichttp.WithInsecure(). That works for true plaintext endpoints, but it cannot reach an https://... collector with certificate verification skipped — the exporter is downgraded to HTTP instead.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/otel/metrics.go` around lines 210 - 220, The current branch
unconditionally uses otlpmetrichttp.WithInsecure() when config.Insecure is true
and config.TLSCACert is empty, which breaks HTTPS collectors in skip-verify
mode; instead, detect the collector endpoint scheme and if it is HTTPS and
config.Insecure is true, call buildTLSConfig with skipVerify=true and pass the
resulting tls.Config to otlpmetrichttp.WithTLSClientConfig(tlsConfig); only use
otlpmetrichttp.WithInsecure() for plaintext HTTP endpoints. Update the logic
around config.TLSCACert, config.Insecure, buildTLSConfig,
otlpmetrichttp.WithInsecure and otlpmetrichttp.WithTLSClientConfig accordingly
(i.e., prefer TLS client config with InsecureSkipVerify for https+insecure,
fallback to WithInsecure only for http).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@plugins/otel/main.go`:
- Around line 101-112: The loop currently calls
injectEnvToHeaders(profileCfg.Headers) for every profile before checking
profileCfg.Enabled, which can cause Init to fail on disabled profiles with
unresolved env.* headers; modify the code so that injectEnvToHeaders is only
invoked for profiles that will be processed—i.e., move or duplicate the
injectEnvToHeaders(profileCfg.Headers) call inside the later for i, profileCfg
:= range config.Profiles loop after the Enabled check (the same loop that builds
profiles []*OtelProfile), so disabled profiles are skipped entirely; keep
attributesFromEnvironment and other logic unchanged.

In `@plugins/otel/metrics.go`:
- Around line 452-552: The current code selects a single llmSpan and records
metrics only for that final attempt, which hides retries and fallbacks; change
the logic to iterate over trace.Spans (filtering span.Kind ==
schemas.SpanKindLLMCall || schemas.SpanKindRetry) and for each span build
attributes via BuildBifrostAttributes (using that span's Attributes and method
fallback to span.Name), then call the metric methods per-span
(RecordUpstreamRequest, RecordUpstreamLatency,
RecordErrorRequest/RecordSuccessRequest, RecordInputTokens, RecordOutputTokens,
RecordCost, RecordStreamFirstTokenLatency) so every attempt contributes to
counts and usage instead of only the last attempt; keep the existing fallback to
trace.RootSpan only for traces with no LLM/retry spans.

---

Outside diff comments:
In `@plugins/otel/metrics.go`:
- Around line 171-187: NewMetricsExporter currently calls
otel.SetMeterProvider(provider) for each profile which mutates global state and
causes last-write-wins; remove the repeated global mutation and keep each
provider local to MetricsExporter (use m.meter created from provider) OR
implement/set a single global multiplexing/fan-out MeterProvider once that
forwards to each profile provider; specifically, stop calling
otel.SetMeterProvider(...) inside NewMetricsExporter (or replace that call with
logic to register the provider with a shared multiplexing provider), and ensure
MetricsExporter continues to use its local meter (m.meter) for recording so
profile-specific metrics are exported correctly.

---

Duplicate comments:
In `@core/schemas/utils.go`:
- Around line 403-445: SafeExtractInt64 currently does unsafe direct casts from
unsigned integers and floats (uint64, uint, float32, float64) that can wrap or
silently coerce invalid values; update SafeExtractInt64 to validate ranges and
integrality before converting: for all unsigned variants (uint, uint8, uint16,
uint32, uint64) check value <= math.MaxInt64 and return false otherwise; for
float32/float64 reject NaN/±Inf and only accept values where math.Trunc(f) == f
and f is within [math.MinInt64, math.MaxInt64]; keep existing safe handling for
signed ints, json.Number, and strconv.ParseInt for strings but remove the direct
int64 casts that allow silent corruption.

In `@framework/configstore/migrations.go`:
- Around line 6488-6496: The migration rewrites
plugin.Config/ConfigJSON/EncryptionStatus but never updates plugin.ConfigHash,
leaving the otel plugin marked dirty; compute and assign the new config hash on
the plugin before persisting (i.e., after setting plugin.Config, clearing
plugin.ConfigJSON and updating plugin.EncryptionStatus but before
tx.Save(&plugin))—use the same hashing routine used elsewhere in this file (the
function or method that produces plugin.ConfigHash from the serialized config)
so the saved row has an updated plugin.ConfigHash.

In `@plugins/otel/metrics.go`:
- Around line 210-220: The current branch unconditionally uses
otlpmetrichttp.WithInsecure() when config.Insecure is true and config.TLSCACert
is empty, which breaks HTTPS collectors in skip-verify mode; instead, detect the
collector endpoint scheme and if it is HTTPS and config.Insecure is true, call
buildTLSConfig with skipVerify=true and pass the resulting tls.Config to
otlpmetrichttp.WithTLSClientConfig(tlsConfig); only use
otlpmetrichttp.WithInsecure() for plaintext HTTP endpoints. Update the logic
around config.TLSCACert, config.Insecure, buildTLSConfig,
otlpmetrichttp.WithInsecure and otlpmetrichttp.WithTLSClientConfig accordingly
(i.e., prefer TLS client config with InsecureSkipVerify for https+insecure,
fallback to WithInsecure only for http).

In `@plugins/otel/utils.go`:
- Around line 32-57: The validator validateCACertPath currently rejects
symlinked mounts and relative paths; change it to resolve the CA file target and
accept symlinked ConfigMap/Secret mounts and relative configs: use filepath.Abs
(or filepath.Join with cwd) to canonicalize relative certPath, then call
filepath.EvalSymlinks (or os.Stat which follows symlinks) on the
cleaned/absolute path to get the real target and verify that the resolved target
exists and is a regular file (use os.Stat and Mode().IsRegular()), rather than
rejecting when os.Lstat reports a symlink; apply the same change to the
duplicate validation logic referenced around the second block (the other
validateCACertPath usage / related checks) and ensure buildTLSConfig continues
to read the resolved path.

In `@transports/config.schema.json`:
- Around line 1474-1516: The schema currently allows both URI and host:port for
collector_url and metrics_endpoint regardless of protocol; change this by adding
a conditional schema at the enclosing OTEL transport object that switches
validation based on the protocol property: use JSON Schema "if"/"then"/"else"
(or dependentSchemas) on "protocol" (check for "http" vs "grpc") and in the
"then" require collector_url and metrics_endpoint to match the HTTP form
(format: "uri") while in the "else" require them to match the gRPC form (pattern
"^[^:\\s]+:\\d+$"); update the entries for collector_url and metrics_endpoint to
remove the current oneOf so they are validated by the conditional block and
reference the properties "collector_url", "metrics_endpoint", and "protocol" to
locate the change.
- Around line 1464-1546: The schema currently unconditionally requires
collector_url, trace_type, and protocol which prevents saving
disabled/incomplete profiles; change transports/config.schema.json to
conditionally require those fields only when enabled is true by adding a JSON
Schema conditional (use "if": {"properties": {"enabled": {"const": true}}},
"then": {"required": ["collector_url","trace_type","protocol"]}) so disabled
profiles (enabled: false) are not rejected; keep the existing properties
(enabled, collector_url, trace_type, protocol, metrics_*, headers, tls_ca_cert,
insecure) and additionalProperties: false unchanged.

In `@ui/app/workspace/observability/fragments/otelFormFragment.tsx`:
- Around line 200-255: The header's native <button> currently wraps interactive
children (Switch and remove Button) causing invalid nesting; refactor so the
expand trigger is separate from the enable/remove controls: convert the outer
header <button> into a non-interactive wrapper (e.g., div with role="button" and
keyboard handlers) or remove the Switch and the remove <Button> from inside the
header and render them as sibling elements, keeping their existing props
(FormField -> name "otel_config.profiles.${index}.enabled", Switch handlers,
remove(index) call and setExpandedProfiles update) and preserving
disabled/title/aria attributes; ensure the expand click still calls
setExpandedProfiles((prev) => ({ ...prev, [index]: !prev[index] })) and that
inner controls no longer rely on stopPropagation to avoid nested interactive
elements.
- Around line 200-255: Add stable data-testid attributes to the interactive
controls in otelFormFragment: add data-testid to the outer collapse trigger
button (the <button> that calls setExpandedProfiles) using pattern
"otelProfile-collapse-<index>", add data-testid to the per-profile enable Switch
(inside FormField rendering for name `otel_config.profiles.${index}.enabled`)
like "otelProfile-enable-<index>", add data-testid to the remove Button (the
ghost Button that calls remove(index)) like "otelProfile-remove-<index>", and
also add the same pattern for the add-profile button referenced around lines
573-583 (use "otelProfile-add" or "otelProfile-add-<qualifier>" as appropriate);
follow the repository convention data-testid="<entity>-<element>-<qualifier>"
and ensure values include the profile index where applicable and are present on
the actual interactive elements: the collapse trigger button, the Switch
component, the remove Button, and the add-profile Button.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 4ea7da67-3434-4e54-990f-1227acabf541

📥 Commits

Reviewing files that changed from the base of the PR and between ac226cb and fe4e0d8.

📒 Files selected for processing (13)
  • core/schemas/utils.go
  • framework/configstore/migrations.go
  • plugins/otel/changelog.md
  • plugins/otel/converter.go
  • plugins/otel/go.mod
  • plugins/otel/grpc.go
  • plugins/otel/http.go
  • plugins/otel/main.go
  • plugins/otel/metrics.go
  • plugins/otel/utils.go
  • transports/bifrost-http/server/plugins.go
  • transports/config.schema.json
  • ui/app/workspace/observability/fragments/otelFormFragment.tsx
✅ Files skipped from review due to trivial changes (3)
  • plugins/otel/changelog.md
  • plugins/otel/go.mod
  • transports/bifrost-http/server/plugins.go
🚧 Files skipped from review as they are similar to previous changes (3)
  • plugins/otel/http.go
  • plugins/otel/grpc.go
  • plugins/otel/converter.go

Comment thread plugins/otel/main.go Outdated
Comment thread plugins/otel/metrics.go Outdated
@sammaji sammaji force-pushed the 04-08-feat_multiple_otel_profiles branch from fe4e0d8 to 1f5f7c5 Compare April 14, 2026 09:53
Comment thread plugins/otel/main.go Outdated
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

♻️ Duplicate comments (10)
core/schemas/utils.go (1)

403-445: ⚠️ Potential issue | 🟠 Major

Reject unsafe casts in SafeExtractInt64.

uint/uint64 can wrap into negative values, and float branches currently allow truncation plus NaN/Inf/out-of-range conversions. For a helper named SafeExtractInt64, these should fail closed.

Proposed fix
 import (
 	"encoding/json"
 	"fmt"
+	"math"
 	"math/rand"
 	"net/url"
 	"regexp"
 	"strconv"
@@
 	case uint:
-		return int64(v), true
+		if uint64(v) > math.MaxInt64 {
+			return 0, false
+		}
+		return int64(v), true
@@
 	case uint64:
-		return int64(v), true
+		if v > math.MaxInt64 {
+			return 0, false
+		}
+		return int64(v), true
 	case float32:
-		return int64(v), true
+		f := float64(v)
+		if math.IsNaN(f) || math.IsInf(f, 0) || math.Trunc(f) != f || f < math.MinInt64 || f > math.MaxInt64 {
+			return 0, false
+		}
+		return int64(f), true
 	case float64:
-		return int64(v), true
+		if math.IsNaN(v) || math.IsInf(v, 0) || math.Trunc(v) != v || v < math.MinInt64 || v > math.MaxInt64 {
+			return 0, false
+		}
+		return int64(v), true
In the Go spec (current stable), what is the defined behavior when converting:
1) uint64 values greater than math.MaxInt64 to int64
2) float64 NaN/Inf/out-of-range or fractional values to int64?
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/schemas/utils.go` around lines 403 - 445, SafeExtractInt64 currently
performs unsafe casts for unsigned integers and floats; update the function to
reject any uint/uint64 values > math.MaxInt64 and reject float32/float64 values
that are NaN/Inf/non-integer (fractional) or out of int64 range. Concretely: in
the uint/uint8/uint16/uint32/uint64 branches check the value against
math.MaxInt64 (promote to uint64 for comparison) and return false if it exceeds;
in float32/float64 branches convert to float64, use !math.IsFinite
(math.IsNaN/IsInf) to reject non-finite, check that math.Trunc(f)==f (no
fractional part) and that f is between math.MinInt64 and math.MaxInt64 before
casting to int64; keep json.Number handling as-is and leave string parsing
unchanged.
framework/configstore/migrations.go (1)

6488-6496: ⚠️ Potential issue | 🟠 Major

Refresh config_hash when rewriting the OTEL plugin config.

This migration changes the persisted config payload, but plugin.ConfigHash is never updated in the same write. That leaves the row hashed against the legacy shape, so config reconciliation can keep seeing it as dirty after upgrade.

💡 Suggested fix
 			plugin.Config = map[string]any{
 				"profiles": []any{cfgMap},
 			}
 			// Force BeforeSave to re-serialize + re-encrypt.
 			plugin.ConfigJSON = ""
 			plugin.EncryptionStatus = tables.EncryptionStatusPlainText
+			hash, err := GeneratePluginHash(plugin)
+			if err != nil {
+				return fmt.Errorf("failed to generate hash for plugin %s: %w", plugin.Name, err)
+			}
+			plugin.ConfigHash = hash
 
 			if err := tx.Save(&plugin).Error; err != nil {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@framework/configstore/migrations.go` around lines 6488 - 6496, The migration
updates the stored OTEL plugin payload (plugin.Config, clears plugin.ConfigJSON,
sets plugin.EncryptionStatus) but doesn’t refresh plugin.ConfigHash, leaving the
row with a stale hash; update plugin.ConfigHash before persisting by computing
the new hash using the same hashing routine used elsewhere (the codepath that
normally sets ConfigHash on save) and assign it to plugin.ConfigHash prior to
calling tx.Save(&plugin), ensuring the saved row has a matching hash for the new
config shape.
ui/app/workspace/observability/fragments/otelFormFragment.tsx (2)

200-228: ⚠️ Potential issue | 🟡 Minor

Add data-testids to the remaining profile controls.

The profile toggle, per-profile enable switch, remove action, and add-profile button still don't expose stable selectors. Please add test ids like otel-profile-toggle-${index}, otel-profile-enable-${index}, otel-profile-remove-${index}, and otel-profile-add-btn to match the rest of the OTEL form.

As per coding guidelines, ui/**/*.{tsx,ts}: Add new interactive UI elements with data-testid attributes following the pattern: data-testid="<entity>-<element>-<qualifier>".

Also applies to: 232-254, 572-582

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ui/app/workspace/observability/fragments/otelFormFragment.tsx` around lines
200 - 228, Add stable data-testid attributes to the remaining OTEL profile
controls: add data-testid={`otel-profile-toggle-${index}`} on the clickable
profile header/div that uses setExpandedProfiles, add
data-testid={`otel-profile-enable-${index}`} on the per-profile Switch rendered
inside FormField for name `otel_config.profiles.${index}.enabled`, add
data-testid={`otel-profile-remove-${index}`} on the remove action/button that
deletes a profile (the element that calls the remove handler), and add
data-testid="otel-profile-add-btn" on the add-profile button used to append new
profiles; apply the same additions in the other similar blocks referenced around
the 232-254 and 572-582 ranges so tests can reliably select these elements.

200-208: ⚠️ Potential issue | 🟡 Minor

Don't make the collapse trigger a click-only div.

This header can't receive focus, doesn't expose expanded/collapsed state, and won't toggle with Enter/Space. Please use a real button for the trigger, or add full button semantics (role, tabIndex, keyboard handler, aria-expanded).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ui/app/workspace/observability/fragments/otelFormFragment.tsx` around lines
200 - 208, The collapse header currently uses a clickable div which is not
accessible; replace that div with a semantic button (or if you must keep a div,
add role="button", tabIndex={0}, an onKeyDown handler that toggles on
Enter/Space, and aria-expanded bound to the expanded state) so keyboard users
and assistive tech can toggle it. Update the element wrapping the header (the
one using setExpandedProfiles and index) to expose
aria-expanded={!!expandedProfiles[index]} and ensure the toggle logic
(setExpandedProfiles callback) is invoked from both onClick and the new keyboard
handler.
transports/config.schema.json (3)

1541-1545: ⚠️ Potential issue | 🟠 Major

Required fields should be conditional on enabled=true for each profile.

Line 1541 currently blocks storing intentionally disabled/incomplete profiles, which conflicts with profile-level enable/disable behavior.

As per coding guidelines, "transports/config.schema.json: Update transports/config.schema.json when adding config fields; it is the authoritative definition for all config.json fields."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@transports/config.schema.json` around lines 1541 - 1545, The schema currently
forces "collector_url", "trace_type", and "protocol" to always be required,
preventing intentionally disabled or incomplete profiles; change the schema so
those fields are only required when a profile's "enabled" is true by replacing
the unconditional "required" array with an if/then conditional (e.g., use "if":
{"properties": {"enabled": {"const": true}}}, "then": {"required":
["collector_url","trace_type","protocol"]}}) so that profiles with "enabled":
false can omit those properties; apply this to the transport profile schema
where the "required" array is defined.

1536-1539: ⚠️ Potential issue | 🟡 Minor

insecure description is ambiguous across HTTP vs gRPC behavior.

Line 1538 says “Skip TLS verification”, but runtime behavior differs by protocol (including plaintext paths). The schema text should reflect that split.

As per coding guidelines, "transports/config.schema.json: Update transports/config.schema.json when adding config fields; it is the authoritative definition for all config.json fields."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@transports/config.schema.json` around lines 1536 - 1539, The "insecure"
schema description is ambiguous about HTTP vs gRPC behavior; update the
"insecure" property description to explicitly state protocol-specific semantics:
mention that when set true (and tls_ca_cert is not provided) HTTP clients will
disable TLS certificate verification (still use TLS but skip verification),
whereas gRPC may fall back to plaintext or disable TLS verification depending on
the gRPC transport implementation; reference the "insecure" field and the
"tls_ca_cert" interaction so readers know the behavior differs by protocol and
when plaintext may be used.

1474-1485: ⚠️ Potential issue | 🟠 Major

Endpoint validation must be protocol-specific (http(s):// vs host:port).

These fields currently accept both URI and host:port for both protocols, which allows schema-valid but runtime-invalid configs.

As per coding guidelines, "transports/config.schema.json: Update transports/config.schema.json when adding config fields; it is the authoritative definition for all config.json fields."

Also applies to: 1506-1517

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@transports/config.schema.json` around lines 1474 - 1485, The collector_url
schema currently allows any URI or a host:port, causing protocol-mismatches;
update the "collector_url" oneOf so the URI branch requires an http or https
scheme (e.g., enforce pattern or format that starts with "http://" or
"https://") and keep the host:port branch limited to the pattern
"^[^:\\s]+:\\d+$" for non-schemed endpoints; apply the same protocol-specific
change to the other identical transport URL field occurrences (the duplicate
block referenced around the later section) so each URL entry only accepts either
an http(s) URI or a host:port, not arbitrary URIs.
plugins/otel/metrics.go (2)

452-522: ⚠️ Potential issue | 🟠 Major

Recording only the last LLM/retry span hides retry/fallback load.

Line 452 picks a single “final” attempt and records counters/tokens/cost from that one span only. This underreports upstream requests/errors/usage whenever retries or fallbacks occur.

Also applies to: 524-539, 541-552

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/otel/metrics.go` around lines 452 - 522, Current code picks only the
latest LLM/retry span (llmSpan) and records metrics for that single span, which
misses requests/errors/usage from earlier retries/fallbacks; fix by iterating
over all trace.Spans where span.Kind == schemas.SpanKindLLMCall || span.Kind ==
schemas.SpanKindRetry, and for each span build attrs (using
getStringAttr/getIntAttr on span.Attributes or reuse BuildBifrostAttributes with
per-span values) and call the same metric methods (m.RecordUpstreamRequest,
m.RecordUpstreamLatency, m.RecordErrorRequest/m.RecordSuccessRequest) per-span
so counters/tokens/cost/latency reflect every attempt rather than only the final
one; ensure you still fall back to trace.RootSpan only if there are no
LLMCall/Retry spans.

210-220: ⚠️ Potential issue | 🟠 Major

WithInsecure() here can break HTTPS collectors when insecure=true.

Line 213 switches to plaintext mode instead of HTTPS-with-cert-verification-skipped behavior, so HTTPS endpoints may fail or silently downgrade behavior.

For go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetrichttp, what is the correct way to send to an HTTPS endpoint while skipping certificate verification? Specifically compare WithInsecure(), WithTLSClientConfig(), and WithHTTPClient() behavior and incompatibilities.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/otel/metrics.go` around lines 210 - 220, The current branch uses
otlpmetrichttp.WithInsecure() when config.TLSCACert == "" && config.Insecure,
which forces plaintext and will break HTTPS collectors; instead, when
insecure=true you should create a TLS config that sets InsecureSkipVerify=true
(via buildTLSConfig or a dedicated builder) and pass it with
otlpmetrichttp.WithTLSClientConfig(tlsConfig) so HTTPS is preserved while
skipping cert verification, or construct a custom *http.Client and supply it
with otlpmetrichttp.WithHTTPClient(); update the conditional around
config.TLSCACert/config.Insecure (and the call sites buildTLSConfig,
WithInsecure, WithTLSClientConfig, WithHTTPClient) to use the TLSClientConfig or
HTTPClient approach rather than WithInsecure().
plugins/otel/utils.go (1)

32-57: ⚠️ Potential issue | 🟠 Major

Allow resolved symlink targets for CA certs instead of hard-rejecting symlinks.

Line 41 and Line 51 currently reject valid deployment patterns (notably Kubernetes Secret/ConfigMap mounts). Keep validation, but resolve to real path and validate the resolved target.

Suggested adjustment
 func validateCACertPath(certPath string) error {
 	if certPath == "" {
 		return nil
 	}
 
-	cleanPath := filepath.Clean(certPath)
-	if !filepath.IsAbs(cleanPath) {
-		return fmt.Errorf("TLS CA cert path must be absolute: %s", certPath)
-	}
-
-	info, err := os.Lstat(cleanPath)
+	cleanPath := filepath.Clean(certPath)
+	absPath, err := filepath.Abs(cleanPath)
+	if err != nil {
+		return fmt.Errorf("failed to resolve tls ca cert path: %w", err)
+	}
+	realPath, err := filepath.EvalSymlinks(absPath)
+	if err != nil {
+		return fmt.Errorf("tls ca cert path not accessible: %w", err)
+	}
+	info, err := os.Stat(realPath)
 	if err != nil {
 		return fmt.Errorf("TLS CA cert path not accessible: %w", err)
 	}
-	if info.Mode()&os.ModeSymlink != 0 {
-		return fmt.Errorf("TLS CA cert path cannot be a symlink: %s", certPath)
-	}
 	if !info.Mode().IsRegular() {
 		return fmt.Errorf("TLS CA cert path is not a regular file: %s", certPath)
 	}
 
 	return nil
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/otel/utils.go` around lines 32 - 57, validateCACertPath currently
rejects symlinks; change it to resolve symlinks and validate the resolved target
instead: keep the initial filepath.Clean and absolute-path check, use os.Lstat
to detect if the path is a symlink, and if so call
filepath.EvalSymlinks(certPath) (or cleanPath) to get the real path, then use
os.Stat (not Lstat) on the resolved path to ensure it exists and
info.Mode().IsRegular() is true; return errors that mention the resolved path
where appropriate and preserve the existing error messages for non-symlink
failures in validateCACertPath.
🧹 Nitpick comments (1)
plugins/otel/main.go (1)

58-61: Tag Profiles as json:"profiles".

Without the tag this struct marshals as Profiles, not profiles, which diverges from the new schema/UI shape and the migrated DB payload.

💡 Suggested fix
 type Config struct {
-	Profiles []*OtelProfileConfig
+	Profiles []*OtelProfileConfig `json:"profiles"`
 }

As per coding guidelines, JSON uses snake_case matching provider API conventions.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/otel/main.go` around lines 58 - 61, The Config struct's Profiles
field is missing a JSON tag so it marshals as "Profiles" instead of "profiles";
update the Config definition to add the JSON tag on the Profiles field (e.g.,
add `json:"profiles"` to Profiles in the Config struct) so serialization matches
the new schema/UI and migrated DB payload.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@core/schemas/utils.go`:
- Around line 883-885: The copy currently aliases the slice by assigning
keyOrder: original.Function.Parameters.keyOrder which shares the backing array;
change the assignment in the copy routine that handles
original.Function.Parameters to deep-copy the slice (allocate a new slice with
same length and copy elements, handling nil cases) so the new object's keyOrder
is independent from original.Function.Parameters.keyOrder.

In `@plugins/otel/main.go`:
- Around line 101-107: The pre-pass over config.Profiles incorrectly only
expands headers when profileCfg.Enabled is non-nil true and doesn't guard
against nil profile entries; update the loop that iterates config.Profiles to
first skip nil profileCfg, compute enabled := profileCfg.Enabled == nil ||
*profileCfg.Enabled (treat nil as enabled), and if enabled call
injectEnvToHeaders(profileCfg.Headers); also apply the same fix to the second
similar loop mentioned (the block around injectEnvToHeaders at the other
location) and ensure any later loops that dereference profileCfg also check for
nil before accessing fields.

In `@ui/app/workspace/observability/fragments/otelFormFragment.tsx`:
- Around line 124-129: The form's validation still requires collector_url for
every profile because otelFormSchema enforces collector_url when OTEL is
enabled, so toggling a profile off (profile.enabled === false) leaves
formState.isValid false and blocks save; update the schema (otelFormSchema) to
conditionally skip validating collector_url when a profile's enabled field is
false (use zod .refine/.optional/.when-like pattern or z.preprocess to make
collector_url optional when profile.enabled === false) or adjust the form logic
to treat disabled profiles as excluded from validation before calling
useForm/OtelFormSchema validation; ensure changes reference otelFormSchema and
the profile.enabled boolean so the UI switch truly disables the profile and
allows formState.isValid to be true for blank/disabled profiles.

In `@ui/lib/types/schemas.ts`:
- Around line 821-831: The validation currently requires collector_url for every
entry in data.otel_config.profiles when data.enabled is true; update the loop in
the data.otel_config.profiles.forEach block to skip profiles where
profile.enabled is explicitly false (i.e., only validate collector_url when
profile.enabled is true or undefined), and only call ctx.addIssue for
missing/empty profile.collector_url in those enabled profiles; reference the
existing data.enabled check, the profiles.forEach iteration,
profile.collector_url, profile.enabled, and ctx.addIssue to locate and change
the condition.

---

Duplicate comments:
In `@core/schemas/utils.go`:
- Around line 403-445: SafeExtractInt64 currently performs unsafe casts for
unsigned integers and floats; update the function to reject any uint/uint64
values > math.MaxInt64 and reject float32/float64 values that are
NaN/Inf/non-integer (fractional) or out of int64 range. Concretely: in the
uint/uint8/uint16/uint32/uint64 branches check the value against math.MaxInt64
(promote to uint64 for comparison) and return false if it exceeds; in
float32/float64 branches convert to float64, use !math.IsFinite
(math.IsNaN/IsInf) to reject non-finite, check that math.Trunc(f)==f (no
fractional part) and that f is between math.MinInt64 and math.MaxInt64 before
casting to int64; keep json.Number handling as-is and leave string parsing
unchanged.

In `@framework/configstore/migrations.go`:
- Around line 6488-6496: The migration updates the stored OTEL plugin payload
(plugin.Config, clears plugin.ConfigJSON, sets plugin.EncryptionStatus) but
doesn’t refresh plugin.ConfigHash, leaving the row with a stale hash; update
plugin.ConfigHash before persisting by computing the new hash using the same
hashing routine used elsewhere (the codepath that normally sets ConfigHash on
save) and assign it to plugin.ConfigHash prior to calling tx.Save(&plugin),
ensuring the saved row has a matching hash for the new config shape.

In `@plugins/otel/metrics.go`:
- Around line 452-522: Current code picks only the latest LLM/retry span
(llmSpan) and records metrics for that single span, which misses
requests/errors/usage from earlier retries/fallbacks; fix by iterating over all
trace.Spans where span.Kind == schemas.SpanKindLLMCall || span.Kind ==
schemas.SpanKindRetry, and for each span build attrs (using
getStringAttr/getIntAttr on span.Attributes or reuse BuildBifrostAttributes with
per-span values) and call the same metric methods (m.RecordUpstreamRequest,
m.RecordUpstreamLatency, m.RecordErrorRequest/m.RecordSuccessRequest) per-span
so counters/tokens/cost/latency reflect every attempt rather than only the final
one; ensure you still fall back to trace.RootSpan only if there are no
LLMCall/Retry spans.
- Around line 210-220: The current branch uses otlpmetrichttp.WithInsecure()
when config.TLSCACert == "" && config.Insecure, which forces plaintext and will
break HTTPS collectors; instead, when insecure=true you should create a TLS
config that sets InsecureSkipVerify=true (via buildTLSConfig or a dedicated
builder) and pass it with otlpmetrichttp.WithTLSClientConfig(tlsConfig) so HTTPS
is preserved while skipping cert verification, or construct a custom
*http.Client and supply it with otlpmetrichttp.WithHTTPClient(); update the
conditional around config.TLSCACert/config.Insecure (and the call sites
buildTLSConfig, WithInsecure, WithTLSClientConfig, WithHTTPClient) to use the
TLSClientConfig or HTTPClient approach rather than WithInsecure().

In `@plugins/otel/utils.go`:
- Around line 32-57: validateCACertPath currently rejects symlinks; change it to
resolve symlinks and validate the resolved target instead: keep the initial
filepath.Clean and absolute-path check, use os.Lstat to detect if the path is a
symlink, and if so call filepath.EvalSymlinks(certPath) (or cleanPath) to get
the real path, then use os.Stat (not Lstat) on the resolved path to ensure it
exists and info.Mode().IsRegular() is true; return errors that mention the
resolved path where appropriate and preserve the existing error messages for
non-symlink failures in validateCACertPath.

In `@transports/config.schema.json`:
- Around line 1541-1545: The schema currently forces "collector_url",
"trace_type", and "protocol" to always be required, preventing intentionally
disabled or incomplete profiles; change the schema so those fields are only
required when a profile's "enabled" is true by replacing the unconditional
"required" array with an if/then conditional (e.g., use "if": {"properties":
{"enabled": {"const": true}}}, "then": {"required":
["collector_url","trace_type","protocol"]}}) so that profiles with "enabled":
false can omit those properties; apply this to the transport profile schema
where the "required" array is defined.
- Around line 1536-1539: The "insecure" schema description is ambiguous about
HTTP vs gRPC behavior; update the "insecure" property description to explicitly
state protocol-specific semantics: mention that when set true (and tls_ca_cert
is not provided) HTTP clients will disable TLS certificate verification (still
use TLS but skip verification), whereas gRPC may fall back to plaintext or
disable TLS verification depending on the gRPC transport implementation;
reference the "insecure" field and the "tls_ca_cert" interaction so readers know
the behavior differs by protocol and when plaintext may be used.
- Around line 1474-1485: The collector_url schema currently allows any URI or a
host:port, causing protocol-mismatches; update the "collector_url" oneOf so the
URI branch requires an http or https scheme (e.g., enforce pattern or format
that starts with "http://" or "https://") and keep the host:port branch limited
to the pattern "^[^:\\s]+:\\d+$" for non-schemed endpoints; apply the same
protocol-specific change to the other identical transport URL field occurrences
(the duplicate block referenced around the later section) so each URL entry only
accepts either an http(s) URI or a host:port, not arbitrary URIs.

In `@ui/app/workspace/observability/fragments/otelFormFragment.tsx`:
- Around line 200-228: Add stable data-testid attributes to the remaining OTEL
profile controls: add data-testid={`otel-profile-toggle-${index}`} on the
clickable profile header/div that uses setExpandedProfiles, add
data-testid={`otel-profile-enable-${index}`} on the per-profile Switch rendered
inside FormField for name `otel_config.profiles.${index}.enabled`, add
data-testid={`otel-profile-remove-${index}`} on the remove action/button that
deletes a profile (the element that calls the remove handler), and add
data-testid="otel-profile-add-btn" on the add-profile button used to append new
profiles; apply the same additions in the other similar blocks referenced around
the 232-254 and 572-582 ranges so tests can reliably select these elements.
- Around line 200-208: The collapse header currently uses a clickable div which
is not accessible; replace that div with a semantic button (or if you must keep
a div, add role="button", tabIndex={0}, an onKeyDown handler that toggles on
Enter/Space, and aria-expanded bound to the expanded state) so keyboard users
and assistive tech can toggle it. Update the element wrapping the header (the
one using setExpandedProfiles and index) to expose
aria-expanded={!!expandedProfiles[index]} and ensure the toggle logic
(setExpandedProfiles callback) is invoked from both onClick and the new keyboard
handler.

---

Nitpick comments:
In `@plugins/otel/main.go`:
- Around line 58-61: The Config struct's Profiles field is missing a JSON tag so
it marshals as "Profiles" instead of "profiles"; update the Config definition to
add the JSON tag on the Profiles field (e.g., add `json:"profiles"` to Profiles
in the Config struct) so serialization matches the new schema/UI and migrated DB
payload.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: cd39d046-0653-4f82-ade5-3f9024f5f63c

📥 Commits

Reviewing files that changed from the base of the PR and between fe4e0d8 and 1f5f7c5.

📒 Files selected for processing (14)
  • core/schemas/utils.go
  • framework/configstore/migrations.go
  • plugins/otel/changelog.md
  • plugins/otel/converter.go
  • plugins/otel/go.mod
  • plugins/otel/grpc.go
  • plugins/otel/http.go
  • plugins/otel/main.go
  • plugins/otel/metrics.go
  • plugins/otel/utils.go
  • transports/bifrost-http/server/plugins.go
  • transports/config.schema.json
  • ui/app/workspace/observability/fragments/otelFormFragment.tsx
  • ui/lib/types/schemas.ts
✅ Files skipped from review due to trivial changes (3)
  • plugins/otel/changelog.md
  • plugins/otel/go.mod
  • transports/bifrost-http/server/plugins.go
🚧 Files skipped from review as they are similar to previous changes (3)
  • plugins/otel/http.go
  • plugins/otel/grpc.go
  • plugins/otel/converter.go

Comment thread core/schemas/utils.go
Comment thread plugins/otel/main.go Outdated
Comment thread ui/app/workspace/observability/fragments/otelFormFragment.tsx
Comment thread ui/lib/types/schemas.ts
@sammaji sammaji force-pushed the 04-08-feat_multiple_otel_profiles branch from 1f5f7c5 to f1dbb63 Compare April 14, 2026 10:49
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
ui/lib/types/schemas.ts (1)

734-806: ⚠️ Potential issue | 🟠 Major

Skip profile-level endpoint validation when enabled is false.

The backend treats enabled=false as "skip during export", but this superRefine still validates a disabled profile's collector_url format and still requires/validates metrics_endpoint when metrics_enabled remains true. Disabling a half-configured profile won't actually unblock save until those fields are fixed too.

Suggested fix
 export const otelProfileConfigSchema = z
 	.object({
 		enabled: z.boolean().default(true),
@@
 	})
 	.superRefine((data, ctx) => {
+		if (data.enabled === false) {
+			return;
+		}
+
 		const protocol = data.protocol;
 		const hostPortRegex = /^(?!https?:\/\/)([a-zA-Z0-9.-]+|\[[0-9a-fA-F:]+\]|\d{1,3}(?:\.\d{1,3}){3}):(\d{1,5})$/;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ui/lib/types/schemas.ts` around lines 734 - 806, The superRefine currently
validates collector_url and metrics_endpoint even when the profile is disabled;
update the logic in the superRefine callback to skip all endpoint validation
when data.enabled is false (i.e., only run the collector_url and
metrics_endpoint checks when data.enabled is true), and ensure the
metrics_enabled branch also checks data.enabled before requiring/validating
metrics_endpoint; locate the validation helpers validateHttpUrl and
validateHostPort and the protocol/collector_url/metrics_enabled/metrics_endpoint
checks inside superRefine and wrap or gate them with a single enabled check so
disabled profiles are not blocked by format validation.
♻️ Duplicate comments (5)
core/schemas/utils.go (2)

883-885: ⚠️ Potential issue | 🟠 Major

DeepCopyChatTool still aliases keyOrder.

At Line 884, assigning keyOrder directly shares the original slice backing array, so mutations can leak across copies.

Proposed fix
 			copyParams := &ToolFunctionParameters{
 				Type:                original.Function.Parameters.Type,
-				keyOrder:            original.Function.Parameters.keyOrder,
 				explicitEmptyObject: original.Function.Parameters.explicitEmptyObject,
 			}
+			if original.Function.Parameters.keyOrder != nil {
+				copyParams.keyOrder = make([]string, len(original.Function.Parameters.keyOrder))
+				copy(copyParams.keyOrder, original.Function.Parameters.keyOrder)
+			}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/schemas/utils.go` around lines 883 - 885, The DeepCopyChatTool
implementation is copying original.Function.Parameters.keyOrder by reference
which shares the underlying slice; update DeepCopyChatTool so keyOrder is
deep-copied: allocate a new slice (handle nil/zero length), copy the elements
from original.Function.Parameters.keyOrder into it, and assign that new slice to
the copied.Parameters.keyOrder to prevent mutations from leaking across copies.

403-445: ⚠️ Potential issue | 🟠 Major

SafeExtractInt64 still allows unsafe casts.

At Line 420, Line 428, Line 430, and Line 432, unchecked casts can wrap (uint64 > math.MaxInt64) or silently truncate/accept invalid float values (NaN, Inf, fractional/out-of-range). This can corrupt emitted attribute values.

Proposed fix
 import (
 	"encoding/json"
 	"fmt"
+	"math"
 	"math/rand"
 	"net/url"
 	"regexp"
 	"strconv"
 	"strings"
@@
 	case uint:
-		return int64(v), true
+		if uint64(v) > math.MaxInt64 {
+			return 0, false
+		}
+		return int64(v), true
@@
 	case uint64:
-		return int64(v), true
+		if v > math.MaxInt64 {
+			return 0, false
+		}
+		return int64(v), true
 	case float32:
-		return int64(v), true
+		f := float64(v)
+		if math.IsNaN(f) || math.IsInf(f, 0) || math.Trunc(f) != f || f < math.MinInt64 || f > math.MaxInt64 {
+			return 0, false
+		}
+		return int64(f), true
 	case float64:
-		return int64(v), true
+		if math.IsNaN(v) || math.IsInf(v, 0) || math.Trunc(v) != v || v < math.MinInt64 || v > math.MaxInt64 {
+			return 0, false
+		}
+		return int64(v), true
In Go, what are the exact semantics when converting:
1) uint64 values greater than math.MaxInt64 to int64
2) float64 NaN/Inf or out-of-range values to int64
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/schemas/utils.go` around lines 403 - 445, SafeExtractInt64 currently
performs unchecked casts that can wrap or silently accept invalid floats; update
the function to validate uint/unsigned values and floats before converting: in
SafeExtractInt64 (the switch cases handling uint, uint8, uint16, uint32, uint64
and float32/float64) reject any unsigned value greater than math.MaxInt64 (use
math.MaxInt64) instead of blindly casting, and for floats first check
math.IsNaN/IsInf and then ensure the value is within the int64 range and has no
fractional part (e.g., compare f == math.Trunc(f)); only then convert to int64
and return true, otherwise return 0,false. Include the same guards for the
json.Number path if it can produce out-of-range/invalid floats.
plugins/otel/metrics.go (2)

210-220: ⚠️ Potential issue | 🟠 Major

HTTP insecure path still risks losing HTTPS transport.

At Line 213, otlpmetrichttp.WithInsecure() is used for Insecure=true with no CA cert, which can force plaintext behavior and leaves no explicit HTTPS+skip-verify path.

Proposed fix direction
+import (
+    "crypto/tls"
+    "net/http"
+    "strings"
+)
@@
-	if config.TLSCACert == "" && config.Insecure {
-		opts = append(opts, otlpmetrichttp.WithInsecure())
+	if config.TLSCACert == "" && config.Insecure {
+		if strings.HasPrefix(strings.ToLower(config.Endpoint), "https://") {
+			httpClient := &http.Client{
+				Transport: &http.Transport{
+					TLSClientConfig: &tls.Config{InsecureSkipVerify: true}, // `#nosec` G402
+				},
+			}
+			opts = append(opts, otlpmetrichttp.WithHTTPClient(httpClient))
+		} else {
+			opts = append(opts, otlpmetrichttp.WithInsecure())
+		}
 	} else {
For go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetrichttp v1.43.0:
How do WithEndpointURL, WithInsecure, WithTLSClientConfig, and WithHTTPClient interact for HTTPS endpoints when certificate verification must be skipped?
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/otel/metrics.go` around lines 210 - 220, The current branch uses
otlpmetrichttp.WithInsecure() when config.Insecure is true and no TLSCACert,
which can force plaintext instead of HTTPS+skip-verify; change the logic in the
block around buildTLSConfig/WithInsecure so that when config.Insecure is true
you build a TLS config with InsecureSkipVerify (call buildTLSConfig with a flag
to enable skip-verify or add such behavior) and pass it via
otlpmetrichttp.WithTLSClientConfig(tlsConfig) (and only use
otlpmetrichttp.WithInsecure() when the configured endpoint is explicitly http),
referencing the existing buildTLSConfig, config.Insecure,
otlpmetrichttp.WithInsecure, otlpmetrichttp.WithTLSClientConfig, and any
endpoint handling (WithEndpointURL) to ensure HTTPS endpoints keep TLS but skip
verification instead of reverting to plaintext.

452-552: ⚠️ Potential issue | 🟠 Major

Final-attempt-only extraction undercounts retry/fallback traffic.

At Line 452-Line 464, selecting one span and then recording metrics once (Line 509 onward) hides intermediate failed attempts, so request/error/token/cost metrics no longer reflect total upstream load.

Proposed fix direction
-	// Prefer the last attempt span (LLM call or retry) so metrics reflect the final outcome.
-	var llmSpan *schemas.Span
-	for _, span := range trace.Spans {
-		...
-	}
-	if llmSpan == nil {
-		llmSpan = trace.RootSpan
-	}
-	if llmSpan == nil {
-		return
-	}
-	attrs := llmSpan.Attributes
+	attemptSpans := make([]*schemas.Span, 0, len(trace.Spans))
+	for _, span := range trace.Spans {
+		if span.Kind == schemas.SpanKindLLMCall || span.Kind == schemas.SpanKindRetry {
+			attemptSpans = append(attemptSpans, span)
+		}
+	}
+	if len(attemptSpans) == 0 && trace.RootSpan != nil {
+		attemptSpans = append(attemptSpans, trace.RootSpan)
+	}
+	for _, llmSpan := range attemptSpans {
+		attrs := llmSpan.Attributes
+		// existing attribute extraction + Record* calls per span
+	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/otel/metrics.go` around lines 452 - 552, The current code picks only
the final llmSpan and emits metrics once, which undercounts retries/fallbacks;
instead iterate all spans in trace.Spans where span.Kind is SpanKindLLMCall or
SpanKindRetry and for each span extract attributes (using
getStringAttr/getIntAttr/getFloat64Attr, preferring "request.type" or falling
back to span.Name), build otelAttrs via
BuildBifrostAttributes/BifrostAttrParams, and call the existing metric emitters
(m.RecordUpstreamRequest, m.RecordUpstreamLatency,
m.RecordErrorRequest/m.RecordSuccessRequest, m.RecordInputTokens,
m.RecordOutputTokens, m.RecordCost, m.RecordStreamFirstTokenLatency) per span to
accumulate counts/latency/tokens/cost across all attempts rather than only the
last one.
ui/app/workspace/observability/fragments/otelFormFragment.tsx (1)

200-228: ⚠️ Potential issue | 🟡 Minor

Add data-testids to the new profile-card actions.

The new expand toggle, per-profile enable switch, remove action, and add-profile button still don't expose stable selectors. Please add ids such as otel-profile-toggle-${index}, otel-profile-enable-${index}, otel-profile-remove-${index}, and otel-profile-add-btn.

As per coding guidelines, ui/**/*.{tsx,ts}: Add new interactive UI elements with data-testid attributes following the pattern: data-testid="<entity>-<element>-<qualifier>".

Also applies to: 231-255, 571-582

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ui/app/workspace/observability/fragments/otelFormFragment.tsx` around lines
200 - 228, The profile card interactive elements lack stable selectors—add
data-testid attributes for the expand toggle, per-profile enable switch, remove
action, and the add-profile button: on the container div that calls
setExpandedProfiles (use data-testid={`otel-profile-toggle-${index}`}), on the
Switch rendered inside FormField (data-testid={`otel-profile-enable-${index}`}),
on the remove action element (data-testid={`otel-profile-remove-${index}`}), and
on the add-profile button control (data-testid="otel-profile-add-btn"); ensure
you place these attributes directly on the rendered interactive elements where
functions/components like setExpandedProfiles, FormField/ Switch, and the
remove/add handlers are defined so the testids match the patterns requested.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@ui/app/workspace/observability/fragments/otelFormFragment.tsx`:
- Around line 627-653: The Tooltip trigger must not be the actual disabled
<Button> because disabled elements don't emit hover/focus; update the
<TooltipTrigger asChild> usage so it wraps a non-disabled wrapper (e.g., a
<span> or <div>) and place the <Button> inside that wrapper instead of using the
Button as the trigger directly; ensure the wrapper forwards focus/hover (add
tabIndex={0} and aria-disabled when appropriate) and keep the Button's disabled
prop bound to hasOtelAccess || !form.formState.isDirty ||
!form.formState.isValid so the tooltip content (the <TooltipContent> message)
becomes reachable when the save is disabled.
- Around line 177-180: The onSubmit handler currently calls
onSave(data).finally(...) which detaches the promise and can cause unhandled
rejections; change onSubmit to be async, await the onSave(data) call inside a
try/catch, setIsSaving(true) before the await and setIsSaving(false) in finally,
and handle or rethrow the caught error as appropriate so rejections are not left
unhandled (refer to onSubmit, onSave, and setIsSaving).
- Around line 200-208: Replace the non-focusable <div> toggle with a real
<button type="button"> in the Otel form fragment so keyboard users can activate
it; keep the existing className, onClick handler that uses setExpandedProfiles
and the index/expanded state logic, add
aria-expanded={!!expandedProfiles[index]} to reflect state, and ensure the
enable switch and delete controls remain sibling elements (not nested inside the
button).

---

Outside diff comments:
In `@ui/lib/types/schemas.ts`:
- Around line 734-806: The superRefine currently validates collector_url and
metrics_endpoint even when the profile is disabled; update the logic in the
superRefine callback to skip all endpoint validation when data.enabled is false
(i.e., only run the collector_url and metrics_endpoint checks when data.enabled
is true), and ensure the metrics_enabled branch also checks data.enabled before
requiring/validating metrics_endpoint; locate the validation helpers
validateHttpUrl and validateHostPort and the
protocol/collector_url/metrics_enabled/metrics_endpoint checks inside
superRefine and wrap or gate them with a single enabled check so disabled
profiles are not blocked by format validation.

---

Duplicate comments:
In `@core/schemas/utils.go`:
- Around line 883-885: The DeepCopyChatTool implementation is copying
original.Function.Parameters.keyOrder by reference which shares the underlying
slice; update DeepCopyChatTool so keyOrder is deep-copied: allocate a new slice
(handle nil/zero length), copy the elements from
original.Function.Parameters.keyOrder into it, and assign that new slice to the
copied.Parameters.keyOrder to prevent mutations from leaking across copies.
- Around line 403-445: SafeExtractInt64 currently performs unchecked casts that
can wrap or silently accept invalid floats; update the function to validate
uint/unsigned values and floats before converting: in SafeExtractInt64 (the
switch cases handling uint, uint8, uint16, uint32, uint64 and float32/float64)
reject any unsigned value greater than math.MaxInt64 (use math.MaxInt64) instead
of blindly casting, and for floats first check math.IsNaN/IsInf and then ensure
the value is within the int64 range and has no fractional part (e.g., compare f
== math.Trunc(f)); only then convert to int64 and return true, otherwise return
0,false. Include the same guards for the json.Number path if it can produce
out-of-range/invalid floats.

In `@plugins/otel/metrics.go`:
- Around line 210-220: The current branch uses otlpmetrichttp.WithInsecure()
when config.Insecure is true and no TLSCACert, which can force plaintext instead
of HTTPS+skip-verify; change the logic in the block around
buildTLSConfig/WithInsecure so that when config.Insecure is true you build a TLS
config with InsecureSkipVerify (call buildTLSConfig with a flag to enable
skip-verify or add such behavior) and pass it via
otlpmetrichttp.WithTLSClientConfig(tlsConfig) (and only use
otlpmetrichttp.WithInsecure() when the configured endpoint is explicitly http),
referencing the existing buildTLSConfig, config.Insecure,
otlpmetrichttp.WithInsecure, otlpmetrichttp.WithTLSClientConfig, and any
endpoint handling (WithEndpointURL) to ensure HTTPS endpoints keep TLS but skip
verification instead of reverting to plaintext.
- Around line 452-552: The current code picks only the final llmSpan and emits
metrics once, which undercounts retries/fallbacks; instead iterate all spans in
trace.Spans where span.Kind is SpanKindLLMCall or SpanKindRetry and for each
span extract attributes (using getStringAttr/getIntAttr/getFloat64Attr,
preferring "request.type" or falling back to span.Name), build otelAttrs via
BuildBifrostAttributes/BifrostAttrParams, and call the existing metric emitters
(m.RecordUpstreamRequest, m.RecordUpstreamLatency,
m.RecordErrorRequest/m.RecordSuccessRequest, m.RecordInputTokens,
m.RecordOutputTokens, m.RecordCost, m.RecordStreamFirstTokenLatency) per span to
accumulate counts/latency/tokens/cost across all attempts rather than only the
last one.

In `@ui/app/workspace/observability/fragments/otelFormFragment.tsx`:
- Around line 200-228: The profile card interactive elements lack stable
selectors—add data-testid attributes for the expand toggle, per-profile enable
switch, remove action, and the add-profile button: on the container div that
calls setExpandedProfiles (use data-testid={`otel-profile-toggle-${index}`}), on
the Switch rendered inside FormField
(data-testid={`otel-profile-enable-${index}`}), on the remove action element
(data-testid={`otel-profile-remove-${index}`}), and on the add-profile button
control (data-testid="otel-profile-add-btn"); ensure you place these attributes
directly on the rendered interactive elements where functions/components like
setExpandedProfiles, FormField/ Switch, and the remove/add handlers are defined
so the testids match the patterns requested.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 23b246b9-3db9-4125-8318-720bf92266b6

📥 Commits

Reviewing files that changed from the base of the PR and between 1f5f7c5 and f1dbb63.

📒 Files selected for processing (14)
  • core/schemas/utils.go
  • framework/configstore/migrations.go
  • plugins/otel/changelog.md
  • plugins/otel/converter.go
  • plugins/otel/go.mod
  • plugins/otel/grpc.go
  • plugins/otel/http.go
  • plugins/otel/main.go
  • plugins/otel/metrics.go
  • plugins/otel/utils.go
  • transports/bifrost-http/server/plugins.go
  • transports/config.schema.json
  • ui/app/workspace/observability/fragments/otelFormFragment.tsx
  • ui/lib/types/schemas.ts
✅ Files skipped from review due to trivial changes (4)
  • plugins/otel/changelog.md
  • transports/bifrost-http/server/plugins.go
  • plugins/otel/go.mod
  • plugins/otel/utils.go
🚧 Files skipped from review as they are similar to previous changes (6)
  • plugins/otel/http.go
  • plugins/otel/grpc.go
  • transports/config.schema.json
  • framework/configstore/migrations.go
  • plugins/otel/converter.go
  • plugins/otel/main.go

Comment thread ui/app/workspace/observability/fragments/otelFormFragment.tsx
Comment thread ui/app/workspace/observability/fragments/otelFormFragment.tsx
Comment thread ui/app/workspace/observability/fragments/otelFormFragment.tsx
coderabbitai[bot]
coderabbitai Bot previously approved these changes Apr 14, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (4)
core/schemas/utils.go (2)

881-886: ⚠️ Potential issue | 🟠 Major

Deep-copy keyOrder as well.

keyOrder still aliases the original slice backing array, so mutating the copied tool mutates the source and breaks the deep-copy contract.

Suggested fix
 			copyParams := &ToolFunctionParameters{
 				Type:                original.Function.Parameters.Type,
-				keyOrder:            original.Function.Parameters.keyOrder,
 				explicitEmptyObject: original.Function.Parameters.explicitEmptyObject,
 			}
+			if original.Function.Parameters.keyOrder != nil {
+				copyParams.keyOrder = make([]string, len(original.Function.Parameters.keyOrder))
+				copy(copyParams.keyOrder, original.Function.Parameters.keyOrder)
+			}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/schemas/utils.go` around lines 881 - 886,
original.Function.Parameters.keyOrder is being shallow-copied into the new
ToolFunctionParameters causing shared backing arrays; update the copy logic
inside the block that constructs &ToolFunctionParameters so keyOrder is
deep-copied (e.g., allocate a new slice and copy elements from
original.Function.Parameters.keyOrder) before assigning to the new struct's
keyOrder field to prevent mutations in the copy from affecting the source.

403-445: ⚠️ Potential issue | 🟠 Major

Reject lossy conversions in SafeExtractInt64.

Large uint/uint64 inputs can wrap negative here, and the float branches silently accept fractional, NaN, and Inf values. That will corrupt OTEL attributes instead of rejecting them. Negative signed ints are fine; the unsafe cases are the unsigned and float coercions.

Suggested fix
 import (
 	"encoding/json"
 	"fmt"
+	"math"
 	"math/rand"
 	"net/url"
 	"regexp"
 	"strconv"
@@
 func SafeExtractInt64(value any) (int64, bool) {
 	if value == nil {
 		return 0, false
 	}
 	switch v := value.(type) {
@@
 	case uint:
-		return int64(v), true
+		if uint64(v) > math.MaxInt64 {
+			return 0, false
+		}
+		return int64(v), true
@@
 	case uint64:
-		return int64(v), true
+		if v > math.MaxInt64 {
+			return 0, false
+		}
+		return int64(v), true
 	case float32:
-		return int64(v), true
+		f := float64(v)
+		if math.IsNaN(f) || math.IsInf(f, 0) || math.Trunc(f) != f || f < math.MinInt64 || f > math.MaxInt64 {
+			return 0, false
+		}
+		return int64(f), true
 	case float64:
-		return int64(v), true
+		if math.IsNaN(v) || math.IsInf(v, 0) || math.Trunc(v) != v || v < math.MinInt64 || v > math.MaxInt64 {
+			return 0, false
+		}
+		return int64(v), true
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/schemas/utils.go` around lines 403 - 445, The SafeExtractInt64 function
currently performs lossy conversions for unsigned integers and floats; update it
to reject unsafe cases: for uint/uint8/uint16/uint32/uint64 branches check the
value is <= math.MaxInt64 before casting (otherwise return 0,false); remove or
change float32/float64 branches to only accept non-NaN, non-Inf values that are
exact integers within the int64 range (e.g., use math.IsNaN/IsInf and compare v
== math.Trunc(v) and bounds against math.MaxInt64/math.MinInt64), returning
false for fractional/NaN/Inf; keep existing safe handling for signed ints,
json.Number (which already errors on decimals), and string parsing. Ensure these
checks are implemented inside SafeExtractInt64 so unsafe conversions are
rejected rather than silently coerced.
transports/config.schema.json (1)

1474-1516: ⚠️ Potential issue | 🟠 Major

Make OTEL endpoints conditional on protocol.

collector_url and metrics_endpoint currently accept both http(s)://... and host:port for every profile. That lets configs pass the authoritative schema with values that are invalid for the selected protocol.

Suggested shape
                       "items": {
                         "type": "object",
                         "properties": {
@@
                           "metrics_endpoint": {
                             "type": "string",
                             "description": "OTLP metrics endpoint URL (e.g., http://otel-collector:4318/v1/metrics for HTTP or otel-collector:4317 for gRPC)",
-                            "oneOf": [
-                              { "format": "uri" },
-                              { "pattern": "^[^:\\s]+:\\d+$" }
-                            ]
+                            "type": "string"
                           },
@@
                         "required": [
                           "collector_url",
                           "trace_type",
                           "protocol"
                         ],
+                        "allOf": [
+                          {
+                            "if": {
+                              "properties": { "protocol": { "const": "http" } },
+                              "required": ["protocol"]
+                            },
+                            "then": {
+                              "properties": {
+                                "collector_url": { "pattern": "^https?://" },
+                                "metrics_endpoint": { "pattern": "^https?://" }
+                              }
+                            }
+                          },
+                          {
+                            "if": {
+                              "properties": { "protocol": { "const": "grpc" } },
+                              "required": ["protocol"]
+                            },
+                            "then": {
+                              "properties": {
+                                "collector_url": { "pattern": "^[^:\\s]+:\\d+$" },
+                                "metrics_endpoint": { "pattern": "^[^:\\s]+:\\d+$" }
+                              }
+                            }
+                          }
+                        ],
                         "additionalProperties": false
                       }

As per coding guidelines, transports/config.schema.json is the authoritative definition for all config.json fields.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@transports/config.schema.json` around lines 1474 - 1516, The schema currently
allows both URI and host:port formats for collector_url and metrics_endpoint
regardless of protocol; update transports/config.schema.json to make those
endpoints conditional on the protocol property by adding JSON Schema conditional
subschemas (if/then/else or dependentSchemas) keyed on "protocol" (e.g., if
{properties:{protocol:{const:"http"}}} then enforce "collector_url" and
"metrics_endpoint" to use "format":"uri", and if
{properties:{protocol:{const:"grpc"}}} then enforce "collector_url" and
"metrics_endpoint" to match the host:port "pattern":"^[^:\\s]+:\\d+$");
reference the existing property names collector_url, metrics_endpoint and
protocol when editing so configs validate only the allowed format for the
selected protocol.
plugins/otel/utils.go (1)

37-57: ⚠️ Potential issue | 🟠 Major

Don’t reject mounted CA bundles by path shape alone.

The absolute-path + no-symlink policy blocks standard Kubernetes Secret/ConfigMap mounts, which are symlink-backed. Resolve the operator-supplied path first, then validate the resolved target.

Suggested fix
 func validateCACertPath(certPath string) error {
 	if certPath == "" {
 		return nil
 	}
 
-	cleanPath := filepath.Clean(certPath)
-
-	if !filepath.IsAbs(cleanPath) {
-		return fmt.Errorf("TLS CA cert path must be absolute: %s", certPath)
-	}
-
-	info, err := os.Lstat(cleanPath)
+	cleanPath := filepath.Clean(certPath)
+	absPath, err := filepath.Abs(cleanPath)
 	if err != nil {
-		return fmt.Errorf("TLS CA cert path not accessible: %w", err)
+		return fmt.Errorf("failed to resolve TLS CA cert path: %w", err)
 	}
-	if info.Mode()&os.ModeSymlink != 0 {
-		return fmt.Errorf("TLS CA cert path cannot be a symlink: %s", certPath)
+
+	realPath, err := filepath.EvalSymlinks(absPath)
+	if err != nil {
+		return fmt.Errorf("TLS CA cert path not accessible: %w", err)
 	}
+
+	info, err := os.Stat(realPath)
+	if err != nil {
+		return fmt.Errorf("TLS CA cert path not accessible: %w", err)
+	}
+
 	if !info.Mode().IsRegular() {
-		return fmt.Errorf("TLS CA cert path is not a regular file: %s", certPath)
+		return fmt.Errorf("TLS CA cert path is not a regular file: %s", realPath)
 	}
 
 	return nil
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/otel/utils.go` around lines 37 - 57, The code currently rejects
symlink-backed mounts; instead, after cleaning and confirming the supplied
certPath is absolute (filepath.Clean + filepath.IsAbs), resolve symlinks with
filepath.EvalSymlinks(cleanPath) to get resolvedPath, then use
os.Stat(resolvedPath) (not os.Lstat) to verify the target exists and is a
regular file (info.Mode().IsRegular()); remove the explicit symlink rejection
(info.Mode()&os.ModeSymlink check) and return clear errors if EvalSymlinks or
Stat fail or the resolved target is not a regular file.
🧹 Nitpick comments (1)
ui/app/workspace/observability/fragments/otelFormFragment.tsx (1)

573-584: Add data-testid attributes to new profile controls.

Per coding guidelines, new interactive UI elements need data-testid attributes. The following are missing:

  • "Add Profile" button (line 574)
  • Profile remove button (line 234)
  • Profile enable switch (line 224)
Suggested fix
 <Button
   type="button"
   variant="outline"
   size="sm"
   onClick={() => append({ ...DEFAULT_PROFILE })}
   disabled={!hasOtelAccess}
   className="flex items-center gap-2"
+  data-testid="otel-profile-add-btn"
 >
   <Plus className="size-4" />
   Add Profile
 </Button>
 <Switch
   checked={enabledField.value}
   onCheckedChange={enabledField.onChange}
   onClick={(e) => e.stopPropagation()}
   disabled={!hasOtelAccess}
   aria-label="Enable profile"
+  data-testid={`otel-profile-enable-${index}`}
 />
 <Button
   type="button"
   variant="ghost"
   size="icon"
   className="size-7"
   onClick={(e) => {
     e.stopPropagation();
     remove(index);
     ...
   }}
   disabled={!hasOtelAccess}
   title="Remove profile"
+  data-testid={`otel-profile-remove-${index}`}
 >

As per coding guidelines: ui/**/*.{tsx,ts}: Add new interactive UI elements with data-testid attributes following the pattern: data-testid="<entity>-<element>-<qualifier>".

Also applies to: 233-256, 219-231

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ui/app/workspace/observability/fragments/otelFormFragment.tsx` around lines
573 - 584, The new interactive controls are missing data-testid attributes; add
data-testid attributes following the pattern
data-testid="<entity>-<element>-<qualifier>" to the "Add Profile" Button (the
JSX Button whose onClick calls append({ ...DEFAULT_PROFILE })), to the profile
remove control (the Button used to remove a profile in the profile list/map),
and to the profile enable control (the Switch/toggle used to enable/disable a
profile); ensure each uses clear names like data-testid="profile-add-button",
"profile-remove-button-<index>" or similar unique qualifier, and
"profile-enable-switch-<index>" so tests can reliably target them.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@plugins/otel/utils.go`:
- Around line 72-88: The function currently maps insecureMode to
cfg.InsecureSkipVerify which enables HTTPS-without-verification; remove that
mapping so insecureMode no longer sets cfg.InsecureSkipVerify. Keep the TLS CA
handling for tlsCACert (validateCACertPath, os.ReadFile, x509.NewCertPool,
cfg.RootCAs) but delete the else if insecureMode branch; instead leave TLS
config untouched and document (or add a comment) that callers of this helper
(HTTP/gRPC setup) must choose plaintext/insecure transport themselves when
insecureMode is requested. Ensure no other code paths in this helper set
InsecureSkipVerify.

---

Duplicate comments:
In `@core/schemas/utils.go`:
- Around line 881-886: original.Function.Parameters.keyOrder is being
shallow-copied into the new ToolFunctionParameters causing shared backing
arrays; update the copy logic inside the block that constructs
&ToolFunctionParameters so keyOrder is deep-copied (e.g., allocate a new slice
and copy elements from original.Function.Parameters.keyOrder) before assigning
to the new struct's keyOrder field to prevent mutations in the copy from
affecting the source.
- Around line 403-445: The SafeExtractInt64 function currently performs lossy
conversions for unsigned integers and floats; update it to reject unsafe cases:
for uint/uint8/uint16/uint32/uint64 branches check the value is <= math.MaxInt64
before casting (otherwise return 0,false); remove or change float32/float64
branches to only accept non-NaN, non-Inf values that are exact integers within
the int64 range (e.g., use math.IsNaN/IsInf and compare v == math.Trunc(v) and
bounds against math.MaxInt64/math.MinInt64), returning false for
fractional/NaN/Inf; keep existing safe handling for signed ints, json.Number
(which already errors on decimals), and string parsing. Ensure these checks are
implemented inside SafeExtractInt64 so unsafe conversions are rejected rather
than silently coerced.

In `@plugins/otel/utils.go`:
- Around line 37-57: The code currently rejects symlink-backed mounts; instead,
after cleaning and confirming the supplied certPath is absolute (filepath.Clean
+ filepath.IsAbs), resolve symlinks with filepath.EvalSymlinks(cleanPath) to get
resolvedPath, then use os.Stat(resolvedPath) (not os.Lstat) to verify the target
exists and is a regular file (info.Mode().IsRegular()); remove the explicit
symlink rejection (info.Mode()&os.ModeSymlink check) and return clear errors if
EvalSymlinks or Stat fail or the resolved target is not a regular file.

In `@transports/config.schema.json`:
- Around line 1474-1516: The schema currently allows both URI and host:port
formats for collector_url and metrics_endpoint regardless of protocol; update
transports/config.schema.json to make those endpoints conditional on the
protocol property by adding JSON Schema conditional subschemas (if/then/else or
dependentSchemas) keyed on "protocol" (e.g., if
{properties:{protocol:{const:"http"}}} then enforce "collector_url" and
"metrics_endpoint" to use "format":"uri", and if
{properties:{protocol:{const:"grpc"}}} then enforce "collector_url" and
"metrics_endpoint" to match the host:port "pattern":"^[^:\\s]+:\\d+$");
reference the existing property names collector_url, metrics_endpoint and
protocol when editing so configs validate only the allowed format for the
selected protocol.

---

Nitpick comments:
In `@ui/app/workspace/observability/fragments/otelFormFragment.tsx`:
- Around line 573-584: The new interactive controls are missing data-testid
attributes; add data-testid attributes following the pattern
data-testid="<entity>-<element>-<qualifier>" to the "Add Profile" Button (the
JSX Button whose onClick calls append({ ...DEFAULT_PROFILE })), to the profile
remove control (the Button used to remove a profile in the profile list/map),
and to the profile enable control (the Switch/toggle used to enable/disable a
profile); ensure each uses clear names like data-testid="profile-add-button",
"profile-remove-button-<index>" or similar unique qualifier, and
"profile-enable-switch-<index>" so tests can reliably target them.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: bd8926f0-2593-4337-a80d-ab763ac6b303

📥 Commits

Reviewing files that changed from the base of the PR and between f1dbb63 and be02248.

📒 Files selected for processing (14)
  • core/schemas/utils.go
  • framework/configstore/migrations.go
  • plugins/otel/changelog.md
  • plugins/otel/converter.go
  • plugins/otel/go.mod
  • plugins/otel/grpc.go
  • plugins/otel/http.go
  • plugins/otel/main.go
  • plugins/otel/metrics.go
  • plugins/otel/utils.go
  • transports/bifrost-http/server/plugins.go
  • transports/config.schema.json
  • ui/app/workspace/observability/fragments/otelFormFragment.tsx
  • ui/lib/types/schemas.ts
✅ Files skipped from review due to trivial changes (3)
  • plugins/otel/changelog.md
  • plugins/otel/go.mod
  • transports/bifrost-http/server/plugins.go
🚧 Files skipped from review as they are similar to previous changes (4)
  • plugins/otel/grpc.go
  • framework/configstore/migrations.go
  • plugins/otel/http.go
  • plugins/otel/metrics.go

Comment thread plugins/otel/utils.go
Comment thread ui/lib/types/schemas.ts
Comment thread ui/app/workspace/observability/fragments/otelFormFragment.tsx Outdated
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (5)
ui/app/workspace/observability/fragments/otelFormFragment.tsx (3)

198-200: ⚠️ Potential issue | 🟠 Major

Await onSave inside onSubmit.

onSave(data).finally(...) detaches the save promise from handleSubmit. If onSave rejects, this leaves an unhandled rejection even though isSaving gets reset.

Suggested fix
-  const onSubmit = (data: OtelFormSchema) => {
-    setIsSaving(true);
-    onSave(data).finally(() => setIsSaving(false));
-  };
+  const onSubmit = async (data: OtelFormSchema) => {
+    setIsSaving(true);
+    try {
+      await onSave(data);
+    } catch {
+      // onSave already reports the failure upstream.
+    } finally {
+      setIsSaving(false);
+    }
+  };
#!/bin/bash
# Verify the submit handler doesn't await or return the save promise.
nl -ba ui/app/workspace/observability/fragments/otelFormFragment.tsx | sed -n '196,202p'

Expected result: onSubmit is synchronous and calls onSave(data).finally(...) without await or return.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ui/app/workspace/observability/fragments/otelFormFragment.tsx` around lines
198 - 200, The submit handler onSubmit currently calls onSave(data).finally(...)
without awaiting or returning it, which detaches the promise and can cause
unhandled rejections; update onSubmit to be async and either await onSave(data)
inside a try/finally or return the onSave(...) promise so rejections propagate,
and ensure setIsSaving(true) is paired with a finally block that calls
setIsSaving(false) (referencing onSubmit, onSave, and setIsSaving).

648-672: ⚠️ Potential issue | 🟡 Minor

The disabled save button can't trigger this tooltip.

TooltipTrigger asChild wraps the same button you disable when the form is pristine/invalid, but disabled buttons don't emit hover or focus events. The validation hint becomes unreachable exactly when it's needed.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ui/app/workspace/observability/fragments/otelFormFragment.tsx` around lines
648 - 672, The tooltip is never reachable because TooltipTrigger asChild wraps
the disabled Button; wrap the Button in a focusable/hoverable wrapper instead so
events fire even when the button is disabled: replace the direct TooltipTrigger
asChild on Button (references: TooltipTrigger, Button, TooltipContent) with
TooltipTrigger asChild around a wrapper element (e.g., a span/div with
inline-block styling) that contains the Button, keep the Button
disabled/isLoading props as-is, and ensure the wrapper allows pointer events so
the TooltipContent can show when the form is pristine/invalid.

221-229: ⚠️ Potential issue | 🟡 Minor

Add test ids to the new profile controls.

The collapsible header, per-profile enable switch, remove button, and add-profile button are new interactive controls, but they still don't expose stable selectors. Please add data-testids for them using the repo's three-part convention.

As per coding guidelines, ui/**/*.{tsx,ts} must add new interactive UI elements with data-testid attributes following the pattern data-testid="<entity>-<element>-<qualifier>".

Also applies to: 243-249, 253-275, 593-603

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ui/app/workspace/observability/fragments/otelFormFragment.tsx` around lines
221 - 229, Add stable test ids to the new interactive profile controls: add a
data-testid on the collapsible header div (the element that calls
setExpandedProfiles with index), add data-testid attributes to the per-profile
enable switch component (the component that toggles profile enabled/disabled),
to the per-profile remove button (the button that deletes a profile), and to the
global add-profile button (the control that appends a new profile). Use the repo
convention data-testid="<entity>-<element>-<qualifier>" and ensure unique
qualifiers per-profile (e.g., include the index or "index-<index>") so selectors
reference the profile index; update the JSX for the elements referenced by
setExpandedProfiles, the enable switch handler, the remove handler, and the
add-profile handler accordingly.
transports/config.schema.json (2)

1519-1529: ⚠️ Potential issue | 🟠 Major

Make endpoint validation conditional on protocol.

collector_url and metrics_endpoint still accept both URI and host:port forms regardless of the selected protocol. That lets protocol: "http" + collector:4317 and protocol: "grpc" + https://... pass the authoritative schema and fail later in OTEL initialization. Please gate both fields with if/then rules keyed on protocol so HTTP only accepts http(s)://... and gRPC only accepts host:port.

As per coding guidelines, transports/config.schema.json is the authoritative definition for all config.json fields.

Also applies to: 1553-1563

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@transports/config.schema.json` around lines 1519 - 1529, Update the JSON
Schema rules so endpoint validation is conditional on the `protocol` value:
replace the current `oneOf` on `collector_url` and `metrics_endpoint` with
`if/then` branches keyed on `protocol` (e.g. `if:
{properties:{protocol:{const:"http"}}} then:
{properties:{collector_url:{type:"string",format:"uri"}}}` and a corresponding
`if protocol == "grpc" then` that enforces the `^[^:\s]+:\d+$` pattern), and
mirror the same conditional logic for `metrics_endpoint`; keep the existing
descriptions and types but move the format/pattern into the appropriate `then`
clauses so HTTP only allows URI forms and gRPC only allows host:port.

1583-1586: ⚠️ Potential issue | 🟡 Minor

Document the actual insecure behavior.

"Skip TLS verification" is misleading here. This flag is intentionally used for insecure/plaintext OTEL transport paths rather than HTTPS-with-verification-disabled, so the schema description should spell out the per-protocol behavior instead of implying InsecureSkipVerify.

Based on learnings, in plugins/otel/metrics.go the Insecure: true path intentionally uses plain HTTP when no CA cert is set, rather than HTTPS with certificate verification disabled.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@transports/config.schema.json` around lines 1583 - 1586, Update the
"insecure" property description to reflect actual behavior: instead of saying
"Skip TLS verification", document that for OTEL/OTLP (see
plugins/otel/metrics.go) setting "insecure": true causes the client to use an
insecure/plaintext transport (HTTP) when no tls_ca_cert is provided, and that it
does not mean "InsecureSkipVerify" for HTTPS; mention the interaction with
tls_ca_cert so readers know plaintext is used when tls_ca_cert is absent and
insecure is true.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@ui/app/workspace/observability/fragments/otelFormFragment.tsx`:
- Around line 75-77: OtelFormFragment currently mixes legacy flat currentConfig
and new multi-profile shapes: update the OtelFormFragmentProps type so
currentConfig includes a profiles: Array<...> field (or create a migrated type
used by this fragment) so RawProfile (and expressions like
cfg?.profiles?.length) are valid, remove the old legacy useForm declaration and
keep only the profile-based const form = useForm(...) used for profiles, and
ensure any access to cfg.profiles is guarded (e.g., optional chaining) against
undefined during migration; reference RawProfile,
OtelFormFragmentProps["currentConfig"], cfg?.profiles?.length, and the duplicate
const form = useForm(...) to locate the fixes.

---

Duplicate comments:
In `@transports/config.schema.json`:
- Around line 1519-1529: Update the JSON Schema rules so endpoint validation is
conditional on the `protocol` value: replace the current `oneOf` on
`collector_url` and `metrics_endpoint` with `if/then` branches keyed on
`protocol` (e.g. `if: {properties:{protocol:{const:"http"}}} then:
{properties:{collector_url:{type:"string",format:"uri"}}}` and a corresponding
`if protocol == "grpc" then` that enforces the `^[^:\s]+:\d+$` pattern), and
mirror the same conditional logic for `metrics_endpoint`; keep the existing
descriptions and types but move the format/pattern into the appropriate `then`
clauses so HTTP only allows URI forms and gRPC only allows host:port.
- Around line 1583-1586: Update the "insecure" property description to reflect
actual behavior: instead of saying "Skip TLS verification", document that for
OTEL/OTLP (see plugins/otel/metrics.go) setting "insecure": true causes the
client to use an insecure/plaintext transport (HTTP) when no tls_ca_cert is
provided, and that it does not mean "InsecureSkipVerify" for HTTPS; mention the
interaction with tls_ca_cert so readers know plaintext is used when tls_ca_cert
is absent and insecure is true.

In `@ui/app/workspace/observability/fragments/otelFormFragment.tsx`:
- Around line 198-200: The submit handler onSubmit currently calls
onSave(data).finally(...) without awaiting or returning it, which detaches the
promise and can cause unhandled rejections; update onSubmit to be async and
either await onSave(data) inside a try/finally or return the onSave(...) promise
so rejections propagate, and ensure setIsSaving(true) is paired with a finally
block that calls setIsSaving(false) (referencing onSubmit, onSave, and
setIsSaving).
- Around line 648-672: The tooltip is never reachable because TooltipTrigger
asChild wraps the disabled Button; wrap the Button in a focusable/hoverable
wrapper instead so events fire even when the button is disabled: replace the
direct TooltipTrigger asChild on Button (references: TooltipTrigger, Button,
TooltipContent) with TooltipTrigger asChild around a wrapper element (e.g., a
span/div with inline-block styling) that contains the Button, keep the Button
disabled/isLoading props as-is, and ensure the wrapper allows pointer events so
the TooltipContent can show when the form is pristine/invalid.
- Around line 221-229: Add stable test ids to the new interactive profile
controls: add a data-testid on the collapsible header div (the element that
calls setExpandedProfiles with index), add data-testid attributes to the
per-profile enable switch component (the component that toggles profile
enabled/disabled), to the per-profile remove button (the button that deletes a
profile), and to the global add-profile button (the control that appends a new
profile). Use the repo convention data-testid="<entity>-<element>-<qualifier>"
and ensure unique qualifiers per-profile (e.g., include the index or
"index-<index>") so selectors reference the profile index; update the JSX for
the elements referenced by setExpandedProfiles, the enable switch handler, the
remove handler, and the add-profile handler accordingly.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: cc84cbd1-fe8c-40c6-b08c-67bc4e9a993c

📥 Commits

Reviewing files that changed from the base of the PR and between be02248 and 941bd17.

📒 Files selected for processing (3)
  • transports/config.schema.json
  • ui/app/workspace/observability/fragments/otelFormFragment.tsx
  • ui/lib/types/schemas.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • ui/lib/types/schemas.ts

Comment thread ui/app/workspace/observability/fragments/otelFormFragment.tsx
@sammaji sammaji force-pushed the 04-08-feat_multiple_otel_profiles branch from 941bd17 to 18d2ba5 Compare April 15, 2026 08:50
@sammaji sammaji requested a review from a team as a code owner April 15, 2026 08:50
@CLAassistant
Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@sammaji sammaji mentioned this pull request Apr 15, 2026
18 tasks
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
plugins/otel/metrics.go (1)

139-187: ⚠️ Potential issue | 🟠 Major

Don't replace the global meter provider per OTEL profile.

With multi-profile OTEL enabled, every NewMetricsExporter call overwrites the process-global provider. Anything instrumented via otel.Meter(...) will export only through the last initialized profile instead of all configured destinations. Keep these providers local, or install one shared global provider with multiple readers once at bootstrap.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/otel/metrics.go` around lines 139 - 187, NewMetricsExporter currently
calls otel.SetMeterProvider(provider) which overwrites the process-global meter
provider per profile; instead avoid replacing the global provider here — remove
the otel.SetMeterProvider(provider) call and keep the created provider local so
callers use provider.Meter(...) returned from NewMetricsExporter (or have
bootstrap install one shared global provider with multiple sdkmetric.Readers
once). Update NewMetricsExporter to return the local provider/meter (or an API
to retrieve its meter) and ensure instrumentation uses the returned meter rather
than otel.Meter(...), leaving global provider installation to a single bootstrap
path.
♻️ Duplicate comments (8)
core/schemas/utils.go (2)

881-885: ⚠️ Potential issue | 🟠 Major

Deep-copy keyOrder here too.

keyOrder still aliases the source slice, so mutating either tool can leak ordering changes into the other and defeats this helper's copy isolation.

Proposed fix
 			copyParams := &ToolFunctionParameters{
 				Type:                original.Function.Parameters.Type,
-				keyOrder:            original.Function.Parameters.keyOrder,
 				explicitEmptyObject: original.Function.Parameters.explicitEmptyObject,
 			}
+			if original.Function.Parameters.keyOrder != nil {
+				copyParams.keyOrder = make([]string, len(original.Function.Parameters.keyOrder))
+				copy(copyParams.keyOrder, original.Function.Parameters.keyOrder)
+			}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/schemas/utils.go` around lines 881 - 885, The helper is shallow-copying
the slice field keyOrder on original.Function.Parameters which causes aliasing;
modify the copy logic in the block that creates copyParams (type
ToolFunctionParameters) to deep-copy the keyOrder slice (allocate a new slice
and copy elements) and assign that new slice to copyParams.keyOrder so mutations
to the new ToolFunctionParameters do not affect
original.Function.Parameters.keyOrder.

429-432: ⚠️ Potential issue | 🟠 Major

Reject lossy float conversions in SafeExtractInt64.

These branches still truncate fractional values and accept NaN/Inf/out-of-range floats with implementation-defined results. For a helper named SafeExtractInt64, callers should get (0, false) instead of a silently corrupted integer.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/schemas/utils.go` around lines 429 - 432, The float32/float64 branches
in SafeExtractInt64 currently truncate and accept NaN/Inf; update them to reject
any lossy or invalid conversions by: for float32 convert to float64 then for
float64 verify !math.IsNaN(v) && !math.IsInf(v,0), check that v equals
math.Trunc(v) (no fractional part), and ensure v is within int64 range (>=
math.MinInt64 && <= math.MaxInt64); only then return int64(v), true, otherwise
return 0, false. Reference the SafeExtractInt64 function and the float32/float64
type switch branches when making the change.
plugins/otel/metrics.go (1)

452-552: ⚠️ Potential issue | 🟠 Major

Only the final attempt contributes metrics.

This path records request/error/token/cost data from just the latest LLMCall/Retry span. A request that retries or falls back multiple times is emitted as one upstream request with only the last attempt's usage, which hides retry load and no longer matches the metric names. Iterate every relevant span and use trace.RootSpan only when none exist.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/otel/metrics.go` around lines 452 - 552, The current code only uses
the final LLM/Retry span (llmSpan) to emit metrics, which hides retry/fallback
load; change the logic to iterate trace.Spans and for each span with Kind ==
schemas.SpanKindLLMCall || Kind == schemas.SpanKindRetry (falling back to
trace.RootSpan only if no such spans exist) build the per-span attrs (use
getStringAttr/getIntAttr/getFloat64Attr and prefer "request.type" or span.Name
for method) and call the same metric emitters (m.RecordUpstreamRequest,
RecordUpstreamLatency, RecordErrorRequest/RecordSuccessRequest,
RecordInputTokens, RecordOutputTokens, RecordCost,
RecordStreamFirstTokenLatency) for each matching span so every attempt
contributes metrics instead of only the last one.
framework/configstore/migrations.go (1)

6488-6496: ⚠️ Potential issue | 🟠 Major

Recompute config_hash for the migrated OTEL plugin.

This migration rewrites the plugin config shape on Line 6488, but it never updates config_hash. That leaves the stored hash out of sync with the migrated payload, so the otel plugin can keep showing up as dirty after upgrade.

Suggested fix
 			plugin.Config = map[string]any{
 				"profiles": []any{cfgMap},
 			}
 			// Force BeforeSave to re-serialize + re-encrypt.
 			plugin.ConfigJSON = ""
 			plugin.EncryptionStatus = tables.EncryptionStatusPlainText
+			hash, err := GeneratePluginHash(plugin)
+			if err != nil {
+				return fmt.Errorf("failed to generate hash for plugin %s: %w", plugin.Name, err)
+			}
+			plugin.ConfigHash = hash
 
 			if err := tx.Save(&plugin).Error; err != nil {
 				return fmt.Errorf("failed to save migrated otel config: %w", err)
 			}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@framework/configstore/migrations.go` around lines 6488 - 6496, The migration
updates plugin.Config (setting "profiles": []{cfgMap}), clears plugin.ConfigJSON
and sets plugin.EncryptionStatus, but never recomputes or updates the
plugin.config_hash, leaving the stored hash inconsistent; after setting
plugin.Config and before tx.Save(&plugin) compute the new config hash (the same
algorithm used elsewhere to derive config_hash from the serialized/encrypted
config) and assign it to plugin.ConfigHash (or plugin.config_hash field) so the
database write persists the updated hash along with the new config and
encryption status.
ui/app/workspace/observability/fragments/otelFormFragment.tsx (3)

648-674: ⚠️ Potential issue | 🟡 Minor

Use a non-disabled wrapper as the tooltip trigger.

When save is disabled, the Button inside TooltipTrigger asChild won’t emit hover/focus events, so the explanation text is unreachable exactly when users need it. Wrap the button in a non-disabled element and use that wrapper as the trigger instead.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ui/app/workspace/observability/fragments/otelFormFragment.tsx` around lines
648 - 674, The TooltipTrigger currently uses the Button asChild so when Button
is disabled it doesn't emit hover/focus events and the TooltipContent becomes
unreachable; update the fragment around Tooltip/TooltipTrigger so a non-disabled
wrapper element (e.g., a <div> or <span> wrapper component) is used as the
TooltipTrigger target while keeping the disabled Button inside that wrapper
(keep TooltipProvider, Tooltip, TooltipTrigger, Button, and TooltipContent names
unchanged), and ensure the wrapper forwards pointer events and accessibility
attributes so the tooltip shows when the user hovers/focuses the wrapper even if
Button.disabled is true.

219-279: ⚠️ Potential issue | 🟡 Minor

Add data-testids to the new profile controls.

The profile expand/collapse trigger, per-profile enable switch, remove action, and add-profile button still don’t expose stable selectors. Please add test IDs for these controls using the repo’s 3-part pattern.

As per coding guidelines, ui/**/*.{tsx,ts} must add new interactive UI elements with data-testid attributes following the pattern data-testid="<entity>-<element>-<qualifier>".

Also applies to: 592-603

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ui/app/workspace/observability/fragments/otelFormFragment.tsx` around lines
219 - 279, The new profile controls (the expand/collapse trigger div using
setExpandedProfiles, the per-profile enable Switch rendered inside FormField for
name "otel_config.profiles.${index}.enabled", the profile remove Button that
calls remove(index) and updates setExpandedProfiles, and the add-profile Button
elsewhere) need stable test selectors: add data-testid attributes following the
repo pattern "data-testid=\"profile-<element>-<qualifier>\"" (e.g.
profile-toggle-<index>, profile-enable-<index>, profile-remove-<index>,
profile-add-<qualifier>) on the outer toggle div, the Switch, the remove Button,
and the add-profile Button; ensure you include the index or appropriate
qualifier to keep them unique and do not change existing onClick logic (only add
the attributes).

196-199: ⚠️ Potential issue | 🟠 Major

Await onSave from the submit handler.

This still fires onSave(data) and drops the promise. If the save rejects, the rejection is detached from handleSubmit and can surface as an unhandled promise rejection.

Suggested fix
-  const onSubmit = (data: OtelFormSchema) => {
-    setIsSaving(true);
-    onSave(data).finally(() => setIsSaving(false));
-  };
+  const onSubmit = async (data: OtelFormSchema) => {
+    setIsSaving(true);
+    try {
+      await onSave(data);
+    } finally {
+      setIsSaving(false);
+    }
+  };
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ui/app/workspace/observability/fragments/otelFormFragment.tsx` around lines
196 - 199, The submit handler drops the promise from onSave so rejections become
unhandled; change onSubmit to either async and await the call (e.g., make
onSubmit async, setIsSaving(true); try { await onSave(data); } finally {
setIsSaving(false); }) or return the promise (return onSave(data).finally(() =>
setIsSaving(false))); update the function referenced as onSubmit to use one of
these patterns so errors propagate to the caller instead of being detached.
transports/config.schema.json (1)

1519-1561: ⚠️ Potential issue | 🟠 Major

Make endpoint validation depend on protocol.

collector_url and metrics_endpoint still accept both URI and host:port shapes regardless of the selected protocol. That lets configs pass the authoritative schema and then fail later when plugins/otel/main.go selects either the HTTP or gRPC client path.

As per coding guidelines, transports/config.schema.json is "the authoritative definition for all config.json fields."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@transports/config.schema.json` around lines 1519 - 1561, collector_url and
metrics_endpoint currently accept both URI and host:port shapes regardless of
protocol; update the JSON schema to make their validation conditional on the
protocol value by using JSON Schema conditional subschemas (if/then/else) tied
to the "protocol" property: for protocol == "http" require a "format": "uri" for
collector_url and metrics_endpoint, and for protocol == "grpc" require the
"pattern": "^[^:\\s]+:\\d+$" (or conversely enforce host:port only for grpc);
apply the same conditional logic to both "collector_url" and "metrics_endpoint"
entries while leaving other fields (e.g., "metrics_enabled", "trace_type")
unchanged so plugins/otel/main.go will receive only schema-valid shapes for each
protocol.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@transports/config.schema.json`:
- Around line 1546-1562: Add a JSON Schema conditional that requires
metrics_endpoint whenever metrics_enabled is true: in
transports/config.schema.json add an if/then rule where if
properties.metrics_enabled.const is true, then require metrics_endpoint (and
optionally validate non-empty/URI via the existing oneOf). This mirrors the
runtime check in plugins/otel/main.go Init and ensures configs with
metrics_enabled: true must include a valid metrics_endpoint before runtime.

In `@ui/app/workspace/observability/fragments/otelFormFragment.tsx`:
- Around line 39-54: Update the OTEL trace_type union and UI options to match
the authoritative schema: change the trace_type type in OtelFormFragmentProps
(the union used in the interface) to only include "genai_extension", and remove
"vercel" and "open_inference" from the traceTypeOptions used by the form (also
update the options defined around the 96-115 area). Also update the
corresponding validation in ui/lib/types/schemas.ts (the trace_type validation
block) to only allow "genai_extension" so the form and schema stay in sync with
transports/config.schema.json.
- Around line 62-74: DEFAULT_PROFILE is seeded with enabled: false and empty
fields and get written into otel_config.profiles by makeDefaultValues and
append({ ...DEFAULT_PROFILE }), which creates invalid/incomplete profile
objects; change this so the UI does not persist a disabled/draft entry: update
DEFAULT_PROFILE to contain a valid, enabled profile (e.g., enabled: true and a
non-empty collector_url or sensible defaults conforming to
OtelProfileConfigSchema) or remove enabled from the seed entirely, and change
makeDefaultValues and the code path that calls append({ ...DEFAULT_PROFILE }) to
only append fully valid profiles (e.g., append only on save) so that
otel_config.profiles never contains incomplete disabled entries; locate and
modify DEFAULT_PROFILE, makeDefaultValues, and the append call sites that write
to otel_config.profiles accordingly.

---

Outside diff comments:
In `@plugins/otel/metrics.go`:
- Around line 139-187: NewMetricsExporter currently calls
otel.SetMeterProvider(provider) which overwrites the process-global meter
provider per profile; instead avoid replacing the global provider here — remove
the otel.SetMeterProvider(provider) call and keep the created provider local so
callers use provider.Meter(...) returned from NewMetricsExporter (or have
bootstrap install one shared global provider with multiple sdkmetric.Readers
once). Update NewMetricsExporter to return the local provider/meter (or an API
to retrieve its meter) and ensure instrumentation uses the returned meter rather
than otel.Meter(...), leaving global provider installation to a single bootstrap
path.

---

Duplicate comments:
In `@core/schemas/utils.go`:
- Around line 881-885: The helper is shallow-copying the slice field keyOrder on
original.Function.Parameters which causes aliasing; modify the copy logic in the
block that creates copyParams (type ToolFunctionParameters) to deep-copy the
keyOrder slice (allocate a new slice and copy elements) and assign that new
slice to copyParams.keyOrder so mutations to the new ToolFunctionParameters do
not affect original.Function.Parameters.keyOrder.
- Around line 429-432: The float32/float64 branches in SafeExtractInt64
currently truncate and accept NaN/Inf; update them to reject any lossy or
invalid conversions by: for float32 convert to float64 then for float64 verify
!math.IsNaN(v) && !math.IsInf(v,0), check that v equals math.Trunc(v) (no
fractional part), and ensure v is within int64 range (>= math.MinInt64 && <=
math.MaxInt64); only then return int64(v), true, otherwise return 0, false.
Reference the SafeExtractInt64 function and the float32/float64 type switch
branches when making the change.

In `@framework/configstore/migrations.go`:
- Around line 6488-6496: The migration updates plugin.Config (setting
"profiles": []{cfgMap}), clears plugin.ConfigJSON and sets
plugin.EncryptionStatus, but never recomputes or updates the plugin.config_hash,
leaving the stored hash inconsistent; after setting plugin.Config and before
tx.Save(&plugin) compute the new config hash (the same algorithm used elsewhere
to derive config_hash from the serialized/encrypted config) and assign it to
plugin.ConfigHash (or plugin.config_hash field) so the database write persists
the updated hash along with the new config and encryption status.

In `@plugins/otel/metrics.go`:
- Around line 452-552: The current code only uses the final LLM/Retry span
(llmSpan) to emit metrics, which hides retry/fallback load; change the logic to
iterate trace.Spans and for each span with Kind == schemas.SpanKindLLMCall ||
Kind == schemas.SpanKindRetry (falling back to trace.RootSpan only if no such
spans exist) build the per-span attrs (use
getStringAttr/getIntAttr/getFloat64Attr and prefer "request.type" or span.Name
for method) and call the same metric emitters (m.RecordUpstreamRequest,
RecordUpstreamLatency, RecordErrorRequest/RecordSuccessRequest,
RecordInputTokens, RecordOutputTokens, RecordCost,
RecordStreamFirstTokenLatency) for each matching span so every attempt
contributes metrics instead of only the last one.

In `@transports/config.schema.json`:
- Around line 1519-1561: collector_url and metrics_endpoint currently accept
both URI and host:port shapes regardless of protocol; update the JSON schema to
make their validation conditional on the protocol value by using JSON Schema
conditional subschemas (if/then/else) tied to the "protocol" property: for
protocol == "http" require a "format": "uri" for collector_url and
metrics_endpoint, and for protocol == "grpc" require the "pattern":
"^[^:\\s]+:\\d+$" (or conversely enforce host:port only for grpc); apply the
same conditional logic to both "collector_url" and "metrics_endpoint" entries
while leaving other fields (e.g., "metrics_enabled", "trace_type") unchanged so
plugins/otel/main.go will receive only schema-valid shapes for each protocol.

In `@ui/app/workspace/observability/fragments/otelFormFragment.tsx`:
- Around line 648-674: The TooltipTrigger currently uses the Button asChild so
when Button is disabled it doesn't emit hover/focus events and the
TooltipContent becomes unreachable; update the fragment around
Tooltip/TooltipTrigger so a non-disabled wrapper element (e.g., a <div> or
<span> wrapper component) is used as the TooltipTrigger target while keeping the
disabled Button inside that wrapper (keep TooltipProvider, Tooltip,
TooltipTrigger, Button, and TooltipContent names unchanged), and ensure the
wrapper forwards pointer events and accessibility attributes so the tooltip
shows when the user hovers/focuses the wrapper even if Button.disabled is true.
- Around line 219-279: The new profile controls (the expand/collapse trigger div
using setExpandedProfiles, the per-profile enable Switch rendered inside
FormField for name "otel_config.profiles.${index}.enabled", the profile remove
Button that calls remove(index) and updates setExpandedProfiles, and the
add-profile Button elsewhere) need stable test selectors: add data-testid
attributes following the repo pattern
"data-testid=\"profile-<element>-<qualifier>\"" (e.g. profile-toggle-<index>,
profile-enable-<index>, profile-remove-<index>, profile-add-<qualifier>) on the
outer toggle div, the Switch, the remove Button, and the add-profile Button;
ensure you include the index or appropriate qualifier to keep them unique and do
not change existing onClick logic (only add the attributes).
- Around line 196-199: The submit handler drops the promise from onSave so
rejections become unhandled; change onSubmit to either async and await the call
(e.g., make onSubmit async, setIsSaving(true); try { await onSave(data); }
finally { setIsSaving(false); }) or return the promise (return
onSave(data).finally(() => setIsSaving(false))); update the function referenced
as onSubmit to use one of these patterns so errors propagate to the caller
instead of being detached.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 2ec6e124-59ba-42d6-a339-49c10e5ab80a

📥 Commits

Reviewing files that changed from the base of the PR and between 941bd17 and 18d2ba5.

📒 Files selected for processing (16)
  • .github/workflows/helm-release.yml
  • .github/workflows/scripts/validate-schema-sync.sh
  • core/schemas/utils.go
  • framework/configstore/migrations.go
  • plugins/otel/changelog.md
  • plugins/otel/converter.go
  • plugins/otel/go.mod
  • plugins/otel/grpc.go
  • plugins/otel/http.go
  • plugins/otel/main.go
  • plugins/otel/metrics.go
  • plugins/otel/utils.go
  • transports/bifrost-http/server/plugins.go
  • transports/config.schema.json
  • ui/app/workspace/observability/fragments/otelFormFragment.tsx
  • ui/lib/types/schemas.ts
✅ Files skipped from review due to trivial changes (6)
  • plugins/otel/changelog.md
  • .github/workflows/scripts/validate-schema-sync.sh
  • .github/workflows/helm-release.yml
  • plugins/otel/go.mod
  • transports/bifrost-http/server/plugins.go
  • plugins/otel/utils.go
🚧 Files skipped from review as they are similar to previous changes (2)
  • plugins/otel/grpc.go
  • ui/lib/types/schemas.ts

Comment thread transports/config.schema.json
Comment thread ui/app/workspace/observability/fragments/otelFormFragment.tsx
Comment thread ui/app/workspace/observability/fragments/otelFormFragment.tsx
@sammaji sammaji force-pushed the 04-08-feat_multiple_otel_profiles branch from 18d2ba5 to 03a1789 Compare April 17, 2026 02:09
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (3)
framework/configstore/migrations.go (1)

6535-6543: ⚠️ Potential issue | 🟠 Major

Recompute ConfigHash before saving the migrated OTEL plugin.

This rewrites plugin.Config but leaves config_hash untouched, so the migrated row can look perpetually dirty to config reconciliation.

Suggested fix
 			plugin.Config = map[string]any{
 				"profiles": []any{cfgMap},
 			}
 			// Force BeforeSave to re-serialize + re-encrypt.
 			plugin.ConfigJSON = ""
 			plugin.EncryptionStatus = tables.EncryptionStatusPlainText
+
+			hash, err := GeneratePluginHash(plugin)
+			if err != nil {
+				return fmt.Errorf("failed to generate hash for plugin %s: %w", plugin.Name, err)
+			}
+			plugin.ConfigHash = hash
 
 			if err := tx.Save(&plugin).Error; err != nil {
 				return fmt.Errorf("failed to save migrated otel config: %w", err)
 			}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@framework/configstore/migrations.go` around lines 6535 - 6543, After updating
plugin.Config, ensure you recompute and set plugin.ConfigHash (using the
project's existing config-hash helper or the same algorithm used elsewhere)
before calling tx.Save(&plugin); in other words, after setting plugin.Config,
ConfigJSON = "" and EncryptionStatus = tables.EncryptionStatusPlainText, invoke
the existing recompute routine or compute the hash and assign it to
plugin.ConfigHash so the saved row reflects the new config and won't appear
dirty to reconciliation.
transports/config.schema.json (2)

1443-1459: ⚠️ Potential issue | 🟠 Major

metrics_endpoint should be required when metrics_enabled is true.

On Line 1443 and Line 1448, the schema allows metrics_enabled: true without metrics_endpoint, which pushes a config error to runtime instead of catching it at validation time.

Suggested schema patch
-                        "allOf": [
+                        "allOf": [
+                          {
+                            "if": {
+                              "properties": { "metrics_enabled": { "const": true } },
+                              "required": ["metrics_enabled"]
+                            },
+                            "then": {
+                              "required": ["metrics_endpoint"]
+                            }
+                          },
                           {
                             "if": {
                               "properties": { "protocol": { "const": "http" } },
                               "required": ["protocol"]
                             },

As per coding guidelines, "transports/config.schema.json: Update transports/config.schema.json when adding config fields; it is the authoritative definition for all config.json fields."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@transports/config.schema.json` around lines 1443 - 1459, Add a JSON Schema
conditional so that when the boolean property metrics_enabled is true the
metrics_endpoint string becomes required; locate the metrics_enabled and
metrics_endpoint definitions in the schema and add an "if":
{"properties":{"metrics_enabled":{"const":true}}} with a matching "then":
{"required":["metrics_endpoint"]} (leave the existing oneOf/format/pattern for
metrics_endpoint intact) to ensure validation fails at schema-time instead of
runtime.

1416-1426: ⚠️ Potential issue | 🟠 Major

Protocol-specific endpoint constraints are still missing.

collector_url and metrics_endpoint currently accept both URI and host:port on any profile. At runtime, HTTP and gRPC expect different shapes, so configs can validate here but fail later.

Suggested schema patch
                         "properties": {
@@
                           "collector_url": {
                             "type": "string",
                             "description": "URL of the OpenTelemetry collector",
                             "oneOf": [
                               {
                                 "format": "uri"
                               },
                               {
                                 "pattern": "^[^:\\s]+:\\d+$"
                               }
                             ]
                           },
@@
                           "metrics_endpoint": {
                             "type": "string",
                             "description": "OTLP metrics endpoint URL (e.g., http://otel-collector:4318/v1/metrics for HTTP or otel-collector:4317 for gRPC)",
                             "oneOf": [
                               {
                                 "format": "uri"
                               },
                               {
                                 "pattern": "^[^:\\s]+:\\d+$"
                               }
                             ]
                           },
@@
-                        "required": [
+                        "allOf": [
+                          {
+                            "if": {
+                              "properties": { "protocol": { "const": "http" } },
+                              "required": ["protocol"]
+                            },
+                            "then": {
+                              "properties": {
+                                "collector_url": { "pattern": "^https?://" },
+                                "metrics_endpoint": { "pattern": "^https?://" }
+                              }
+                            }
+                          },
+                          {
+                            "if": {
+                              "properties": { "protocol": { "const": "grpc" } },
+                              "required": ["protocol"]
+                            },
+                            "then": {
+                              "properties": {
+                                "collector_url": { "pattern": "^[^:\\s]+:\\d+$" },
+                                "metrics_endpoint": { "pattern": "^[^:\\s]+:\\d+$" }
+                              }
+                            }
+                          }
+                        ],
+                        "required": [
                           "collector_url",
                           "trace_type",
                           "protocol"
                         ],

As per coding guidelines, "transports/config.schema.json: Update transports/config.schema.json when adding config fields; it is the authoritative definition for all config.json fields."

Also applies to: 1448-1458

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@transports/config.schema.json` around lines 1416 - 1426, The schema currently
lets collector_url and metrics_endpoint accept either a URI or host:port for all
profiles; update transports/config.schema.json to add protocol-aware validation
using conditional subschemas (if/then/else) tied to the protocol field so HTTP
profiles require an http(s) URI and gRPC profiles require either a grpc(s) URI
or host:port as appropriate; specifically modify the definitions for
"collector_url" and "metrics_endpoint" (the symbols to change) to use JSON
Schema if/then clauses checking "protocol": "http" vs "grpc" and enforce the
correct "format": "uri" or the "pattern": "^[^:\\s]+:\\d+$" accordingly, and
mirror the same change for the other occurrence of these fields around the later
block (lines with the second collector_url/metrics_endpoint definition).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@framework/configstore/migrations.go`:
- Around line 6535-6543: After updating plugin.Config, ensure you recompute and
set plugin.ConfigHash (using the project's existing config-hash helper or the
same algorithm used elsewhere) before calling tx.Save(&plugin); in other words,
after setting plugin.Config, ConfigJSON = "" and EncryptionStatus =
tables.EncryptionStatusPlainText, invoke the existing recompute routine or
compute the hash and assign it to plugin.ConfigHash so the saved row reflects
the new config and won't appear dirty to reconciliation.

In `@transports/config.schema.json`:
- Around line 1443-1459: Add a JSON Schema conditional so that when the boolean
property metrics_enabled is true the metrics_endpoint string becomes required;
locate the metrics_enabled and metrics_endpoint definitions in the schema and
add an "if": {"properties":{"metrics_enabled":{"const":true}}} with a matching
"then": {"required":["metrics_endpoint"]} (leave the existing
oneOf/format/pattern for metrics_endpoint intact) to ensure validation fails at
schema-time instead of runtime.
- Around line 1416-1426: The schema currently lets collector_url and
metrics_endpoint accept either a URI or host:port for all profiles; update
transports/config.schema.json to add protocol-aware validation using conditional
subschemas (if/then/else) tied to the protocol field so HTTP profiles require an
http(s) URI and gRPC profiles require either a grpc(s) URI or host:port as
appropriate; specifically modify the definitions for "collector_url" and
"metrics_endpoint" (the symbols to change) to use JSON Schema if/then clauses
checking "protocol": "http" vs "grpc" and enforce the correct "format": "uri" or
the "pattern": "^[^:\\s]+:\\d+$" accordingly, and mirror the same change for the
other occurrence of these fields around the later block (lines with the second
collector_url/metrics_endpoint definition).

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 19592709-b2d3-4cac-afd5-24171ffd7cea

📥 Commits

Reviewing files that changed from the base of the PR and between 18d2ba5 and 03a1789.

📒 Files selected for processing (16)
  • .github/workflows/helm-release.yml
  • .github/workflows/scripts/validate-schema-sync.sh
  • core/schemas/utils.go
  • framework/configstore/migrations.go
  • plugins/otel/changelog.md
  • plugins/otel/converter.go
  • plugins/otel/go.mod
  • plugins/otel/grpc.go
  • plugins/otel/http.go
  • plugins/otel/main.go
  • plugins/otel/metrics.go
  • plugins/otel/utils.go
  • transports/bifrost-http/server/plugins.go
  • transports/config.schema.json
  • ui/app/workspace/observability/fragments/otelFormFragment.tsx
  • ui/lib/types/schemas.ts
✅ Files skipped from review due to trivial changes (6)
  • plugins/otel/changelog.md
  • .github/workflows/scripts/validate-schema-sync.sh
  • transports/bifrost-http/server/plugins.go
  • plugins/otel/go.mod
  • .github/workflows/helm-release.yml
  • plugins/otel/utils.go
🚧 Files skipped from review as they are similar to previous changes (4)
  • plugins/otel/http.go
  • plugins/otel/grpc.go
  • core/schemas/utils.go
  • ui/app/workspace/observability/fragments/otelFormFragment.tsx

@sammaji sammaji force-pushed the 04-08-feat_multiple_otel_profiles branch from 03a1789 to 7b3828d Compare April 17, 2026 02:35
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
plugins/otel/metrics.go (1)

171-186: ⚠️ Potential issue | 🟡 Minor

Remove the unnecessary global meter provider call in per-profile exporter initialization.

otel.SetMeterProvider(provider) is called once per enabled profile in NewMetricsExporter(), which is invoked in a loop from Init(). However, the codebase never uses the global otel.Meter() API—each profile's exporter uses its own locally scoped provider.Meter() instance. The global mutation is a no-op with no functional impact but should be removed to avoid unnecessary side effects in multi-profile initialization.

Suggested fix
-	// Set as global provider
-	otel.SetMeterProvider(provider)
-
 	// Create meter
 	meter := provider.Meter("bifrost",
 		metric.WithInstrumentationVersion(version),
 	)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/otel/metrics.go` around lines 171 - 186, Remove the global mutation
otel.SetMeterProvider(provider) from NewMetricsExporter: the exporter already
uses the local provider.Meter(...) and Init() calls NewMetricsExporter() in a
loop per profile, so calling otel.SetMeterProvider(provider) is unnecessary and
may cause unwanted side-effects; simply delete the
otel.SetMeterProvider(provider) line in the NewMetricsExporter function and keep
the locally scoped provider and meter usage as-is (verify there are no remaining
references to the global otel.Meter() API before committing).
♻️ Duplicate comments (3)
ui/app/workspace/observability/fragments/otelFormFragment.tsx (3)

196-199: ⚠️ Potential issue | 🟠 Major

Await onSave inside onSubmit.

onSave(data).finally(...) still clears isSaving, but it leaves a rejected save promise unobserved. That can show up as an unhandled rejection even though the parent already handles the user-facing error path.

Suggested fix
-  const onSubmit = (data: OtelFormSchema) => {
-    setIsSaving(true);
-    onSave(data).finally(() => setIsSaving(false));
-  };
+  const onSubmit = async (data: OtelFormSchema) => {
+    setIsSaving(true);
+    try {
+      await onSave(data);
+    } catch {
+      // onSave already reports the failure.
+    } finally {
+      setIsSaving(false);
+    }
+  };
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ui/app/workspace/observability/fragments/otelFormFragment.tsx` around lines
196 - 199, The onSubmit function currently calls onSave(data).finally(...) which
can leave a rejected promise unobserved; update onSubmit to await the async
onSave call and handle errors so rejections are observed — e.g., make onSubmit
async, setIsSaving(true) before awaiting await onSave(data) inside try/catch,
rethrow or handle the error as appropriate, and ensure setIsSaving(false) runs
in a finally block; locate and change the onSubmit definition that references
OtelFormSchema, setIsSaving and onSave.

648-674: ⚠️ Potential issue | 🟡 Minor

Use a non-disabled wrapper as the tooltip trigger.

When save is disabled, the <Button> stops emitting hover/focus events, so the tooltip explaining why it is disabled becomes unreachable.

Suggested fix
             <TooltipProvider>
               <Tooltip>
                 <TooltipTrigger asChild>
-                  <Button
-                    type="submit"
-                    disabled={
-                      !hasOtelAccess ||
-                      !form.formState.isDirty ||
-                      !form.formState.isValid
-                    }
-                    isLoading={isSaving}
-                  >
-                    Save OTEL Configuration
-                  </Button>
+                  <span className="inline-flex">
+                    <Button
+                      type="submit"
+                      disabled={
+                        !hasOtelAccess ||
+                        !form.formState.isDirty ||
+                        !form.formState.isValid
+                      }
+                      isLoading={isSaving}
+                    >
+                      Save OTEL Configuration
+                    </Button>
+                  </span>
                 </TooltipTrigger>
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ui/app/workspace/observability/fragments/otelFormFragment.tsx` around lines
648 - 674, The tooltip is unreachable when the Button is disabled because a
disabled native button stops emitting hover/focus events; update the
TooltipTrigger to wrap a non-disabled element (e.g., a <span> or similar) and
render the disabled Button as its child so the trigger can still receive
hover/focus while the Button (controlled by hasOtelAccess /
form.formState.isDirty / form.formState.isValid and isSaving) remains
visually/semantically disabled. Locate the Tooltip/TooltipTrigger/TooltipContent
block around Save OTEL Configuration and change the trigger to use a
non-disabled wrapper element that contains the Button, keeping the same props
(disabled/isLoading) on the Button and preserving the conditional TooltipContent
logic.

219-269: ⚠️ Potential issue | 🟡 Minor

Add stable data-testids to the new profile controls.

The profile expand/collapse row, per-profile enable switch, remove action, and add-profile button are new interactive elements, but they still don't expose stable selectors.

Suggested fix
                 <div
                   className="flex w-full items-center justify-between px-4 py-3 text-left"
+                  data-testid={`otel-profile-toggle-${index}`}
                   onClick={() =>
                     setExpandedProfiles((prev) => ({
                       ...prev,
@@
                         <Switch
                           checked={enabledField.value}
                           onCheckedChange={enabledField.onChange}
                           onClick={(e) => e.stopPropagation()}
                           disabled={!hasOtelAccess}
                           aria-label="Enable profile"
+                          data-testid={`otel-profile-enable-${index}`}
                         />
@@
                       <Button
                         type="button"
                         variant="ghost"
                         size="icon"
                         className="size-7"
+                        data-testid={`otel-profile-remove-${index}`}
                         onClick={(e) => {
@@
         <Button
           type="button"
           variant="outline"
           size="sm"
           onClick={() => append({ ...DEFAULT_PROFILE })}
           disabled={!hasOtelAccess}
           className="flex items-center gap-2"
+          data-testid="otel-profile-add-btn"
         >

As per coding guidelines, ui/**/*.{tsx,ts}: "Add new interactive UI elements with data-testid attributes following the pattern: data-testid=\"<entity>-<element>-<qualifier>\"."

Also applies to: 593-603

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ui/app/workspace/observability/fragments/otelFormFragment.tsx` around lines
219 - 269, Add stable data-testid attributes for the new per-profile interactive
controls using the pattern data-testid="<entity>-<element>-<qualifier>": add a
data-testid to the profile header div (the expand/collapse row that calls
setExpandedProfiles) like "profile-row-<index>-toggle", add to the per-profile
enable Switch rendered by FormField (name
"otel_config.profiles.${index}.enabled") like "profile-switch-<index>-enabled",
add to the remove Button (the Button that calls remove(index) and adjusts
setExpandedProfiles) like "profile-button-<index>-remove", and add the missing
data-testid to the add-profile button referenced elsewhere (the add-profile
handler around lines 593-603) like "profile-button-add"; ensure the index is
used as the qualifier so selectors remain stable per-profile.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@plugins/otel/main.go`:
- Around line 225-233: The parser in loadEnvAttributes silently drops attributes
whose values contain '=' because it uses strings.Split(attribute, "=") and
expects len(parts)==2; change the split to only on the first '=' (e.g., use
strings.SplitN(attribute, "=", 2) or find the first '=' and slice) so the
right-hand side keeps any '=' characters, then trim and pass the key/value into
kvStr; ensure OTELResponseAttributesEnvKey handling and the existing TrimSpace
calls remain.

---

Outside diff comments:
In `@plugins/otel/metrics.go`:
- Around line 171-186: Remove the global mutation
otel.SetMeterProvider(provider) from NewMetricsExporter: the exporter already
uses the local provider.Meter(...) and Init() calls NewMetricsExporter() in a
loop per profile, so calling otel.SetMeterProvider(provider) is unnecessary and
may cause unwanted side-effects; simply delete the
otel.SetMeterProvider(provider) line in the NewMetricsExporter function and keep
the locally scoped provider and meter usage as-is (verify there are no remaining
references to the global otel.Meter() API before committing).

---

Duplicate comments:
In `@ui/app/workspace/observability/fragments/otelFormFragment.tsx`:
- Around line 196-199: The onSubmit function currently calls
onSave(data).finally(...) which can leave a rejected promise unobserved; update
onSubmit to await the async onSave call and handle errors so rejections are
observed — e.g., make onSubmit async, setIsSaving(true) before awaiting await
onSave(data) inside try/catch, rethrow or handle the error as appropriate, and
ensure setIsSaving(false) runs in a finally block; locate and change the
onSubmit definition that references OtelFormSchema, setIsSaving and onSave.
- Around line 648-674: The tooltip is unreachable when the Button is disabled
because a disabled native button stops emitting hover/focus events; update the
TooltipTrigger to wrap a non-disabled element (e.g., a <span> or similar) and
render the disabled Button as its child so the trigger can still receive
hover/focus while the Button (controlled by hasOtelAccess /
form.formState.isDirty / form.formState.isValid and isSaving) remains
visually/semantically disabled. Locate the Tooltip/TooltipTrigger/TooltipContent
block around Save OTEL Configuration and change the trigger to use a
non-disabled wrapper element that contains the Button, keeping the same props
(disabled/isLoading) on the Button and preserving the conditional TooltipContent
logic.
- Around line 219-269: Add stable data-testid attributes for the new per-profile
interactive controls using the pattern
data-testid="<entity>-<element>-<qualifier>": add a data-testid to the profile
header div (the expand/collapse row that calls setExpandedProfiles) like
"profile-row-<index>-toggle", add to the per-profile enable Switch rendered by
FormField (name "otel_config.profiles.${index}.enabled") like
"profile-switch-<index>-enabled", add to the remove Button (the Button that
calls remove(index) and adjusts setExpandedProfiles) like
"profile-button-<index>-remove", and add the missing data-testid to the
add-profile button referenced elsewhere (the add-profile handler around lines
593-603) like "profile-button-add"; ensure the index is used as the qualifier so
selectors remain stable per-profile.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: a9ce0791-e453-43c2-b631-806ac5368cef

📥 Commits

Reviewing files that changed from the base of the PR and between 03a1789 and 7b3828d.

📒 Files selected for processing (6)
  • .github/workflows/helm-release.yml
  • .github/workflows/scripts/validate-schema-sync.sh
  • plugins/otel/main.go
  • plugins/otel/metrics.go
  • ui/app/workspace/observability/fragments/otelFormFragment.tsx
  • ui/lib/types/schemas.ts
✅ Files skipped from review due to trivial changes (2)
  • .github/workflows/scripts/validate-schema-sync.sh
  • .github/workflows/helm-release.yml

Comment thread plugins/otel/main.go
coderabbitai[bot]
coderabbitai Bot previously approved these changes Apr 17, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (8)
core/schemas/utils.go (2)

883-885: ⚠️ Potential issue | 🟠 Major

keyOrder is shallow-copied and still aliases source state.

Assigning the slice directly shares the same backing array, so this deep copy is not isolated.

Proposed fix
 			copyParams := &ToolFunctionParameters{
 				Type:                original.Function.Parameters.Type,
-				keyOrder:            original.Function.Parameters.keyOrder,
 				explicitEmptyObject: original.Function.Parameters.explicitEmptyObject,
 			}
+			if original.Function.Parameters.keyOrder != nil {
+				copyParams.keyOrder = make([]string, len(original.Function.Parameters.keyOrder))
+				copy(copyParams.keyOrder, original.Function.Parameters.keyOrder)
+			}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/schemas/utils.go` around lines 883 - 885, The code assigns
original.Function.Parameters.keyOrder directly which aliases the source slice;
instead, allocate a new slice and copy elements so the new object's keyOrder is
independent. Locate the struct construction where keyOrder is set from
original.Function.Parameters.keyOrder and replace the direct assignment with a
deep copy (e.g., create a new slice with the same length and copy each element;
if elements are pointers or composite types, copy their contents as well) so
mutations to the new object's keyOrder won't affect
original.Function.Parameters.keyOrder.

403-446: ⚠️ Potential issue | 🟠 Major

SafeExtractInt64 is not actually safe for unsigned overflow and float edge cases.

uint/uint64 values above math.MaxInt64 can be misrepresented after cast, and float branches currently accept truncation plus NaN/Inf conversions. This can silently corrupt exported attribute values.

Proposed fix
 import (
 	"encoding/json"
 	"fmt"
+	"math"
 	"math/rand"
 	"net/url"
 	"regexp"
 	"strconv"
 	"strings"
@@
 	case uint:
-		return int64(v), true
+		if uint64(v) > math.MaxInt64 {
+			return 0, false
+		}
+		return int64(v), true
@@
 	case uint64:
-		return int64(v), true
+		if v > math.MaxInt64 {
+			return 0, false
+		}
+		return int64(v), true
 	case float32:
-		return int64(v), true
+		f := float64(v)
+		if math.IsNaN(f) || math.IsInf(f, 0) || math.Trunc(f) != f || f < math.MinInt64 || f > math.MaxInt64 {
+			return 0, false
+		}
+		return int64(f), true
 	case float64:
-		return int64(v), true
+		if math.IsNaN(v) || math.IsInf(v, 0) || math.Trunc(v) != v || v < math.MinInt64 || v > math.MaxInt64 {
+			return 0, false
+		}
+		return int64(v), true
In Go (current spec), what is the defined behavior when converting:
1) uint64 values greater than math.MaxInt64 to int64
2) float64 NaN/Inf to int64
3) non-integer float64 (e.g., 1.9) to int64
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/schemas/utils.go` around lines 403 - 446, SafeExtractInt64 currently
unsafely casts unsigned and float types; update SafeExtractInt64 to validate
ranges and non-fractional floats before returning an int64: for
uint/uint8/uint16/uint32/uint64 (and plain uint) check value <= math.MaxInt64
and reject otherwise; for float32/float64 use math.IsNaN and math.IsInf to
reject NaN/Inf, ensure the value is within [math.MinInt64, math.MaxInt64] and
has no fractional part (e.g., value == math.Trunc(value)) before converting;
apply the same float validation for json.Number parsed as float64 and for any
branch that truncates floats; return (0,false) when checks fail.Reference the
SafeExtractInt64 function and modify its uint*, float32/float64, and json.Number
branches accordingly.
framework/configstore/migrations.go (1)

6577-6584: ⚠️ Potential issue | 🟠 Major

Recompute ConfigHash before saving the migrated plugin.

Line 6583 persists a different OTEL config shape, but the migration never refreshes plugin.ConfigHash. That leaves the stored hash out of sync with the rewritten config and can make the plugin look dirty immediately after upgrade.

Suggested fix
 			plugin.Config = map[string]any{
 				"profiles": []any{cfgMap},
 			}
 			// Force BeforeSave to re-serialize + re-encrypt.
 			plugin.ConfigJSON = ""
 			plugin.EncryptionStatus = tables.EncryptionStatusPlainText
+			hash, err := GeneratePluginHash(plugin)
+			if err != nil {
+				return fmt.Errorf("failed to generate hash for plugin %s: %w", plugin.Name, err)
+			}
+			plugin.ConfigHash = hash
 			if err := tx.Save(&plugin).Error; err != nil {
 				return fmt.Errorf("failed to save migrated otel config: %w", err)
 			}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@framework/configstore/migrations.go` around lines 6577 - 6584, The migrated
plugin's stored config is changed (plugin.Config, plugin.ConfigJSON cleared and
plugin.EncryptionStatus set) but plugin.ConfigHash is not refreshed, leaving the
DB hash out-of-sync; before calling tx.Save(&plugin) recompute and assign
plugin.ConfigHash using the existing config-hash routine (e.g., call the
module/helper that serializes+hashes plugin.Config or invoke the entity's
BeforeSave-equivalent that updates the hash) so plugin.ConfigHash reflects the
new serialized/encrypted shape prior to saving.
transports/config.schema.json (2)

1443-1459: ⚠️ Potential issue | 🟠 Major

Require metrics_endpoint when metrics_enabled is true.

plugins/otel/main.go rejects enabled metrics without an endpoint, but the schema still allows that combination. Please encode the same rule here with an if/then on metrics_enabled so invalid configs are caught before init time.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@transports/config.schema.json` around lines 1443 - 1459, Add a conditional
rule to the JSON schema so that when metrics_enabled is true the
metrics_endpoint property is required: update the schema to include an "if" that
checks properties.metrics_enabled.const true and a "then" that adds "required":
["metrics_endpoint"] (and preserve existing oneOf/format/pattern for
metrics_endpoint). This enforces the same validation as plugins/otel/main.go and
prevents enabling metrics without an endpoint at schema-validation time.

1416-1459: ⚠️ Potential issue | 🟠 Major

Make OTEL endpoint validation conditional on protocol.

collector_url and metrics_endpoint currently accept both URI and host:port shapes for both protocols, so configs like collector_url: "otel:4317" with protocol: "http" still pass schema and then fail at runtime. Add if/then branches so HTTP requires http(s)://... and gRPC requires host:port.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@transports/config.schema.json` around lines 1416 - 1459, The schema currently
allows both URI and host:port for collector_url and metrics_endpoint regardless
of protocol; update the JSON schema to add conditional if/then blocks keyed on
the protocol property so that when "protocol" is "http" the collector_url and
metrics_endpoint subschemas require a URI format (http/https) and when
"protocol" is "grpc" they require the host:port pattern; modify the objects that
define "collector_url" and "metrics_endpoint" to remove the broad oneOf and
instead reference or inline these conditional validations so
runtime-incompatible values (e.g., "otel:4317" with protocol "http") will be
rejected by the schema validator.
ui/app/workspace/observability/fragments/otelFormFragment.tsx (3)

196-199: ⚠️ Potential issue | 🟠 Major

Await the save promise in onSubmit.

onSave(data).finally(...) detaches the rejection from handleSubmit, so failed saves can still surface as unhandled promise rejections. Make onSubmit async and await the save inside try/finally so setIsSaving(false) still always runs.

Suggested fix
-  const onSubmit = (data: OtelFormSchema) => {
-    setIsSaving(true);
-    onSave(data).finally(() => setIsSaving(false));
-  };
+  const onSubmit = async (data: OtelFormSchema) => {
+    setIsSaving(true);
+    try {
+      await onSave(data);
+    } finally {
+      setIsSaving(false);
+    }
+  };
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ui/app/workspace/observability/fragments/otelFormFragment.tsx` around lines
196 - 199, The onSubmit handler currently calls onSave(data).finally(...) which
detaches promise rejection from handleSubmit; change the onSubmit function to be
async, set setIsSaving(true) before awaits, then await onSave(data) inside a
try/finally block and call setIsSaving(false) in the finally so rejections
propagate to handleSubmit while ensuring isSaving is always cleared; update the
onSubmit definition that references OtelFormSchema, setIsSaving, and onSave to
this async/try-finally pattern.

219-273: ⚠️ Potential issue | 🟡 Minor

Add stable data-testids to the new profile controls.

The profile toggle row, enable switch, remove action, and add-profile button are new interactive elements but still don't expose stable selectors. That makes the feature harder to cover in E2E/tests than the neighboring OTEL controls that already have explicit test ids.

As per coding guidelines, "Add data-testid attributes to all interactive UI elements following the pattern: data-testid=\"<entity>-<element>-<qualifier>\"."

Also applies to: 592-603

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ui/app/workspace/observability/fragments/otelFormFragment.tsx` around lines
219 - 273, The interactive profile controls lack stable test ids; add
data-testid attributes following the pattern "<entity>-<element>-<qualifier>" to
the profile row, enable switch, remove button, and the add-profile button:
update the div that toggles expansion (where setExpandedProfiles is called) to
include data-testid="profile-row-<index>" (or similar), the Switch rendered
inside FormField (name `otel_config.profiles.${index}.enabled`) to include
data-testid="profile-switch-<index>", the remove Button (which calls
remove(index) and uses Trash2) to include data-testid="profile-remove-<index>",
and the add-profile control referenced around lines 592-603 to include
data-testid="profile-add". Ensure the test ids use the index/qualifier
consistently and are added without altering click handlers or accessibility
attributes.

648-674: ⚠️ Potential issue | 🟡 Minor

The disabled save button cannot trigger this tooltip.

When the button is disabled, the current TooltipTrigger asChild never receives hover/focus events, so the “why can’t I save?” message is unreachable. Wrap the button in a non-disabled trigger element and keep the button disabled inside it.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ui/app/workspace/observability/fragments/otelFormFragment.tsx` around lines
648 - 674, The Tooltip can't appear because TooltipTrigger asChild forwards
events to the disabled Button; instead wrap the Button in a focusable
non-disabled wrapper (e.g., a <span> or <div> with tabIndex={0}) and make that
wrapper the TooltipTrigger child while keeping Button disabled. Update the
fragment where TooltipTrigger, Button, TooltipContent and the form.formState
checks are used so TooltipTrigger wraps the wrapper element (not the disabled
Button), preserve the disabled/isLoading props on Button and keep the
conditional rendering of TooltipContent unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@core/schemas/utils.go`:
- Around line 883-885: The code assigns original.Function.Parameters.keyOrder
directly which aliases the source slice; instead, allocate a new slice and copy
elements so the new object's keyOrder is independent. Locate the struct
construction where keyOrder is set from original.Function.Parameters.keyOrder
and replace the direct assignment with a deep copy (e.g., create a new slice
with the same length and copy each element; if elements are pointers or
composite types, copy their contents as well) so mutations to the new object's
keyOrder won't affect original.Function.Parameters.keyOrder.
- Around line 403-446: SafeExtractInt64 currently unsafely casts unsigned and
float types; update SafeExtractInt64 to validate ranges and non-fractional
floats before returning an int64: for uint/uint8/uint16/uint32/uint64 (and plain
uint) check value <= math.MaxInt64 and reject otherwise; for float32/float64 use
math.IsNaN and math.IsInf to reject NaN/Inf, ensure the value is within
[math.MinInt64, math.MaxInt64] and has no fractional part (e.g., value ==
math.Trunc(value)) before converting; apply the same float validation for
json.Number parsed as float64 and for any branch that truncates floats; return
(0,false) when checks fail.Reference the SafeExtractInt64 function and modify
its uint*, float32/float64, and json.Number branches accordingly.

In `@framework/configstore/migrations.go`:
- Around line 6577-6584: The migrated plugin's stored config is changed
(plugin.Config, plugin.ConfigJSON cleared and plugin.EncryptionStatus set) but
plugin.ConfigHash is not refreshed, leaving the DB hash out-of-sync; before
calling tx.Save(&plugin) recompute and assign plugin.ConfigHash using the
existing config-hash routine (e.g., call the module/helper that
serializes+hashes plugin.Config or invoke the entity's BeforeSave-equivalent
that updates the hash) so plugin.ConfigHash reflects the new
serialized/encrypted shape prior to saving.

In `@transports/config.schema.json`:
- Around line 1443-1459: Add a conditional rule to the JSON schema so that when
metrics_enabled is true the metrics_endpoint property is required: update the
schema to include an "if" that checks properties.metrics_enabled.const true and
a "then" that adds "required": ["metrics_endpoint"] (and preserve existing
oneOf/format/pattern for metrics_endpoint). This enforces the same validation as
plugins/otel/main.go and prevents enabling metrics without an endpoint at
schema-validation time.
- Around line 1416-1459: The schema currently allows both URI and host:port for
collector_url and metrics_endpoint regardless of protocol; update the JSON
schema to add conditional if/then blocks keyed on the protocol property so that
when "protocol" is "http" the collector_url and metrics_endpoint subschemas
require a URI format (http/https) and when "protocol" is "grpc" they require the
host:port pattern; modify the objects that define "collector_url" and
"metrics_endpoint" to remove the broad oneOf and instead reference or inline
these conditional validations so runtime-incompatible values (e.g., "otel:4317"
with protocol "http") will be rejected by the schema validator.

In `@ui/app/workspace/observability/fragments/otelFormFragment.tsx`:
- Around line 196-199: The onSubmit handler currently calls
onSave(data).finally(...) which detaches promise rejection from handleSubmit;
change the onSubmit function to be async, set setIsSaving(true) before awaits,
then await onSave(data) inside a try/finally block and call setIsSaving(false)
in the finally so rejections propagate to handleSubmit while ensuring isSaving
is always cleared; update the onSubmit definition that references
OtelFormSchema, setIsSaving, and onSave to this async/try-finally pattern.
- Around line 219-273: The interactive profile controls lack stable test ids;
add data-testid attributes following the pattern
"<entity>-<element>-<qualifier>" to the profile row, enable switch, remove
button, and the add-profile button: update the div that toggles expansion (where
setExpandedProfiles is called) to include data-testid="profile-row-<index>" (or
similar), the Switch rendered inside FormField (name
`otel_config.profiles.${index}.enabled`) to include
data-testid="profile-switch-<index>", the remove Button (which calls
remove(index) and uses Trash2) to include data-testid="profile-remove-<index>",
and the add-profile control referenced around lines 592-603 to include
data-testid="profile-add". Ensure the test ids use the index/qualifier
consistently and are added without altering click handlers or accessibility
attributes.
- Around line 648-674: The Tooltip can't appear because TooltipTrigger asChild
forwards events to the disabled Button; instead wrap the Button in a focusable
non-disabled wrapper (e.g., a <span> or <div> with tabIndex={0}) and make that
wrapper the TooltipTrigger child while keeping Button disabled. Update the
fragment where TooltipTrigger, Button, TooltipContent and the form.formState
checks are used so TooltipTrigger wraps the wrapper element (not the disabled
Button), preserve the disabled/isLoading props on Button and keep the
conditional rendering of TooltipContent unchanged.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: feeb1bcb-aa04-4675-8b44-334f65ca1791

📥 Commits

Reviewing files that changed from the base of the PR and between 7b3828d and b3cf939.

📒 Files selected for processing (16)
  • .github/workflows/helm-release.yml
  • .github/workflows/scripts/validate-schema-sync.sh
  • core/schemas/utils.go
  • framework/configstore/migrations.go
  • plugins/otel/changelog.md
  • plugins/otel/converter.go
  • plugins/otel/go.mod
  • plugins/otel/grpc.go
  • plugins/otel/http.go
  • plugins/otel/main.go
  • plugins/otel/metrics.go
  • plugins/otel/utils.go
  • transports/bifrost-http/server/plugins.go
  • transports/config.schema.json
  • ui/app/workspace/observability/fragments/otelFormFragment.tsx
  • ui/lib/types/schemas.ts
✅ Files skipped from review due to trivial changes (3)
  • .github/workflows/helm-release.yml
  • .github/workflows/scripts/validate-schema-sync.sh
  • plugins/otel/changelog.md
🚧 Files skipped from review as they are similar to previous changes (6)
  • plugins/otel/go.mod
  • transports/bifrost-http/server/plugins.go
  • ui/lib/types/schemas.ts
  • plugins/otel/grpc.go
  • plugins/otel/utils.go
  • plugins/otel/metrics.go

coderabbitai[bot]
coderabbitai Bot previously approved these changes Apr 20, 2026
sammaji and others added 2 commits April 20, 2026 09:45
Adds a new Go-based schema synchronization tool (`schemasync`) to validate that Go config types stay in sync with `config.schema.json`. The tool recursively walks Go struct types via AST analysis to ensure field names, types, and enum values match between Go code and JSON schema definitions.

- **New schemasync tool**: Created `.github/workflows/scripts/schemasync/` with a Go program that uses `go/types` to analyze config structs and compare them against the JSON schema
- **CI integration**: Added schemasync validation step to the helm-release workflow with Go 1.26 setup
- **Enhanced MCP secret support**: Added `secretRef` configuration for MCP client connection strings in helm charts, allowing Kubernetes secret injection
- **Schema updates**: Updated `config.schema.json` and `values.schema.json` to reflect recent Go type changes including:
  - Renamed `concurrency_config` to `concurrency_and_buffer_size`
  - Added new provider fields (`id`, `description`, network/proxy configs)
  - Updated plugin placement enum to include `builtin`
  - Added `chain_rule` field for routing rules
  - Removed unused MCP websocket/http config objects
- **Validation script updates**: Modified existing helm schema validation to account for renamed types and removed obsolete checks

- [x] Feature
- [x] Chore/CI

- [x] Core (Go)
- [x] Transports (HTTP)
- [ ] Providers/Integrations
- [ ] Plugins
- [ ] UI (React)
- [ ] Docs

The schemasync tool runs automatically in CI. To test locally:

```sh
.github/workflows/scripts/validate-schema-sync.sh

.github/workflows/scripts/validate-helm-config-fields.sh

helm template helm-charts/bifrost --values test-values.yaml
```

The tool validates:
- Go struct fields have corresponding JSON schema properties
- Enum constants in Go match schema enum arrays
- EnvVar-typed fields have helm chart secret support

N/A - This is a validation tool and schema update.

- [ ] Yes
- [x] No

The changes are additive validation tooling and schema updates that maintain backward compatibility.

This addresses the need for automated validation between Go config types and JSON schemas to prevent drift during development.

- The schemasync tool validates that EnvVar-typed fields (which may contain secrets) have proper Kubernetes secret injection support in helm charts
- Added MCP connection string secret reference capability to avoid hardcoding sensitive connection details

- [x] I read `docs/contributing/README.md` and followed the guidelines
- [x] I added/updated tests where appropriate
- [x] I updated documentation where needed
- [x] I verified builds succeed (Go and UI)
- [x] I verified the CI pipeline passes locally if applicable
@sammaji sammaji changed the base branch from v1.5.0 to graphite-base/2567 April 20, 2026 04:16
@sammaji sammaji force-pushed the graphite-base/2567 branch from f42eac2 to 986552c Compare April 20, 2026 04:16
@sammaji sammaji force-pushed the 04-08-feat_multiple_otel_profiles branch from b3cf939 to 580d859 Compare April 20, 2026 04:16
@sammaji sammaji changed the base branch from graphite-base/2567 to main April 20, 2026 04:16
@sammaji sammaji dismissed coderabbitai[bot]’s stale review April 20, 2026 04:16

The base branch was changed.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (3)
framework/configstore/migrations.go (1)

6577-6585: ⚠️ Potential issue | 🟠 Major

Recompute ConfigHash before saving the migrated plugin.

This migration rewrites the OTEL config shape, but it persists the row with the old config_hash. That leaves the plugin dirty after upgrade and can trigger unnecessary reconciliation on the next sync.

Suggested fix
 			plugin.Config = map[string]any{
 				"profiles": []any{cfgMap},
 			}
 			// Force BeforeSave to re-serialize + re-encrypt.
 			plugin.ConfigJSON = ""
 			plugin.EncryptionStatus = tables.EncryptionStatusPlainText
+			hash, err := GeneratePluginHash(plugin)
+			if err != nil {
+				return fmt.Errorf("failed to generate hash for plugin %s: %w", plugin.Name, err)
+			}
+			plugin.ConfigHash = hash
 			if err := tx.Save(&plugin).Error; err != nil {
 				return fmt.Errorf("failed to save migrated otel config: %w", err)
 			}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@framework/configstore/migrations.go` around lines 6577 - 6585, The migration
updates plugin.Config/ConfigJSON and EncryptionStatus but does not update the
persisted config hash; before calling tx.Save(&plugin) recompute and set the
plugin's config hash field (plugin.ConfigHash) to reflect the new
serialized/configured value so the row is not left dirty. Locate where
plugin.Config, plugin.ConfigJSON and plugin.EncryptionStatus are set and either
call the existing helper that recalculates the hash (e.g.,
RecomputeConfigHash/ComputeConfigHash) or serialize the new config to the same
form used elsewhere and assign the resulting hash to plugin.ConfigHash, then
proceed with tx.Save(&plugin).
core/schemas/utils.go (2)

882-886: ⚠️ Potential issue | 🟠 Major

Deep copy still aliases keyOrder instead of cloning it.

Line [884] copies the slice header directly, so mutations in the copied tool parameters can still mutate the original backing array.

Proposed fix
 		if original.Function.Parameters != nil {
 			copyParams := &ToolFunctionParameters{
 				Type:                original.Function.Parameters.Type,
-				keyOrder:            original.Function.Parameters.keyOrder,
 				explicitEmptyObject: original.Function.Parameters.explicitEmptyObject,
 			}
+			if original.Function.Parameters.keyOrder != nil {
+				copyParams.keyOrder = make([]string, len(original.Function.Parameters.keyOrder))
+				copy(copyParams.keyOrder, original.Function.Parameters.keyOrder)
+			}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/schemas/utils.go` around lines 882 - 886, The copy of
ToolFunctionParameters currently aliases the keyOrder slice (copyParams :=
&ToolFunctionParameters{ ..., keyOrder: original.Function.Parameters.keyOrder,
... }), so mutations to copyParams.keyOrder mutate the original; change this to
allocate and copy the slice (handle nil) by creating a new slice with the same
length/capacity and copying elements from original.Function.Parameters.keyOrder
before assigning to copyParams.keyOrder so the copy is independent.

403-446: ⚠️ Potential issue | 🟠 Major

SafeExtractInt64 still allows unsafe numeric coercions.

At Line [419] and Line [427], unchecked uint/uint64 casts can overflow into negative int64 values. At Line [429]-Line [432], float casts accept lossy values (NaN, Inf, fractional, out-of-range) instead of rejecting them.

Proposed fix
 import (
 	"encoding/json"
 	"fmt"
+	"math"
 	"math/rand"
 	"net/url"
 	"regexp"
 	"strconv"
@@
 func SafeExtractInt64(value any) (int64, bool) {
@@
 	case uint:
+		if uint64(v) > math.MaxInt64 {
+			return 0, false
+		}
 		return int64(v), true
@@
 	case uint64:
+		if v > math.MaxInt64 {
+			return 0, false
+		}
 		return int64(v), true
 	case float32:
-		return int64(v), true
+		f := float64(v)
+		if math.IsNaN(f) || math.IsInf(f, 0) || math.Trunc(f) != f || f < math.MinInt64 || f > math.MaxInt64 {
+			return 0, false
+		}
+		return int64(f), true
 	case float64:
-		return int64(v), true
+		if math.IsNaN(v) || math.IsInf(v, 0) || math.Trunc(v) != v || v < math.MinInt64 || v > math.MaxInt64 {
+			return 0, false
+		}
+		return int64(v), true
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/schemas/utils.go` around lines 403 - 446, SafeExtractInt64 currently
permits unsafe coercions: update the uint/uint64 branches and the
float32/float64/json.Number/string handling to reject values that would overflow
or be lossy. Specifically, in SafeExtractInt64 check
uint/uint8/uint16/uint32/uint64 values against math.MaxInt64 and return false if
they exceed it; for float32/float64 first reject NaN and Inf, then ensure the
value is within int64 range and has no fractional part (e.g., math.Trunc(f)==f)
before converting; for json.Number and string parsing, parse into the
appropriate numeric type and validate the same range/no-fraction rules (or use
strconv.ParseInt when appropriate) and return false on out-of-range or lossy
values. Ensure all branches consistently return (0,false) on
invalid/unrepresentable inputs.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@core/schemas/utils.go`:
- Around line 882-886: The copy of ToolFunctionParameters currently aliases the
keyOrder slice (copyParams := &ToolFunctionParameters{ ..., keyOrder:
original.Function.Parameters.keyOrder, ... }), so mutations to
copyParams.keyOrder mutate the original; change this to allocate and copy the
slice (handle nil) by creating a new slice with the same length/capacity and
copying elements from original.Function.Parameters.keyOrder before assigning to
copyParams.keyOrder so the copy is independent.
- Around line 403-446: SafeExtractInt64 currently permits unsafe coercions:
update the uint/uint64 branches and the float32/float64/json.Number/string
handling to reject values that would overflow or be lossy. Specifically, in
SafeExtractInt64 check uint/uint8/uint16/uint32/uint64 values against
math.MaxInt64 and return false if they exceed it; for float32/float64 first
reject NaN and Inf, then ensure the value is within int64 range and has no
fractional part (e.g., math.Trunc(f)==f) before converting; for json.Number and
string parsing, parse into the appropriate numeric type and validate the same
range/no-fraction rules (or use strconv.ParseInt when appropriate) and return
false on out-of-range or lossy values. Ensure all branches consistently return
(0,false) on invalid/unrepresentable inputs.

In `@framework/configstore/migrations.go`:
- Around line 6577-6585: The migration updates plugin.Config/ConfigJSON and
EncryptionStatus but does not update the persisted config hash; before calling
tx.Save(&plugin) recompute and set the plugin's config hash field
(plugin.ConfigHash) to reflect the new serialized/configured value so the row is
not left dirty. Locate where plugin.Config, plugin.ConfigJSON and
plugin.EncryptionStatus are set and either call the existing helper that
recalculates the hash (e.g., RecomputeConfigHash/ComputeConfigHash) or serialize
the new config to the same form used elsewhere and assign the resulting hash to
plugin.ConfigHash, then proceed with tx.Save(&plugin).

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 7de8a571-6e42-434f-8b0e-da6e39a782f0

📥 Commits

Reviewing files that changed from the base of the PR and between b3cf939 and 580d859.

📒 Files selected for processing (4)
  • .github/workflows/helm-release.yml
  • .github/workflows/scripts/validate-schema-sync.sh
  • core/schemas/utils.go
  • framework/configstore/migrations.go
✅ Files skipped from review due to trivial changes (2)
  • .github/workflows/scripts/validate-schema-sync.sh
  • .github/workflows/helm-release.yml

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants