awss3exporter: add upload metrics for parquet debugging#43
Conversation
Assisted-by: ChatGPT 5.4
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdded upload-related telemetry to the awss3 exporter: new metrics for upload attempts, failures, uploaded bytes, object size histogram, and a gauge for last successful upload; wired instruments into a telemetry builder, updated exporter upload flow to return/report uploaded bytes and timestamps, and extended tests, metadata, documentation, and changelog. Changes
Sequence Diagram(s)sequenceDiagram
participant Exporter
participant Uploader
participant Telemetry
Exporter->>Telemetry: recordUploadStart() (increment uploadAttempt, record start hist)
Exporter->>Uploader: Upload(ctx, data)
Uploader-->>Exporter: (uploadedBytes, err)
alt err != nil
Exporter->>Telemetry: recordUploadComplete(err) (increment uploadFailed with reason/outcome)
else success
Exporter->>Telemetry: recordUploadComplete(success) (add uploadBytes, record uploadObjectSize histogram, set lastSuccessfulUpload gauge)
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@exporter/awss3exporter/exporter_test.go`:
- Around line 232-261: Extend TestExporterRecordsUploadFailureMetric to also
assert that success-only metrics are not reported when uploads fail: check that
"otelcol_exporter_awss3_upload_bytes",
"otelcol_exporter_awss3_upload_object_size", and
"otelcol_exporter_awss3_last_successful_upload_timestamp" either have zero
points or are absent from telemetry after calling exporter.ConsumeLogs (with
exporter.uploader set to uploaderStub{err: assert.AnError}); use the existing
telemetry helper functions (assertMetricDataPointCount/requireSumPoint or
equivalents) and the same tel instance to verify these metrics are not emitted
on failure, referencing the test name TestExporterRecordsUploadFailureMetric and
the exporter.uploader stub to locate the change.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 875a6abd-2c78-4b35-bc2f-9a6a6b97f03d
📒 Files selected for processing (3)
exporter/awss3exporter/exporter.goexporter/awss3exporter/exporter_test.goexporter/awss3exporter/metadata.yaml
Assisted-by: ChatGPT 5.4
Assisted-by: ChatGPT 5.4
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 02fc6490d8
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
cubic analysis
No issues found across 3 files
Confidence score: 5/5
- Automated review surfaced no issues in the provided summaries.
- No files require special attention.
Auto-approved: This PR is low-impact as it only adds observability metrics (upload attempts, failures, bytes, sizes) and corresponding tests without altering the core S3 upload logic.
Linked issue analysis
Linked issue: SAW-6662: Design a Snowflake-optimized S3 export format for BigPanda log querying
| Status | Acceptance criteria | Notes |
|---|---|---|
| ❌ | define one shared Snowflake-facing query contract for BigPanda workloads | No schema or query contract files or SQL in diffs |
| ❌ | implement track A: Snowflake-friendly Parquet on plain S3 with Snowflake external table + secure view | No Parquet export, Snowflake external table, or view implementation |
| ❌ | implement track B: Snowflake-friendly schema on Amazon S3 Tables / Iceberg with Snowflake Iceberg integration | No S3 Tables/Iceberg schema or Snowflake Iceberg integration code |
| ❌ | align both tracks with BigPanda’s real query patterns: time-window filters | No query examples, indexing, or filter support in diffs |
| ❌ | align both tracks with BigPanda’s real query patterns: service filters | No service-filter schema or query support present |
| ❌ | align both tracks with BigPanda’s real query patterns: status filters | No status-filtering implementation or schema changes |
| ❌ | align both tracks with BigPanda’s real query patterns: message search | No message search indexing or query support in diffs |
| ❌ | align both tracks with BigPanda’s real query patterns: JSON-property lookups | No typed columns or JSON-property extraction code added |
| ❌ | align both tracks with BigPanda’s real query patterns: common aggregations | No aggregation examples, queries, or optimizations present |
| ❌ | compare performance, bytes scanned, Snowflake cost, and implementation / operational complexity | No benchmarks, byte measurements, cost estimates, or comparison reports |
| ❌ | shared target schema / query contract for BigPanda log queries (deliverable) | No shared target schema or contract artifacts in diffs |
| ❌ | working prototype for track A (deliverable) | No prototype code for plain S3 Parquet + Snowflake external table |
| ❌ | working prototype for track B (deliverable) | No prototype code for S3 Tables / Iceberg + Snowflake integration |
| ❌ | benchmark corpus and reproducible comparison results (deliverable) | No benchmark corpus, harness, or reproducible results included |
| ❌ | Snowflake cost estimate for BigPanda for both paths (deliverable) | No cost estimation or billing analysis present in diffs |
| ❌ | final recommendation for the production path (deliverable) | No recommendation document or decision record in changes |
| ❌ | both prototype paths are queryable in Snowflake with the agreed query corpus (success criteria) | No Snowflake-queryable artifacts or query corpus provided |
| ❌ | both paths are benchmarked on the same representative dataset and Snowflake setup (success criteria) | No benchmarking or Snowflake setup in diffs |
| ❌ | final output includes p50/p95, bytes scanned, cost estimate, and implementation/ops tradeoff summary (success criteria) | No performance metrics, summary, or tradeoff analysis added |
| ❌ | ticket closes with a recommendation for which path should be productionized next (success criteria) | No closing recommendation or decision included in PR |
Assisted-by: ChatGPT 5.4
There was a problem hiding this comment.
1 issue found across 7 files (changes from recent commits).
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="exporter/awss3exporter/internal/metadata/generated_telemetry_test.go">
<violation number="1" location="exporter/awss3exporter/internal/metadata/generated_telemetry_test.go:62">
P3: Fix the tracer assertion failure message to reference `Tracer` instead of `Meter`.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review, or fix all with cubic.
|
@sawmills-architect-review please re-review this PR after the latest webhook fix; the prior run stalled before session start. |
Assisted-by: ChatGPT 5.4
|
@sawmills-architect-review re-review after startup orphan cleanup fix. |
Assisted-by: ChatGPT 5.4
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@exporter/awss3exporter/exporter_test.go`:
- Around line 195-229: The test currently uses assert.Greater(t,
lastSuccessfulUploadPoint.Value, int64(0)) which only verifies presence; change
it to assert the timestamp falls within the controlled window by replacing that
assertion with a range check using beforeConsume.Unix() and afterConsume.Unix()
(e.g. assert.GreaterOrEqual(t, lastSuccessfulUploadPoint.Value,
beforeConsume.Unix()) and assert.LessOrEqual(t, lastSuccessfulUploadPoint.Value,
afterConsume.Unix())), ensuring you cast types to int64 if needed and keep using
the lastSuccessfulUploadPoint variable from exporter_test.go.
In `@exporter/awss3exporter/exporter.go`:
- Around line 52-63: The metric name constants (e.g., flushStartMetricName,
uploadAttemptMetricName, uploadCompleteMetricName,
lastSuccessfulUploadMetricName, and the other constants in this file and the
similar block around the upload metrics at the later section) are defined with
the prefix "otelcol_exporter_awss3_*" but the PR metadata/docs use
"otelcol_awss3_*"; update all metric name constants in exporter.go to use the
consistent "otelcol_awss3_*" prefix (or change the docs to
"otelcol_exporter_awss3_*" if you prefer the other scheme) so runtime instrument
names match the generated metadata—ensure you change every occurrence (both the
block shown and the block at lines ~213-229) to the chosen prefix.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 42d4748e-63fb-4c47-a123-0f3fe2b470a2
📒 Files selected for processing (9)
exporter/awss3exporter/documentation.mdexporter/awss3exporter/exporter.goexporter/awss3exporter/exporter_test.goexporter/awss3exporter/generated_package_test.goexporter/awss3exporter/internal/metadata/generated_telemetry.goexporter/awss3exporter/internal/metadata/generated_telemetry_test.goexporter/awss3exporter/internal/metadatatest/generated_telemetrytest.goexporter/awss3exporter/internal/metadatatest/generated_telemetrytest_test.goexporter/awss3exporter/metadata.yaml
✅ Files skipped from review due to trivial changes (3)
- exporter/awss3exporter/generated_package_test.go
- exporter/awss3exporter/internal/metadatatest/generated_telemetrytest_test.go
- exporter/awss3exporter/internal/metadatatest/generated_telemetrytest.go
|
@sawmills-architect-review re-review after checkout-resolution fix. |
Assisted-by: ChatGPT 5.4
There was a problem hiding this comment.
Telemetry Review
Advisory: 2 medium findings in changed files.
Key findings:
- [MEDIUM] @ exporter/awss3exporter/exporter.go:356 — Five newly added metrics plus pre-existing flush_complete and flush_duration all carry a 'reason' attribute sourced from flushMetadata.reason — a raw string returned by any LogMarshalerWithFlushMetadata implementation with no enumeration constraint. flushOutcomeAttrs() appends the attribute unconditionally when non-empty. The open interface allows any future or third-party marshaler to inject per-batch, per-error, or per-message strings, causing unbounded series growth across all 7 instruments with no guard at the emission path.
- [MEDIUM] @ exporter/awss3exporter/metadata.yaml:12 — metadata.yaml defines telemetry metric keys with an exporter_ prefix, but mdatagen drops that segment and emits otelcol_awss3_* names. The manual constants in exporter.go use otelcol_exporter_awss3_* names. This prefix mismatch means the generated file and the manual file will never agree on metric names without an explicit correction to metadata.yaml.
These findings are advisory; address them before re-requesting review if applicable.
There was a problem hiding this comment.
0 issues found across 3 files (changes from recent commits).
Auto-approved: Adds internal telemetry metrics for AWS S3 exporter. Changes are primarily boilerplate generated code and tests. Core S3 upload logic remains intact, with low-risk instrumentation added.
|
@sawmills-architect-review re-review after issue-comment checkout fix. |
Assisted-by: ChatGPT 5.4
There was a problem hiding this comment.
Compared with previous telemetry review: 0 new, 1 still open, 0 fixed, 0 accepted.
Advisory: [MEDIUM] Five newly added metrics (uploadAttempt, uploadFailed, uploadBytes, uploadObjectSize, lastSuccessfulUpload) plus pre-existing flushComplete and flushDuration all carry a 'reason' attribute sourced from flushMetadata.reason — a raw string returned by any marshalerWithFlushMetadata implementation with no enumeration constraint. flushOutcomeAttrs() appends the attribute unconditionally when non-empty. The open interface allows any current or future marshaler to inject per-batch, per-error, or per-message strings, causing unbounded series growth across all 7 instruments. With 3 signal values × 2 outcome values × N reason values, cardinality scales linearly with distinct reason strings emitted. The org has an active awsS3 destination for logs, making this a live emission path.
Action: Constrain 'reason' to a closed enum at the interface boundary. Define a named type (e.g., FlushReason string) with a fixed set of constants (e.g., FlushReasonBatchFull, FlushReasonTimeout, FlushReasonShutdown) and update MarshalLogsWithFlushMetadata to return that type instead of a raw string. Alternatively, add an allowlist check in flushOutcomeAttrs() that maps unknown values to a sentinel like 'other' before appending the attribute, preventing unbounded series growth without breaking existing marshalers.
Location: exporter/awss3exporter/exporter.go:356.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@exporter/awss3exporter/documentation.md`:
- Line 1: Replace the reference-style comment "[comment]: <> (Code generated by
mdatagen. DO NOT EDIT.)" at the top of exporter/awss3exporter/documentation.md
with a plain HTML comment so markdownlint MD053 no longer treats it as an unused
link definition; update the file by removing the reference-style link syntax and
inserting a standard HTML comment containing the same message to indicate
generated code.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: cacf5cc7-f600-4d1a-8d3b-40abd67d296f
📒 Files selected for processing (8)
exporter/awss3exporter/documentation.mdexporter/awss3exporter/exporter_test.goexporter/awss3exporter/internal/metadata/generated_telemetry.goexporter/awss3exporter/internal/metadata/generated_telemetry_test.goexporter/awss3exporter/internal/metadatatest/generated_telemetrytest.goexporter/awss3exporter/internal/metadatatest/generated_telemetrytest_test.goexporter/awss3exporter/internal/upload/partition.goexporter/awss3exporter/metadata.yaml
✅ Files skipped from review due to trivial changes (3)
- exporter/awss3exporter/internal/upload/partition.go
- exporter/awss3exporter/internal/metadata/generated_telemetry_test.go
- exporter/awss3exporter/internal/metadatatest/generated_telemetrytest_test.go
🚧 Files skipped from review as they are similar to previous changes (3)
- exporter/awss3exporter/metadata.yaml
- exporter/awss3exporter/exporter_test.go
- exporter/awss3exporter/internal/metadata/generated_telemetry.go
There was a problem hiding this comment.
0 issues found across 6 files (changes from recent commits).
Auto-approved: Adds internal telemetry metrics for AWS S3 exporter. Changes are primarily observability-focused, including generated metadata, documentation, and tests, with very low risk to core upload logic.
|
@sawmills-architect-review re-review this PR |
|
@sawmills-architect-review please re-review after the architect bot routing fix |
Assisted-by: ChatGPT 5.4
There was a problem hiding this comment.
🧹 Nitpick comments (1)
exporter/awss3exporter/internal/upload/writer.go (1)
70-70: Tighten theUploadsize contract on failure.
uploadedBytesis computed before the S3 call and returned even whenUploadObjectfails. The current exporter ignores it onerr, but this new interface is easy for a future caller to misuse and over-count failed uploads. Safer contract: return0when the upload fails, or explicitly document that the size is only valid whenerr == nil.Possible fix
_, err = sw.uploader.UploadObject(ctx, uploadInput) - return uploadedBytes, err + if err != nil { + return 0, err + } + return uploadedBytes, nilAlso applies to: 106-107
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@exporter/awss3exporter/internal/upload/writer.go` at line 70, The Upload implementation computes uploadedBytes from content.Len() before calling the S3 upload and returns that value even on error; change the contract so callers never get a non-zero size on failure by either computing uploadedBytes only after a successful UploadObject call or by setting uploadedBytes = 0 whenever UploadObject returns an error; update all error-return paths in the Upload method (including the secondary upload path currently around the retry/alternative upload logic) to return 0 on err and only return the actual byte count when err == nil, referencing the Upload function, the uploadedBytes variable, and the call to UploadObject.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@exporter/awss3exporter/internal/upload/writer.go`:
- Line 70: The Upload implementation computes uploadedBytes from content.Len()
before calling the S3 upload and returns that value even on error; change the
contract so callers never get a non-zero size on failure by either computing
uploadedBytes only after a successful UploadObject call or by setting
uploadedBytes = 0 whenever UploadObject returns an error; update all
error-return paths in the Upload method (including the secondary upload path
currently around the retry/alternative upload logic) to return 0 on err and only
return the actual byte count when err == nil, referencing the Upload function,
the uploadedBytes variable, and the call to UploadObject.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: c8744a7b-62f7-411f-9349-8c9a0de0ffa2
📒 Files selected for processing (5)
exporter/awss3exporter/exporter.goexporter/awss3exporter/exporter_test.goexporter/awss3exporter/internal/upload/writer.goexporter/awss3exporter/internal/upload/writer_test.goexporter/awss3exporter/s3_writer_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
- exporter/awss3exporter/exporter_test.go
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit b5ea95d. Configure here.
Assisted-by: ChatGPT 5.4
|
@sawmills-architect-review re-review after head-aware lifecycle watcher fix. |
|
@sawmills-architect-review re-review now that the stale run was cleared after the head-aware lifecycle watcher fix. |
There was a problem hiding this comment.
0 issues found across 1 file (changes from recent commits).
Requires human review: Changes the upload.Manager interface signature and internal upload logic. Modifying functional contracts in core exporter paths requires human validation for potential side effects.

awss3exporter: add upload metrics for parquet debugging
Adds the missing awss3 upload-side metrics needed to explain parquet handoff and S3 behavior; based on a fresh branch from current origin/main.
Description
Note
Medium Risk
Adds new internal metrics and changes the
upload.Managerinterface to return uploaded byte counts, which affects all uploader implementations and could impact upload call sites if missed. Runtime behavior is otherwise additive/observability-focused, with tests updated for success and failure paths.Overview
Adds upload-side telemetry to
awss3exporterto improve observability of parquet/S3 handoff: counters for upload attempts, failures, and uploaded bytes, a histogram for uploaded object sizes, and a gauge for last successful upload timestamp (with explicitBy/ms/sunits).Updates the upload path to have
upload.Manager.Uploadreturn the final uploaded size (post-compression) so the exporter can recordupload_bytes,upload_object_size, and last-success time only on successful uploads; expands/adjusts unit tests and generated metadata/docs/changelog accordingly.Reviewed by Cursor Bugbot for commit a9e975e. Bugbot is set up for automated code reviews on this repo. Configure here.
Summary by cubic
Adds upload-side telemetry to
awss3exporterto make Parquet handoff and S3 uploads observable. Supports SAW-6662 by tracking attempts, failures, bytes, object size, and the last successful upload; sizes reflect post-compression bytes.New Features
metadata.yaml, generatesdocumentation.md, and extends tests to cover success/failure and metric absence on failures.Bug Fixes
upload_bytesandupload_object_size; sets explicit metric units (By,s,ms, and1); fixes tracer assertion message and lints (bind templatenowtotime.Now, import order).Written for commit a9e975e. Summary will update on new commits.
Summary by CodeRabbit
New Features
Tests
Documentation