feat: add debug info upload service by korniltsev-grafanista · Pull Request #4648 · grafana/pyroscope

korniltsev-grafanista · 2025-11-28T07:53:53Z

Summary

Add a new DebuginfoService that allows profiling agents (e.g., the OTel eBPF profiler) to upload debug information (ELF executables) to Pyroscope's object storage. The symbolizer then uses these uploaded files as an additional source when resolving symbols, falling back to debuginfod if no upload is available.

This is an ALPHA state implementation for internal use and testing. Breaking changes will follow.

What's included

New proto API (debuginfo.v1alpha1): Bidirectional streaming Upload RPC using Connect. Clients send a ShouldInitiateUploadRequest, receive a response indicating whether to proceed, then stream UploadChunk messages. The API is marked v1alpha1 to signal experimental status.
Debug info store (pkg/debuginfo/store.go): Handles upload lifecycle — validates GNU build IDs, checks for duplicate/in-progress/stale uploads via ObjectMetadata persisted as JSON in object storage, streams the uploaded binary to the bucket.
Upload reader (pkg/debuginfo/reader/): An io.Reader adapter that assembles streamed chunks into a contiguous reader for the object storage upload.
Symbolizer integration: The symbolizer now checks the debug info upload bucket (keyed by tenant + GNU build ID) before falling back to debuginfod. The symbolizer bucket was changed from a prefixed bucket to the shared storage bucket, with explicit path prefixes (symbolizer/, debug-info/).
API wiring: The service is registered on the distributor's HTTP handler with auth and a 5MB per-message limit.

Key design decisions

Own proto, not Parca's: Inspired by Parca's debug info upload but uses Pyroscope's own protobuf definitions with Connect-Go handlers instead of gRPC.
Metadata in object storage: Upload state (UPLOADING/UPLOADED) is tracked via a JSON metadata file alongside the binary in the bucket, avoiding the need for an additional database.
Stale upload recovery: If a previous upload started but never finished and exceeds MaxUploadDuration + 2min, it's considered stale and can be retried.

Test plan

Unit tests for Store upload flow, metadata lifecycle, and edge cases (pkg/debuginfo/store_test.go)
Unit tests for UploadReader chunk assembly (pkg/debuginfo/reader/reader_test.go)
Symbolizer tests for the uploaded debug info fetch path (pkg/symbolizer/symbolizer_test.go)
Manual test: upload debug info via profiling agent, verify symbolization uses uploaded file

🤖 Generated with Claude Code

Note

Medium Risk
Adds a new authenticated streaming upload endpoint that writes executables/metadata to shared object storage and changes symbolization to read from those uploads, which can affect storage usage and symbol resolution behavior if misconfigured or abused.

Overview
Adds an alpha Connect bidirectional streaming DebuginfoService.Upload API (debuginfo.v1alpha1) plus generated Go/OpenAPI artifacts, enabling agents to upload ELF executables in chunks after an init/"should upload" handshake.

Implements pkg/debuginfo.Store to validate GNU build IDs, track upload state via JSON ObjectMetadata in the bucket, detect stale/in-progress uploads, enforce max upload size/duration, and stream chunks into object storage via a new UploadReader helper.

Wires the service into the distributor HTTP API with new connectOptionsDebugInfo (auth + 5MB read limit) and configuration flags, and updates the symbolizer to (1) use tenant-scoped object-store keys for cached Lidia data and (2) try fetching uploaded debuginfo (debug-info/<tenant>/<buildid>/exe) before calling debuginfod; tests updated/added accordingly.

^{Written by Cursor Bugbot for commit 68c2fab. This will update automatically on new commits. Configure here.}

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Add debug-level logging to help troubleshoot debuginfo upload issues: - Log ShouldInitiateUpload results (build_id, decision, reason) - Log Upload gRPC results (success with size, or failure with error) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Debuginfod fetching is now handled in the symbolizer instead of in the parca debuginfo store. This removes the debuginfod client implementation, related tests, and test fixtures from the parca/debuginfo package. Also removes tenant ID handling from the debuginfo store since it's not needed for this use case. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

…test Refactor the store test to use an HTTP/2 server with h2c and the util.AuthenticateUser middleware instead of a direct gRPC server. This better reflects the production setup where requests go through HTTP middleware for tenant authentication. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

marcsanmi · 2026-01-20T17:59:41Z

+
+	t := noop.Tracer{}
+	l = log.With(l, "component", "debug-info")
+	bucket = objstore.NewPrefixedBucket(bucket, symbolizer.BucketPrefixParcaDebugInfo)


I think we don't have a cleanup mechanism to control unbounded growth (neither for symbolizer path). Could it be an issue?

I think it's fine to keep lidia's indefinitely. And preferably to remove the debug information files once they are converted to lidia. All of this is hard to implement in a stateless manner in distributors, so I propose to postpone these decision until we have a proper place/way to implement this https://github.com/grafana/pyroscope-squad/issues/687

Co-authored-by: Marc Sanmiquel <marcsanmiquel@gmail.com>

korniltsev-grafanista · 2026-02-16T07:27:41Z

Just wanted to clarify, the PR was not ready for review, only the new protobufs and overall intention to use the new protobufs.

I also think we should have an idea how to move forward with infinite data growth (both lidia and debug info)

korniltsev-grafanista · 2026-03-13T00:27:35Z

I also think we should have an idea how to move forward with infinite data growth (both lidia and debug info)

I talked with @\simonswine and we agreed to solve this later, and merge this for private use, not for public.

Remove parcaBucket parameter from test helper and all callers after the symbolizer was refactored to use a single bucket for both lidia and parca debug info storage. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Move proto package from debuginfo.v1 to debuginfo.v1alpha1 and update all generated code and consumer imports accordingly. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Rename proto fields to snake_case (gnu_build_id, go_build_id, otel_file_id) - Rename enum DebuginfoType → Type, add TYPE_UNSPECIFIED, fix prefix - Fix bug: set STATE_UPLOADED after upload completes - Replace gRPC status.Error with connect.NewError for consistency - Rename fetchFromParca → fetchFromUploadedDebugInfo - Revert ProcessELFData back to private method on Symbolizer Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

korniltsev-grafanista · 2026-03-13T01:29:48Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 9757e94dc9

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

#4896) * add unit tests for debuginfo service Cover the store logic (ValidateGnuBuildID, validateInit, NewStore, shouldInitiateUpload, uploadIsStale, fetchMetadata, writeMetadata, ObjectPath, MetadataObjectPath) and the chunked upload reader (Read, Size, context cancellation) with ~43 test cases across two new test files. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix store_test.go to use base branch protobuf field names The base branch (debuginfo-upload) uses GnuBuildId and FileMetadata_TYPE_* naming in the generated protobuf, while this worktree had a divergent proto with GNU and DEBUGINFO_TYPE_* names. Align the test file with the base branch so CI compiles correctly. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix gofmt alignment in store_test.go Align struct field formatting to pass gofmt check in CI. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * add end-to-end connect client tests for debuginfo Upload RPC Add TestUploadE2E with three subtests that exercise the full Upload bidi streaming RPC over a real HTTP/2 test server with connect client: - full upload flow: init -> chunks -> verify stored data and metadata - already uploaded: pre-populate metadata, verify service declines upload - disabled service: verify service returns ReasonDisabled Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * remove unused net/http import from store_test.go Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix e2e test to use TLS for HTTP/2 bidi streaming BidiStream requires HTTP/2. Switch from h2c (cleartext) to TLS via StartTLS() which negotiates HTTP/2 through ALPN. Remove unused h2c and http2 imports. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix store.go and e2e test: align with base branch protobuf, fix upload state bug - Align store.go with base branch protobuf field names (GnuBuildId, TYPE_EXECUTABLE_FULL/NO_TEXT) and use connect.NewError instead of grpc status.Error - Add missing md.State = STATE_UPLOADED after successful upload - Fix e2e test to wait for server handler completion by calling Receive() after CloseRequest() before checking bucket contents Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Fix wrong error messages: "init.FileData" → "init.File" to match actual field name - Enforce MaxUploadSize in UploadReader to prevent unlimited chunked uploads - Tenant-isolate lidia cache paths to prevent cross-tenant cache poisoning Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

korniltsev-grafanista · 2026-03-13T03:41:44Z

@codex review

korniltsev-grafanista · 2026-03-13T03:42:32Z

@cursor-bugbot review

cursor

Cursor Bugbot has reviewed your changes and found 4 potential issues.

Autofix Details

Bugbot Autofix prepared fixes for all 4 issues found in the latest run.

✅ Fixed: filepath.Join produces wrong paths on non-Unix OS
- Changed lidiaObjectPath to use path.Join so object-storage keys always use forward slashes across OSes.
✅ Fixed: Removed not-found cache causes extra bucket lookups
- Reintroduced the debuginfod not-found cache short-circuit in getLidiaBytes to avoid unnecessary object-store lookups for known-missing build IDs.
✅ Fixed: IDE run config commits debug-level logging settings
- Removed the debug logging/output-file/large upload-size flags from the shared .idea v2 run configuration and restored non-debug defaults.
✅ Fixed: Reader drops bytes when current chunk ends with data
- Reworked UploadReader.Read to return already-read bytes before advancing chunks and added tests covering n>0 with io.EOF behavior.

Or push these changes by commenting:

@cursor push 5b9bdf720c

Preview (5b9bdf720c)

diff --git a/.idea/runConfigurations/v2.xml b/.idea/runConfigurations/v2.xml
--- a/.idea/runConfigurations/v2.xml
+++ b/.idea/runConfigurations/v2.xml
@@ -1,10 +1,9 @@
 <component name="ProjectRunConfigurationManager">
   <configuration default="false" name="v2" type="GoApplicationRunConfiguration" factoryName="Go Application">
-    <output_file path="intellij-pyroscope.log" is_save="true" />
     <module name="pyroscope" />
     <working_directory value="$PROJECT_DIR$" />
     <go_parameters value="-tags embedassets" />
-    <parameters value="-log.level=debug -storage.backend=filesystem -write-path=segment-writer -enable-query-backend=true -symbolizer.enabled=true -debug-info.max-upload-size=1000000000" />
+    <parameters value="-storage.backend=filesystem -write-path=segment-writer -enable-query-backend=true" />
     <envs>
       <env name="PYROSCOPE_V2" value="true" />
     </envs>

diff --git a/pkg/debuginfo/reader/reader.go b/pkg/debuginfo/reader/reader.go
--- a/pkg/debuginfo/reader/reader.go
+++ b/pkg/debuginfo/reader/reader.go
@@ -24,39 +24,39 @@
 }
 
 func (r *UploadReader) Read(p []byte) (int, error) {
-	if r.cur == nil {
-		var err error
-		r.cur, err = r.next()
-		if err == io.EOF {
-			return 0, io.EOF
+	for {
+		if r.cur == nil {
+			var err error
+			r.cur, err = r.next()
+			if err == io.EOF {
+				return 0, io.EOF
+			}
+			if err != nil {
+				return 0, fmt.Errorf("get upload chunk (%d bytes read so far): %w", r.size, err)
+			}
 		}
-		if err != nil {
-			return 0, fmt.Errorf("get first upload chunk: %w", err)
+
+		i, err := r.cur.Read(p)
+		if i > 0 {
+			r.size += uint64(i)
+			if r.maxSize > 0 && r.size > uint64(r.maxSize) {
+				return 0, fmt.Errorf("upload size %d exceeds maximum allowed size of %d bytes", r.size, r.maxSize)
+			}
+			if err == io.EOF {
+				r.cur = nil
+			}
+			return i, nil
 		}
-	}
-	i, err := r.cur.Read(p)
-	if err != nil && err != io.EOF {
-		return 0, fmt.Errorf("read upload chunk (%d bytes read so far): %w", r.size, err)
-	}
-	if err == io.EOF {
-		r.cur, err = r.next()
+
 		if err == io.EOF {
-			return 0, io.EOF
+			r.cur = nil
+			continue
 		}
 		if err != nil {
-			return 0, fmt.Errorf("get next upload chunk (%d bytes read so far): %w", r.size, err)
+			return 0, fmt.Errorf("read upload chunk (%d bytes read so far): %w", r.size, err)
 		}
-		i, err = r.cur.Read(p)
-		if err != nil {
-			return 0, fmt.Errorf("read next upload chunk (%d bytes read so far): %w", r.size, err)
-		}
+		return 0, nil
 	}
-
-	r.size += uint64(i)
-	if r.maxSize > 0 && r.size > uint64(r.maxSize) {
-		return 0, fmt.Errorf("upload size %d exceeds maximum allowed size of %d bytes", r.size, r.maxSize)
-	}
-	return i, nil
 }
 
 func (r *UploadReader) next() (io.Reader, error) {

diff --git a/pkg/debuginfo/reader/reader_test.go b/pkg/debuginfo/reader/reader_test.go
--- a/pkg/debuginfo/reader/reader_test.go
+++ b/pkg/debuginfo/reader/reader_test.go
@@ -22,6 +22,20 @@
 	}
 }
 
+type eofAfterDataReader struct {
+	data []byte
+	read bool
+}
+
+func (r *eofAfterDataReader) Read(p []byte) (int, error) {
+	if r.read {
+		return 0, io.EOF
+	}
+	r.read = true
+	n := copy(p, r.data)
+	return n, io.EOF
+}
+
 func TestUploadReader_Read(t *testing.T) {
 	t.Parallel()
 
@@ -102,6 +116,28 @@
 		require.Error(t, err)
 		assert.Contains(t, err.Error(), "broken")
 	})
+
+	t.Run("reader returning data with EOF keeps bytes", func(t *testing.T) {
+		t.Parallel()
+
+		r := New(context.Background(), chunksFunc(nil), 0)
+		r.cur = &eofAfterDataReader{data: []byte("abc")}
+
+		data, err := io.ReadAll(r)
+		require.NoError(t, err)
+		assert.Equal(t, "abc", string(data))
+	})
+
+	t.Run("reader returning data with EOF does not overwrite buffer", func(t *testing.T) {
+		t.Parallel()
+
+		r := New(context.Background(), chunksFunc([][]byte{[]byte("def")}), 0)
+		r.cur = &eofAfterDataReader{data: []byte("abc")}
+
+		data, err := io.ReadAll(r)
+		require.NoError(t, err)
+		assert.Equal(t, "abcdef", string(data))
+	})
 }
 
 func TestUploadReader_Size(t *testing.T) {

diff --git a/pkg/symbolizer/symbolizer.go b/pkg/symbolizer/symbolizer.go
--- a/pkg/symbolizer/symbolizer.go
+++ b/pkg/symbolizer/symbolizer.go
@@ -9,6 +9,7 @@
 	"fmt"
 	"io"
 	"net/http"
+	"path"
 	"path/filepath"
 	"time"
 
@@ -417,6 +418,15 @@
 }
 
 func (s *Symbolizer) getLidiaBytes(ctx context.Context, buildID string) ([]byte, error) {
+	if client, ok := s.client.(*DebuginfodHTTPClient); ok {
+		if sanitizedBuildID, err := sanitizeBuildID(buildID); err == nil {
+			if found, _ := client.notFoundCache.Get(sanitizedBuildID); found {
+				s.metrics.cacheOperations.WithLabelValues("not_found", "get", statusSuccess).Inc()
+				return nil, buildIDNotFoundError{buildID: buildID}
+			}
+		}
+	}
+
 	tenantID, err := tenant.TenantID(ctx)
 	if err != nil {
 		return nil, err
@@ -461,7 +471,7 @@
 }
 
 func lidiaObjectPath(tenantID, buildID string) string {
-	return filepath.Join(bucketPrefix, tenantID, buildID)
+	return path.Join(bucketPrefix, tenantID, buildID)
 }
 
 // fetchLidiaFromDebuginfod fetches debug info from debuginfod and converts to Lidia format

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b33f1a7572

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-03-13T03:51:14Z

+	if err := s.bucket.Upload(ctx, ObjectPath(tenantID, id), r); err != nil {
+		return connect.NewError(connect.CodeInternal, fmt.Errorf("upload debuginfo: %w", err))
+	}
+	md.State = debuginfov1alpha1.ObjectMetadata_STATE_UPLOADED
+	md.FinishedAt = timestamppb.New(time.Now())


Reject empty upload streams before marking metadata uploaded

If a client sends init and then closes the stream without any chunk, debuginforeader.New(...).Read returns EOF immediately, bucket.Upload succeeds with a zero-byte object, and this code still marks metadata as STATE_UPLOADED; from that point shouldInitiateUpload returns ReasonDebuginfoAlreadyExists, so the tenant is effectively stuck with an unusable debug-info object until manual cleanup.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-03-13T03:51:14Z

 // fetchLidiaFromDebuginfod fetches debug info from debuginfod and converts to Lidia format
 func (s *Symbolizer) fetchLidiaFromDebuginfod(ctx context.Context, buildID string) ([]byte, error) {
-	debugReader, err := s.fetchFromDebuginfod(ctx, buildID)
+	debugReader, err := s.fetch(ctx, buildID)


Enforce symbol size limits for uploaded debug-info reads

Switching this call to s.fetch allows uploaded debug-info files to flow through fetchLidiaFromDebuginfod, but that path then does an unbounded io.ReadAll on the returned reader before limit checks; unlike the debuginfod HTTP client path, this bypasses SymbolizerMaxSymbolSizeBytes for uploaded files, so tenants with stricter symbol-size limits can still trigger large in-memory reads from object storage.

Useful? React with 👍 / 👎.

- Use path.Join instead of filepath.Join in lidiaObjectPath for OS-independent object storage keys - Restore tenant-scoped not-found cache in getLidiaBytes to reduce redundant object storage lookups for unknown build IDs - Extract MaxSizeReader from UploadReader into a composable wrapper - Handle simultaneous data+EOF in UploadReader.Read per io.Reader contract - Revert debug-specific IDE run configuration changes Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

korniltsev-grafanista · 2026-03-13T04:09:30Z

@cursor-bugbot review

cursor

Cursor Bugbot has reviewed your changes and found 3 potential issues.

Autofix Details

Bugbot Autofix prepared fixes for all 3 issues found in the latest run.

✅ Fixed: Silently swallowed errors hide uploaded debug info failures
- Added a debug log in fetch when uploaded debug info retrieval fails before falling back to debuginfod so fallback-causing errors are visible.
✅ Fixed: Not-found cache blocks newly uploaded debug info lookups
- Removed the tenant-scoped not-found cache short-circuit and write path from getLidiaBytes so newly uploaded debug info is not blocked by stale not-found entries.
✅ Fixed: MaxSizeReader drops read bytes on limit exceeded
- Updated MaxSizeReader.Read to return the actual bytes read with the size-limit error and added a test to verify partial-read behavior.

Or push these changes by commenting:

@cursor push 2ec2c9ddbb

Preview (2ec2c9ddbb)

diff --git a/pkg/debuginfo/reader/reader.go b/pkg/debuginfo/reader/reader.go
--- a/pkg/debuginfo/reader/reader.go
+++ b/pkg/debuginfo/reader/reader.go
@@ -94,7 +94,7 @@
 	n, err := r.r.Read(p)
 	r.n += int64(n)
 	if r.maxSize > 0 && r.n > r.maxSize {
-		return 0, fmt.Errorf("upload size %d exceeds maximum allowed size of %d bytes", r.n, r.maxSize)
+		return n, fmt.Errorf("upload size %d exceeds maximum allowed size of %d bytes", r.n, r.maxSize)
 	}
 	return n, err
 }

diff --git a/pkg/debuginfo/reader/reader_test.go b/pkg/debuginfo/reader/reader_test.go
--- a/pkg/debuginfo/reader/reader_test.go
+++ b/pkg/debuginfo/reader/reader_test.go
@@ -170,6 +170,21 @@
 		assert.Contains(t, err.Error(), "exceeds maximum allowed size")
 	})
 
+	t.Run("returns already-read bytes when limit exceeded", func(t *testing.T) {
+		t.Parallel()
+		r := New(context.Background(), chunksFunc([][]byte{
+			[]byte("abcdef"),
+		}))
+		lr := NewMaxSizeReader(r, 5)
+
+		buf := make([]byte, 16)
+		n, err := lr.Read(buf)
+		require.Error(t, err)
+		assert.Equal(t, 6, n)
+		assert.Equal(t, "abcdef", string(buf[:n]))
+		assert.Contains(t, err.Error(), "exceeds maximum allowed size")
+	})
+
 	t.Run("zero means unlimited", func(t *testing.T) {
 		t.Parallel()
 		r := New(context.Background(), chunksFunc([][]byte{

diff --git a/pkg/symbolizer/symbolizer.go b/pkg/symbolizer/symbolizer.go
--- a/pkg/symbolizer/symbolizer.go
+++ b/pkg/symbolizer/symbolizer.go
@@ -423,15 +423,6 @@
 		return nil, err
 	}
 
-	// Check not-found cache with tenant-scoped key to avoid repeated object storage lookups
-	// for build IDs that are known to be unavailable for this tenant.
-	notFoundCacheKey := tenantID + "/" + buildID
-	if httpClient, ok := s.client.(*DebuginfodHTTPClient); ok {
-		if found, _ := httpClient.notFoundCache.Get(notFoundCacheKey); found {
-			return nil, buildIDNotFoundError{buildID: buildID}
-		}
-	}
-
 	lidiaBytes, err := s.fetchLidiaFromObjectStore(ctx, tenantID, buildID)
 	if err == nil {
 		s.metrics.cacheOperations.WithLabelValues("object_storage", "get", statusSuccess).Inc()
@@ -441,13 +432,6 @@
 
 	lidiaBytes, err = s.fetchLidiaFromDebuginfod(ctx, buildID)
 	if err != nil {
-		// Cache not-found errors with tenant-scoped key to avoid repeated lookups.
-		var bnfErr buildIDNotFoundError
-		if errors.As(err, &bnfErr) {
-			if httpClient, ok := s.client.(*DebuginfodHTTPClient); ok {
-				httpClient.notFoundCache.SetWithTTL(notFoundCacheKey, true, 1, httpClient.cfg.NotFoundCacheTTL)
-			}
-		}
 		return nil, err
 	}
 
@@ -514,6 +498,8 @@
 func (s *Symbolizer) fetch(ctx context.Context, buildID string) (io.ReadCloser, error) {
 	if r, err := s.fetchFromUploadedDebugInfo(ctx, buildID); err == nil {
 		return r, nil
+	} else {
+		level.Debug(s.logger).Log("msg", "failed to fetch uploaded debug info, falling back to debuginfod", "buildID", buildID, "err", err)
 	}
 	return s.fetchFromDebuginfod(ctx, buildID)
 }

cursor · 2026-03-13T04:10:11Z

+		return r, nil
+	}
+	return s.fetchFromDebuginfod(ctx, buildID)
+}


Silently swallowed errors hide uploaded debug info failures

Medium Severity

The fetch method silently discards all errors from fetchFromUploadedDebugInfo, including non-transient errors like bucket connectivity issues or permission failures. When uploaded debug info exists but can't be read due to an infrastructure problem, the symbolizer silently falls through to debuginfod with no logging, making it impossible to diagnose why uploaded debug info is never used.

- Remove MaxSizeReader wrapper; check r.Size() after upload and delete oversized objects in store.go - Remove duplicate not-found cache in symbolizer (debuginfod client already caches not-found build IDs) - Bound io.ReadAll in fetchLidiaFromDebuginfod with readAllWithLimit to enforce SymbolizerMaxSymbolSizeBytes for uploaded debug-info files Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

cursor

Cursor Bugbot has reviewed your changes and found 2 potential issues.

There are 3 total unresolved issues (including 1 from previous review).

Bugbot Autofix prepared fixes for both issues found in the latest run.

✅ Fixed: Upload size check happens after full upload completes
- The upload now streams through an io.LimitReader capped at MaxUploadSize+1, preventing unbounded writes to object storage before size validation.
✅ Fixed: Metadata left as UPLOADING after size-exceeded rejection
- On size-exceeded rejection the code now deletes both the uploaded object and its metadata so retries are not blocked by stale STATE_UPLOADING metadata.

Or push these changes by commenting:

@cursor push faf97c8dda

Preview (faf97c8dda)

diff --git a/pkg/debuginfo/store.go b/pkg/debuginfo/store.go
--- a/pkg/debuginfo/store.go
+++ b/pkg/debuginfo/store.go
@@ -165,11 +165,16 @@
 	}
 
 	r := debuginforeader.New(ctx, readChunksFromStream(stream))
-	if err := s.bucket.Upload(ctx, ObjectPath(tenantID, id), r); err != nil {
+	uploadReader := io.Reader(r)
+	if s.cfg.MaxUploadSize > 0 {
+		uploadReader = io.LimitReader(r, s.cfg.MaxUploadSize+1)
+	}
+	if err := s.bucket.Upload(ctx, ObjectPath(tenantID, id), uploadReader); err != nil {
 		return connect.NewError(connect.CodeInternal, fmt.Errorf("upload debuginfo: %w", err))
 	}
 	if s.cfg.MaxUploadSize > 0 && r.Size() > uint64(s.cfg.MaxUploadSize) {
 		_ = s.bucket.Delete(ctx, ObjectPath(tenantID, id))
+		_ = s.bucket.Delete(ctx, MetadataObjectPath(tenantID, id))
 		return connect.NewError(connect.CodeInvalidArgument,
 			fmt.Errorf("upload size %d exceeds maximum allowed size of %d bytes", r.Size(), s.cfg.MaxUploadSize))
 	}

diff --git a/pkg/debuginfo/store_test.go b/pkg/debuginfo/store_test.go
--- a/pkg/debuginfo/store_test.go
+++ b/pkg/debuginfo/store_test.go
@@ -657,4 +657,83 @@
 		require.NoError(t, stream.CloseRequest())
 		require.NoError(t, stream.CloseResponse())
 	})
+
+	t.Run("oversized upload is rejected and upload state is cleaned up", func(t *testing.T) {
+		t.Parallel()
+		store, bucket := newTestStore(t, Config{
+			Enabled:           true,
+			MaxUploadDuration: time.Minute,
+			MaxUploadSize:     8,
+		})
+		client := startTestServer(t, store)
+
+		ctx := tenant.InjectTenantID(context.Background(), "test-tenant")
+		stream := client.Upload(ctx)
+
+		err := stream.Send(&debuginfov1alpha1.UploadRequest{
+			Data: &debuginfov1alpha1.UploadRequest_Init{
+				Init: &debuginfov1alpha1.ShouldInitiateUploadRequest{
+					File: &debuginfov1alpha1.FileMetadata{
+						GnuBuildId: "aabbccdd",
+						Name:       "my-binary",
+						Type:       debuginfov1alpha1.FileMetadata_TYPE_EXECUTABLE_FULL,
+					},
+				},
+			},
+		})
+		require.NoError(t, err)
+
+		resp, err := stream.Receive()
+		require.NoError(t, err)
+		initResp := resp.GetInit()
+		require.NotNil(t, initResp)
+		assert.True(t, initResp.ShouldInitiateUpload)
+		assert.Equal(t, ReasonFirstTimeSeen, initResp.Reason)
+
+		for _, chunk := range [][]byte{[]byte("12345"), []byte("67890")} {
+			err = stream.Send(&debuginfov1alpha1.UploadRequest{
+				Data: &debuginfov1alpha1.UploadRequest_Chunk{
+					Chunk: &debuginfov1alpha1.UploadChunk{Chunk: chunk},
+				},
+			})
+			require.NoError(t, err)
+		}
+
+		require.NoError(t, stream.CloseRequest())
+		_, err = stream.Receive()
+		require.Error(t, err)
+		assert.Equal(t, connect.CodeInvalidArgument, connect.CodeOf(err))
+		assert.Contains(t, err.Error(), "exceeds maximum allowed size")
+		require.NoError(t, stream.CloseResponse())
+
+		id := mustValidateGnuBuildID(t, "aabbccdd")
+		objects := bucket.Objects()
+		_, objectExists := objects[ObjectPath("test-tenant", id)]
+		_, metadataExists := objects[MetadataObjectPath("test-tenant", id)]
+		assert.False(t, objectExists)
+		assert.False(t, metadataExists)
+
+		retryStream := client.Upload(ctx)
+		err = retryStream.Send(&debuginfov1alpha1.UploadRequest{
+			Data: &debuginfov1alpha1.UploadRequest_Init{
+				Init: &debuginfov1alpha1.ShouldInitiateUploadRequest{
+					File: &debuginfov1alpha1.FileMetadata{
+						GnuBuildId: "aabbccdd",
+						Name:       "my-binary",
+						Type:       debuginfov1alpha1.FileMetadata_TYPE_EXECUTABLE_FULL,
+					},
+				},
+			},
+		})
+		require.NoError(t, err)
+
+		retryResp, err := retryStream.Receive()
+		require.NoError(t, err)
+		retryInitResp := retryResp.GetInit()
+		require.NotNil(t, retryInitResp)
+		assert.True(t, retryInitResp.ShouldInitiateUpload)
+		assert.Equal(t, ReasonFirstTimeSeen, retryInitResp.Reason)
+		require.NoError(t, retryStream.CloseRequest())
+		require.NoError(t, retryStream.CloseResponse())
+	})
 }

cursor · 2026-03-13T04:23:29Z

+		_ = s.bucket.Delete(ctx, ObjectPath(tenantID, id))
+		return connect.NewError(connect.CodeInvalidArgument,
+			fmt.Errorf("upload size %d exceeds maximum allowed size of %d bytes", r.Size(), s.cfg.MaxUploadSize))
+	}


Upload size check happens after full upload completes

Medium Severity

The MaxUploadSize validation occurs after s.bucket.Upload has already written the entire payload to object storage. An authenticated client can stream unlimited data (in 5MB chunks) to the bucket; only after the upload finishes is the size checked and the object deleted. This enables resource exhaustion of the storage backend since there is no streaming size limit enforced during the upload itself.

cursor · 2026-03-13T04:23:29Z

+		_ = s.bucket.Delete(ctx, ObjectPath(tenantID, id))
+		return connect.NewError(connect.CodeInvalidArgument,
+			fmt.Errorf("upload size %d exceeds maximum allowed size of %d bytes", r.Size(), s.cfg.MaxUploadSize))
+	}


Metadata left as UPLOADING after size-exceeded rejection

Medium Severity

When an upload is rejected for exceeding MaxUploadSize, the object file is deleted via s.bucket.Delete but the metadata (written at line 163 as STATE_UPLOADING) is never cleaned up or updated. Subsequent upload attempts for the same build ID will see the orphaned STATE_UPLOADING metadata and be rejected with "upload in progress" until the staleness timer (MaxUploadDuration + 2 minutes) expires, effectively creating a temporary denial-of-service window for that build ID.

Additional Locations (1)

pkg/debuginfo/store.go#L156-L165

korniltsev-grafanista force-pushed the debuginfo-upload branch 5 times, most recently from d202551 to 8577f33 Compare January 16, 2026 07:36

parca symbol upload

5c053a7

korniltsev-grafanista force-pushed the debuginfo-upload branch from 8577f33 to 5c053a7 Compare January 16, 2026 07:37

test: add symbolizer test for parca bucket debuginfo source

3748d11

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

korniltsev-grafanista changed the title ~~WIP parca debug-info upload~~ parca debug-info upload Jan 16, 2026

korniltsev-grafanista marked this pull request as ready for review January 16, 2026 08:05

korniltsev-grafanista requested review from a team, aleks-p, alsoba13, marcsanmi and simonswine as code owners January 16, 2026 08:05

korniltsev-grafanista and others added 5 commits January 19, 2026 05:45

make go/mod

ebac661

make fmt

de30ec7

marcsanmi reviewed Jan 20, 2026

View reviewed changes

korniltsev-grafanista and others added 5 commits January 21, 2026 15:29

Update pkg/parca/debuginfo/metadata.go

6996be3

Co-authored-by: Marc Sanmiquel <marcsanmiquel@gmail.com>

remove writepath options from the new debuginfo handlers

a6e1367

level.Error

51b6c96

cleanup

c4a913d

remove cache

73a9406

korniltsev-grafanista mentioned this pull request Jan 22, 2026

feat: Add parcas symbols upload to pyroscope.ebpf grafana/alloy#4948

Merged

4 tasks

korniltsev-grafanista added 3 commits January 22, 2026 14:51

Merge branch 'main' into debuginfo-upload

ebf62a6

init

e0bb853

add gopclntab

2574f90

korniltsev-grafanista marked this pull request as draft February 16, 2026 07:25

korniltsev-grafanista changed the title ~~parca debug-info upload~~ debug-info upload Mar 13, 2026

Merge remote-tracking branch 'origin/main' into debuginfo-upload

7cc54f2

korniltsev-grafanista and others added 3 commits March 13, 2026 10:59

fix symbolizer tests after New() signature change

755bdd2

Remove parcaBucket parameter from test helper and all callers after the symbolizer was refactored to use a single bucket for both lidia and parca debug info storage. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

rename debuginfo API package to v1alpha1 to mark as experimental

2693f99

Move proto package from debuginfo.v1 to debuginfo.v1alpha1 and update all generated code and consumer imports accordingly. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

revert lidia change

889b668

korniltsev-grafanista-yolo-vibecoder239 mentioned this pull request Mar 13, 2026

lidia: tolerate elf.ErrNoSymbols when both symbol tables are missing #4895

Merged

korniltsev-grafanista and others added 2 commits March 13, 2026 11:26

cleanup

2fcace4

chatgpt-codex-connector Bot reviewed Mar 13, 2026

View reviewed changes

Comment thread pkg/symbolizer/symbolizer.go

Comment thread pkg/debuginfo/store.go

korniltsev-grafanista-yolo-vibecoder239 mentioned this pull request Mar 13, 2026

feat(pyroscope): Replace Parca gRPC debuginfo upload with Pyroscope Connect API grafana/alloy#5779

Closed

5 tasks

korniltsev-grafanista-yolo-vibecoder239 and others added 2 commits March 13, 2026 13:06

fix proto formatting (buf format)

ba25e78

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

korniltsev-grafanista changed the title ~~debug-info upload~~ feat: add debug info upload service Mar 13, 2026

korniltsev-grafanista marked this pull request as ready for review March 13, 2026 03:42

cursor Bot reviewed Mar 13, 2026

View reviewed changes

Comment thread pkg/symbolizer/symbolizer.go

Comment thread pkg/symbolizer/symbolizer.go

Comment thread .idea/runConfigurations/v2.xml Outdated

Comment thread pkg/debuginfo/reader/reader.go

chatgpt-codex-connector Bot reviewed Mar 13, 2026

View reviewed changes

cursor Bot reviewed Mar 13, 2026

View reviewed changes

korniltsev-grafanista merged commit e9a5181 into main Mar 13, 2026
25 checks passed

korniltsev-grafanista deleted the debuginfo-upload branch March 13, 2026 04:38

Conversation

korniltsev-grafanista commented Nov 28, 2025 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What's included

Key design decisions

Test plan

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

marcsanmi Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

korniltsev-grafanista Jan 22, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

korniltsev-grafanista commented Feb 16, 2026

Uh oh!

korniltsev-grafanista commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

korniltsev-grafanista commented Mar 13, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

korniltsev-grafanista commented Mar 13, 2026

Uh oh!

korniltsev-grafanista commented Mar 13, 2026

Uh oh!

cursor Bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

korniltsev-grafanista commented Mar 13, 2026

Uh oh!

cursor Bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cursor Bot Mar 13, 2026

Choose a reason for hiding this comment

Silently swallowed errors hide uploaded debug info failures

Uh oh!

Uh oh!

Uh oh!

cursor Bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cursor Bot Mar 13, 2026

Choose a reason for hiding this comment

Upload size check happens after full upload completes

Uh oh!

cursor Bot Mar 13, 2026

Choose a reason for hiding this comment

Metadata left as UPLOADING after size-exceeded rejection

Uh oh!

Uh oh!

Reviewers

korniltsev-grafanista commented Nov 28, 2025 •

edited by cursor Bot

Loading

korniltsev-grafanista commented Mar 13, 2026 •

edited

Loading

cursor Bot left a comment •

edited

Loading

cursor Bot left a comment •

edited

Loading

cursor Bot left a comment •

edited

Loading