feat(pyroscope): Replace Parca gRPC debuginfo upload with Pyroscope Connect API#5779
Conversation
Alloy cannot use otel v1.40.0 yet (grafana/alloy#5779). grpc v1.78.0 only requires otel/sdk/metric v1.38.0, so v1.39.0 is sufficient for this module. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
| // newHTTP2Client wraps an existing HTTP client's transport with HTTP/2 support. | ||
| // For HTTPS, HTTP/2 is negotiated via ALPN automatically. For plain HTTP (h2c), | ||
| // we use the http2 transport with AllowHTTP and a custom dialer. | ||
| func newHTTP2Client(base *http.Client) *http.Client { |
There was a problem hiding this comment.
this is garbage. is there no way to configure h2 with commonconfig.NewClientFromConfig?
There was a problem hiding this comment.
this still looks like garbage and idk how to make it nice
korniltsev-grafanista-yolo-vibecoder239
left a comment
There was a problem hiding this comment.
Code Review
Overall this is a well-structured migration from Parca gRPC to Pyroscope Connect bidi streaming. Clean dependency removal, good fan-out upgrade, solid test coverage. A few issues worth addressing:
HIGH
1. Bidi stream never fully closed — CloseResponse() / final response never read
pyroscope_uploader.go:attemptUpload and receive_http/debuginfo.go:Upload
After sending all chunks and calling stream.CloseRequest(), neither the uploader nor the proxy ever calls stream.CloseResponse() or reads the server's final response. This means:
- The server may send a final error/confirmation that goes unread → silent upload failures
- HTTP/2 streams may leak (client holds the receive half-open)
The Connect bidi protocol expects the client to close both sides.
Fix: After CloseRequest(), drain the server response (or at minimum call CloseResponse()) in both files.
2. Proxy: all downstreams failing init silently returns "upload not needed"
receive_http/debuginfo.go, init loop (~lines 53-73 in the new code)
When ALL downstream clients fail during the init handshake (Send or Receive errors), the code continues past each. Result: downstreams is empty, anyShouldUpload is false, proxy sends back ShouldInitiateUpload: false — telling the caller "no upload needed" when all downstreams are unreachable. Silent data loss.
Fix: After the init loop, check len(downstreams) == 0 && len(clients) > 0 and return connect.NewError(connect.CodeUnavailable, ...).
3. h2c client drops auth headers and tracing from base client
write/write.go:newH2CClient()
The h2c client creates a fresh http2.Transport that does NOT inherit the base client's transport chain:
- No auth headers (basic auth, bearer token from
HTTPClientConfig) - No tracing (base transport is wrapped with
otelhttp.NewTransport) - No custom dialer (proxy settings,
DialContextfrom base)
Only Timeout, CheckRedirect, and Jar are copied. Debuginfo uploads over plain HTTP will lack authentication.
Fix: Wrap the h2c transport in the same tracing/auth middleware, or document this as a known limitation with a TODO.
MEDIUM
4. No stream cleanup on early error returns in attemptUpload()
pyroscope_uploader.go — when attemptUpload() returns early (e.g., stream.Send() fails in Step 1), the bidi stream is never closed. No defer for cleanup.
Fix: Add defer func() { _ = stream.CloseRequest(); _ = stream.CloseResponse() }() after stream := client.Upload(ctx).
5. Chunk read loop may lose last bytes (fragile io.EOF handling)
pyroscope_uploader.go, Step 4 chunk loop:
n, err := reader.Read(buffer)
if err == io.EOF {
break // If n > 0 here, those bytes are lost
}Per io.Reader contract, a reader MAY return n > 0, io.EOF on the final read. Safe with bufio.Reader today, but fragile if the reader changes. Standard pattern:
n, err := reader.Read(buffer)
if n > 0 { /* send chunk */ }
if errors.Is(err, io.EOF) { break }
if err != nil { return ... }6. Proxy chunk fan-out silently ignores downstream send errors
receive_http/debuginfo.go, chunk loop:
_ = ds.stream.Send(req) // error discardedA downstream that fails mid-upload continues receiving (and failing on) all subsequent chunks. Never removed from activeStreams, never logged.
Fix: Track per-stream errors atomically and either remove failed streams or log the failure.
7. Proxy: non-EOF receive errors treated as success
receive_http/debuginfo.go:
if err == io.EOF || err != nil {
break
}If upstream disconnects mid-stream (transport error, not EOF), the proxy breaks and returns nil — success. Partial uploads to downstreams are never flagged.
Fix: Differentiate io.EOF (normal) from other errors (propagate).
LOW / nits
parcadirectory name is now misleading — contains exclusively Pyroscope code. Follow-up rename recommended.readAtCloserSizereturning(0, nil)for non-Stater readers → permanent skip via retry cache. Pre-existing.- Fragile string matching:
err.Error() == "no backing file for anonymous memory"andinitResp.Reason == ReasonUploadInProgress. Pre-existing pattern. context.Background()in upload path (upload.go) — no cancellation possible for individual uploads.- No test for
stripTextSection=truepath (temp file +elfwriter.OnlyKeepDebug()).
Positive observations
- Clean dependency removal — all
buf.build/gen/go/parca-dev/parcagone from go.mod, go.sum, and source DebugInfoClients() []DebuginfoServiceClientis the right design for fan-out- Good integration tests with real HTTP/2 TLS test servers
- Thread safety properly handled throughout
- Safe defaults (
UploadEnabled: false) - Smart vdso skip
…ttp2 package (#5810) ## Summary Copy `http_config.go` from `prometheus/common/config` into a new local `promhttp2` package with minimal changes, so that we can modify the HTTP client behavior for the pyroscope write path in follow-up PRs. i.e. enabling h2c for #5779 Changes: - Copy `http_config.go` and its tests from prometheus/common into `promhttp2` package - Switch `write.go` to use `promhttp2.NewClientFromConfig` - Add a thin reflection wrapper for `NewOAuth2RoundTripper` to work around its unexported `*httpClientOptions` parameter - Add tests to detect upstream struct/signature drift on prometheus/common upgrades ## Test plan - [x] `go build ./internal/component/pyroscope/write/promhttp2/...` - [x] `go test ./internal/component/pyroscope/write/promhttp2/...` - [ ] CI passes 🤖 Generated with [Claude Code](https://claude.com/claude-code)
…onnect API Replace the old Parca gRPC-based debuginfo/symbol upload implementation with the new Pyroscope Connect bidirectional streaming API. This simplifies the upload protocol from a 4-step gRPC flow to a single bidi stream. Key changes: - New Connect-based PyroscopeSymbolUploader replacing ParcaSymbolUploader - Appender interface now exposes ConnectClient()/ConnectClients() instead of the old parca gRPC Client() - receive_http proxy fans out uploads to ALL downstream Connect clients - New undocumented symbol_upload_enabled config knob in pyroscope.ebpf (disabled by default, for internal use) - Removed all parca gRPC dependencies (buf.build/gen/go/parca-dev/parca) - Bumped github.com/grafana/pyroscope/api to debuginfo-upload branch Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> refactor(pyroscope): Remove redundant ConnectClient() from Appender interface ConnectClients() (plural) already covers the single-client case, making ConnectClient() (singular) unnecessary. Remove it from the interface and all 10 implementations. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> test(pyroscope): Add integration tests for PyroscopeSymbolUploader Add 7 tests exercising the uploader against a mock Connect server: - Success flow with data integrity verification - Server declining upload (cached in retry) - Upload-in-progress reason handling - Empty buildID fallback to fileID - Large file multi-chunk streaming (6.5MB → 3 chunks) - Dedup via in-progress tracker - End-to-end worker queue processing Also fix a bug where CloseResponse() after CloseRequest() canceled the HTTP/2 stream before the server finished reading buffered chunks. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> fix(pyroscope): Fix CI lint errors and update collector go.mod - Replace sync/atomic with go.uber.org/atomic (depguard) - Fix gofmt formatting in appender_mock.go, receive_http_test.go, relabel_test.go - Remove unused *httptest.Server return from startMockServer (unparam) - Run go mod tidy on collector module Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> fix(pyroscope): Update extension/alloyengine go.mod for pyroscope/api dep Run generate-module-dependencies to sync extension/alloyengine module with updated pyroscope/api dependency and removal of parca deps. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> fix(pyroscope): Address PR review — don't put fileID hash in GnuBuildId When buildID is empty, pass it as empty GnuBuildId rather than substituting the fileID hash. The fileID already goes into OtelFileId. Update TestAttemptUpload_EmptyBuildID to verify both fields. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> test(pyroscope): Add integration tests for receive_http debuginfo proxy Add 5 tests for the debuginfo upload fan-out proxy in receive_http: - Single endpoint accepts upload: verifies init metadata and chunk data - Multiple endpoints all accept: all 3 receive identical data - Multiple endpoints all decline: proxy returns false, no chunks sent - Mixed accept/decline: only accepting endpoints receive chunks - No endpoints: returns connect.CodeUnavailable Also fix proxy bug: remove CloseResponse() on downstream streams that canceled HTTP/2 streams before servers finished reading buffered chunks. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> feat(pyroscope): Add symb_cache_enabled config knob to pyroscope.ebpf Add undocumented symb_cache_enabled attribute (default true) to allow disabling the local irsymcache symbolizer. When false, the eBPF component skips creating the symbol cache and passes nil to the reporter, producing unsymbolized profiles. This is useful for testing server-side symbolization via debuginfo upload. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> chore(pyroscope): Bump pyroscope/api to downgrade-otel branch, revert otel SDK to v1.39.0 - Bump github.com/grafana/pyroscope/api to PR grafana#4897 revision which downgrades otel SDK dependency from v1.40.0 to v1.39.0 - Revert otel/sdk and otel/sdk/metric to v1.39.0 across all modules (go.mod, collector/go.mod, extension/alloyengine/go.mod) to fix runtime schema URL conflict between 1.37.0 and 1.39.0 - Rename parca_uploader_test.go to pyroscope_uploader_test.go Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> fix(pyroscope): Add h2c transport for debuginfo upload and populate InitArguments Two fixes discovered during manual integration testing: 1. The Connect debuginfo client needs HTTP/2 for bidi streaming. For HTTPS this works via ALPN, but for plain HTTP endpoints we need h2c. Add newHTTP2Client() wrapper in pyroscope.write that creates an h2c- capable transport for the debuginfo Connect client. 2. Populate UploadJob.InitArguments with default values (CacheSize=1024, QueueSize=64, WorkerNum=4) in reportExecutableForDebugInfoUpload. Without these, the LRU cache fails with "capacity must be positive". Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> fix(pyroscope): Skip vdso upload in debuginfo uploader Virtual DSOs (linux-vdso, [vdso]) have no backing file and no build ID, so uploading them always fails. Skip them early in Upload() to avoid unnecessary error logs. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> refactor(pyroscope): Address PR review comments - Remove unnecessary pyroscope_uploader_stub.go (old code never had one) - Move debuginfo InitArguments defaults to ebpf component args as a debug_info block, pass debuginfo.Arguments as-is to UploadJob - Fix h2c client: only use h2c transport for plain HTTP endpoints, pass base client through for HTTPS (HTTP/2 via ALPN) - Rename ConnectClients to DebugInfoClients across all interfaces and implementations Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> chore(pyroscope): Bump pyroscope/api to v1.3.2 release tag Update github.com/grafana/pyroscope/api from pseudo-version to the v1.3.2 release tag across all modules. This tag includes the otel SDK downgrade to v1.39.0, avoiding the runtime schema URL conflict. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> refactor(pyroscope): Remove h2c client, use HTTP/2 TLS in tests Remove the h2c transport from write.go — the debuginfo Connect client now uses the same HTTP client as the push client. For HTTPS endpoints, HTTP/2 is negotiated via ALPN automatically. Refactor proxy tests to use startProxyServer() which wraps the Component's Upload handler in an httptest TLS server with HTTP/2, eliminating the need for h2c in tests entirely. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> fix(pyroscope): Add h2c transport for debuginfo upload Bidi streaming (debuginfo upload) requires HTTP/2. For HTTPS this works via ALPN, but both receive_http and pyroscope serve h2c (HTTP/2 over cleartext). Add newH2CClient() that uses an h2c transport for plain HTTP endpoints and passes through the base client for HTTPS. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> refactor(pyroscope): Reuse base client settings in h2c transport The h2c transport now reuses the base HTTP client's DialContext (for proxy/timeout), Timeout, CheckRedirect, and Jar settings instead of creating an isolated client that ignores HTTPClientConfig. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> refactor(pyroscope): Clean up h2c client, document limitation Simplify newH2CClient and document that h2c is not supported by commonconfig.NewClientFromConfig or the Go standard library. Reference internal/service/cluster/cluster.go which uses the same pattern. For HTTPS endpoints the base client is returned as-is (HTTP/2 via ALPN). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> fix(pyroscope): Explicitly set UploadEnabled=false in default DebugInfoArguments Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> refactor(pyroscope): Remove symbol_upload_enabled, use debug_info block Remove the standalone symbol_upload_enabled flag from pyroscope.ebpf args. The upload is now controlled by the upload attr in the debug_info block (DebugInfoArguments.UploadEnabled). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> fix(pyroscope): Fix gofmt and stuttering function name - Fix gofmt alignment in appender_mock.go - Fix getDebugInfoDebugInfoClients -> getDebugInfoClients stutter Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
e44738f to
d64bb46
Compare
Summary
Replace the Parca gRPC-based debuginfo/symbol upload with the new Pyroscope Connect bidirectional streaming API (pyroscope#4648). This simplifies the upload protocol from a 4-step gRPC flow (
ShouldInitiateUpload→InitiateUpload→Upload→MarkUploadFinished) to a single bidi stream (Uploadwith init/chunk messages).ParcaSymbolUploaderwithPyroscopeSymbolUploaderusing Connect bidi streamingClient() debuginfogrpc.DebuginfoServiceClientwithConnectClient()/ConnectClients()across theAppenderinterface hierarchyreceive_httpdebuginfo proxy to fan out uploads to all downstream Connect clients (not just the first one)symbol_upload_enabledconfig knob topyroscope.ebpf(disabled by default, for internal use)buf.build/gen/go/parca-dev/parca)github.com/grafana/pyroscope/apito thedebuginfo-uploadbranchDetails
New upload protocol (Connect bidi stream)
The new protocol uses a single
UploadRPC with bidirectional streaming:ShouldInitiateUploadRequestwithFileMetadata(build ID, file name, type)ShouldInitiateUploadResponse(boolean + reason)UploadChunkmessages (3MB chunks)receive_http proxy fan-out
The
receive_httpcomponent now collects all Connect clients from all downstream appendables viaConnectClients()and replicates uploads to every endpoint that responds withshould_initiate_upload = true. Previously the proxy only forwarded to the first available gRPC client.Configuration
New undocumented attribute in
pyroscope.ebpf:When enabled,
ReportExecutablecallsreportExecutableForDebugInfoUploadwhich dispatches upload jobs through the appendable chain topyroscope.writeendpoints.Files changed
Deleted:
reporter/grpc_upload_client.go— old Parca gRPC streaming uploadreporter/parca_uploader.go— old Parca symbol uploaderwrite/debuginfo_client.go— old gRPC connection factoryreceive_http/grpc.go— old gRPC server helperCreated:
reporter/pyroscope_uploader.go— new Connect-based uploader (linux/amd64+arm64)reporter/pyroscope_uploader_stub.go— non-linux stubRenamed:
write/debuginfo/parca.go→write/debuginfo/upload.goTest plan
go build ./internal/component/pyroscope/...compilesgo test ./internal/component/pyroscope/...passes (all unit tests green)go mod tidyclean — no parca dependencies remainsymbol_upload_enabled = trueagainst a Pyroscope instance with debuginfo supportsymbol_upload_enabledis omitted (default false)🤖 Generated with Claude Code