Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gRPC batch without service name crashes the collector #1722

Closed
jpkrohling opened this issue Aug 8, 2019 · 3 comments · Fixed by #1723
Closed

gRPC batch without service name crashes the collector #1722

jpkrohling opened this issue Aug 8, 2019 · 3 comments · Fixed by #1723
Assignees
Labels

Comments

@jpkrohling
Copy link
Contributor

When developing the Jaeger Span Exporter via gRPC for OpenTelemetry, I got into a situation where I have a batch, but no Process object. Sending such a batch to the collector via gRPC results in the server crashing, due to:

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0xbe67d1]

goroutine 187 [running]:
github.com/jaegertracing/jaeger/cmd/collector/app.metricsBySvc.ReportServiceNameForSpan(0xc0002d00f0, 0xc0002d0120, 0x1b893e0, 0xc0002638b0, 0xc0004a5368, 0xfa0, 0x13c3eb9, 0x8, 0xc0002d09c0, 0xc0002d1260, ...)
	/home/travis/gopath/src/github.com/jaegertracing/jaeger/cmd/collector/app/metrics.go:247 +0x41
github.com/jaegertracing/jaeger/cmd/collector/app.(*spanProcessor).enqueueSpan(0xc00052aea0, 0xc000308000, 0x13c0205, 0x5, 0x13bf162, 0x4, 0x0)
	/home/travis/gopath/src/github.com/jaegertracing/jaeger/cmd/collector/app/span_processor.go:145 +0x1b2
github.com/jaegertracing/jaeger/cmd/collector/app.(*spanProcessor).ProcessSpans(0xc00052aea0, 0xc0004b4038, 0x1, 0x1, 0x13c0205, 0x5, 0x13bf162, 0x4, 0xc0004f7a28, 0x9edddc, ...)
	/home/travis/gopath/src/github.com/jaegertracing/jaeger/cmd/collector/app/span_processor.go:129 +0x112
github.com/jaegertracing/jaeger/cmd/collector/app.(*GRPCHandler).PostSpans(0xc000394200, 0x1b85b60, 0xc00030e840, 0xc00052a5a0, 0xc000394200, 0xc00030e840, 0xc00060dba8)
	/home/travis/gopath/src/github.com/jaegertracing/jaeger/cmd/collector/app/grpc_handler.go:46 +0x1c3
github.com/jaegertracing/jaeger/proto-gen/api_v2._CollectorService_PostSpans_Handler(0x11db5c0, 0xc000394200, 0x1b85b60, 0xc00030e840, 0xc00009c910, 0x0, 0x1b85b60, 0xc00030e840, 0xc0000909c0, 0x58)
	/home/travis/gopath/src/github.com/jaegertracing/jaeger/proto-gen/api_v2/collector.pb.go:200 +0x23e
github.com/jaegertracing/jaeger/vendor/google.golang.org/grpc.(*Server).processUnaryRPC(0xc00021b980, 0x1b92fc0, 0xc000480a80, 0xc0000f4100, 0xc0001f3380, 0x26d7fb0, 0x0, 0x0, 0x0)
	/home/travis/gopath/src/github.com/jaegertracing/jaeger/vendor/google.golang.org/grpc/server.go:972 +0x470
github.com/jaegertracing/jaeger/vendor/google.golang.org/grpc.(*Server).handleStream(0xc00021b980, 0x1b92fc0, 0xc000480a80, 0xc0000f4100, 0x0)
	/home/travis/gopath/src/github.com/jaegertracing/jaeger/vendor/google.golang.org/grpc/server.go:1252 +0xda6
github.com/jaegertracing/jaeger/vendor/google.golang.org/grpc.(*Server).serveStreams.func1.1(0xc0004a4214, 0xc00021b980, 0x1b92fc0, 0xc000480a80, 0xc0000f4100)
	/home/travis/gopath/src/github.com/jaegertracing/jaeger/vendor/google.golang.org/grpc/server.go:691 +0x9f
created by github.com/jaegertracing/jaeger/vendor/google.golang.org/grpc.(*Server).serveStreams.func1
	/home/travis/gopath/src/github.com/jaegertracing/jaeger/vendor/google.golang.org/grpc/server.go:689 +0xa1

Of course my OpenTelemetry code has to be fixed, but the collector should never crash because of bad data.

@jpkrohling jpkrohling added the bug label Aug 8, 2019
@jpkrohling
Copy link
Contributor Author

Once that one is fixed, this one fails:

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0x1020f6b]

goroutine 155 [running]:
github.com/jaegertracing/jaeger/plugin/storage/memory.(*Store).WriteSpan(0xc00052e000, 0xc0003f2000, 0x0, 0x0)
	/mnt/storage/jpkroehling/Projects/src/github.com/jaegertracing/jaeger/plugin/storage/memory/memory.go:121 +0x7b
github.com/jaegertracing/jaeger/cmd/collector/app.(*spanProcessor).saveSpan(0xc0000a5200, 0xc0003f2000)
	/mnt/storage/jpkroehling/Projects/src/github.com/jaegertracing/jaeger/cmd/collector/app/span_processor.go:113 +0x84
github.com/jaegertracing/jaeger/cmd/collector/app.ChainedProcessSpan.func1(0xc0003f2000)
	/mnt/storage/jpkroehling/Projects/src/github.com/jaegertracing/jaeger/cmd/collector/app/model_consumer.go:34 +0x4d
github.com/jaegertracing/jaeger/cmd/collector/app.(*spanProcessor).processItemFromQueue(0xc0000a5200, 0xc0003e1860)
	/mnt/storage/jpkroehling/Projects/src/github.com/jaegertracing/jaeger/cmd/collector/app/span_processor.go:139 +0x53
github.com/jaegertracing/jaeger/cmd/collector/app.NewSpanProcessor.func1(0x11d1c60, 0xc0003e1860)
	/mnt/storage/jpkroehling/Projects/src/github.com/jaegertracing/jaeger/cmd/collector/app/span_processor.go:68 +0x45
github.com/jaegertracing/jaeger/pkg/queue.(*BoundedQueue).StartConsumers.func1(0xc0001f0160, 0xc0004e7d80, 0xc0003e2f10)
	/mnt/storage/jpkroehling/Projects/src/github.com/jaegertracing/jaeger/pkg/queue/bounded_queue.go:65 +0xb1
created by github.com/jaegertracing/jaeger/pkg/queue.(*BoundedQueue).StartConsumers
	/mnt/storage/jpkroehling/Projects/src/github.com/jaegertracing/jaeger/pkg/queue/bounded_queue.go:58 +0xae
exit status 2

@pavolloffay
Copy link
Member

Maybe the process struct should never be nil.

@yurishkuro
Copy link
Member

the business rule we need to enforce is either spans have individual Process entries, or the batch has one they can fall back to. Both being null is invalid input.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants