Missing traces on Hypertrace (RawSpansGrouper) #315

13shivam · 2022-03-28T11:58:35Z

Observation:

Traces dropping in one of our internal system running on php backend, this happens if span/nested span structure exceds rawSpansGrouperConfig.maxSpanCount which internally is used here: refer

we tried isolating the internal method line on which spans were getting dropped, count of spans per route generated are around 1100, including nested spans.

Progress so far:

we looked at the configs of jaeger-agent and otel-collector no significant spikes were seen on the pod memory+cpu,
next we checked the MTU limit in the pod, which uses Jumbo 9001 (reg: Network maximum transmission unit (MTU) for your EC2 instance) this too doesnt looked like a bottleneck,
next we tried taking dump of nw throughput on our pod using tshark and we were able to see our spans which confirmed that spans on the app side are started and closed properly,
we also ran our service for some routes pointing to jaeger:all-in-one in local, but were able to see 400+ spans,

finally,

we tried increasing the max.span.count on RSG to 2000 from 250, sample with ~1070 spans (1.3Mb file size), this time we were able to see the spans on our env.

the issue is that, once the max count is reached, RSG should truncate the spans, but in this case entire trace payload is getting dropped.

13shivam · 2022-03-28T12:01:27Z

exit calls on hypertrace after increasing max span

13shivam · 2022-03-28T12:02:15Z

spans in jaeger all in one running locally

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Missing traces on Hypertrace (RawSpansGrouper) #315

Missing traces on Hypertrace (RawSpansGrouper) #315

13shivam commented Mar 28, 2022

13shivam commented Mar 28, 2022

13shivam commented Mar 28, 2022

Missing traces on Hypertrace (RawSpansGrouper) #315

Missing traces on Hypertrace (RawSpansGrouper) #315

Comments

13shivam commented Mar 28, 2022

13shivam commented Mar 28, 2022

13shivam commented Mar 28, 2022