You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Traces dropping in one of our internal system running on php backend, this happens if span/nested span structure exceds rawSpansGrouperConfig.maxSpanCount which internally is used here: refer
we tried isolating the internal method line on which spans were getting dropped, count of spans per route generated are around 1100, including nested spans.
Progress so far:
we looked at the configs of jaeger-agent and otel-collector no significant spikes were seen on the pod memory+cpu,
next we checked the MTU limit in the pod, which uses Jumbo 9001 (reg: Network maximum transmission unit (MTU) for your EC2 instance) this too doesnt looked like a bottleneck,
next we tried taking dump of nw throughput on our pod using tshark and we were able to see our spans which confirmed that spans on the app side are started and closed properly,
we also ran our service for some routes pointing to jaeger:all-in-one in local, but were able to see 400+ spans,
finally,
we tried increasing the max.span.count on RSG to 2000 from 250, sample with ~1070 spans (1.3Mb file size), this time we were able to see the spans on our env.
the issue is that, once the max count is reached, RSG should truncate the spans, but in this case entire trace payload is getting dropped.
The text was updated successfully, but these errors were encountered:
Observation:
Traces dropping in one of our internal system running on php backend, this happens if span/nested span structure exceds rawSpansGrouperConfig.maxSpanCount which internally is used here: refer
we tried isolating the internal method line on which spans were getting dropped, count of spans per route generated are around 1100, including nested spans.
Progress so far:
we looked at the configs of jaeger-agent and otel-collector no significant spikes were seen on the pod memory+cpu,
next we checked the MTU limit in the pod, which uses Jumbo 9001 (reg: Network maximum transmission unit (MTU) for your EC2 instance) this too doesnt looked like a bottleneck,
next we tried taking dump of nw throughput on our pod using tshark and we were able to see our spans which confirmed that spans on the app side are started and closed properly,
we also ran our service for some routes pointing to jaeger:all-in-one in local, but were able to see 400+ spans,
finally,
the issue is that, once the max count is reached, RSG should truncate the spans, but in this case entire trace payload is getting dropped.
The text was updated successfully, but these errors were encountered: