Skip to content

fix for ignoring recording status of spans#136978

Closed
jackshirazi wants to merge 1 commit intoelastic:mainfrom
jackshirazi:main
Closed

fix for ignoring recording status of spans#136978
jackshirazi wants to merge 1 commit intoelastic:mainfrom
jackshirazi:main

Conversation

@jackshirazi
Copy link
Contributor

  1. APMTracer receives the parent task's trace context and sets that as the parent of a new span.
            final Context parentContext = getParentContext(traceContext);
                spanBuilder.setParent(parentContext);
  1. APMTracer requests a new span, setting the start time to the same as the transaction start time
            Instant startTime = traceContext.getTransient(Task.TRACE_START_TIME);
                spanBuilder.setStartTimestamp(startTime);
            final Span span = spanBuilder.startSpan();
  1. The span is created but is too many for the transaction because transaction_max_spans is exceeded for this span, so it is set as a non-recording span
  2. APMTracer stores the Context containing the span in its spans map, ie this new span is now the next parent span so will be stored as the task's trace context
  • this is incorrect because the span is a non-recording span, so should be discarded
  • the result of the incorrect approach is that you can have a chain of descendent spans in a transaction which can be unlimited in size, and all with the same start time

@elasticsearchmachine elasticsearchmachine added needs:triage Requires assignment of a team area label v9.3.0 external-contributor Pull request authored by a developer outside the Elasticsearch team labels Oct 22, 2025
@mosche mosche self-assigned this Oct 23, 2025
@mosche mosche added auto-backport Automatically create backport pull requests when merged :Core/Infra/Metrics Metrics and metering infrastructure v9.2.1 v8.19.7 v9.1.7 and removed needs:triage Requires assignment of a team area label labels Oct 23, 2025
@elasticsearchmachine elasticsearchmachine added the Team:Core/Infra Meta label for core/infra team label Oct 23, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra (Team:Core/Infra)

final Span span = spanBuilder.startSpan();
// If the agent decided not to record this span (e.g., due to transaction_max_spans), isRecording() will be false.
if (span.isRecording() == false) {
logger.warn("Elastic APM agent has reached the transaction_max_spans limit. Span [{}] [{}] will not be recorded.", spanId, spanName);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately that would spam our logs massively thinking about the 500k discarded spams we've just seen.
I'll have a closer look, I think we need to track when we transition to not recording in the context somehow.

@mosche
Copy link
Contributor

mosche commented Oct 23, 2025

Thanks a lot and great find @jackshirazi 💚
I've opened #137015 including your fix to address the necessary changes in tests needed for this. (Since this PR was opened against your main branch of your fork, I wasn't able to push to this PR)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto-backport Automatically create backport pull requests when merged :Core/Infra/Metrics Metrics and metering infrastructure external-contributor Pull request authored by a developer outside the Elasticsearch team Team:Core/Infra Meta label for core/infra team v8.19.7 v9.1.7 v9.2.1 v9.3.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants

Comments