Skip to content

[receiver/tcplog] Add metrics to track payload size and connections created/closed#45204

Closed
anubhav21sharma wants to merge 2 commits into
open-telemetry:mainfrom
anubhav21sharma:tcplog-receiver-new-metrics
Closed

[receiver/tcplog] Add metrics to track payload size and connections created/closed#45204
anubhav21sharma wants to merge 2 commits into
open-telemetry:mainfrom
anubhav21sharma:tcplog-receiver-new-metrics

Conversation

@anubhav21sharma
Copy link
Copy Markdown
Contributor

Description

This PR adds three new metrics to the tcplog receiver (opt-in) that allows users to track the number of new tcp connections created, number of tcp connections closed and the size distribution of the incoming payload to the receiver.

Metrics from a sample run:

# HELP otelcol_tcplog_receiver_connections_closed_total Total number of connections closed by the tcp log receiver
# TYPE otelcol_tcplog_receiver_connections_closed_total counter
otelcol_tcplog_receiver_connections_closed_total{client_address="localhost",otel_scope_name="github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/operator/input/tcp",otel_scope_schema_url="",otel_scope_version=""} 1

# HELP otelcol_tcplog_receiver_connections_created_total Total number of connections created by the tcp log receiver
# TYPE otelcol_tcplog_receiver_connections_created_total counter
otelcol_tcplog_receiver_connections_created_total{client_address="localhost",otel_scope_name="github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/operator/input/tcp",otel_scope_schema_url="",otel_scope_version=""} 1

# HELP otelcol_tcplog_receiver_payload_size_bytes Size of the payload size received by the tcp log receiver
# TYPE otelcol_tcplog_receiver_payload_size_bytes histogram
otelcol_tcplog_receiver_payload_size_bytes_bucket{otel_scope_name="github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/operator/input/tcp",otel_scope_schema_url="",otel_scope_version="",le="64"} 0
otelcol_tcplog_receiver_payload_size_bytes_bucket{otel_scope_name="github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/operator/input/tcp",otel_scope_schema_url="",otel_scope_version="",le="128"} 0
otelcol_tcplog_receiver_payload_size_bytes_bucket{otel_scope_name="github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/operator/input/tcp",otel_scope_schema_url="",otel_scope_version="",le="256"} 17390
otelcol_tcplog_receiver_payload_size_bytes_bucket{otel_scope_name="github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/operator/input/tcp",otel_scope_schema_url="",otel_scope_version="",le="512"} 17390
otelcol_tcplog_receiver_payload_size_bytes_bucket{otel_scope_name="github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/operator/input/tcp",otel_scope_schema_url="",otel_scope_version="",le="1024"} 17390
otelcol_tcplog_receiver_payload_size_bytes_bucket{otel_scope_name="github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/operator/input/tcp",otel_scope_schema_url="",otel_scope_version="",le="2048"} 17390
otelcol_tcplog_receiver_payload_size_bytes_bucket{otel_scope_name="github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/operator/input/tcp",otel_scope_schema_url="",otel_scope_version="",le="4096"} 17390
otelcol_tcplog_receiver_payload_size_bytes_bucket{otel_scope_name="github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/operator/input/tcp",otel_scope_schema_url="",otel_scope_version="",le="8192"} 17390
otelcol_tcplog_receiver_payload_size_bytes_bucket{otel_scope_name="github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/operator/input/tcp",otel_scope_schema_url="",otel_scope_version="",le="16384"} 17390
otelcol_tcplog_receiver_payload_size_bytes_bucket{otel_scope_name="github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/operator/input/tcp",otel_scope_schema_url="",otel_scope_version="",le="32768"} 17390
otelcol_tcplog_receiver_payload_size_bytes_bucket{otel_scope_name="github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/operator/input/tcp",otel_scope_schema_url="",otel_scope_version="",le="65536"} 17390
otelcol_tcplog_receiver_payload_size_bytes_bucket{otel_scope_name="github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/operator/input/tcp",otel_scope_schema_url="",otel_scope_version="",le="524288"} 17390
otelcol_tcplog_receiver_payload_size_bytes_bucket{otel_scope_name="github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/operator/input/tcp",otel_scope_schema_url="",otel_scope_version="",le="1.048576e+06"} 17390
otelcol_tcplog_receiver_payload_size_bytes_bucket{otel_scope_name="github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/operator/input/tcp",otel_scope_schema_url="",otel_scope_version="",le="2.097152e+06"} 17390
otelcol_tcplog_receiver_payload_size_bytes_bucket{otel_scope_name="github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/operator/input/tcp",otel_scope_schema_url="",otel_scope_version="",le="4.194304e+06"} 17390
otelcol_tcplog_receiver_payload_size_bytes_bucket{otel_scope_name="github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/operator/input/tcp",otel_scope_schema_url="",otel_scope_version="",le="8.388608e+06"} 17390
otelcol_tcplog_receiver_payload_size_bytes_bucket{otel_scope_name="github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/operator/input/tcp",otel_scope_schema_url="",otel_scope_version="",le="1.6777216e+07"} 17390
otelcol_tcplog_receiver_payload_size_bytes_bucket{otel_scope_name="github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/operator/input/tcp",otel_scope_schema_url="",otel_scope_version="",le="3.3554432e+07"} 17390
otelcol_tcplog_receiver_payload_size_bytes_bucket{otel_scope_name="github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/operator/input/tcp",otel_scope_schema_url="",otel_scope_version="",le="6.7108864e+07"} 17390
otelcol_tcplog_receiver_payload_size_bytes_bucket{otel_scope_name="github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/operator/input/tcp",otel_scope_schema_url="",otel_scope_version="",le="1.34217728e+08"} 17390
otelcol_tcplog_receiver_payload_size_bytes_bucket{otel_scope_name="github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/operator/input/tcp",otel_scope_schema_url="",otel_scope_version="",le="+Inf"} 17390
otelcol_tcplog_receiver_payload_size_bytes_sum{otel_scope_name="github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/operator/input/tcp",otel_scope_schema_url="",otel_scope_version=""} 4.43445e+06
otelcol_tcplog_receiver_payload_size_bytes_count{otel_scope_name="github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/operator/input/tcp",otel_scope_schema_url="",otel_scope_version=""} 17390

Link to tracking issue

Fixes 45146

Testing

New test cases added to ensure metrics are being correctly populated.

Copy link
Copy Markdown
Contributor

@thompson-tomo thompson-tomo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice if these were generic collector metrics, similar to https://opentelemetry.io/docs/collector/internal-telemetry/

if c.Metrics.Enabled {
meter := set.MeterProvider.Meter("github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/operator/input/tcp")
if metricPayloadSize, err = meter.Int64Histogram(
"otelcol_tcplog_receiver_payload_size_bytes",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"otelcol_tcplog_receiver_payload_size_bytes",
"otelcol_receiver_payload_size_bytes",

Or to follow semantic conventions on namespacing:

Suggested change
"otelcol_tcplog_receiver_payload_size_bytes",
"otel.col.receiver.payload.size",

Or following additional semconv patterns

Suggested change
"otelcol_tcplog_receiver_payload_size_bytes",
"otel.col.receiver.network.io",

We could even have if we add otel.component.type attribute capturing that it is a collector reciever.

Suggested change
"otelcol_tcplog_receiver_payload_size_bytes",
"otel.col.network.io",

All options use attributes to identify the reciever ie otel.component.name as well as the network direction.

}

if metricConnectionsCreated, err = meter.Int64Counter(
"otelcol_tcplog_receiver_connections_created_total",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"otelcol_tcplog_receiver_connections_created_total",
"otelcol_receiver_connections_created_total",

Or to follow semantic conventions on namespacing:

Suggested change
"otelcol_tcplog_receiver_connections_created_total",
"otel.col.receiver.network.connection.created",

Or even which fits better with other conventions and allows consolidation.

Suggested change
"otelcol_tcplog_receiver_connections_created_total",
"otel.col.receiver.network.connection.status",

With network.connection.state attribute = established

We could even have if we add otel.component.type attribute capturing that it is a collector reciever.

Suggested change
"otelcol_tcplog_receiver_connections_created_total",
"otel.col.network.connection.status",

All options use attributes to identify the reciever ie otel.component.name

}

if metricConnectionsClosed, err = meter.Int64Counter(
"otelcol_tcplog_receiver_connections_closed_total",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"otelcol_tcplog_receiver_connections_closed_total",
"otelcol_receiver_connections_created_total",

Or to follow semantic conventions on namespacing:

Suggested change
"otelcol_tcplog_receiver_connections_closed_total",
"otel.col.receiver.network.connection.closed",

Or even which fits better with other conventions and allows consolidation.

Suggested change
"otelcol_tcplog_receiver_connections_closed_total",
"otel.col.receiver.network.connection.status",

With network.connection.state attribute = closed

We could even have if we add otel.component.type attribute capturing that it is a collector reciever.

Suggested change
"otelcol_tcplog_receiver_connections_closed_total",
"otel.col.network.connection.status",

All options use attributes to identify the reciever ie otel.component.name

@anubhav21sharma
Copy link
Copy Markdown
Contributor Author

It would be nice if these were generic collector metrics, similar to https://opentelemetry.io/docs/collector/internal-telemetry/

Thank you @thompson-tomo for your inputs!

Do you mean to create these as standard metrics that are available for all receivers, i.e. add them as metrics to the receiverhelper package in the opentelemetry-collector repo?

Please let me know if I misunderstood this. Thank you!

@thompson-tomo
Copy link
Copy Markdown
Contributor

That is correct. Note the following rfc open-telemetry/opentelemetry-collector#11406 which provide more guidance on naming.

@anubhav21sharma
Copy link
Copy Markdown
Contributor Author

I appreciate the input on naming conventions. I will change the metric names and attributes as per the RFC.

About the standardization of these metrics, I’m a bit hesitant to treat these as 'general' receiver metrics because in my opinion they don't map well to receivers that do not do any network I/O. For example, many receivers like filelog, journald, hostmetrics etc don't receive (or pull) data over the network, so the connection metrics wouldn't apply. Also defining 'payload size' universally for all receivers might be tricky, since it is very contextual per receiver (and it might be a non-sensical measure for few receivers, like the hostmetrics receiver). For such receivers, exposing a standard metric like otel.col.network.connection.status, albeit with a zero value, creates noise, I believe.

I'm relatively new to this codebase and would definitely rely on the judgement of more experienced folks here but I'm slightly leaning towards doing this change only for the tcplogreceiver because of the reasons given above.

But if we decide to move forward with the standardization of these metrics, I believe we need to cancel this PR and open a new issue in the core opentelemetry collector repo to discuss this further. Please let me know your thoughts. Really appreciate all your inputs!

@thompson-tomo
Copy link
Copy Markdown
Contributor

thompson-tomo commented Jan 2, 2026

While I agree with you network namespace it is not applicable to all recievers/components I do however think it is broadly applicable to a large number of them hence prefer to facilitate reuse rather than duplication.

Looking at the rfc, we could do the following:

  • otelcol.reciever.consumed.size for the size metrics. This would be consistent with the rfc but differ to semconv. The name otelcol.reciever.io would be a better fit with a otelcol.io.direction attribute to me.
  • otelcol.network.connection.status still seems a good fit as if no connection is used, the metric is simply omitted by the reciever. Otherwise you would create more noise by each reciever having its own definition. It also fits with semconv

It would be good if we kicked off a discussion about aligning internal telemetry definitions with semconv as that would resolve the naming question in point 1.

@anubhav21sharma
Copy link
Copy Markdown
Contributor Author

Sounds fair. I'll cancel this PR then and raise a new issue to discuss this in the opentelemetry-collector repo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[receiver/tcplog] Add internal metric to observe the payload size distribution

3 participants