Skip to content

[otelhttp] transport metrics #3769

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 26 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
2a075fd
otelhttp transport metrics
RangelReale May 3, 2023
9710398
get attributes
RangelReale May 3, 2023
706a58a
changelog
RangelReale May 4, 2023
0c1ecad
add transport test
RangelReale May 4, 2023
6c046de
fix lint
RangelReale May 4, 2023
faed864
try data race fix
RangelReale May 8, 2023
9304781
use atomic instead of mutex
RangelReale May 9, 2023
f66d205
fix grammar
RangelReale May 10, 2023
0d2323b
use semconv for http response
RangelReale May 10, 2023
d073e3c
don't add status code if error
RangelReale May 17, 2023
3602fb7
remove unused import
RangelReale May 17, 2023
50b6074
Merge branch 'main' into otelhttp-transport-metrics
RangelReale Jul 27, 2023
b4716e0
fix conflicts
RangelReale Jul 29, 2023
4e23144
Merge branch 'main' into otelhttp-transport-metrics
pellared Jul 31, 2023
250299e
Update CHANGELOG.md
pellared Jul 31, 2023
f93b5c1
Update CHANGELOG.md
pellared Jul 31, 2023
bf3503f
fix lint
RangelReale Aug 8, 2023
eafcc75
Merge branch 'main' into otelhttp-transport-metrics
RangelReale Aug 8, 2023
9fc2629
fix changelog
RangelReale Aug 8, 2023
359859c
remove check when error
RangelReale Aug 8, 2023
45ceb1d
Update CHANGELOG.md
pellared Aug 10, 2023
165ee4b
Update instrumentation/net/http/otelhttp/common.go
RangelReale Aug 15, 2023
09066a8
review fixes
RangelReale Aug 15, 2023
75c7903
Merge remote-tracking branch 'origin/otelhttp-transport-metrics' into…
RangelReale Aug 15, 2023
4df192e
Merge branch 'main' into otelhttp-transport-metrics
RangelReale Aug 15, 2023
e28520c
Merge branch 'main' into otelhttp-transport-metrics
RangelReale Aug 22, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ This project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.htm
- The `go.opentelemetry.io/contrib/exporters/autoexport` package to provide configuration of trace exporters with useful defaults and envar support. (#2753, #4100, #4129, #4132, #4134)
- `WithRouteTag` in `go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp` adds HTTP route attribute to metrics. (#615)
- Add `WithSpanOptions` option in `go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc`. (#3768)
- Add metrics to `NewTransport` in `go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp`. (#3769)

### Fixed

Expand Down
7 changes: 7 additions & 0 deletions instrumentation/net/http/otelhttp/common.go
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,13 @@ const (
ServerLatency = "http.server.duration" // Incoming end to end duration, microseconds
)

// Client HTTP metrics.
const (
ClientRequestContentLength = "http.client.request.size" // Outgoing request bytes total
ClientResponseContentLength = "http.client.response.size" // Outgoing response bytes total
ClientLatency = "http.client.duration" // Outgoing end to end duration, microseconds
)

// Filter is a predicate used to determine whether a given http.request should
// be traced. A Filter must return true if the request should be traced.
type Filter func(*http.Request) bool
Expand Down
4 changes: 2 additions & 2 deletions instrumentation/net/http/otelhttp/handler.go
Original file line number Diff line number Diff line change
Expand Up @@ -216,15 +216,15 @@ func (h *middleware) serveHTTP(w http.ResponseWriter, r *http.Request, next http

next.ServeHTTP(w, r.WithContext(ctx))

setAfterServeAttributes(span, bw.read, rww.written, rww.statusCode, bw.err, rww.err)
setAfterServeAttributes(span, bw.read.Load(), rww.written, rww.statusCode, bw.err, rww.err)

// Add metrics
attributes := append(labeler.Get(), semconvutil.HTTPServerRequest(h.server, r)...)
if rww.statusCode > 0 {
attributes = append(attributes, semconv.HTTPStatusCode(rww.statusCode))
}
o := metric.WithAttributes(attributes...)
h.counters[RequestContentLength].Add(ctx, bw.read, o)
h.counters[RequestContentLength].Add(ctx, bw.read.Load(), o)
h.counters[ResponseContentLength].Add(ctx, rww.written, o)

// Use floating point division here for higher precision (instead of Millisecond method).
Expand Down
106 changes: 106 additions & 0 deletions instrumentation/net/http/otelhttp/test/transport_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -16,22 +16,31 @@ package test

import (
"context"
"fmt"
"io"
"net"
"net/http"
"net/http/httptest"
"net/http/httptrace"
"runtime"
"strconv"
"strings"
"testing"

"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"

"go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp"
"go.opentelemetry.io/otel/attribute"
"go.opentelemetry.io/otel/codes"
"go.opentelemetry.io/otel/propagation"
"go.opentelemetry.io/otel/sdk/instrumentation"
"go.opentelemetry.io/otel/sdk/metric"
"go.opentelemetry.io/otel/sdk/metric/metricdata"
"go.opentelemetry.io/otel/sdk/metric/metricdata/metricdatatest"
sdktrace "go.opentelemetry.io/otel/sdk/trace"
"go.opentelemetry.io/otel/sdk/trace/tracetest"
semconv "go.opentelemetry.io/otel/semconv/v1.17.0"
"go.opentelemetry.io/otel/trace"
)

Expand Down Expand Up @@ -238,3 +247,100 @@ func TestWithHTTPTrace(t *testing.T) {
assert.Equal(t, spans[2].SpanContext().SpanID(), spans[0].Parent().SpanID())
assert.Equal(t, spans[1].SpanContext().SpanID(), spans[2].Parent().SpanID())
}

func TestTransportMetrics(t *testing.T) {
reader := metric.NewManualReader()
meterProvider := metric.NewMeterProvider(metric.WithReader(reader))

content := []byte("Hello, world!")

ts := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(http.StatusOK)
if _, err := w.Write(content); err != nil {
t.Fatal(err)
}
}))
defer ts.Close()

r, err := http.NewRequest(http.MethodGet, ts.URL, nil)
if err != nil {
t.Fatal(err)
}

tr := otelhttp.NewTransport(
http.DefaultTransport,
otelhttp.WithMeterProvider(meterProvider),
)

c := http.Client{Transport: tr}
res, err := c.Do(r)
if err != nil {
t.Fatal(err)
}
require.NoError(t, res.Body.Close())

host, portStr, _ := net.SplitHostPort(r.Host)
if host == "" {
host = "127.0.0.1"
}
port, err := strconv.Atoi(portStr)
if err != nil {
port = 0
}

rm := metricdata.ResourceMetrics{}
err = reader.Collect(context.Background(), &rm)
require.NoError(t, err)
require.Len(t, rm.ScopeMetrics, 1)
attrs := attribute.NewSet(
semconv.NetPeerName(host),
semconv.NetPeerPort(port),
semconv.HTTPURL(ts.URL),
semconv.HTTPFlavorKey.String(fmt.Sprintf("1.%d", r.ProtoMinor)),
semconv.HTTPMethod("GET"),
semconv.HTTPResponseContentLength(13),
semconv.HTTPStatusCode(200),
)
assertClientScopeMetrics(t, rm.ScopeMetrics[0], attrs)
}

func assertClientScopeMetrics(t *testing.T, sm metricdata.ScopeMetrics, attrs attribute.Set) {
assert.Equal(t, instrumentation.Scope{
Name: "go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp",
Version: otelhttp.Version(),
}, sm.Scope)

require.Len(t, sm.Metrics, 3)

want := metricdata.Metrics{
Name: "http.client.request_content_length",
Data: metricdata.Sum[int64]{
DataPoints: []metricdata.DataPoint[int64]{{Attributes: attrs, Value: 0}},
Temporality: metricdata.CumulativeTemporality,
IsMonotonic: true,
},
}
metricdatatest.AssertEqual(t, want, sm.Metrics[0], metricdatatest.IgnoreTimestamp())

want = metricdata.Metrics{
Name: "http.client.response_content_length",
Data: metricdata.Sum[int64]{
DataPoints: []metricdata.DataPoint[int64]{{Attributes: attrs, Value: 13}},
Temporality: metricdata.CumulativeTemporality,
IsMonotonic: true,
},
}
metricdatatest.AssertEqual(t, want, sm.Metrics[1], metricdatatest.IgnoreTimestamp())

// Duration value is not predictable.
dur := sm.Metrics[2]
assert.Equal(t, "http.client.duration", dur.Name)
require.IsType(t, dur.Data, metricdata.Histogram[float64]{})
hist := dur.Data.(metricdata.Histogram[float64])
assert.Equal(t, metricdata.CumulativeTemporality, hist.Temporality)
require.Len(t, hist.DataPoints, 1)
dPt := hist.DataPoints[0]
assert.Equal(t, attrs, dPt.Attributes, "attributes")
assert.Equal(t, uint64(1), dPt.Count, "count")
assert.Equal(t, []float64{0, 5, 10, 25, 50, 75, 100, 250, 500, 750, 1000, 2500, 5000, 7500, 10000}, dPt.Bounds, "bounds")
}
94 changes: 84 additions & 10 deletions instrumentation/net/http/otelhttp/transport.go
Original file line number Diff line number Diff line change
Expand Up @@ -19,10 +19,13 @@ import (
"io"
"net/http"
"net/http/httptrace"
"time"

"go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp/internal/semconvutil"
"go.opentelemetry.io/otel"
"go.opentelemetry.io/otel/attribute"
"go.opentelemetry.io/otel/codes"
"go.opentelemetry.io/otel/metric"
"go.opentelemetry.io/otel/propagation"
"go.opentelemetry.io/otel/trace"
)
Expand All @@ -32,12 +35,18 @@ import (
type Transport struct {
rt http.RoundTripper

tracer trace.Tracer
propagators propagation.TextMapPropagator
spanStartOptions []trace.SpanStartOption
filters []Filter
spanNameFormatter func(string, *http.Request) string
clientTrace func(context.Context) *httptrace.ClientTrace
tracer trace.Tracer
meter metric.Meter
propagators propagation.TextMapPropagator
spanStartOptions []trace.SpanStartOption
readEvent bool
filters []Filter
spanNameFormatter func(string, *http.Request) string
clientTrace func(context.Context) *httptrace.ClientTrace
getRequestAttributes func(*http.Request) []attribute.KeyValue
getResponseAttributes func(response *http.Response) []attribute.KeyValue
counters map[string]metric.Int64Counter
valueRecorders map[string]metric.Float64Histogram
Comment on lines +48 to +49
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we using maps here? How about simply having the mutiple instrument fields (many metric.Int64Counter and metric.Float64Histogram fields instead of these two maps).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also the same thing done by the server handler, this pattern seems to be used in a lot of instrumentations.

}

var _ http.RoundTripper = &Transport{}
Expand All @@ -63,14 +72,17 @@ func NewTransport(base http.RoundTripper, opts ...Option) *Transport {

c := newConfig(append(defaultOpts, opts...)...)
t.applyConfig(c)
t.createMeasures()

return &t
}

func (t *Transport) applyConfig(c *config) {
t.tracer = c.Tracer
t.meter = c.Meter
t.propagators = c.Propagators
t.spanStartOptions = c.SpanStartOptions
t.readEvent = c.ReadEvent
t.filters = c.Filters
t.spanNameFormatter = c.SpanNameFormatter
t.clientTrace = c.ClientTrace
Expand All @@ -80,10 +92,29 @@ func defaultTransportFormatter(_ string, r *http.Request) string {
return "HTTP " + r.Method
}

func (t *Transport) createMeasures() {
t.counters = make(map[string]metric.Int64Counter)
t.valueRecorders = make(map[string]metric.Float64Histogram)

requestBytesCounter, err := t.meter.Int64Counter(ClientRequestContentLength)
handleErr(err)

responseBytesCounter, err := t.meter.Int64Counter(ClientResponseContentLength)
handleErr(err)

clientLatencyMeasure, err := t.meter.Float64Histogram(ClientLatency)
handleErr(err)

t.counters[ClientRequestContentLength] = requestBytesCounter
t.counters[ClientResponseContentLength] = responseBytesCounter
t.valueRecorders[ClientLatency] = clientLatencyMeasure
}

// RoundTrip creates a Span and propagates its context via the provided request's headers
// before handing the request to the configured base RoundTripper. The created span will
// end when the response body is closed or when a read from the body returns io.EOF.
func (t *Transport) RoundTrip(r *http.Request) (*http.Response, error) {
requestStartTime := time.Now()
for _, f := range t.filters {
if !f(r) {
// Simply pass through to the base RoundTripper if a filter rejects the request
Expand All @@ -109,21 +140,64 @@ func (t *Transport) RoundTrip(r *http.Request) (*http.Response, error) {
ctx = httptrace.WithClientTrace(ctx, t.clientTrace(ctx))
}

readRecordFunc := func(int64) {}
if t.readEvent {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

readEvent and readRecordFunc map to "read" in the span .. but it's the client's Request object, so it's actually a "write" event - right? because it's being written to the server?

"read" events should be recorded from the bytes consumed by the http.Response.Body, right?

readRecordFunc = func(n int64) {
span.AddEvent("read", trace.WithAttributes(ReadBytesKey.Int64(n)))
}
}

var bw bodyWrapper
// if request body is nil or NoBody, we don't want to mutate the body as it
// will affect the identity of it in an unforeseeable way because we assert
// ReadCloser fulfills a certain interface, and it is indeed nil or NoBody.
if r.Body != nil && r.Body != http.NoBody {
bw.ReadCloser = r.Body
bw.record = readRecordFunc
r.Body = &bw
}

r = r.Clone(ctx) // According to RoundTripper spec, we shouldn't modify the origin request.
span.SetAttributes(semconvutil.HTTPClientRequest(r)...)
if t.getRequestAttributes != nil {
span.SetAttributes(t.getRequestAttributes(r)...)
}
t.propagators.Inject(ctx, propagation.HeaderCarrier(r.Header))

res, err := t.rt.RoundTrip(r)
if err != nil {
span.RecordError(err)
span.SetStatus(codes.Error, err.Error())
span.End()
return res, err
} else {
span.SetAttributes(semconvutil.HTTPClientResponse(res)...)
if t.getResponseAttributes != nil {
span.SetAttributes(t.getResponseAttributes(res)...)
}
span.SetStatus(semconvutil.HTTPClientStatus(res.StatusCode))
res.Body = newWrappedBody(span, res.Body)
}

// Add metrics
attributes := semconvutil.HTTPClientRequest(r)
Copy link

@jlordiales jlordiales Sep 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This probably needs something similar to

func HTTPServerRequestMetrics(server string, req *http.Request) []attribute.KeyValue {
which avoids any high cardinality attributes for metrics

if t.getRequestAttributes != nil {
attributes = append(attributes, t.getRequestAttributes(r)...)
}
if err == nil {
attributes = append(attributes, semconvutil.HTTPClientResponse(res)...)
if t.getResponseAttributes != nil {
attributes = append(attributes, t.getResponseAttributes(res)...)
}
}
o := metric.WithAttributes(attributes...)
t.counters[ClientRequestContentLength].Add(ctx, bw.read.Load(), o)
if err == nil {
t.counters[ClientResponseContentLength].Add(ctx, res.ContentLength, o)
Copy link

@jdef jdef Sep 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ContentLength may not be what we want here, could be negative. since bodyWrapper already tracks count of bytes read from the response, it's probably better to refer to that count here

[EDIT] I see that newWrappedBody is used instead of bodyWrapper .. so maybe it's not a drop-in replacement. nonetheless we need a more accurate byte count than http.Response.ContentLength is guaranteed to provide.

}

span.SetAttributes(semconvutil.HTTPClientResponse(res)...)
span.SetStatus(semconvutil.HTTPClientStatus(res.StatusCode))
res.Body = newWrappedBody(span, res.Body)
// Use floating point division here for higher precision (instead of Millisecond method).
elapsedTime := float64(time.Since(requestStartTime)) / float64(time.Millisecond)
t.valueRecorders[ClientLatency].Record(ctx, elapsedTime, o)

return res, err
}
Expand Down
5 changes: 3 additions & 2 deletions instrumentation/net/http/otelhttp/wrap.go
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ import (
"context"
"io"
"net/http"
"sync/atomic"

"go.opentelemetry.io/otel/propagation"
)
Expand All @@ -30,14 +31,14 @@ type bodyWrapper struct {
io.ReadCloser
record func(n int64) // must not be nil

read int64
read atomic.Int64
err error
}

func (w *bodyWrapper) Read(b []byte) (int, error) {
n, err := w.ReadCloser.Read(b)
n1 := int64(n)
w.read += n1
w.read.Add(n1)
w.err = err
w.record(n1)
return n, err
Expand Down