Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consolidate query metrics and include result tag #1075

Merged
merged 3 commits into from
Sep 28, 2018

Conversation

objectiser
Copy link
Contributor

@objectiser objectiser commented Sep 20, 2018

Signed-off-by: Gary Brown [email protected]

Which problem is this PR solving?

Currently jaeger-query produces a set of metrics for the following operations: find traces, get operations, get services and get trace.

The set of metrics are (only showing one bucket for histograms):

jaeger_find_traces_attempts 1
jaeger_find_traces_errLatency_bucket{le="0.005"} 0
jaeger_find_traces_errors 0
jaeger_find_traces_okLatency_bucket{le="0.005"} 0
jaeger_find_traces_responses_bucket{le="0.005"} 1
jaeger_find_traces_successes 1

Errors and successes for latency and counters are separated out (including a further counter representing the total counts).

This PR consolidates the metrics to use common name with a result tag representing ok or err states.

Short description of the changes

Have a single counter find_traces_requests with tag result (ok, err) to replace the attempts/errors/successes counters. So total (successes) is the count(find_traces_requests) and errors/successes can be determined by querying based on the tag.

Also combined the two latency based metrics and added a tag for result (ok/err).

jaeger_query_find_traces_latency_bucket{result="err",le="0.005"} 0
jaeger_query_find_traces_latency_bucket{result="ok",le="0.005"} 0
jaeger_query_find_traces_requests{result="err"} 0
jaeger_query_find_traces_requests{result="ok"} 0
jaeger_query_find_traces_responses_bucket{le="0.005"} 0

NOTE: Have not included the sum and count values for the histograms.

scoped := metricsFactory.Namespace(namespace, nil)
metrics.Init(qMetrics, scoped, nil)
qMetrics := &queryMetrics{
Errors: scoped.Counter("requests", map[string]string{"result": "err"}),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a better name than requests and/or responses?
requests = number of query operations performed
responses = number of items returned per request

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

responses = number of items returned per request

what items? traces, spans?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It depends on which operation: find traces, get operations, get services and get trace.
So jaeger_find_traces_responses would be tracking the number of traces returned.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On reflection, requests and responses is probably ok - the requests is a counter, so should be obvious relates to number of requests. responses is a histogram to represent the number of items in each response.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems this discussion has finished already, but how about "operations" and "results"?

Even though "requests/responses" is probably OK as well, I would avoid it as it's a bit of a loaded term.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have removed the additional part of the name for those counters, so now it is just:

jaeger_query_find_traces{result="err"} 0
jaeger_query_find_traces{result="ok"} 1
jaeger_query_find_traces_latency_bucket{result="err",le="0.005"} 0
jaeger_query_find_traces_latency_bucket{result="ok",le="0.005"} 1
jaeger_query_find_traces_responses_bucket{le="0.005"} 1

@codecov
Copy link

codecov bot commented Sep 20, 2018

Codecov Report

Merging #1075 into master will not change coverage.
The diff coverage is 100%.

Impacted file tree graph

@@          Coverage Diff           @@
##           master   #1075   +/-   ##
======================================
  Coverage     100%    100%           
======================================
  Files         140     140           
  Lines        6622    6625    +3     
======================================
+ Hits         6622    6625    +3
Impacted Files Coverage Δ
storage/spanstore/metrics/decorator.go 100% <100%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f441f1d...03b899a. Read the comment docs.

@objectiser objectiser force-pushed the refactormetrics branch 2 times, most recently from 7663db2 to eda163f Compare September 24, 2018 16:15
@objectiser
Copy link
Contributor Author

@jaegertracing/jaeger-maintainers Updated description with the new equivalent metrics. Any thoughts on the new names/tags?

Copy link
Contributor

@jpkrohling jpkrohling left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but could you also add an entry to the changelog, mentioning that this is a breaking change? People relying on an existing metric name will get affected by this change.

}

func (q *queryMetrics) emit(err error, latency time.Duration, responses int) {
q.Attempts.Inc(1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is Attempts being used elsewhere?

scoped := metricsFactory.Namespace(namespace, nil)
metrics.Init(qMetrics, scoped, nil)
qMetrics := &queryMetrics{
Errors: scoped.Counter("requests", map[string]string{"result": "err"}),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems this discussion has finished already, but how about "operations" and "results"?

Even though "requests/responses" is probably OK as well, I would avoid it as it's a bit of a loaded term.

Copy link
Contributor

@jpkrohling jpkrohling left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jpkrohling jpkrohling merged commit b2aa771 into jaegertracing:master Sep 28, 2018
@ghost ghost removed the review label Sep 28, 2018
Successes: scoped.Counter("", map[string]string{"result": "ok"}),
Responses: scoped.Timer("responses", nil),
ErrLatency: scoped.Timer("latency", map[string]string{"result": "err"}),
OKLatency: scoped.Timer("latency", map[string]string{"result": "ok"}),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not use the annotation-based initialization as before? It keeps declaration of the struct and metrics name in a single place.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in #1096

@objectiser objectiser deleted the refactormetrics branch January 15, 2019 09:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants