Skip to content
This repository was archived by the owner on Aug 23, 2023. It is now read-only.

expose discarded samples metrics via prometheus /metrics endpoint #1203

Closed
woodsaj opened this issue Jan 7, 2019 · 3 comments · Fixed by #1278 or #1288
Closed

expose discarded samples metrics via prometheus /metrics endpoint #1203

woodsaj opened this issue Jan 7, 2019 · 3 comments · Fixed by #1278 or #1288
Assignees
Labels
Milestone

Comments

@woodsaj
Copy link
Member

woodsaj commented Jan 7, 2019

We use prometheus to scrap active_series counts on a per org basis so we can provide our customers with a unified dashboard that shows usage for both graphite (metrictank) and prometheus(cortex) hosted metrics services.

In addition to the current metrics, we also need to expose "discarded samples" numbers.
These metrics will be similar to the existing metrics we collect:
https://github.com/grafana/metrictank/blob/master/input/input.go#L47-L51
https://github.com/grafana/metrictank/blob/master/mdata/init.go#L31-L39

But will be a counter metric "discarded_samples_total" with labels for OrgID and discarded reason.
eg

var DiscardedSamples = prometheus.NewCounterVec(
	prometheus.CounterOpts{
		Namespace: "metrictank",
		Name: "discarded_samples_total",
		Help: "The total number of samples that were discarded.",
	},
	[]string{"reason", "org"},
)

the reason will be something like

  • out-of-order (currently called tank.metrics_too_old)
  • new-value-for-timestamp (currently called tank.metrics_too_old )
  • received-too-late (currently called tank.add_to_closed_chunk)
  • invalid-timestamp (currently called input.%s.metricdata.invalid or input.%s.metricpoint.invalid)
  • invalid-interval (currently called input.%s.metricdata.invalid)
  • invalid-orgID (currently called input.%s.metricdata.invalid)
  • invalid-name (currently called input.%s.metricdata.invalid)
  • invalid-mtype (currently called input.%s.metricdata.invalid)
  • invalid-tag-format (currently called input.%s.metricdata.invalid)
  • unknown-point-id (currently called input.%s.metricpoint.unknown)
@woodsaj
Copy link
Member Author

woodsaj commented Jan 7, 2019

see also: #1201

@fkaleo fkaleo added the feature label Apr 10, 2019
@fkaleo fkaleo added this to the vnext milestone Apr 10, 2019
fkaleo added a commit that referenced this issue Apr 11, 2019
The metric takes a reason with values among:
- sample-out-of-order
- received-too-late
- invalid-timestamp"
- invalid-interval
- invalid-orgID"
- invalid-name
- invalid-mtype
- invalid-tag-format
- unknown-point-id

Fixes #1203
fkaleo added a commit that referenced this issue Apr 12, 2019
The metric takes a reason with values among:
- sample-out-of-order
- received-too-late
- invalid-timestamp"
- invalid-interval
- invalid-orgID"
- invalid-name
- invalid-mtype
- invalid-tag-format
- unknown-point-id

Fixes #1203
fkaleo added a commit that referenced this issue Apr 15, 2019
The metric takes a reason with values among:
- sample-out-of-order
- received-too-late
- invalid-timestamp"
- invalid-interval
- invalid-orgID"
- invalid-name
- invalid-mtype
- invalid-tag-format
- unknown-point-id

Fixes #1203
@Dieterbe Dieterbe reopened this Apr 17, 2019
@Dieterbe
Copy link
Contributor

Dieterbe commented Apr 17, 2019

This is mostly finished, but needs followup, new-value-for-timestamp
see #1278 (comment)

@fkaleo
Copy link
Contributor

fkaleo commented Apr 17, 2019

This is mostly finished, but needs followup, new-value-for-timestamp
see #1278 (comment)

Yep, will do with #1201 and #1202

fkaleo added a commit that referenced this issue Apr 17, 2019
- New Carbon metric 'tank.discarded.new_value_for_timestamp'
- Prometheus metric 'discarded_samples_total' has a new reason 'new-value-for-timestamp'

Fixes #1201, #1202, #1203
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.