Track number of discarded samples while committing in TSDB by pracucci · Pull Request #7213 · prometheus/prometheus

pracucci · 2020-05-06T11:29:46Z

Today, while supporting a customer, we realised there's a condition under which out of order samples may be written to the WAL (and thus sent to the remote write) but then silently discarded by TSDB.

We observed this issue with clashing series (container name label was missing) scraped from cAdvisor. We've noticed cAdvisor exposes metrics with the timestamp but, due to how it works, the timestamp may be different between different series exposed during a single scrape.

In the case of clashing series, samples for the same series but different timestamp may be appended during a single scrape. When this happens, such out of order samples are silently ignored during the headAppender.Commit() but are written to the WAL anyway and thus sent to the remote write:

prometheus/tsdb/head.go

Lines 1059 to 1061 in 532f7bb

In this PR, I'm suggesting to add a metric to keep track of such silently discarded samples so that when we see out of order samples received by the remote write (ie. Cortex) we would have a way to check if we hit this case.

Internal details

memSeries.append() is called within the headAppender.Commit(). If the memSeries.append() fails because of "Out of order sample" it returns false and the sample discarded, but there's no metrics tracking it and Prometheus just silently moves on.

This issue is triggered by the fact that c.maxTime is updated in memSeries.append() (again, called during headAppender.Commit()) and so the storage.ErrOutOfOrderSample is not returned by memSeries.appendable() if the out of order occurs within a single scrape/transaction.

However, records are logged in the WAL within the headAppender.Commit() but before memSeries.append() and this may lead to out of order samples being written to the WAL (and thus sent to the remote write) but then silently discarded while committing to TSDB.

Signed-off-by: Marco Pracucci <marco@pracucci.com>

brian-brazil · 2020-05-06T11:42:05Z

Hmm, now that we have isolation should we be not writing these to the WAL at all?

codesome

I can see this being useful. This is not to be confused with the metric being added for out of order in #6679 - which counts out of order samples in both AddFast and Commit(), while the out of order in AddFast does not mean discarded silently.

codesome · 2020-05-06T11:45:42Z

tsdb/head.go


 		if !ok {
 			total--
+			discarded++


We can get rid of total variable and use len(a.samples)-discarded when adding the metric, looks cleaner.

codesome · 2020-05-06T11:47:29Z

Hmm, now that we have isolation should we be not writing these to the WAL at all?

I had the same thought. But do we want things to be logged in the WAL even before we start to append in the memory?

brian-brazil · 2020-05-06T11:48:29Z

If users can't seem them due to isolation, does it matter?

codesome · 2020-05-06T11:54:04Z

If we log after appending to memory, and if the log fails, then users can still see unless we remove it from the memory. In that case, a restart will make those samples disappear.

Another way to do this would be to identify the potential out of order in the AddFast and not during Commit.

brian-brazil · 2020-05-06T12:14:10Z

Another way to do this would be to identify the potential out of order in the AddFast and not during Commit.

That'd only reduce the chances of it, it could still in theory happen.

codesome · 2020-05-06T12:25:16Z

Any suggestions on skipping WAL for those samples? (assuming we want to log before putting it in the memory)

brian-brazil · 2020-05-06T12:29:38Z

I don't see a way to make it work.

codesome · 2020-05-06T12:37:00Z

Sounds like a metric then

pracucci · 2020-05-07T16:59:44Z

Closing because already done in #6679 by @codesome.

pracucci added 2 commits May 6, 2020 13:11

Track number of discarded samples during Commit()

1787bf9

Signed-off-by: Marco Pracucci <marco@pracucci.com>

Added unit test

8923b3e

Signed-off-by: Marco Pracucci <marco@pracucci.com>

codesome approved these changes May 6, 2020

View reviewed changes

pracucci closed this May 7, 2020

pstibrany mentioned this pull request Jul 3, 2020

Cortex cluster generating "Duplicate sample for timestamp" errors constantly cortexproject/cortex#2832

Closed

pracucci deleted the track-number-of-discarded-samples branch July 9, 2020 15:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Track number of discarded samples while committing in TSDB#7213

Track number of discarded samples while committing in TSDB#7213
pracucci wants to merge 2 commits intoprometheus:masterfrom
pracucci:track-number-of-discarded-samples

pracucci commented May 6, 2020

Uh oh!

brian-brazil commented May 6, 2020

Uh oh!

codesome left a comment

Uh oh!

codesome May 6, 2020

Uh oh!

codesome commented May 6, 2020

Uh oh!

brian-brazil commented May 6, 2020

Uh oh!

codesome commented May 6, 2020

Uh oh!

brian-brazil commented May 6, 2020

Uh oh!

codesome commented May 6, 2020

Uh oh!

brian-brazil commented May 6, 2020

Uh oh!

codesome commented May 6, 2020

Uh oh!

pracucci commented May 7, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

pracucci commented May 6, 2020

Internal details

Uh oh!

brian-brazil commented May 6, 2020

Uh oh!

codesome left a comment

Choose a reason for hiding this comment

Uh oh!

codesome May 6, 2020

Choose a reason for hiding this comment

Uh oh!

codesome commented May 6, 2020

Uh oh!

brian-brazil commented May 6, 2020

Uh oh!

codesome commented May 6, 2020

Uh oh!

brian-brazil commented May 6, 2020

Uh oh!

codesome commented May 6, 2020

Uh oh!

brian-brazil commented May 6, 2020

Uh oh!

codesome commented May 6, 2020

Uh oh!

pracucci commented May 7, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants