Skip to content

Track number of discarded samples while committing in TSDB#7213

Closed
pracucci wants to merge 2 commits intoprometheus:masterfrom
pracucci:track-number-of-discarded-samples
Closed

Track number of discarded samples while committing in TSDB#7213
pracucci wants to merge 2 commits intoprometheus:masterfrom
pracucci:track-number-of-discarded-samples

Conversation

@pracucci
Copy link
Contributor

@pracucci pracucci commented May 6, 2020

Today, while supporting a customer, we realised there's a condition under which out of order samples may be written to the WAL (and thus sent to the remote write) but then silently discarded by TSDB.

We observed this issue with clashing series (container name label was missing) scraped from cAdvisor. We've noticed cAdvisor exposes metrics with the timestamp but, due to how it works, the timestamp may be different between different series exposed during a single scrape.

In the case of clashing series, samples for the same series but different timestamp may be appended during a single scrape. When this happens, such out of order samples are silently ignored during the headAppender.Commit() but are written to the WAL anyway and thus sent to the remote write:

prometheus/tsdb/head.go

Lines 1059 to 1061 in 532f7bb

if !ok {
total--
}

In this PR, I'm suggesting to add a metric to keep track of such silently discarded samples so that when we see out of order samples received by the remote write (ie. Cortex) we would have a way to check if we hit this case.

Internal details

memSeries.append() is called within the headAppender.Commit(). If the memSeries.append() fails because of "Out of order sample" it returns false and the sample discarded, but there's no metrics tracking it and Prometheus just silently moves on.

This issue is triggered by the fact that c.maxTime is updated in memSeries.append() (again, called during headAppender.Commit()) and so the storage.ErrOutOfOrderSample is not returned by memSeries.appendable() if the out of order occurs within a single scrape/transaction.

However, records are logged in the WAL within the headAppender.Commit() but before memSeries.append() and this may lead to out of order samples being written to the WAL (and thus sent to the remote write) but then silently discarded while committing to TSDB.

pracucci added 2 commits May 6, 2020 13:11
Signed-off-by: Marco Pracucci <marco@pracucci.com>
Signed-off-by: Marco Pracucci <marco@pracucci.com>
@brian-brazil
Copy link
Contributor

Hmm, now that we have isolation should we be not writing these to the WAL at all?

Copy link
Member

@codesome codesome left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see this being useful. This is not to be confused with the metric being added for out of order in #6679 - which counts out of order samples in both AddFast and Commit(), while the out of order in AddFast does not mean discarded silently.


if !ok {
total--
discarded++
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can get rid of total variable and use len(a.samples)-discarded when adding the metric, looks cleaner.

@codesome
Copy link
Member

codesome commented May 6, 2020

Hmm, now that we have isolation should we be not writing these to the WAL at all?

I had the same thought. But do we want things to be logged in the WAL even before we start to append in the memory?

@brian-brazil
Copy link
Contributor

If users can't seem them due to isolation, does it matter?

@codesome
Copy link
Member

codesome commented May 6, 2020

If we log after appending to memory, and if the log fails, then users can still see unless we remove it from the memory. In that case, a restart will make those samples disappear.

Another way to do this would be to identify the potential out of order in the AddFast and not during Commit.

@brian-brazil
Copy link
Contributor

Another way to do this would be to identify the potential out of order in the AddFast and not during Commit.

That'd only reduce the chances of it, it could still in theory happen.

@codesome
Copy link
Member

codesome commented May 6, 2020

Any suggestions on skipping WAL for those samples? (assuming we want to log before putting it in the memory)

@brian-brazil
Copy link
Contributor

I don't see a way to make it work.

@codesome
Copy link
Member

codesome commented May 6, 2020

Sounds like a metric then

@pracucci
Copy link
Contributor Author

pracucci commented May 7, 2020

Closing because already done in #6679 by @codesome.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants