Skip to content
This repository has been archived by the owner on Aug 13, 2019. It is now read-only.

Expose prometheus_tsdb_lowest_timestamp metric #363

Merged
merged 4 commits into from
Sep 14, 2018

Conversation

bobmshannon
Copy link
Contributor

db.go Outdated
@@ -238,6 +244,9 @@ func Open(dir string, l log.Logger, r prometheus.Registerer, opts *Options) (db

go db.run()

head := db.Head()
db.metrics.startTime.Set(float64(head.minTime))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, that's only the lower bound of the head block, not the oldest timestamp in the database. Or do I miss something? It also doesn't take into account compaction.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still learning about the internals of the TSDB so I probably did make some incorrect assumptions here.

Looking over the previous PRs for this feature perhaps we can just loop through each block using db.Blocks() and search for the oldest minTime, and have that run on each collection.

@krasi-georgiev
Copy link
Contributor

Is the original use case still valid?
@bkupidura did you find a workaround for your use case or you still need this metric?

@bkupidura
Copy link

Hey, im no longer working on this. I will ping right people to give You answer.

@ildarsv @mescanef @dkalashnik

@dkalashnik
Copy link

@krasi-georgiev Hi, original case is not valid anymore, we are using a sort of workaround.

@krasi-georgiev
Copy link
Contributor

Thanks
If the workaround works well we can close it and revisit if needed.

btw what was the workaround?

@dkalashnik
Copy link

@krasi-georgiev small proxy on top of several Prometheuses with data merge.

@simonpasquier
Copy link
Contributor

Hello @dkalashnik @bkupidura 😄
@gouthamve mentioned that reporting the oldest timestamp in the db is a nice-to-have and I kind of agree, especially if it's cheap and easy.

Copy link
Collaborator

@gouthamve gouthamve left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, thanks for this, this is indeed a nice metric to have, and we can create an alert on it, to check if retention is working or now.

While you're on the right track, I've explained some changes which I think will make it reliable in all cases. Please let me know if anything is confusing.

db.go Outdated
}, func() float64 {
db.mtx.RLock()
defer db.mtx.RUnlock()
startTime := time.Now().Unix()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no reason to believe that the time stored is going to be the current time. For example, it could 10, 20, 30 or even time in the future. It's best initialise startTime as math.MaxInt64, and if there is no data in the block, set it to 0.

Further, the blocks are sorted by time. So doing just db.blocks[0].MinTime should suffice. And if len(db.blocks) == 0, then we should look at HeadBlock mintime.

Signed-off-by: Bob Shannon <[email protected]>
Copy link
Collaborator

@gouthamve gouthamve left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor nit, otherwise LGTM 👍

db.go Outdated
@@ -157,6 +158,17 @@ func newDBMetrics(db *DB, r prometheus.Registerer) *dbMetrics {
Name: "prometheus_tsdb_retention_cutoffs_failures_total",
Help: "Number of times the database failed to cut off block data from disk.",
})
m.startTime = prometheus.NewGaugeFunc(prometheus.GaugeOpts{
Name: "prometheus_tsdb_start_time_seconds",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prometheus_tsdb_lowest_timestamp makes more sense here, as start_time could be confused with the process start time and it may or may not be seconds :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should keep the unit suffix and enforce the metric's value to be in seconds though.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. From a general TSDB perspective, there is no reason that the time passed into .Append(t int64, v float64), is a wall time. It could be anything, and people may even pass nano-seconds.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. Does it make sense to have an "opaque" metric then? Wouldn't it be better to expose the raw value to the caller (eg Prometheus) and let it compute the metric?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If Prometheus appends millisecond timestamps to the TSDB, then this metric as collected might be slightly more difficult to work with using something like the built-in time() function for example. For that reason I agree that it would be nice to adhere to metric naming best practices and include the base unit as a suffix in the metric name, converting the millisecond timestamp to a unix timestamp if necessary. As @simonpasquier mentioned I would think that instrumenting this metric directly in Prometheus instead would allow for this.

With that being said perhaps there are other consumers of the TSDB where exposing this lowest timestamp metric directly would still be useful.

@bobmshannon bobmshannon changed the title Expose prometheus_tsdb_start_time_seconds metric Expose prometheus_tsdb_lowest_timestamp metric Jul 24, 2018
@simonpasquier
Copy link
Contributor

👍

@krasi-georgiev
Copy link
Contributor

@simonpasquier , @gouthamve so finally are we adding this?

@gouthamve
Copy link
Collaborator

👍

@gouthamve gouthamve merged commit cb7f320 into prometheus-junkyard:master Sep 14, 2018
@kfdm
Copy link

kfdm commented Sep 24, 2018

Sorry to bump an old thread, but since this does not have units on it, the timestamp seems to be in milliseconds. Is that correct ?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants