-
Notifications
You must be signed in to change notification settings - Fork 179
Expose prometheus_tsdb_lowest_timestamp metric #363
Expose prometheus_tsdb_lowest_timestamp metric #363
Conversation
Signed-off-by: Bob Shannon <[email protected]>
0d58fba
to
e8e00c6
Compare
db.go
Outdated
@@ -238,6 +244,9 @@ func Open(dir string, l log.Logger, r prometheus.Registerer, opts *Options) (db | |||
|
|||
go db.run() | |||
|
|||
head := db.Head() | |||
db.metrics.startTime.Set(float64(head.minTime)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, that's only the lower bound of the head block, not the oldest timestamp in the database. Or do I miss something? It also doesn't take into account compaction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still learning about the internals of the TSDB so I probably did make some incorrect assumptions here.
Looking over the previous PRs for this feature perhaps we can just loop through each block using db.Blocks()
and search for the oldest minTime
, and have that run on each collection.
Is the original use case still valid? |
Hey, im no longer working on this. I will ping right people to give You answer. |
@krasi-georgiev Hi, original case is not valid anymore, we are using a sort of workaround. |
Thanks btw what was the workaround? |
@krasi-georgiev small proxy on top of several Prometheuses with data merge. |
Hello @dkalashnik @bkupidura 😄 |
Signed-off-by: Bob Shannon <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, thanks for this, this is indeed a nice metric to have, and we can create an alert on it, to check if retention is working or now.
While you're on the right track, I've explained some changes which I think will make it reliable in all cases. Please let me know if anything is confusing.
db.go
Outdated
}, func() float64 { | ||
db.mtx.RLock() | ||
defer db.mtx.RUnlock() | ||
startTime := time.Now().Unix() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no reason to believe that the time stored is going to be the current time. For example, it could 10, 20, 30
or even time in the future. It's best initialise startTime
as math.MaxInt64
, and if there is no data in the block, set it to 0
.
Further, the blocks are sorted by time. So doing just db.blocks[0].MinTime
should suffice. And if len(db.blocks) == 0
, then we should look at HeadBlock
mintime.
Signed-off-by: Bob Shannon <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor nit, otherwise LGTM 👍
db.go
Outdated
@@ -157,6 +158,17 @@ func newDBMetrics(db *DB, r prometheus.Registerer) *dbMetrics { | |||
Name: "prometheus_tsdb_retention_cutoffs_failures_total", | |||
Help: "Number of times the database failed to cut off block data from disk.", | |||
}) | |||
m.startTime = prometheus.NewGaugeFunc(prometheus.GaugeOpts{ | |||
Name: "prometheus_tsdb_start_time_seconds", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
prometheus_tsdb_lowest_timestamp
makes more sense here, as start_time could be confused with the process start time and it may or may not be seconds :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should keep the unit suffix and enforce the metric's value to be in seconds though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. From a general TSDB perspective, there is no reason that the time passed into .Append(t int64, v float64)
, is a wall time. It could be anything, and people may even pass nano-seconds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. Does it make sense to have an "opaque" metric then? Wouldn't it be better to expose the raw value to the caller (eg Prometheus) and let it compute the metric?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If Prometheus appends millisecond timestamps to the TSDB, then this metric as collected might be slightly more difficult to work with using something like the built-in time()
function for example. For that reason I agree that it would be nice to adhere to metric naming best practices and include the base unit as a suffix in the metric name, converting the millisecond timestamp to a unix timestamp if necessary. As @simonpasquier mentioned I would think that instrumenting this metric directly in Prometheus instead would allow for this.
With that being said perhaps there are other consumers of the TSDB where exposing this lowest timestamp metric directly would still be useful.
Signed-off-by: Bob Shannon <[email protected]>
👍 |
@simonpasquier , @gouthamve so finally are we adding this? |
👍 |
Sorry to bump an old thread, but since this does not have units on it, the timestamp seems to be in milliseconds. Is that correct ? |
Closes prometheus/prometheus#2988