rabbit_osiris_metrics: Use committed_offset counter#15390
rabbit_osiris_metrics: Use committed_offset counter#15390the-mikedavis wants to merge 1 commit intomainfrom
Conversation
This change uses the committed offset as the last offset in the calculation of available offsets rather than the last-written offset. Stream readers can't access tail data until it becomes committed so this is a more accurate count for "ready" messages.
| Data = osiris_counters:overview(), | ||
| _ = maps:map( | ||
| fun ({osiris_writer, QName}, #{offset := Offs, | ||
| fun ({osiris_writer, QName}, #{committed_offset := COffs, |
There was a problem hiding this comment.
Do we need to check the feature flag and use it to decide whether we should use offset or committed_offset?
There was a problem hiding this comment.
If those values can come from other nodes, we do. If they are entirely local state, maybe not.
There was a problem hiding this comment.
It should not crash as committed_offset is always available, but the stats will be slightly inaccurate before the changes in #15225 are enforced with the 4.3 feature flag. committed_offset is the offset of the last committed chunk pre-15225, so it is behind by the number of messages in the chunk compared to the actual last offset.
This should fix itself when the upgrade is done and after the streams restart, so I guess it is reasonable considering it is just stats.
WDYT @kjnilsson ?
There was a problem hiding this comment.
yeah it is mostly informational at this point so shouldn't matter, let's not bother the ff subsystem if we don't have to.
|
I looked into the tests and I think this makes some cases where we check message count flakier. I didn't see a specific test failing reliably - it made a few cases more likely to fail. I will try to give this another look at some point but there's no reason to rush it into 4.3.0 |
This change uses the committed offset as the last offset in the calculation of available offsets rather than the last-written offset. Stream readers can't access tail data until it becomes committed so this is a more accurate count for "ready" messages.
This is meant to extend #15225 so it's for v4.3.x only.