feat: add cloudnative-pg-mixin #1469

arikgrahl · 2025-08-01T12:37:19Z

As already described in the README.md:

A monitoring mixin for CloudNativePG, providing Grafana dashboards and Prometheus alerting rules for PostgreSQL clusters running on Kubernetes.

Dashboards

This mixin bundles the Grafana dashboard provided by CloudNativePG.

Prometheus Alerts

This mixin bundles the sample Prometheus Alert rules provided by CloudNativePG.

LongRunningTransaction: A query is taking longer than 5 minutes.

BackendsWaiting: If a backend is waiting for longer than 5 minutes

PGDatabase: Number of transactions from the frozen XID to the current one

PGReplication: The standby is lagging behind the primary

LastFailedArchiveTime: Checks the last time archiving failed. Will be < 0 when it has not failed.

DatabaseDeadlockConflicts: Checks the number of database conflicts

ReplicaFailingReplication: Checks if the replica is failing to replicate

v-zhuravlev · 2025-08-06T07:40:09Z

Hi! Thanks for contributing. It this a copy cloudnative dashboards and alerts? Would it be more practical to repack their alerts/dashboard as mixin in cloudnative repo instead? So contributors and users can more easily discover it?

arikgrahl · 2025-08-10T17:00:04Z

Generally speaking, this approach makes sense.
I could try submitting a repackaging of the resources as a mixin in the repository, which is where the dashboards and alerts originate.

However, I'm not very optimistic that such a contribution would be accepted, as the repository appears to be limited to Helm charts:

This repository contains the Grafana Dashboards distributed as Helm Charts so they can packaged as a dependency to other projects.

Additionally, I’ve noticed that several mixins within this project have counterparts in their respective upstream repositories.
For example: github.com/ceph/ceph/…/ceph-cluster.json vs. github.com/grafana/jsonnet-libs/…/ceph-cluster.json

For this reason, I thought it might make sense to include the mixins here.
However, there may be subtle differences that I am not currently aware of.

Dasomeone · 2025-08-11T11:49:27Z

Hi @arikgrahl
Generally speaking we'd prefer to keep them together just for consistency as changes are made, but as you rightly pointed out that's not always the case, and we can (and will) absolutely accept the contribution if they turn it down!

Let's give it a shot and see what they say, otherwise happy to add it here for everyone. Perhaps even if they turn down the contribution a compromise can be reached with linking between the two?

Dasomeone · 2025-10-23T12:42:41Z

Hi @arikgrahl, any updates on this PR? Is this something you were able to contribute upstream or would you still like to have it live here?

aalhour

LGTM, just a couple of questions since I am not familiar with the CNPG:

Does the operator expose more metrics out of PG or is that left to other projects, i.e.: Prometheus and the like?
If the answer is yes, would you want to add more health metrics? Commit latencies, slow queries ... etc?

arikgrahl · 2025-10-27T16:02:51Z

@Dasomeone, thanks for checking in and for all the feedback!
Unfortunately, I haven’t had the bandwidth to push this upstream so far, but I’d definitely still like the contribution to be made available, whether here or upstream. Since I can’t commit to a timeline for pursuing the upstream route, I’d be happy to have it live here for now if that works.

@aalhour, thank you for the review and your questions!
CloudNativePG exposes a fairly extensive set of metrics for the PostgreSQL databases it manages.
These include:

CPU/memory usage
session state (active vs. idle)
transactions
- commited vs. rolled back
- longest transaction
deadlocks
blocked queries
storage (PGData/WAL)
- volume space usage
- volume inode usage
tuple I/O (deleted/inserted/fetched/returned/updated)
block I/O (hit vs. read)
database size
WAL
- segment archive status (ready vs. done)
- archiver status (archived vs. failed)
- last archive age
- WAL count
replication
- replication lag
- write lag
- flush lag
- replay lag

I hope this answers your question, but please let me know if you’d like more detail or have a specific definition of “health metrics” for PostgreSQL in mind.
Commit latencies should be covered by the existing metrics, but I’m not certain if there’s a direct metric for slow queries.
If there is a specific metric or dashboard you'd like added, let me know and I can look into including it!

Dasomeone · 2025-10-28T14:30:16Z

@arikgrahl Like I said originally, we'd prefer an attempt is made to commit this upstream so it lives closer to the codebase at first, but if they're unwilling to accept it we can absolutely store it here. Just hesitant to have it live here from the get-go.
Do let me know if you're able to upstream it or not :)
Maybe we can revisit in a few weeks?

arikgrahl requested a review from a team as a code owner August 1, 2025 12:37

arikgrahl marked this pull request as draft August 1, 2025 12:51

feat: add cloudnative-pg-mixin

8bd7aa7

arikgrahl force-pushed the cloudnative-pg-mixin branch from 5052974 to 8bd7aa7 Compare August 5, 2025 13:13

arikgrahl marked this pull request as ready for review August 5, 2025 13:15

Dasomeone added the monitoring-mixins label Aug 21, 2025

aalhour reviewed Oct 27, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add cloudnative-pg-mixin #1469

feat: add cloudnative-pg-mixin #1469

Uh oh!

arikgrahl commented Aug 1, 2025

Uh oh!

v-zhuravlev commented Aug 6, 2025

Uh oh!

arikgrahl commented Aug 10, 2025

Uh oh!

Dasomeone commented Aug 11, 2025

Uh oh!

Dasomeone commented Oct 23, 2025

Uh oh!

aalhour left a comment

Uh oh!

arikgrahl commented Oct 27, 2025

Uh oh!

Dasomeone commented Oct 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

feat: add cloudnative-pg-mixin #1469

Are you sure you want to change the base?

feat: add cloudnative-pg-mixin #1469

Uh oh!

Conversation

arikgrahl commented Aug 1, 2025

Dashboards

Prometheus Alerts

Uh oh!

v-zhuravlev commented Aug 6, 2025

Uh oh!

arikgrahl commented Aug 10, 2025

Uh oh!

Dasomeone commented Aug 11, 2025

Uh oh!

Dasomeone commented Oct 23, 2025

Uh oh!

aalhour left a comment

Choose a reason for hiding this comment

Uh oh!

arikgrahl commented Oct 27, 2025

Uh oh!

Dasomeone commented Oct 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants