Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add class ClusterReplicationCollector #166

Conversation

themoriarti
Copy link
Contributor

This class should collect the status of replication that is performed for each lxc/vm for all servers in the cluster.

This class should collect the status of replication that is performed for each lxc/vm for all servers in the cluster.
@znerol
Copy link
Member

znerol commented Oct 1, 2023

Thanks for taking the time to file a PR.

Unfortunately scraping /nodes has turned out to be inherently inefficient (see #55 and #58). Especially for big and growing deployments, this can get nasty quite quickly.

This PR is using the exact same known-to-be-faulty mechanism to collect the desired data. To make matters worse, the problematic loop would be running twice after that PR landed and as a result the time to collect all metrics will double for many users.

See the comments in #115 for alternative ideas on how to scrape config efficiently.

Also note that we might scrape /nodes in a different manner after #164 landed.

@themoriarti
Copy link
Contributor Author

Thanks for taking the time to file a PR.

Unfortunately scraping /nodes has turned out to be inherently inefficient (see #55 and #58). Especially for big and growing deployments, this can get nasty quite quickly.

This PR is using the exact same known-to-be-faulty mechanism to collect the desired data. To make matters worse, the problematic loop would be running twice after that PR landed and as a result the time to collect all metrics will double for many users.

See the comments in #115 for alternative ideas on how to scrape config efficiently.

Also note that we might scrape /nodes in a different manner after #164 landed.

Yes, it really simplifies the process and speeds it up, and also makes it possible to now install the export on each of proxmox servers of the cluster and collect metrics from each server separately, but some metrics of the cluster will be duplicated. I have made changes to the process of collecting replication tasks status.

@znerol
Copy link
Member

znerol commented Oct 9, 2023

but some metrics of the cluster will be duplicated. I have made changes to the process of collecting replication tasks status.

Not sure what you mean by duplicated metrics. If you have specific feedback on the refactoring, then please comment over there: #164

@znerol
Copy link
Member

znerol commented Nov 5, 2023

I released 3.0.0 and also merged #198. This PR now needs a little refactoring for the new file layout.

znerol added a commit that referenced this pull request Apr 27, 2024
Add replication metrics as requested in issue #112.

* Replication Metrics are fetched per node
* The metrics can be enabled or disabled

Based on the original PR #166 adapted the new file structure.

---------

Signed-off-by: Sven Gerber <[email protected]>
Co-authored-by: znerol <[email protected]>
Co-authored-by: Marian Koreniuk <[email protected]>
@znerol
Copy link
Member

znerol commented Apr 27, 2024

Closing in favor of #243 (also credited @themoriarti over there for the original work).

@znerol znerol closed this Apr 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants