-
Notifications
You must be signed in to change notification settings - Fork 172
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support metrics-based workload discovery #59
Comments
In our team we would love to see this option as well. We have played around with the newly introduced Our setup contains a central monitoring cluster where we would like to utilize krr to produce recommendations for the different clusters connected to this central monitoring solution. A metric-based workload discovery would allow us not to require Kubernetes API access to every downstream cluster that is reporting metrics to the central solution. As shown above the workload information should be extractable via the stored metrics with reasonable effort. |
You can probably try my fork branch, https://github.com/lujiajing1126/krr/tree/support-prom-discovery. We've already run this branch several times to provide resource recommendations for our production cluster. (maybe you need some tweak to the formatter) Just $ python krr.py simple -n <namespace> -p <prometheus-url> is enough. |
Hey @lujiajing1126, thank you for your work and pointing us to your fork. The results so far look verify promising. We were able to generate recommendations running against our central monitoring cluster without problems so far. Are you planning to contribute your changes back to krr? If you need any input or further testing which we can help with feel free to ping me! |
Thanks for your feedback! Sure, I am glad to contribute this feature back to the upstream. Since the branch contains several patches except the solution to this issue,
I would like to split it into several Pull Requests. But so far I did not receive any code review or comments. It seems maintainers are not actively considering contributions from the community. |
@lujiajing1126 hey, we appreciate the contribution and will review soon. I'm sorry about the delay. We're a small team, so it takes sometimes takes more time than we like. |
@lujiajing1126 Hey! I really like your idea and approach, and I am currently refactoring the way KRR is working with prometheus. But there is one thing that came to my mind about removing the usage of cluster API: we have an auto-discovery feature, that makes life easy for lots of people not making them to search for the link. But we can leave that I think, it will just not work when you don't have access, making you to provide the uri. But there is one more thing I am thinking about: Will it be possible to get data about CRDs using your approach, WDYT? (#65) |
@Avi-Robusta you are working with centralized monitoring, could you also join the discussion? |
Yes. I think so. We can generally provide two ways of workload discovery,
I am not familiar with ArgoCD. We need some investigation. But I think it is prevalent: advanced Kubernetes workload needs their own controller, e.g. And I would like to contribute my branch back to the upstream. (probably I need some more work to polish the code. I guess I could finish next week) Again, thanks to your team for the hardworking to bring us this great product! |
We agree that making it an optional behaviour is a good choise. Maybe by adding |
Hey all, just to update we have an updated PR with a new mode where KRR can operate only based on Prometheus data. So this will be supported in the near future. Please help by testing the PR! |
Is your feature request related to a problem? Please describe.
In some cases, we want to run the KRR program locally. But for the security consideration, the API server of the Kubernetes cluster cannot be accessed outside of the cluster.
So we can use the Prometheus-based workload discovery if
kube-state-metrics
is installed.Describe the solution you'd like
We can do the workload based discovery with the following steps,
Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.
Additional context
Add any other context or screenshots about the feature request here.
The text was updated successfully, but these errors were encountered: