Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Health Check Service for Native Link #757

Closed
DolceTriade opened this issue Mar 12, 2024 · 4 comments
Closed

Health Check Service for Native Link #757

DolceTriade opened this issue Mar 12, 2024 · 4 comments

Comments

@DolceTriade
Copy link
Contributor

It would be nice if nativelink supported a basic health check for liveness checks in kubernetes/envoy/etc.

It seems Nativelink already has healthchecks for its stores. If we could propagate that into either a simple HTTP /healthz endpoint (maybe on the admin port?) or implement https://crates.io/crates/tonic-health on one of the gRPC ports, that would be ideal.

Can be configurable via services in the nativelink config.

Happy to take a stab at a PR if it seems like a desirable thing and we can agree on the mechanism.

@DolceTriade DolceTriade changed the title Health Check for Native Link Health Check Service for Native Link Mar 12, 2024
@adam-singer
Copy link
Contributor

@DolceTriade the initial /status route is being used for checking health in some context. The plan is to map the internal meaning of health to the external systems that would query for it. /healthz for kubernetes (and other systems) I assume work the same where any route could be used and what matters is status code responses (at least thats what I've gathered from documentation). There are specifications that allow for query parameters to filter out health checks by some mappings. We don't have any gRPC port checking atm, I think tonic-health would be a good starting point on that side.

Please take a stab at it or we can discuss more on details for the external to internal mappings of what health could mean.

@DolceTriade
Copy link
Contributor Author

Hm, I didn't know about the /status admin endpoint. That's probably sufficient for my use cases. we can probably close this unless the gRPC check is important.

@allada
Copy link
Member

allada commented Mar 17, 2024

Yeah, when #643 is done, it should help discoverability.

@allada
Copy link
Member

allada commented Apr 19, 2024

This should be resolved now with a few recently landed changes. Here's the last one on the config change:
ea50856

@allada allada closed this as completed Apr 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants