Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for grpc health check #835

Merged
merged 1 commit into from
Nov 15, 2021
Merged

Conversation

averzicco
Copy link
Contributor

Hi, This PR is to add support for gRPC health check to blackbox exporter.
I've read the discussion on issues #665 and #603, it seems that gRPC is out of scope for blackbox_exporter, if this is still the case just close the PR.

@roidelapluie
Copy link
Member

Hello, it is in scope again, as discussed during our public dev summits. I will review this asap.

@averzicco
Copy link
Contributor Author

great, I've checked the dev summit video and realized didn't implement the logic to specify the service name for the health check. I've just pushed the code which takes care of that

Copy link
Member

@brancz brancz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really awesome that you're taking this on, been wanting this for a long time but haven't had a chance to implement it! Thank you so much!

blackbox.yml Show resolved Hide resolved
config/config.go Outdated Show resolved Hide resolved
@roidelapluie
Copy link
Member

roidelapluie commented Oct 23, 2021

Can you please add a probe_grpc_duration_seconds with phase="check" ?

@roidelapluie
Copy link
Member

I am puzzled by the fact that we can not know the difference between SERVICE_UNKNOWN, UNKNOWN, and NOT_SERVING. Would we need an additional metric for health check response status?

@averzicco
Copy link
Contributor Author

I am puzzled by the fact that we can not know the difference between SERVICE_UNKNOWN, UNKNOWN, and NOT_SERVING. Would we need an additional metric for health check response status?

what do you prefer for the metric definition?
a gauge with status label

probe_grpc_response{status="UNKNOWN"}=0
probe_grpc_response{status="SERVING"}=1
probe_grpc_response{status="NOT_SERVING"}=0
probe_grpc_response{status="SERVICE_UNKNOWN"}=0

or a gauge with status as numeric value (same as value used in gRPC healtCheck proto)

probe_grpc_response=1

@roidelapluie
Copy link
Member

roidelapluie commented Oct 24, 2021

Since the numbers are pretty well defined in the proto, I'd go for the probe_grpc_healthcheck_serving_status=1.

@averzicco
Copy link
Contributor Author

Metric probe_grpc_healthcheck_serving_status added

@roidelapluie
Copy link
Member

After checking with others, it seems it would be preferable to have

probe_grpc_healthcheck_response{serving_status="UNKNOWN"}=0
probe_grpc_healthcheck_response{serving_status="SERVING"}=1
probe_grpc_healthcheck_response{serving_status="NOT_SERVING"}=0
probe_grpc_healthcheck_response{serving_status="SERVICE_UNKNOWN"}=0

@averzicco
Copy link
Contributor Author

Changed metric from probe_grpc_healthcheck_serving_status=1 to

probe_grpc_healthcheck_response{serving_status="UNKNOWN"}=0
probe_grpc_healthcheck_response{serving_status="SERVING"}=1
probe_grpc_healthcheck_response{serving_status="NOT_SERVING"}=0
probe_grpc_healthcheck_response{serving_status="SERVICE_UNKNOWN"}=0

Copy link
Member

@roidelapluie roidelapluie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM besides name change.

prober/grpc.go Outdated Show resolved Hide resolved
Signed-off-by: Alessandro Verzicco <[email protected]>
@roidelapluie
Copy link
Member

Thanks!

@autumnw
Copy link

autumnw commented Jan 22, 2022

Appreciate to all the people who joined conversation and contribute code here! This is so awesome!
Kudos to @averzicco!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants