vLLM Health Check (BETA)

Note

The vLLM Health Check support is currently in BETA. Its features and functionality are subject to change as we collect feedback. We are excited to hear any thoughts you have!

The vLLM backend supports checking for vLLM Engine Health upon receiving each inference request. If the health check fails, the model state will becomes NOT Ready at the server, which can be queried by the Repository Index or Model Ready APIs.

The Health Check is disabled by default. To enable it, set the following parameter on the model config to true

parameters: {
  key: "ENABLE_VLLM_HEALTH_CHECK"
  value: { string_value: "true" }
}

and select Model Control Mode EXPLICIT when the server is started.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

health_check.md

health_check.md

vLLM Health Check (BETA)

Files

health_check.md

Latest commit

History

health_check.md

File metadata and controls

vLLM Health Check (BETA)