Skip to content

Distributor Warning: removing ingester failing healthcheck #3028

@zhuyanxi

Description

@zhuyanxi

Here the situation:

When a ingester pod is evicted, there will be a warning in distributor:

level=warn ts=2020-08-13T06:29:31.836520675Z caller=pool.go:182 msg="removing ingester failing healthcheck" addr=10.42.7.94:9095 reason="rpc error: code = DeadlineExceeded desc = latest connection error: connection error: desc = \"transport: Error while dialing dial tcp 10.42.7.94:9095: connect: connection refused\""

I think it is because of the ingester pod is already evicted, so the IP addr is not exist. And if the number of evicted ingester is bigger than the half of all ingesters, there will be an error in distributor:

level=error ts=2020-08-13T06:44:59.834662076Z caller=pool.go:161 msg="error removing stale clients" err="too many failed ingesters"

So what can I do to deal with this problem?

Thanks.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions