Set function channel to idle to prevent DNS resolution of deleted pod #14750
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Motivation
When running the Kubernetes runtime and deleting a function, it is possible to observe the following error log:
This error happens because the
ManagedChannel
is created to target the StatefulSet pods, and the function worker deletes the pods before shutting down theManagedChannel
. Note that there is nothing about pod deletion that triggers the GRPC client to connect to the functions. The error happens due to the way that GRPC handles DNS and its frequent DNS resolution.There are two solutions. First, we could shutdown the managed channel first, or we could set the channel to idle and prevent any new DNS resolution (as long as there are new connections). Given that the StatefulSet or the Service could fail to get deleted, it seems simpler to just set the channel to idle and then delete it after successfully deleting the function.
Although, it's worth noting
enterIdle
is labeled as experimental. It looks like it is implemented in theManagedChannelImpl
to do exactly what we want: stop DNS resolution for as long as the channel stays in "idle" state. Here is the associated JavadocModifications
sucess
.Verifying this change
This is a trivial change that will not affect the logic of function deletion. It just ensures graceful shutdown and avoids benign errors that might otherwise confuse users.
Does this pull request potentially affect one of the following parts:
This is a backwards compatible change.
Documentation
no-need-doc
This is an internal cleanup.