Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Log collection on Windows node causes EKS pods to be stuck Terminating #11483

Open
backjo opened this issue Mar 28, 2022 · 0 comments
Open

Log collection on Windows node causes EKS pods to be stuck Terminating #11483

backjo opened this issue Mar 28, 2022 · 0 comments

Comments

@backjo
Copy link

backjo commented Mar 28, 2022

Output of the info page (if this is a bug)

(Paste the output of the info page here)

Describe what happened:
When deleting a pod on Windows, the container is failing to terminate. From research in this area,
it seems that some specific Windows event needs to be handled when files are in "DeletePending". Without handling for this event, the Datadog Agent is holding file handles for the log files of the containers that are being deleted - which prevents Docker from destroying the containers correctly.

See other log collectors that have had to implement fixes for this:
fluent/fluentd#3457
microsoft/Windows-Containers#106

Describe what you expected:
Pods should not be stuck terminating.

Steps to reproduce the issue:
Run Windows pods on an EKS environment (Docker as the container runtime) with standard log collection configured (via the docker log files output to C:/ProgramData/docker/containers). After some volume of logs have been generated, attempt to terminate the pod.

An error message like:

RemoveContainer "[ID]" from runtime service failed: rpc error: code = Unknown desc = failed to remove container "[ID]": Error response from daemon: unable to remove filesystem for [ID]: CreateFile C:\ProgramData\docker\containers[ID][ID]-json.log.4: Access is denied.

will be evented on the pod.

Additional environment details (Operating System, Cloud provider, etc):
EKS, Windows 2019 Server Core AMIs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant