-
Notifications
You must be signed in to change notification settings - Fork 618
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
awsvpc task / agent 1.29.1 / eni - ecs failed to start task no errors in console #2193
Comments
Hi, Also, from the trace you gave, it looks like the error might be caused by the agent not initializing a field upon restart. Did you happen to restart the instance or the agent before getting this failure? |
I did restart the agent before receiving that error but the ECS Task wasn't working so i googled up found this thread. I basically changed the debug level to debug and upon task registration that was occuring. Note: I upgraded my VM's to ECS Agent 1.30 and the error doesn't occurs. |
I think we do have a bug, even in 1.30, though it does not affect normal use case, and only affects an edge case where you restart the agent right after launching an awsvpc task (in that case the agent might crash like you saw before). We will work on fixing the edge case. |
Hello, here are an interesting behavior using 1.30.0 Our setup kickstart an ec2 with ecs-agent and everything functional. Somehow, if we go on the host and do restart the ecs container using The ecs agent is now able to provision the ecs tasks? I believe this might only happen in "Awsvpc trunk enabled" vm. On an non-awsvpc trunk enabled VM we didnt had issue. Also using awsvpc trunk we have some kind of weird packet drop behaviors (but might be due to our custom iptables rules... connction get accepted but then hang or packet get dropped.... strange...) |
Can you collect the logs on the instance (e.g. https://docs.aws.amazon.com/AmazonECS/latest/developerguide/ecs-logs-collector.html) and send them to fenxiong AT amazon.com so that I can take a look? Thanks.
Does that only happen when you turn on awsvpc trunking? Do you have an easy way to reproduce the issue? |
So, i've tested the 1.32 patch. It seems there are still errors after a while... not sure why. Few days ago i did reprovision some instances to get to the 1.32.0 (using ECS AMI V1 not the ECS Optimized AMI 2 Linux ) I was not able to place an awsvpc task, so I stopped the ecs-agent + restarted it ( including debug log on ) I had to kill the task / get it created. Somehow it couldnt re-provision the task on that node... Here is the log: Hopefully that would help some investigations... |
The logs show that the agent is not able to launch the task because it's not able to find the pause container image
I am not certain how you ended up in that situation (i.e. able to place a task on an instance but no pause container image on it). This image should be loaded into Docker by the agent when the agent starts. Did you by any chance manually remove the |
Hmmmm, good catch i looked quite a bit too fast in the log file... We have a cronjob that delete unused images... (i know ecs does that too but we have other running process that may generate images outside of ECS). Does the docker image pull of amazon/amazon-ecs-pause:0.1.0 only happen during the ecs-agent initialization? |
Ok yes from what I looked the pause image is only added / extracted at ecs initialization...
|
Yes that's right. The agent doesn't pull this image from anywhere. Instead it loads this image into Docker (from a tar file bundled within the agent image itself) when it starts, and assumes that this image exists when running awsvpc task. So you might want to modify your cronjob to skip deleting this image in this case. |
Indeed, i'll add exception for any amazon/ containers 👍 I guess we can close this bug. |
Sounds good 👍closing this |
I had commented in ecs-cni but should of been here maybe ?
aws/amazon-ecs-cni-plugins#93
Observed Behavior
ECS Task failed to start without error wording ? (awsvpctrunk wasnt enabled on the vm)
ecs agent version 1.29.1
The text was updated successfully, but these errors were encountered: