-
Notifications
You must be signed in to change notification settings - Fork 619
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix the issue where agent could create the same container twice #939
Conversation
agent/engine/docker_task_engine.go
Outdated
@@ -236,6 +236,7 @@ func (engine *DockerTaskEngine) synchronizeState() { | |||
log.Warn("Could not find matching container for expected", "name", cont.DockerName) | |||
} else { | |||
cont.DockerID = describedCont.ID | |||
cont.Container.SetKnownStatus(api.ContainerCreated) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to look here and set the state based on what comes back. It could be created, running, or stopped.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it falls into this condition dockerID==""
, the container status shouldn't be > created. Otherwise the dockerID
field won't be empty. But have a check here will definitely add some confidence.
agent/engine/docker_task_engine.go
Outdated
@@ -562,9 +563,17 @@ func (engine *DockerTaskEngine) createContainer(task *api.Task, container *api.C | |||
// we have to do this in create, not start, because docker no longer handles |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you move this comment about HostConfig above line 579 instead of here? It's less clear what the comment is applying to now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good. please address @samuelkarp's comment.
agent/engine/docker_task_engine.go
Outdated
for _, v := range containerMap { | ||
if v.Container.Name == container.Name { | ||
dockerContainerName = v.DockerName | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you should/can add a break
here as well.
7bb0ffa
to
1bec8f9
Compare
@@ -250,18 +250,25 @@ func (state *DockerTaskEngineState) AddContainer(container *api.DockerContainer, | |||
} | |||
|
|||
if container.DockerID != "" { | |||
// Update the container id to the state |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a unit test covering the if
and else if
code paths?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added.
Can you update the CHANGELOG.md also as part of this PR? |
1bec8f9
to
bd3883a
Compare
CHANGELOG.md
Outdated
@@ -1,5 +1,8 @@ | |||
# Changelog | |||
|
|||
## |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we leave this as blank? We could consider adding a tentative version string with a *
annotation to indicate that this is not yet released (Example: 1.14.5*
or 1.15.0*
). Or, we can add a placeholder release tag (Example: UNRELEASED
). What do you think?
cc @jhaynes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd go with a placeholder for now, and we'll fill in the actual version as part of the release process. "UNRELEASED", "IN-PROGRESS", or "PENDING" would be my suggestions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I vote "UNRELEASED"
|
||
// TestAddContainer tests first add container with docker name and | ||
// then add the container with dockerID | ||
func TestAddContainer(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you rename the test to be more descriptive? You can use what you have written in the comments as well. Example:TestAddContainerNameAndID
.
0a62385
to
b18e674
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please address the minor comments I made before merging this.
CHANGELOG.md
Outdated
@@ -1,5 +1,8 @@ | |||
# Changelog | |||
|
|||
## UNRELEASED | |||
* Bug - Fix a race condition where a container can be created twice when agent restart. [#939](https://github.com/aws/amazon-ecs-agent/pull/939) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: restarts
agent/engine/docker_task_engine.go
Outdated
// we die before 'createContainer' returns because we can inspect by | ||
// name | ||
engine.state.AddContainer(&api.DockerContainer{DockerName: dockerContainerName, Container: container}, task) | ||
seelog.Infof("Created container name mapping for task %s - %s -> %s", task, container, dockerContainerName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you change task
to task.Arn
? We don't need the whole task to be logged here.
Save the generated docker container name and when agent restart restore the agent state, when agent try to create the container it will reuse the name so that no duplicate container will be created.
b18e674
to
49c36c7
Compare
Summary
Fix the issue if agent was restarted before the docker create API returned, the agent could create duplicate containers for the task.
Implementation details
Originally the agent stores the generated container name in memory before actually calling CreateContainer API, see here. And this information was only stored in the taskToID struct, due to lacking of docker ID here. But the
taskToID
struct wasn't stored in the state file bystatemanager
, so when the agent restart it will lose this information and then it will try to create the container with another name again.This PR make sure the the generated container name can be restored from agent state file if agent restarted before the
CreateContainer
API returned. Thus ensure the container won't be created twice. Also the agent would detect the container if it was created during the period of restarting, see here.Testing
make release
)go build -out amazon-ecs-agent.exe ./agent
)make test
) passgo test -timeout=25s ./agent/...
) passmake run-integ-tests
) pass.\scripts\run-integ-tests.ps1
) passmake run-functional-tests
) pass.\scripts\run-functional-tests.ps1
) passNew tests cover the changes:
yes
Description for the changelog
Bug - Fix an issue where a container could be created twice.
Licensing
This contribution is under the terms of the Apache 2.0 License:
yes