Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

V1.14.5 release #999

Merged
merged 261 commits into from
Oct 3, 2017
Merged

V1.14.5 release #999

merged 261 commits into from
Oct 3, 2017

Conversation

jhaynes
Copy link
Contributor

@jhaynes jhaynes commented Sep 30, 2017

1.14.5

  • Enhancement - Retry failed container image pull operations #975
  • Enhancement - Set read and write timeouts for websocket connectons #993
  • Enhancement - Add support for the SumoLogic Docker log driver plugin
    #992
  • Bug - Fixed a memory leak issue when submitting the task state change #967
  • Bug - Fixed a race condition where a container can be created twice when agent restarts. #939
  • Bug - Fixed an issue where microsoft/windowsservercore:latest was not
    pulled on Windows under certain conditions.
    #990
  • Bug - Fixed an issue where task IAM role credentials could be logged to disk. #998

aaithal and others added 30 commits May 2, 2017 13:05
1. api: Added a new container state called `ContainerResourcesProvisioned`,
   which represents if a container that has completed provisioning all of its
   resources. Non-internal containers transition to this state without doing any
   additional work. However, containers that are added to a task by the ECS
   Agent would possibly need to perform additional actions. For example, the
   "pause" container would be provisioned by invoking CNI plugins
2. api: Tasks do not transition into "RUNNING" unless all the containers in the
   task have transitioned into "ResourcesProvisioned" state. The `TaskStatus()`
   method in the `ContainerStatus` type has been updated for this
3. api: Similarly, task transitions that lead to container transitions using the
   `ContainerStatus()` method in `TaskStatus` type has been updated to reflect
   this change
4. api, engine: All references to `api.ContainerRunning` have been replaced with
   the new `GetContainerSteadyStateStatus()` method, which returns
   `ContainerResourcesProvisioned` instead of `ContainerRunning`
5. engine/dependencygraph: `seelog` logger has replaced the module logger
6. engine/dependencygraph: `onRunIsResolved` has been renamed to
   `onProvisionResourcesIsResolved` to better reflect what the method is
   supposed to do
7. engine: `DockerTaskEngine.transitionFunctionMap()` has been updated with the
   function pointer to the new `DockerTaskEngine.provisionContainerResources()`
   method to perform the necessary actions required to update the state
Also modify `task.dockerHostConfig()` to set the network mode
field in container's host config json to `container:<pause-container>`
for non-pause containers in the task
* Instead of assuming that ContainerRunning is the steady state for
  all containers, add the notion of each container being able to define
  its own steady state.
* Rename task.RunDependencies to task.SteadyStateDependencies
* Add unit tests for containerstatus and task status updates
This change adds the ability for each container to specify its own steady state.
Doing so enables the Task stats of 'RUNNING' to be reached when all of the
containers in the task have reached their respective steady states. The 'pause'
container can specify its steady state as 'RESOURCES_PROVISION', where as other
normal containers can specify their steady states as 'RUNNING'.

This change also adds a new field in the 'api.Container' type, called
'InternalContainerType'. We support two types of internal containers, one for
creating empty container volumes, another for provisioning the network namespace
(via the pause container). This gets us away from using container names for
determining different types of internal containers.
The IsInternal field in api.Container is being refactored as an enum instead
(as 'api.Container.Type ContainerType'), with values that indicate what type of
container it is. It can be one of 'ContainerNormal' (indicates that this is
not an internal container, but something that was sent as part of the payload
from the communication service) or 'ContainerEmptyHostVolume' (indicates that
this is an internal container created for attaching ephemeral empty host
volumes) or 'ContainerCNIPause' (indicates that this is an internal container
created for attaching ENIs or some other custom network configuration)

The 'api/json.go' file has been deleted and its contents moved to files for
each respective type that has custom marshal/unmarshal methods overridden.

The version of the ECS Data file has been incremented as well.
Signed-off-by: Vinothkumar Siddharth <[email protected]>
Signed-off-by: Vinothkumar Siddharth <[email protected]>
Signed-off-by: Vinothkumar Siddharth <[email protected]>
Signed-off-by: Vinothkumar Siddharth <[email protected]>
Signed-off-by: Vinothkumar Siddharth <[email protected]>
Signed-off-by: Vinothkumar Siddharth <[email protected]>
* Return watcher, error from agent initialization
* Amend log messages to provide improved context

Signed-off-by: Vinothkumar Siddharth <[email protected]>
Signed-off-by: Vinothkumar Siddharth <[email protected]>
Signed-off-by: Vinothkumar Siddharth <[email protected]>
Signed-off-by: Vinothkumar Siddharth <[email protected]>
Signed-off-by: Vinothkumar Siddharth <[email protected]>
… terminal containers

* bugfix: Container transition for RUNNING -> STOPPED was deemed in-actionable
  in this method. Modified it so that if container's known status is RUNNING or
  RESOURCES_PROVISIONED, we return true for the action needed flag
* The managedTask.handleStoppedToRunningContainerTransition method has been
  refactored to get rid of chained if conditions so that its easier to read
* The DockerTaskEngine's transition function map has been refactored as a field
  in the DockerTaskEngine. This helps in testing as we can override this map
  in our tests. Ideally we would have used an interface method. But, that's a
  bigger refactor for another day
* Added unit tests for task manager for steady state == RESOURCES_PROVISIONED
Here is a link to minor modifications to the source:
kubernetes/kubernetes#43578

Signed-off-by: Vinothkumar Siddharth <[email protected]>
samuelkarp and others added 26 commits September 26, 2017 11:34
Transition dependencies will deprecate and replace the
SteadyStateDependencies list that exists in api.Container.  Transition
dependencies could also replace implicit link- and volume-dependencies
in the future, but that change is not yet planned.
Code handling AdditionalLocalRoutes was inadvertently removed in
10fb083.  This commit adds it back, and
adjusts the TestSetupNS unit test to explicitly check for it.
…or adding and dropping Linux capabilities

The changes include the following:
- Model changes in ContainerDefinition of the form:
linuxParameters: {
    capabilities: {
        add: [""],
        drop: [""]
    }
}
- Functional Tests that verify if the specified capability has been added to and dropped from the task's container
ecs_client/model, functional_tests: updated model, functional tests for adding and dropping Linux capabilities
The wc command prefixes spaces to the output, which
corrupts the GIT_PORCELAIN variable, thus failing
the build. This change removes spaces from the output.
This commit aims to make the websocker connection management
better by implementing the following improvements:

1. Set read and write deadlines for websocket ReadMessage and
WriteMessage operations. This is to ensure that these methods
do not hang and result in io timeout if there's issues with
the connection
2. Reduce the scope of the lock in the Connect() method. The
lock was being held for the length of Connect() method, which
meant that it wouldn't be relnquished if there was any delay
in establishing the connection. The scope of the lock has now
been reduced to just accessing the cs.conn variable
3. Start ACS heartbeat timer after the connection has been
established. The timer was being started before a call to
Connect, which meant that the connection could be prematurely
terminated for being idle if there was a delay in establishing
the connection

These changes should improve the disconnection behavior of the
websocket connection, which should help with scenarios where the
Agent never reconnects to ACS because it's forever waiting in
Disconnect() method waiting to acquire the lock (aws#985)
Increase the websocket read and write timeouts
as per review comments
The messages that come over the websockets can potentially contain sensitive
information that shouldn't be put in logs. Separately, if reading the message
results in an error, the content of the message is irrelevant and unreliable.
@petderek petderek merged commit 0dcd02c into aws:master Oct 3, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.