Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pull retries and configurable routes for awsvpc network mode #975

Merged
merged 9 commits into from
Sep 13, 2017
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@

## UNRELEASED
* Feature - Support for provisioning Tasks with ENIs
* Enhancement - Retry failed container image pull operations.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also added another feature that add customized route in the container namespace for task using awsvpc. If I read the code correctly, this only can be configured by the config file, would it be more convenient to use environment variable. Also we may need to check the route before adding it, as the task traffic inside the container namespace can be changed by adding some route?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right; I had intended to make it configurable by environment variable as well but apparently never wrote the code. I'll do that.

I don't think we need to check the route. This is a sufficiently advanced, optional feature that is used in fairly narrow use-cases.

* Bug - Fixed a memory leak issue when submitting the task state change [#967](https://github.com/aws/amazon-ecs-agent/pull/967)

## 1.14.4
Expand Down
7 changes: 4 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -175,9 +175,10 @@ configure them as something other than the defaults.
| `ECS_IMAGE_MINIMUM_CLEANUP_AGE` | 30m | The minimum time interval between when an image is pulled and when it can be considered for automated image cleanup. | 1h | 1h |
| `ECS_NUM_IMAGES_DELETE_PER_CYCLE` | 5 | The maximum number of images to delete in a single automated image cleanup cycle. If set to less than 1, the value is ignored. | 5 | 5 |
| `ECS_INSTANCE_ATTRIBUTES` | `{"stack": "prod"}` | These attributes take effect only during initial registration. After the agent has joined an ECS cluster, use the PutAttributes API action to add additional attributes. For more information, see [Amazon ECS Container Agent Configuration](http://docs.aws.amazon.com/AmazonECS/latest/developerguide/ecs-agent-config.html) in the Amazon ECS Developer Guide.| `{}` | `{}` |
| `ECS_ENABLE_TASK_ENI` | `false` | Whether to enable task networking for task to be launched with its own network interface | `false` | `false` |
| `ECS_CNI_PLUGINS_PATH` | `/ecs/cni` | The path where the cni binary file is located | `/amazon-ecs-cni-plugins` | `/amazon-ecs-cni-plugins` |
| `ECS_AWSVPC_BLOCK_IMDS` | `true` | Whether to block access to [Instance Metdata](http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-metadata.html) for Tasks started with `awsvpc` network mode | `false` | `false`|
| `ECS_ENABLE_TASK_ENI` | `false` | Whether to enable task networking for task to be launched with its own network interface | `false` | Not applicable |
| `ECS_CNI_PLUGINS_PATH` | `/ecs/cni` | The path where the cni binary file is located | `/amazon-ecs-cni-plugins` | Not applicable |
| `ECS_AWSVPC_BLOCK_IMDS` | `true` | Whether to block access to [Instance Metdata](http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-metadata.html) for Tasks started with `awsvpc` network mode | `false` | Not applicable |
| `ECS_AWSVPC_ADDITIONAL_LOCAL_ROUTES` | `["10.0.15.0/24"]` | In `awsvpc` network mode, traffic to these prefixes will be routed via the host bridge instead of the task ENI | `[]` | Not applicable |

### Persistence

Expand Down
2 changes: 1 addition & 1 deletion agent/acs/model/ecsacs/api.go
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
// Copyright 2014-2016 Amazon.com, Inc. or its affiliates. All Rights Reserved.
// Copyright 2014-2017 Amazon.com, Inc. or its affiliates. All Rights Reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License"). You may
// not use this file except in compliance with the License. A copy of the
Expand Down
3 changes: 2 additions & 1 deletion agent/api/errors.go
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,8 @@ type NamedError interface {
ErrorName() string
}

// NamedError is a wrapper type for 'error' which adds an optional name and provides a symetric marshal/unmarshal
// DefaultNamedError is a wrapper type for 'error' which adds an optional name and provides a symmetric
// marshal/unmarshal
type DefaultNamedError struct {
Err string `json:"error"`
Name string `json:"name"`
Expand Down
34 changes: 24 additions & 10 deletions agent/config/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@ import (
"github.com/aws/amazon-ecs-agent/agent/engine/dockerclient"
"github.com/aws/amazon-ecs-agent/agent/utils"
"github.com/cihub/seelog"
cnitypes "github.com/containernetworking/cni/pkg/types"
)

const (
Expand Down Expand Up @@ -218,7 +219,8 @@ func fileConfig() (Config, error) {

err = json.Unmarshal(data, &cfg)
if err != nil {
seelog.Errorf("Error reading cfg json data, err %v", err)
seelog.Criticalf("Error reading cfg json data, err %v", err)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth to logging the string(data) here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, not a bad idea.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After further thought, I'm going to leave this out for now. I'm concerned about the case where a customer adds credentials (EngineAuthData) to the file and then we have a parse error; that would cause credentials to be written out to the log.

return cfg, err
}

// Handle any deprecated keys correctly here
Expand Down Expand Up @@ -293,11 +295,12 @@ func environmentConfig() (Config, error) {
}

taskCleanupWaitDuration := parseEnvVariableDuration("ECS_ENGINE_TASK_CLEANUP_WAIT_DURATION")

availableLoggingDriversEnv := os.Getenv("ECS_AVAILABLE_LOGGING_DRIVERS")
loggingDriverDecoder := json.NewDecoder(strings.NewReader(availableLoggingDriversEnv))
var availableLoggingDrivers []dockerclient.LoggingDriver
err = loggingDriverDecoder.Decode(&availableLoggingDrivers)
// EOF means the string was blank as opposed to UnexepctedEof which means an
// EOF means the string was blank as opposed to UnexpectedEof which means an
// invalid parse
// Blank is not a warning; we have sane defaults
if err != io.EOF && err != nil {
Expand Down Expand Up @@ -327,20 +330,30 @@ func environmentConfig() (Config, error) {
cniPluginsPath := os.Getenv("ECS_CNI_PLUGINS_PATH")
awsVPCBlockInstanceMetadata := utils.ParseBool(os.Getenv("ECS_AWSVPC_BLOCK_IMDS"), false)

instanceAttributesEnv := os.Getenv("ECS_INSTANCE_ATTRIBUTES")
attributeDecoder := json.NewDecoder(strings.NewReader(instanceAttributesEnv))
var instanceAttributes map[string]string

err = attributeDecoder.Decode(&instanceAttributes)
if err != io.EOF && err != nil {
err := fmt.Errorf("Invalid format for ECS_INSTANCE_ATTRIBUTES. Expected a json hash")
seelog.Warn(err)
errs = append(errs, err)
instanceAttributesEnv := os.Getenv("ECS_INSTANCE_ATTRIBUTES")
err = json.Unmarshal([]byte(instanceAttributesEnv), &instanceAttributes)
if instanceAttributesEnv != "" {
if err != nil {
wrappedErr := fmt.Errorf("Invalid format for ECS_INSTANCE_ATTRIBUTES. Expected a json hash: %v", err)
seelog.Error(wrappedErr)
errs = append(errs, wrappedErr)
}
}
for attributeKey, attributeValue := range instanceAttributes {
seelog.Debugf("Setting instance attribute %v: %v", attributeKey, attributeValue)
}

var additionalLocalRoutes []cnitypes.IPNet
additionalLocalRoutesEnv := os.Getenv("ECS_AWSVPC_ADDITIONAL_LOCAL_ROUTES")
if additionalLocalRoutesEnv != "" {
err := json.Unmarshal([]byte(additionalLocalRoutesEnv), &additionalLocalRoutes)
if err != nil {
seelog.Errorf("Invalid format for ECS_AWSVPC_ADDITIONAL_LOCAL_ROUTES, expected a json array of CIDRs: %v", err)
errs = append(errs, err)
}
}

if len(errs) > 0 {
err = utils.NewMultiError(errs...)
} else {
Expand Down Expand Up @@ -379,6 +392,7 @@ func environmentConfig() (Config, error) {
InstanceAttributes: instanceAttributes,
CNIPluginsPath: cniPluginsPath,
AWSVPCBlockInstanceMetdata: awsVPCBlockInstanceMetadata,
AWSVPCAdditionalLocalRoutes: additionalLocalRoutes,
}, err
}

Expand Down
71 changes: 63 additions & 8 deletions agent/config/config_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@
package config

import (
"encoding/json"
"errors"
"os"
"reflect"
Expand Down Expand Up @@ -80,53 +81,76 @@ func TestBrokenEC2MetadataEndpoint(t *testing.T) {
}

func TestEnvironmentConfig(t *testing.T) {
os.Setenv("AWS_DEFAULT_REGION", "foo-bar-1")
defer os.Unsetenv("AWS_DEFAULT_REGION")
os.Setenv("ECS_CLUSTER", "myCluster")
defer os.Unsetenv("ECS_CLUSTER")
os.Setenv("ECS_RESERVED_PORTS_UDP", "[42,99]")
defer os.Unsetenv("ECS_RESERVED_PORTS_UDP")
os.Setenv("ECS_RESERVED_MEMORY", "20")
defer os.Unsetenv("ECS_RESERVED_MEMORY")
os.Setenv("ECS_CONTAINER_STOP_TIMEOUT", "60s")
defer os.Unsetenv("ECS_CONTAINER_STOP_TIMEOUT")
os.Setenv("ECS_AVAILABLE_LOGGING_DRIVERS", "[\""+string(dockerclient.SyslogDriver)+"\"]")
defer os.Unsetenv("ECS_AVAILABLE_LOGGING_DRIVERS")
os.Setenv("ECS_SELINUX_CAPABLE", "true")
defer os.Unsetenv("ECS_SELINUX_CAPABLE")
os.Setenv("ECS_APPARMOR_CAPABLE", "true")
defer os.Unsetenv("ECS_APPARMOR_CAPABLE")
os.Setenv("ECS_DISABLE_PRIVILEGED", "true")
defer os.Unsetenv("ECS_DISABLE_PRIVILEGED")
os.Setenv("ECS_ENGINE_TASK_CLEANUP_WAIT_DURATION", "90s")
defer os.Unsetenv("ECS_ENGINE_TASK_CLEANUP_WAIT_DURATION")
os.Setenv("ECS_ENABLE_TASK_IAM_ROLE", "true")
defer os.Unsetenv("ECS_ENABLE_TASK_IAM_ROLE")
os.Setenv("ECS_ENABLE_TASK_IAM_ROLE_NETWORK_HOST", "true")
defer os.Unsetenv("ECS_ENABLE_TASK_IAM_ROLE_NETWORK_HOST")
os.Setenv("ECS_DISABLE_IMAGE_CLEANUP", "true")
defer os.Unsetenv("ECS_DISABLE_IMAGE_CLEANUP")
os.Setenv("ECS_IMAGE_CLEANUP_INTERVAL", "2h")
defer os.Unsetenv("ECS_IMAGE_CLEANUP_INTERVAL")
os.Setenv("ECS_IMAGE_MINIMUM_CLEANUP_AGE", "30m")
defer os.Unsetenv("ECS_IMAGE_MINIMUM_CLEANUP_AGE")
os.Setenv("ECS_NUM_IMAGES_DELETE_PER_CYCLE", "2")
defer os.Unsetenv("ECS_NUM_IMAGES_DELETE_PER_CYCLE")
os.Setenv("ECS_INSTANCE_ATTRIBUTES", "{\"my_attribute\": \"testing\"}")
defer os.Unsetenv("ECS_INSTANCE_ATTRIBUTES")
os.Setenv("ECS_ENABLE_TASK_ENI", "true")
defer os.Unsetenv("ECS_ENABLE_TASK_ENI")
additionalLocalRoutesJSON := `["1.2.3.4/22","5.6.7.8/32"]`
os.Setenv("ECS_AWSVPC_ADDITIONAL_LOCAL_ROUTES", additionalLocalRoutesJSON)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you missed defer here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops!

defer os.Unsetenv("ECS_AWSVPC_ADDITIONAL_LOCAL_ROUTES")

conf, err := environmentConfig()
assert.Nil(t, err)
assert.NoError(t, err)
assert.Equal(t, "myCluster", conf.Cluster)
assert.Equal(t, 2, len(conf.ReservedPortsUDP))
assert.Contains(t, conf.ReservedPortsUDP, uint16(42))
assert.Contains(t, conf.ReservedPortsUDP, uint16(99))
assert.Equal(t, uint16(20), conf.ReservedMemory)

expectedDuration, _ := time.ParseDuration("60s")
assert.Equal(t, expectedDuration, conf.DockerStopTimeout)

assert.Equal(t, []dockerclient.LoggingDriver{dockerclient.SyslogDriver}, conf.AvailableLoggingDrivers)

assert.True(t, conf.PrivilegedDisabled)
assert.True(t, conf.SELinuxCapable, "Wrong value for SELinuxCapable")
assert.True(t, conf.AppArmorCapable, "Wrong value for AppArmorCapable")
assert.True(t, conf.TaskIAMRoleEnabled, "Wrong value for TaskIAMRoleEnabled")
assert.True(t, conf.TaskIAMRoleEnabledForNetworkHost, "Wrong value for TaskIAMRoleEnabledForNetworkHost")
assert.True(t, conf.ImageCleanupDisabled, "Wrong value for ImageCleanupDisabled")
assert.True(t, conf.TaskENIEnabled, "Wrong value for TaskNetwork")

assert.Equal(t, (30 * time.Minute), conf.MinimumImageDeletionAge)
assert.Equal(t, (2 * time.Hour), conf.ImageCleanupInterval)
assert.Equal(t, 30*time.Minute, conf.MinimumImageDeletionAge)
assert.Equal(t, 2*time.Hour, conf.ImageCleanupInterval)
assert.Equal(t, 2, conf.NumImagesToDeletePerCycle)
assert.Equal(t, "testing", conf.InstanceAttributes["my_attribute"])
assert.Equal(t, (90 * time.Second), conf.TaskCleanupWaitDuration)
assert.Equal(t, 90*time.Second, conf.TaskCleanupWaitDuration)
serializedAdditionalLocalRoutesJSON, err := json.Marshal(conf.AWSVPCAdditionalLocalRoutes)
assert.NoError(t, err, "should marshal additional local routes")
assert.Equal(t, additionalLocalRoutesJSON, string(serializedAdditionalLocalRoutesJSON))
}

func TestTrimWhitespace(t *testing.T) {
os.Setenv("AWS_DEFAULT_REGION", "foo-bar-1")
defer os.Unsetenv("AWS_DEFAULT_REGION")
os.Setenv("ECS_CLUSTER", "default \r")
os.Setenv("ECS_ENGINE_AUTH_TYPE", "dockercfg\r")

Expand All @@ -153,6 +177,8 @@ func TestTrimWhitespace(t *testing.T) {
}

func TestConfigBoolean(t *testing.T) {
os.Setenv("AWS_DEFAULT_REGION", "foo-bar-1")
defer os.Unsetenv("AWS_DEFAULT_REGION")
os.Setenv("ECS_DISABLE_METRICS", "true")
cfg, err := NewConfig(ec2.NewBlackholeEC2MetadataClient())
if err != nil {
Expand Down Expand Up @@ -253,6 +279,8 @@ func TestValidFormatParseEnvVariableDuration(t *testing.T) {
}

func TestInvalidTaskCleanupTimeout(t *testing.T) {
os.Setenv("AWS_DEFAULT_REGION", "foo-bar-1")
defer os.Unsetenv("AWS_DEFAULT_REGION")
os.Setenv("ECS_ENGINE_TASK_CLEANUP_WAIT_DURATION", "1s")
cfg, err := NewConfig(ec2.NewBlackholeEC2MetadataClient())
if err != nil {
Expand All @@ -267,6 +295,8 @@ func TestInvalidTaskCleanupTimeout(t *testing.T) {
}

func TestTaskCleanupTimeout(t *testing.T) {
os.Setenv("AWS_DEFAULT_REGION", "foo-bar-1")
defer os.Unsetenv("AWS_DEFAULT_REGION")
os.Setenv("ECS_ENGINE_TASK_CLEANUP_WAIT_DURATION", "10m")
cfg, err := NewConfig(ec2.NewBlackholeEC2MetadataClient())
if err != nil {
Expand All @@ -281,6 +311,8 @@ func TestTaskCleanupTimeout(t *testing.T) {
}

func TestInvalidReservedMemory(t *testing.T) {
os.Setenv("AWS_DEFAULT_REGION", "foo-bar-1")
defer os.Unsetenv("AWS_DEFAULT_REGION")
os.Setenv("ECS_RESERVED_MEMORY", "-1")
cfg, err := NewConfig(ec2.NewBlackholeEC2MetadataClient())
if err != nil {
Expand All @@ -295,6 +327,8 @@ func TestInvalidReservedMemory(t *testing.T) {
}

func TestReservedMemory(t *testing.T) {
os.Setenv("AWS_DEFAULT_REGION", "foo-bar-1")
defer os.Unsetenv("AWS_DEFAULT_REGION")
os.Setenv("ECS_RESERVED_MEMORY", "1")
cfg, err := NewConfig(ec2.NewBlackholeEC2MetadataClient())
if err != nil {
Expand All @@ -309,6 +343,8 @@ func TestReservedMemory(t *testing.T) {
}

func TestTaskIAMRoleEnabled(t *testing.T) {
os.Setenv("AWS_DEFAULT_REGION", "foo-bar-1")
defer os.Unsetenv("AWS_DEFAULT_REGION")
os.Setenv("ECS_ENABLE_TASK_IAM_ROLE", "true")
cfg, err := NewConfig(ec2.NewBlackholeEC2MetadataClient())
if err != nil {
Expand All @@ -321,6 +357,8 @@ func TestTaskIAMRoleEnabled(t *testing.T) {
}

func TestTaskIAMRoleForHostNetworkEnabled(t *testing.T) {
os.Setenv("AWS_DEFAULT_REGION", "foo-bar-1")
defer os.Unsetenv("AWS_DEFAULT_REGION")
os.Setenv("ECS_ENABLE_TASK_IAM_ROLE_NETWORK_HOST", "true")
cfg, err := NewConfig(ec2.NewBlackholeEC2MetadataClient())
if err != nil {
Expand All @@ -333,6 +371,8 @@ func TestTaskIAMRoleForHostNetworkEnabled(t *testing.T) {
}

func TestCredentialsAuditLogFile(t *testing.T) {
os.Setenv("AWS_DEFAULT_REGION", "foo-bar-1")
defer os.Unsetenv("AWS_DEFAULT_REGION")
dummyLocation := "/foo/bar.log"
os.Setenv("ECS_AUDIT_LOGFILE", dummyLocation)
cfg, err := NewConfig(ec2.NewBlackholeEC2MetadataClient())
Expand All @@ -346,6 +386,8 @@ func TestCredentialsAuditLogFile(t *testing.T) {
}

func TestCredentialsAuditLogDisabled(t *testing.T) {
os.Setenv("AWS_DEFAULT_REGION", "foo-bar-1")
defer os.Unsetenv("AWS_DEFAULT_REGION")
os.Setenv("ECS_AUDIT_LOGFILE_DISABLED", "true")
cfg, err := NewConfig(ec2.NewBlackholeEC2MetadataClient())
if err != nil {
Expand All @@ -358,6 +400,8 @@ func TestCredentialsAuditLogDisabled(t *testing.T) {
}

func TestImageCleanupMinimumInterval(t *testing.T) {
os.Setenv("AWS_DEFAULT_REGION", "foo-bar-1")
defer os.Unsetenv("AWS_DEFAULT_REGION")
os.Setenv("ECS_IMAGE_CLEANUP_INTERVAL", "1m")
cfg, err := NewConfig(ec2.NewBlackholeEC2MetadataClient())
if err != nil {
Expand All @@ -370,6 +414,8 @@ func TestImageCleanupMinimumInterval(t *testing.T) {
}

func TestImageCleanupMinimumNumImagesToDeletePerCycle(t *testing.T) {
os.Setenv("AWS_DEFAULT_REGION", "foo-bar-1")
defer os.Unsetenv("AWS_DEFAULT_REGION")
os.Setenv("ECS_NUM_IMAGES_DELETE_PER_CYCLE", "-1")
cfg, err := NewConfig(ec2.NewBlackholeEC2MetadataClient())
if err != nil {
Expand All @@ -382,9 +428,18 @@ func TestImageCleanupMinimumNumImagesToDeletePerCycle(t *testing.T) {
}

func TestAWSVPCBlockInstanceMetadata(t *testing.T) {
os.Setenv("AWS_DEFAULT_REGION", "foo-bar-1")
defer os.Unsetenv("AWS_DEFAULT_REGION")
os.Setenv("ECS_AWSVPC_BLOCK_IMDS", "true")
defer os.Unsetenv("ECS_AWSVPC_BLOCK_IMDS")
cfg, err := NewConfig(ec2.NewBlackholeEC2MetadataClient())
assert.NoError(t, err)
assert.True(t, cfg.AWSVPCBlockInstanceMetdata)
}

func TestInvalidAWSVPCAdditionalLocalRoutes(t *testing.T) {
os.Setenv("ECS_AWSVPC_ADDITIONAL_LOCAL_ROUTES", `["300.300.300.300/64"]`)
defer os.Unsetenv("ECS_AWSVPC_ADDITIONAL_LOCAL_ROUTES")
_, err := environmentConfig()
assert.Error(t, err)
}
Loading