-
Notifications
You must be signed in to change notification settings - Fork 619
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Temporary fix for #1127 #1227
Temporary fix for #1127 #1227
Conversation
this addresses #1127 |
agent/config/config.go
Outdated
@@ -346,6 +346,7 @@ func environmentConfig() (Config, error) { | |||
} | |||
containerMetadataEnabled := utils.ParseBool(os.Getenv("ECS_ENABLE_CONTAINER_METADATA"), false) | |||
dataDirOnHost := os.Getenv("ECS_HOST_DATA_DIR") | |||
cpuUnboundedTasksEnabled := utils.ParseBool(os.Getenv("ECS_ENABLE_CPU_UNBOUNDED_TASKS"), false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we move this into the Windows-only codepath since this doesn't apply to Linux?
Can we name this in a way that is Windows-specific? ECS_ENABLE_CPU_UNBOUNDED_WINDOWS
Can we name this in a way to set expectations that we might remove it in the future? ECS_ENABLE_CPU_UNBOUNDED_WINDOWS_WORKAROUND
agent/config/types.go
Outdated
|
||
// CpuUnboundedTasksEnabled specifies if agent can run a mix of CPU bounded and | ||
// unbounded tasks for windows | ||
CpuUnboundedTasksEnabled bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we move this to a Windows-only type? Can we put Windows in the name?
Also: s/Cpu/CPU/
ac5c1d5
to
cb35812
Compare
39d5e00
to
ded74dc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changes look good. i have some minor nits and comments. You're missing README.md
CHANGELOG
entries for this.
agent/api/task.go
Outdated
@@ -145,6 +145,8 @@ type Task struct { | |||
|
|||
// lock is for protecting all fields in the task struct | |||
lock sync.RWMutex | |||
|
|||
cpuUnboundedWindowsEnabled bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cpuUnboundedWindowsEnabled
is quite a long name. Can you make it shorter? cpuUnboundedWindows
perhaps? Enabled
seems redundant since its already a bool
.
Also, the right thing to do here is to probably create Task
struct in task_windows.go
and make this a part of that struct. But, we can do that in a separate PR.
agent/api/task_windows.go
Outdated
@@ -69,8 +71,13 @@ func (task *Task) platformHostConfigOverride(hostConfig *docker.HostConfig) erro | |||
// Convert the CPUShares to CPUPercent | |||
hostConfig.CPUPercent = hostConfig.CPUShares * percentageFactor / int64(cpuShareScaleFactor) | |||
if hostConfig.CPUPercent == 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you refactor this as if hostConfig.CPUPercent == 0 && hostConfig.CPUShares != 0
?
agent/config/config_windows.go
Outdated
@@ -94,6 +95,8 @@ func (cfg *Config) platformOverrides() { | |||
|
|||
// ensure TaskResourceLimit is disabled | |||
cfg.TaskCPUMemLimit = ExplicitlyDisabled | |||
|
|||
cfg.CPUUnboundedWindowsEnabled = utils.ParseBool(os.Getenv("ECS_ENABLE_CPU_UNBOUNDED_WINDOWS_WORKAROUND"), false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we really need the string WORKAROUND
in the name here? Seems unnecessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion by @samuelkarp to name it this way was to convey this could be a temporary fix/hack, until cpu shares are fixed in windows. Moreover allowing both cpu unbounded and bounded tasks in an instance can have serious consequences like bounded tasks not getting their allotted shares and 'workaround' may warn the customers to use it with caution. wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just having a flag named ECS_ENABLE_CPU_UNBOUNDED_WINDOWS
would probably indicate that this is a workaround imo. I don't have strong opinions here. The env var name does seem too long, however.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like the env var name to convey that we might remove it as a configuration option at some point (i.e., when we fix the CPU bursting behavior). I think WORKAROUND
or HACK
is helpful in pointing out that this is not an intended long-term approach.
bc323cb
to
479d19a
Compare
CHANGELOG.md
Outdated
@@ -1,5 +1,9 @@ | |||
# Changelog | |||
|
|||
## Unreleased |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're going with the convention of next-version.dev
instead of the Unreleased
tag. So, this should probably be 1.17.1-dev
README.md
Outdated
@@ -177,6 +177,7 @@ additional details on each available environment variable. | |||
| `ECS_HOST_DATA_DIR` | `/var/lib/ecs` | The source directory on the host from which ECS_DATADIR is mounted. We use this to determine the source mount path for container metadata files in the case the ECS Agent is running as a container. We do not use this value in Windows because the ECS Agent is not running as container in Windows. | `/var/lib/ecs` | `Not used` | | |||
| `ECS_ENABLE_TASK_CPU_MEM_LIMIT` | `true` | Whether to enable task-level cpu and memory limits | `true` | `false` | | |||
| `ECS_CGROUP_PATH` | `/sys/fs/cgroup` | The root cgroup path that is expected by the ECS agent. This is the path that accessible from the agent mount. | `/sys/fs/cgroup` | Not applicable | | |||
| `ECS_ENABLE_CPU_UNBOUNDED_WINDOWS_HACK` | `true` | When `true`, ECS will allow CPU unbounded(CPU=`0`) tasks to run along with CPU bounded tasks in Windows. | Not applicable | `false` | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's not have the word HACK
anywhere in these flags. Feel free to switch back to WORKAROUND
, but I'd highly discourage using HACK
here.
agent/api/task_unix.go
Outdated
@@ -40,6 +42,89 @@ const ( | |||
bytesPerMegabyte = 1024 * 1024 | |||
) | |||
|
|||
// Task is the internal representation of a task in the ECS agent | |||
type Task struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All of this code duplication is leading me to think that may be this is not the best way to do this. Because, adding a new field to the task now requires modifications in 2 different files. Also, 2 * code = 2 * bugs. How about something like this:
// task.go, no build flags
type Task struct {
... // all common fields
additionalFields
}
// task_unix.go, with the negative windows platform flag
type additionalFields struct {
// empty for now
}
// task_windows.go, with the windows build flag
type additionalFields struct {
cpuUnboundedEnabled bool
}
If that sounds reasonable, we should rewrite config.go
in a similar vein as well. I'm sure you can come with a better name for additionalFields
:)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should try to use the same struct. The Task API from the end user perspective is going to be the same, and forking it to have different implementations internally may become confusing down the road for us. What would you think about simply ignoring unrelated / unimplemented fields on each platform? This is the pattern used in the Docker API, and it works pretty well.
agent/api/task_windows.go
Outdated
|
||
// cpuUnboundedEnabled determines whether a mix of unbounded and bounded CPU tasks | ||
// are allowed to run in the instance | ||
cpuUnboundedEnabled bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cpuUnbounded
is good enough i think.
479d19a
to
77d2a10
Compare
agent/config/types_windows.go
Outdated
|
||
// PlatformVariables consists of configuration variables specific to Windows | ||
type PlatformVariables struct { | ||
// CPUUnbounded specifies if agent can run a mix of CPU bounded and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: tabbing seems off here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changes lgtm
This is a temporary workaround for the issue aws#1127
77d2a10
to
b3021b3
Compare
Summary
Allow a mix of CPU unbounded and bounded tasks for windows
Implementation details
Testing
make release
)go build -out amazon-ecs-agent.exe ./agent
)make test
) passgo test -timeout=25s ./agent/...
) passmake run-integ-tests
) pass.\scripts\run-integ-tests.ps1
) passmake run-functional-tests
) pass.\scripts\run-functional-tests.ps1
) passNew tests cover the changes: yes
Description for the changelog
Licensing
This contribution is under the terms of the Apache 2.0 License: yes