Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable metrics for windows #1077

Merged
merged 3 commits into from
Nov 16, 2017
Merged

Enable metrics for windows #1077

merged 3 commits into from
Nov 16, 2017

Conversation

richardpen
Copy link

@richardpen richardpen commented Nov 15, 2017

Summary

Fix #608

Implementation details

Change the default configuration of metrics on windows to be enabled by default.

Testing

  • Builds on Linux (make release)
  • Builds on Windows (go build -out amazon-ecs-agent.exe ./agent)
  • Unit tests on Linux (make test) pass
  • Unit tests on Windows (go test -timeout=25s ./agent/...) pass
  • Integration tests on Linux (make run-integ-tests) pass
  • Integration tests on Windows (.\scripts\run-integ-tests.ps1) pass
  • Functional tests on Linux (make run-functional-tests) pass
  • Functional tests on Windows (.\scripts\run-functional-tests.ps1) pass
  • Manual test.

New tests cover the changes:
yes

Description for the changelog

Licensing

This contribution is under the terms of the Apache 2.0 License:
yes

Copy link
Contributor

@aaithal aaithal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good, i have a couple of minor comments.

// dockerStatsToContainerStats returns a new object of the ContainerStats object from docker stats.
func dockerStatsToContainerStats(dockerStats *docker.Stats) (*ContainerStats, error) {
if numCores == uint64(0) {
seelog.Error("Invalid number of cpu cores")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add more context here: "Stats: invalid number of cpu cores in response"

func dockerStatsToContainerStats(dockerStats *docker.Stats) (*ContainerStats, error) {
if numCores == uint64(0) {
seelog.Error("Invalid number of cpu cores")
return nil, fmt.Errorf("Invalid number of cpu cores")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// The length of PercpuUsage represents the number of cores in an instance.
if len(dockerStats.CPUStats.CPUUsage.PercpuUsage) == 0 {
seelog.Debug("Invalid container statistics reported, invalid stats payload from docker")
return nil, fmt.Errorf("Invalid container statistics reported")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please fix the error message here as well.

)

// dockerStatsToContainerStats returns a new object of the ContainerStats object from docker stats.
func dockerStatsToContainerStats(dockerStats *docker.Stats) (*ContainerStats, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a unit test for this?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be covered here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it doesn't cover numCores block, right?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, you're right, adding it.

@richardpen richardpen force-pushed the metrics branch 2 times, most recently from 3411046 to abe50e7 Compare November 15, 2017 21:26

// dockerStatsToContainerStats returns a new object of the ContainerStats object from docker stats.
func dockerStatsToContainerStats(dockerStats *docker.Stats) (*ContainerStats, error) {
if numCores == uint64(0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When would this happen?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

theory it shouldn't happen, as it's acquired from runtime. But I don't know when will it happen.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sharanyad we should always remember that the runtime can send weird unexpected values to us (which has happened in the past). Since in this case, it'll lead to a panic, it's better to be protected

return nil, fmt.Errorf("Invalid container statistics reported, no cpu core usage reported")
}

cpuUsage := dockerStats.CPUStats.CPUUsage.TotalUsage / numCores
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we do a zero check for numCores here too?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will add that.

func dockerStatsToContainerStats(dockerStats *docker.Stats) (*ContainerStats, error) {
if numCores == uint64(0) {
seelog.Error("Invalid number of cpu cores acquired from the system")
return nil, fmt.Errorf("invalid number of cpu cores acquired from the system")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we still send memoryUsage and timestamp information since they don't require numCores ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a strong opinion here, but my suggestion is that we shouldn't send part of the metrics if something was wrong, not sending wrong metrics is better than sending wrong metrics I think.

@@ -0,0 +1,40 @@
// +build windows,!integration
// Copyright 2014-2016 Amazon.com, Inc. or its affiliates. All Rights Reserved.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: 2017

return nil, fmt.Errorf("invalid number of cpu cores acquired from the system")
}

cpuUsage := dockerStats.CPUStats.CPUUsage.TotalUsage / numCores
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we verify that dividing by numCores is appropriate on Windows? From when I was playing around with it on a 4-core machine, docker stats showed 100% usage on Windows when using all 4 cores but showed 400% on Linux for the same workload.

Copy link
Contributor

@petderek petderek Nov 15, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does '100%' mean different things on each platform?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@samuelkarp confirmed in that case, the metrics will be the same as docker stats, it will show 100%.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So do we need to divide by numCores or should we remove that?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should keep it, also this code indicates the same.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed offline: I don't have an objection to merging this as long as we close out on the right math before we release anything.

@samuelkarp samuelkarp changed the base branch from dev to windows November 16, 2017 00:47
@richardpen richardpen force-pushed the metrics branch 2 times, most recently from 3882502 to 317a3d8 Compare November 16, 2017 01:01
@richardpen richardpen merged commit d46f8fa into aws:windows Nov 16, 2017
@richardpen richardpen deleted the metrics branch November 21, 2017 01:05
@samuelkarp samuelkarp mentioned this pull request Nov 22, 2017
8 tasks
@samuelkarp samuelkarp added this to the 1.16.0 milestone Nov 22, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants