-
Notifications
You must be signed in to change notification settings - Fork 618
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move TCS Client to ecs-agent module #3726
Conversation
10146b7
to
5c6d50b
Compare
@@ -474,9 +474,8 @@ func (engine *DockerStatsEngine) publishMetrics(includeServiceConnectStats bool) | |||
metricsMetadata, taskMetrics, metricsErr := engine.GetInstanceMetrics(includeServiceConnectStats) | |||
if metricsErr == nil { | |||
metricsMessage := ecstcs.TelemetryMessage{ | |||
Metadata: metricsMetadata, | |||
TaskMetrics: taskMetrics, | |||
IncludeServiceConnectStats: includeServiceConnectStats, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is includeServiceConnectStats
value no more used here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm good catch, I think I miss some change, I should have 1) removed this field from the telemetryMessage
type as well as the condition here, or 2) keep use this value. I was planning (but failed) to do 1). I want to do 1) because:
- ServiceConnect is associated with task. Therefore, having or not having service connect metrics "should" be associated with each "TaskMetric", instead of a "TelemetryMessage", as "TelemetryMessage" contains all tasks on the instance. For Fargate it is a little bit different, since each Fargate instance corresponding to one agent corresponding to one task.
- EC2 agent collects service connect metrics once per minute, i.e. every 3 telemetryMessages should have 1 message contains service connect metrics. Therefore, when we call
GetInstanceMetrics
, we do need to pass in this parameter to actually try to collect service connect metrics. Whether or not a task has service connect metrics is a different story. - On the other side of the channel, TCS client,
- when receiving the telemetry message, it calls
publishMetrics
code when either metrics is not disabled, or there is SC metrics, or both. However, when metrics are disabled, SC metrics are still being collected, which means in the case that metrics are disabled and there is SC metrics,getInstanceMetrics
will collect those task metrics of which task has SC enabled, and task metrics will only have SC metrics here, and it will form atelemetryMessage
. The only case, that the condition in channel receiver side skip the publish, is when both metrics is disabled and there is no SC metrics, there will not be a telemetry message at all here, so the condition can be removed. In fact, it would not even get to that line - the instance will be an idle instance here. - when sending the metrics, it checks whether each
taskMetric
object contains with service connect metric by checking this. There is no need for this value. To be more specific, whenIncludeServiceConnectStats
is set to true, there might be sometaskMetric
intaskMetrics
contains SC stats; whenIncludeServiceConnectStats
is set to false, there will be notaskMetric
intaskMetrics
contains SC stats. As a result, checking if SC stats present intaskMetric
should be a simpler and direct way to achieve same goal.
- when receiving the telemetry message, it calls
By the way, currently we do use this includeServiceConnectStats
parameter in TCSClient's metricsToPublishMetricRequests
function under /agent, while Fargate's implementation of TCSClient does not contain it. Again it is looping through each taskMetric
object and trying to copy its service connect stats, this flag does not provide any value.
Would you prefer 1) removed this field from the telemetryMessage
type as well as the condition here, or 2) keep use this value ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer 1 as long as we have a clear change in our tests that reflects the code change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah we have getInstance test changes verifying there is no empty telemetry message under all cases this and We also have SC integ test to verify cases where metrics are disabled this.
The missed part should no longer have a condition and therefore no specific test for that (we do not have a test for it before either after we moved to channel based stats engine, as the metrics disabled tests are moved to engine_test here and not able to catch the miss). We also have unit tests to test channel send/receive flow - this and this.
Is there any test you can think of that I should add?
@@ -474,9 +474,8 @@ func (engine *DockerStatsEngine) publishMetrics(includeServiceConnectStats bool) | |||
metricsMetadata, taskMetrics, metricsErr := engine.GetInstanceMetrics(includeServiceConnectStats) | |||
if metricsErr == nil { | |||
metricsMessage := ecstcs.TelemetryMessage{ | |||
Metadata: metricsMetadata, | |||
TaskMetrics: taskMetrics, | |||
IncludeServiceConnectStats: includeServiceConnectStats, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
Summary
Move TCS Client to ecs-agent module, and switch to use wsclient in ecs-agent module.
Implementation details
agent/tcs/client
andagent/tcs/model
to be underecs-agent/tcs
go.mod
in agent model accordinglygo.mod
in ecs-agent model for necessary dependency (uuid) for generating uuid when making request to TACSecs-agent/wsclient
with updated interface; also remove statsEngine from tcs client creation as they are decoupled nowTesting
New tests cover the changes: Yes. Updated tests for new interface.
Github workflow passed.
Description for the changelog
Move TCS Client to ecs-agent module
Licensing
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.