Skip to content
This repository has been archived by the owner on Sep 21, 2023. It is now read-only.

Support monitoring the shipper from the Elastic Agent #267

Closed
Tracked by #16
cmacknz opened this issue Feb 28, 2023 · 4 comments · Fixed by elastic/elastic-agent#2427
Closed
Tracked by #16

Support monitoring the shipper from the Elastic Agent #267

cmacknz opened this issue Feb 28, 2023 · 4 comments · Fixed by elastic/elastic-agent#2427
Assignees
Labels
Team:Elastic-Agent Label for the Agent team

Comments

@cmacknz
Copy link
Member

cmacknz commented Feb 28, 2023

Enabling agent.monitoring.metrics in an agent policy will spawn a Metricbeat instance to collect metrics from processes supervised by the agent. This Metricbeat instance needs to have a module enabled to capture the shipper's CPU utilization, memory usage, and application specific metrics. The monitoring configuration for this Metricbeat instance is defined in the the v1_monitor.go file.

The agent currently knows how to start both the beat/metrics module for monitoring beats and the http/metrics module for monitoring the agent itself:

https://github.com/elastic/elastic-agent/blob/f27f8d01bcb57f21903c1bc4927d7bf62ba46595/internal/pkg/agent/application/monitoring/v1_monitor.go#L786-L796

	inputs := []interface{}{
		map[string]interface{}{
			idKey:        "metrics-monitoring-beats",
			"name":       "metrics-monitoring-beats",
			"type":       "beat/metrics",
			useOutputKey: monitoringOutput,
			"data_stream": map[string]interface{}{
				"namespace": monitoringNamespace,
			},
			"streams": beatsStreams,
		},
		map[string]interface{}{
			idKey:        "metrics-monitoring-agent",
			"name":       "metrics-monitoring-agent",
			"type":       "http/metrics",

Likely the easiest path to support the shipper is to use the http/metrics input to collect the shipper's metrics. A recent example doing this for custom Filebeat metrics can be found in elastic/elastic-agent#2171, with the corresponding updates to the mappings in the Elastic Agent integration in elastic/integrations#5077.

The shipper metrics will need to be written to a dedicated data stream if elastic/elastic-agent#1814 is not completed before this issue is completed. This will require updating the list of possible monitoring datastreams in the Fleet UI as described in that issue.

Acceptance Criteria:

  • Enabling agent.monitoring.metrics causes the agent to start a Metricbeat monitoring input capable of collecting the shipper's metrics and sending them to Fleet.
  • A demo can be performed proving that shipper CPU, memory, queue, output, and other application specific metrics are shipped to the expected data stream in Fleet and can be queried and inspected.

The work to update the Elastic Agent monitoring dashboard to account for the new metrics is captured into #54

@cmacknz
Copy link
Member Author

cmacknz commented Mar 8, 2023

We should consider having the shipper report metrics in a format the beat/metrics input understands, so that we can reuse it for collecting memory and CPU metrics from the shipper.

https://www.elastic.co/guide/en/beats/metricbeat/current/metricbeat-metricset-beat-stats.html

@fearful-symmetry
Copy link
Contributor

@cmacknz I was still wondering about the whole "format like beat metrics" thing, since we also have queue metrics and stuff that's shipper-specific. Should we expand the beat/metric inputs to recognize shipper data? Just have two inputs for shipper data?

@cmacknz
Copy link
Member Author

cmacknz commented Mar 21, 2023

Yes the approach to take is to use beat/metrics to get the standard Beat metrics, and then use http/metrics to get the shipper specific metrics. This is the approach used to capture Filebeat input specific metrics. The agent change was in elastic/elastic-agent#2171 with the Elastic Agent integration mappings updated in elastic/integrations#5077. This requires the fewest changes to the system.

We could extend the beat/metrics module to capture the shipper metrics based on configuration, but this module is used by stack monitoring, is owned by the stack monitoring team, and would have to be tested against every stack monitoring use case. We should avoid modifying this for the shipper since it is an extremely special case since stack monitoring will never monitor a standalone shipper.

@cmacknz
Copy link
Member Author

cmacknz commented Mar 30, 2023

Example of creating a new dataset for monitoring: elastic/kibana#149974. The list is hard coded in Fleet, without modifying Fleet the agent won't have permission to write to these indexes.

We should use elastic_agent.shipper as the dataset name, we'll need one for logs and one for metrics.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Team:Elastic-Agent Label for the Agent team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants