-
Notifications
You must be signed in to change notification settings - Fork 8.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Stack Monitoring] Support for apm-server/beats Agent subprocesses #144701
Comments
Pinging @elastic/infra-monitoring-ui (Team:Infra Monitoring UI) |
@klacabane @cmacknz do you see any downside to supporting the Beats metrics as they are in the SM UI in terms of forwards compatibility? In the future v2 Agent architecture, we're likely to break out some inputs from filebeat and metricbeat into separate processes. Will this cause a breaking change for the SM UI if we start supporting these metrics now? |
In addition to changing the input architecture, we are also adding new metrics and given the chance would probably like to deprecate some of the existing Beat metrics. I don't think there is a great understanding of the implications of adding or changing metrics on stack monitoring within the agent team. In general for the existing Beat metrics we just never change them to avoid breaking anything, but this also means we don't improve them. I am in favour of deferring this until the changes to the agent architecture are complete if we can. If we don't want to defer this, we should review the existing metrics and align on which ones we want to keep in the long term. |
I'm not very inclined to couple stack monitoring to the agent internals as stated here #120415 (comment). Adding the legacy mappings will also make that dependency two way which is not desirable when we already have a new agent architecture planned and likely different document shapes. Now it depends how far ahead those changes are but Stack monitoring is only looking at maintaining feature parity with metricbeat collection, and we don't plan on improving the existing moving forward as the solution will be superseded. The future solution is based on packages and I'd say it would be a good opportunity to already invests in building dashboard directly in the elastic agent package instead of supporting it in SM |
Summarizing spread out pieces about apm and beats monitoring in Agent world: For standalone mode, we’ll create a Beats package (#144995) that will spawn the corresponding metricbeat module so that standalone processes, apm-server included, can still be monitored and surfaced in Stack Monitoring. For agent subprocesses, they are a detail of the Agent internals. If we want to leak this information we should be careful and avoid spreading it to Stack Monitoring because it will confuse users that didn’t explicitly set up beats and see them in Stack Monitoring (Agents are not shown in SM). I suggest keeping that information encapsulated within the Elastic-Agent package and creating Dashboards that monitor the processes. To ease discoverability of these dashboards, Stack Monitoring could detect whether metrics are collected by agents and offer a quick link to the relevant Dashboards. APM running under agent is trickier because the internals are already exposed, one is directly configuring the server when adding the integration and it’s specified in the documentation[1]. The server is also unlikely to be replaced/refactored like other beats. No strong opinion here we could read from |
I think this makes sense, we really don't want to expose the implementation details of the agent for monitoring. We are starting to think more about how improve the existing agent monitoring dashboards within Fleet that we should avoid duplicating. @joshdover or @kpollich likely have some valuable input here on how to tie this into stack monitoring for agent. |
@klacabane we haven't yet decided how to move forward with this topic, but we have internal conversations on whether to build a dedicated UI or dashboards. |
Thanks! So we need a way to monitor apm-server under agent until an alternative exists. If we want to read from |
Yes, that's how it is set up on ESS. But the configuration has to be passed down from the Elastic Agent ( |
Would that be an acceptable solution for users ? My concern is polluting the The beats package will be available starting for 8.7.0 and one could follow the same steps as described in the docs but for the agent package instead of metricbeat module. |
Let's include the @joshdover and @cmacknz for this question. From an APM perspective, I think it is fine to use a configuration option for exposing a metrics endpoint, but not certain if that is aligned with the elastic-agent team's vision. |
Braindump of opinions:
|
Thanks @joshdover Looks like we're aligned on defining clear boundaries between Agent and Stack monitoring.
With these scenarios, no agent-produced data ( |
👍 from me |
SGTM; would prefer the dashboards being part of the |
SGTM. @simitt yes let's put them in the apm package to clarify ownership. One caveat is that the data streams are currently defined in This just means that we'll need to coordinate changes in across these two packages for this change and any future ones. |
Are both apm and elastic-agent packages stack-aligned ? Syncing the changes would be a difficult feat otherwise |
APM packages are bundled with Kibana, and therefore aligned with the Kibana version. |
Because apm would rely on data streams defined in elastic-agent package (which does not appear to be stack bound), I'm concerned of breaking changes going uncaught:
I think these concerns are mostly mitigated by the coupling of the data streams to the elastic-agent internals, so those would only happen when shipping a new stack version with a long enough window to catch potential issues. Coordination would also cover this, but I'm wondering how we could automatically detect such breakage. Maybe a CI step that installs both packages and run assertions on the dashboard ? Do we have such capability at the moment ? |
Closing this as the initial discussion came to a conclusion and the dependency concern between packages is out of scope. Ownership can be discussed when planning for new dashboards. |
Summary
An elastic-agent may spawn beats and apm-server subprocesses depending on its configuration. When that's the case the metrics for these processes will be ingested under
metrics-elastic_agent.(apm-server|metricbeat)-*
.Stack Monitoring should be able to interpret and surface these processes just like we do with the metricbeat data stored in
.monitoring-beats-mb
.Right now the mappings for these streams (stored in the elastic_agent package) are not aligned with stack monitoring expectations and adding the
metrics-elastic_agent.*
pattern to appropriate queries won't be enough, we'll either have to update the mappings to carry the legacy aliases or update queries to also look for the ECS format. The former approach is less work and is also consistent with other stack packages (es/kibana/logstash) so I'm inclined going that route.metrics-elastic_agent.*
pattern in appropriate placesAC
The text was updated successfully, but these errors were encountered: