Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Agent not Sending Metrics to Instance Manager #6148

Closed
wants to merge 2 commits into from

Conversation

saedx1
Copy link
Contributor

@saedx1 saedx1 commented Aug 7, 2024

Proposed changes

Describe the use case and detail of the change. If this PR addresses an issue on GitHub, make sure to include a link to
that issue here in this description (not in the title of the PR).

The issue is that the agent was not sending metrics to instance manager it was reporting and showing up in the dashboard. This is due to the metrics feature being mistyped (or possible its naming changing from before).
This table shows the acceptable values: https://docs.nginx.com/nginx-management-suite/nginx-agent/install-nginx-agent/#cli-flags-and-environment-variables .

Feel free to test this, as I really do not have the time to do proper unit testing if you have one. However, I can confirm that I started seeing metrics in instance manager right after this change in the configmap.

Checklist

Before creating a PR, run through this checklist and mark each as complete.

  • I have read the CONTRIBUTING doc
  • I have added tests that prove my fix is effective or that my feature works
  • I have checked that all unit tests pass after adding my changes
  • I have updated necessary documentation
  • I have rebased my branch onto main
  • I will ensure my PR is targeting the main branch and pulling from my branch from my own fork

@saedx1 saedx1 requested a review from a team as a code owner August 7, 2024 10:49
@github-actions github-actions bot added the helm_chart Pull requests that update the Helm Chart label Aug 7, 2024
@vepatel
Copy link
Contributor

vepatel commented Aug 7, 2024

Hi @saedx1 metrics-sender is a valid feature, https://github.com/nginx/agent/blob/v2.36.1/sdk/agent/config/config_helpers.go#L27, are you using this alongside NAP-WAF as then NIC-Agent integration is specifically for the scenario of Service Monitoring

@pdabelf5
Copy link
Collaborator

pdabelf5 commented Aug 7, 2024

features:
  - metrics

includes metrics-sender, metrics-collection & metrics-throttle nginx/agent#453.

@saedx1
Copy link
Contributor Author

saedx1 commented Aug 7, 2024

@vepatel Yes, I am using this with NAP-WAF. Again, feel free to test metrics without the change in this PR, but I cannot get any metrics with metrics-sender in Instance Manager.

@pdabelf5 That is also fine, but the documentation does not mention anything but metrics, and I cannot see metrics in instance manager with metrics-sender, I'd appreciate a link to documentation where it mentions the difference between the three.

Also, the chart makes it hard to do any custom changes to agent configs as far as I can tell (without kubectl patch or something of the like).

@vepatel
Copy link
Contributor

vepatel commented Aug 7, 2024

@saedx1 are none of the metrics showing up without your changes?

@saedx1
Copy link
Contributor Author

saedx1 commented Aug 7, 2024

@vepatel That is correct. Nginx+ Instance Counts + Security Monitoring are working fine, but not instance/host metrics (CPU Utilization, Memory Usage, Status Code Counts, ...)

@vepatel
Copy link
Contributor

vepatel commented Aug 7, 2024

@saedx1 that seems like correct, NIC only does SM at the moment by default

@saedx1
Copy link
Contributor Author

saedx1 commented Aug 7, 2024

@vepatel I am not sure I get your point. I do get instance metrics if I do the change suggested in this PR. So clearly it is doable (unless you believe that should not work with NIC).

@pdabelf5
Copy link
Collaborator

pdabelf5 commented Aug 7, 2024

@saedx1 have you tried the helm value nginxAgent.customConfigMap to supply a custom nginx agent configuration file in a ConfigMap?

As @vepatel mentioned, what is included by default is the configuration required to support NIC + Security Monitoring. The custom configmap is there to allow you to enable additional features of NGINX Agent.

@saedx1
Copy link
Contributor Author

saedx1 commented Aug 7, 2024

@pdabelf5 Got it.
So, is it just the default and I am free to switch it up with custom configmap? or is it that we should not be using instance manager to monitor nginx-ic-nap-v5 instances?

Apologies, I realize this is a PR, but hopefully the answer to this question would lead to the end of the discussion.

@vepatel
Copy link
Contributor

vepatel commented Aug 7, 2024

Hi @saedx1, you're indeed free to bring in any valid configmap for customConfigMap setting and pass it's name to https://github.com/nginxinc/kubernetes-ingress/blob/v3.6.1/charts/nginx-ingress/values.yaml#L649.
As for:

we should not be using instance manager to monitor nginx-ic-nap-v5 instances

We certainly do not put any restriction but again NIC's default use-case with Agent is Security Monitoring as described here.
hope this clarifies everything and we'll add the customConfigMap option in above linked doc for better visibility

@pdabelf5
Copy link
Collaborator

pdabelf5 commented Aug 8, 2024

So, is it just the default and I am free to switch it up with custom configmap?

yes, the default is there to support Security Monitoring, the custom configmap is there to allow the user to customise the agent configuration to their needs.

@saedx1 saedx1 closed this Aug 8, 2024
@brianehlert
Copy link
Collaborator

Agent v2 is part of the NIC + WAF image.
It was added specifically to support getting violations into NIM + Security Monitor.

By default Agent v2 also requires logs be written to disk (and tails those logs for metrics).
This is a configuration that we do NOT recommend. So, we defaulted to turn that functionality off.

Why don't we recommend writing logs to disk?
Because logrotate jobs (or any cron job) don't run within containers. Therefore, you will get into a state of volume filling and pod crashing as a result.

@brianehlert
Copy link
Collaborator

brianehlert commented Aug 9, 2024

There was recently an internal experiment related to getting around this log tailing behavior of Agentv2.
The following was the recommendation that came out of that:
(note the addition of "metrics" and "exclude_logs" to the configmap)

kind: ConfigMap
 apiVersion: v1
 name: <configmap name>
 namespace: <namespace where NGINX Ingress Controller will be installed>
 data:
   nginx-agent.conf: |-
     log:
       level: error
       path: ""
     server:
       host: "<FQDN or IP address of NGINX Instance Manager>"
       grpcPort: 443
     nginx:
       exclude_logs: "/var/log/nginx/*"
     features:
     - registration
     - nginx-counting
     - metrics-sender
     - dataplane-status
     - metrics
     extensions:
     - nginx-app-protect
     - nap-monitoring
     nginx_app_protect:
       report_interval: 15s
       precompiled_publication: true
     nap_monitoring:
       collector_buffer_size: 20000
       processor_buffer_size: 20000
       syslog_ip: 127.0.0.1
       syslog_port: 1514 

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
helm_chart Pull requests that update the Helm Chart
Projects
Status: Done 🚀
Development

Successfully merging this pull request may close these issues.

4 participants