Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add workqueue prometheus metrics #1266

Merged
merged 4 commits into from
Dec 9, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions cmd/nginx-ingress/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -368,6 +368,7 @@ func main() {
managerCollector = collectors.NewLocalManagerMetricsCollector(constLabels)
controllerCollector = collectors.NewControllerMetricsCollector(*enableCustomResources, constLabels)
processCollector := collectors.NewNginxProcessesMetricsCollector(constLabels)
workQueueCollector := collectors.NewWorkQueueMetricsCollector(constLabels)

err = managerCollector.Register(registry)
if err != nil {
Expand All @@ -383,6 +384,11 @@ func main() {
if err != nil {
glog.Errorf("Error registering NginxProcess Prometheus metrics: %v", err)
}

err = workQueueCollector.Register(registry)
if err != nil {
glog.Errorf("Error registering WorkQueue Prometheus metrics: %v", err)
}
}

useFakeNginxManager := *proxyURL != ""
Expand Down
8 changes: 6 additions & 2 deletions docs-web/logging-and-monitoring/prometheus.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ The Ingress Controller exports the following metrics:
* Exported by NGINX/NGINX Plus. Refer to the [NGINX Prometheus Exporter developer docs](https://github.com/nginxinc/nginx-prometheus-exporter#exported-metrics) to find more information about the exported metrics.
* There is a Grafana dashboard for NGINX Plus metrics located in the root repo folder.
* Calculated by the Ingress Controller:
* `controller_upstream_server_response_latency_ms_count`. Bucketed response times from when NGINX establishes a connection to an upstream server to when the last byte of the response body is received by NGINX. **Note**: The metric for the upstream isn't available until traffic is sent to the upstream. The metric isn't enabled by default. To enable the metric, set the `-enable-latency-metrics` command-line argument.
* `controller_upstream_server_response_latency_ms_count`. Bucketed response times from when NGINX establishes a connection to an upstream server to when the last byte of the response body is received by NGINX. **Note**: The metric for the upstream isn't available until traffic is sent to the upstream. The metric isn't enabled by default. To enable the metric, set the `-enable-latency-metrics` command-line argument.
* Ingress Controller metrics
* `controller_nginx_reloads_total`. Number of successful NGINX reloads. This includes the label `reason` with 2 possible values `endpoints` (the reason for the reload was an endpoints update) and `other` (the reload was caused by something other than an endpoint update like an ingress update).
* `controller_nginx_reload_errors_total`. Number of unsuccessful NGINX reloads.
Expand All @@ -37,7 +37,11 @@ The Ingress Controller exports the following metrics:
* `controller_ingress_resources_total`. Number of handled Ingress resources. This metric includes the label type, that groups the Ingress resources by their type (regular, [minion or master](/nginx-ingress-controller/configuration/ingress-resources/cross-namespace-configuration)). **Note**: The metric doesn't count minions without a master.
* `controller_virtualserver_resources_total`. Number of handled VirtualServer resources.
* `controller_virtualserverroute_resources_total`. Number of handled VirtualServerRoute resources. **Note**: The metric counts only VirtualServerRoutes that have a reference from a VirtualServer.
* Workqueue metrics. **Note**: the workqueue is a queue used by the Ingress Controller to process changes to the relevant resources in the cluster like Ingress resources. The Ingress Controller uses only one queue. The metrics for that queue will have the label `name="taskQueue"`
* `workqueue_depth`. Current depth of the workqueue.
* `workqueue_queue_duration_second`. How long in seconds an item stays in the workqueue before being requested.
* `workqueue_work_duration_seconds`. How long in seconds processing an item from the workqueue takes.

**Note**: all metrics have the namespace nginx_ingress. For example, nginx_ingress_controller_nginx_reloads_total.
**Note**: all metrics have the namespace `nginx_ingress`. For example, `nginx_ingress_controller_nginx_reloads_total`.

**Note**: all metrics include the label `class`, which is set to the class of the Ingress Controller. The class is configured via the `-ingress-class` command-line argument.
18 changes: 3 additions & 15 deletions internal/k8s/task_queue.go
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ type taskQueue struct {
// The sync function is called for every element inserted into the queue.
func newTaskQueue(syncFn func(task)) *taskQueue {
return &taskQueue{
queue: workqueue.New(),
queue: workqueue.NewNamed("taskQueue"),
sync: syncFn,
workerDone: make(chan struct{}),
}
Expand All @@ -55,7 +55,6 @@ func (tq *taskQueue) Enqueue(obj interface{}) {
}

glog.V(3).Infof("Adding an element with a key: %v", task.Key)

tq.queue.Add(task)
}

Expand Down Expand Up @@ -103,30 +102,19 @@ func (tq *taskQueue) Shutdown() {
// kind represents the kind of the Kubernetes resources of a task
type kind int

// resources
const (
// ingress resource
Dean-Coakley marked this conversation as resolved.
Show resolved Hide resolved
ingress = iota
// endpoints resource
endpoints
// configMap resource
configMap
// secret resource
secret
// service resource
service
// virtualserver resource
virtualserver
// virtualServeRoute resource
virtualServerRoute
// globalConfiguration resource
globalConfiguration
// transportserver resource
transportserver
// policy resource
policy
// appProtectPolicy resource
appProtectPolicy
// appProtectlogconf resource
appProtectLogConf
)

Expand Down Expand Up @@ -166,7 +154,7 @@ func newTask(key string, obj interface{}) (task, error) {
} else if objectKind == appProtectLogConfGVK.Kind {
k = appProtectLogConf
} else {
return task{}, fmt.Errorf("Unknow unstructured kind: %v", objectKind)
return task{}, fmt.Errorf("Unknown unstructured kind: %v", objectKind)
}
default:
return task{}, fmt.Errorf("Unknown type: %v", t)
Expand Down
6 changes: 2 additions & 4 deletions internal/metrics/collectors/processes.go
Original file line number Diff line number Diff line change
Expand Up @@ -10,15 +10,14 @@ import (
"github.com/prometheus/client_golang/prometheus"
)

// NginxProcessesMetricsCollector implements NginxPorcessesCollector interface and prometheus.Collector interface
// NginxProcessesMetricsCollector implements prometheus.Collector interface
type NginxProcessesMetricsCollector struct {
// Metrics
workerProcessTotal *prometheus.GaugeVec
}

// NewNginxProcessesMetricsCollector creates a new NginxProcessMetricsCollector
func NewNginxProcessesMetricsCollector(constLabels map[string]string) *NginxProcessesMetricsCollector {
pc := &NginxProcessesMetricsCollector{
return &NginxProcessesMetricsCollector{
workerProcessTotal: prometheus.NewGaugeVec(
prometheus.GaugeOpts{
Name: "nginx_worker_processes_total",
Expand All @@ -29,7 +28,6 @@ func NewNginxProcessesMetricsCollector(constLabels map[string]string) *NginxProc
[]string{"generation"},
),
}
return pc
}

// updateWorkerProcessCount sets the number of NGINX worker processes
Expand Down
119 changes: 119 additions & 0 deletions internal/metrics/collectors/workqueue.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,119 @@
package collectors

import (
"github.com/prometheus/client_golang/prometheus"
"k8s.io/client-go/util/workqueue"
)

// WorkQueueMetricsCollector collects the metrics about the work queue, which the Ingress Controller uses to process changes to the resources in the cluster.
// implements the prometheus.Collector interface
type WorkQueueMetricsCollector struct {
Dean-Coakley marked this conversation as resolved.
Show resolved Hide resolved
depth *prometheus.GaugeVec
latency *prometheus.HistogramVec
workDuration *prometheus.HistogramVec
}

// NewWorkQueueMetricsCollector creates a new WorkQueueMetricsCollector
func NewWorkQueueMetricsCollector(constLabels map[string]string) *WorkQueueMetricsCollector {
const workqueueSubsystem = "workqueue"
var latencyBucketSeconds = []float64{0.1, 0.5, 1, 5, 10, 50}

return &WorkQueueMetricsCollector{
depth: prometheus.NewGaugeVec(
prometheus.GaugeOpts{
Namespace: metricsNamespace,
Subsystem: workqueueSubsystem,
Name: "depth",
Help: "Current depth of workqueue",
ConstLabels: constLabels,
},
[]string{"name"},
),
latency: prometheus.NewHistogramVec(
prometheus.HistogramOpts{
Namespace: metricsNamespace,
Subsystem: workqueueSubsystem,
Name: "queue_duration_seconds",
Dean-Coakley marked this conversation as resolved.
Show resolved Hide resolved
Help: "How long in seconds an item stays in workqueue before being processed",
Buckets: latencyBucketSeconds,
ConstLabels: constLabels,
},
[]string{"name"},
),
workDuration: prometheus.NewHistogramVec(
prometheus.HistogramOpts{
Namespace: metricsNamespace,
Subsystem: workqueueSubsystem,
Name: "work_duration_seconds",
Help: "How long in seconds processing an item from workqueue takes",
Buckets: latencyBucketSeconds,
ConstLabels: constLabels,
},
[]string{"name"},
),
}
}

// Collect implements the prometheus.Collector interface Collect method
func (wqc *WorkQueueMetricsCollector) Collect(ch chan<- prometheus.Metric) {
wqc.depth.Collect(ch)
wqc.latency.Collect(ch)
wqc.workDuration.Collect(ch)
}

// Describe implements the prometheus.Collector interface Describe method
func (wqc *WorkQueueMetricsCollector) Describe(ch chan<- *prometheus.Desc) {
wqc.depth.Describe(ch)
wqc.latency.Describe(ch)
wqc.workDuration.Describe(ch)
}

// Register registers all the metrics of the collector
func (wqc *WorkQueueMetricsCollector) Register(registry *prometheus.Registry) error {
workqueue.SetProvider(wqc)
return registry.Register(wqc)
}

// NewDepthMetric implements the workqueue.MetricsProvider interface NewDepthMetric method
func (wqc *WorkQueueMetricsCollector) NewDepthMetric(name string) workqueue.GaugeMetric {
return wqc.depth.WithLabelValues(name)
}

// NewLatencyMetric implements the workqueue.MetricsProvider interface NewLatencyMetric method
func (wqc *WorkQueueMetricsCollector) NewLatencyMetric(name string) workqueue.HistogramMetric {
return wqc.latency.WithLabelValues(name)

}

// NewWorkDurationMetric implements the workqueue.MetricsProvider interface NewWorkDurationMetric method
func (wqc *WorkQueueMetricsCollector) NewWorkDurationMetric(name string) workqueue.HistogramMetric {
return wqc.workDuration.WithLabelValues(name)
}

// noopMetric implements the workqueue.GaugeMetric and workqueue.HistogramMetric interfaces
type noopMetric struct{}

func (noopMetric) Inc() {}
func (noopMetric) Dec() {}
func (noopMetric) Set(float64) {}
func (noopMetric) Observe(float64) {}

// NewAddsMetric implements the workqueue.MetricsProvider interface NewAddsMetric method
func (*WorkQueueMetricsCollector) NewAddsMetric(string) workqueue.CounterMetric {
return noopMetric{}
}

// NewUnfinishedWorkSecondsMetric implements the workqueue.MetricsProvider interface NewUnfinishedWorkSecondsMetric method
func (*WorkQueueMetricsCollector) NewUnfinishedWorkSecondsMetric(string) workqueue.SettableGaugeMetric {
return noopMetric{}
}

// NewLongestRunningProcessorSecondsMetric implements the workqueue.MetricsProvider interface NewLongestRunningProcessorSecondsMetric method
func (*WorkQueueMetricsCollector) NewLongestRunningProcessorSecondsMetric(string) workqueue.SettableGaugeMetric {
return noopMetric{}
}

// NewRetriesMetric implements the workqueue.MetricsProvider interface NewRetriesMetric method
func (*WorkQueueMetricsCollector) NewRetriesMetric(string) workqueue.CounterMetric {
return noopMetric{}
}