Stack monitoring for Logstash#6436
Conversation
Initial commit of a simple operator. The first operator will create: * A Service exposing the logstash metrics API, so it can be used for monitoring purposes * Secrets holding logstash.yml * A StatefulSet deploying the logstash pods * Pods with default liveness and readiness probes The sample logstash yml file as located in config/samples/logstash/logstash.yaml will create: ``` ✗ kubectl tree logstash logstash-sample NAMESPACE NAME READY REASON AGE default Logstash/logstash-sample - 4m5s default ├─Secret/logstash-sample-ls-config - 4m4s default ├─Service/logstash-sample-ls-http - 4m5s default │ └─EndpointSlice/logstash-sample-ls-http-kwfsg - 4m5s default └─StatefulSet/logstash-sample-ls - 4m4s default ├─ControllerRevision/logstash-sample-ls-79bd59f869 - 4m4s default ├─Pod/logstash-sample-ls-0 True 3m59s default ├─Pod/logstash-sample-ls-1 True 3m59s default └─Pod/logstash-sample-ls-2 True 3m59s ``` And shows status: ``` ✗ kubectl get logstash logstash-sample NAME AVAILABLE EXPECTED AGE VERSION logstash-sample 3 3 9m1s 8.6.1 ``` Still TODO: * Unit Tests * End to end Tests * Certificate handling on the HTTP Metrics API Tidy up Co-authored-by: Kaise Cheng <kaise.cheng@elastic.co>
Rudimentary tests for validation and naming
This reverts commit 6eabf4c.
This reverts commit 8016b7b. Remving e2e tests to get e2e tests prior to logstash working
This reverts commit b5c775e. And re-adds the e2e tests
This reverts commit d4908b7.
Add a test to make sure that Logstash is running, by testing the metrics API endpoint and checking that the value of status is "green"
This commit fixes the readiness probe, by setting the default visibility of the logstash API to 0.0.0.0, allowing access to the metrics API to the probe
This reverts commit 60736da.
- update min version to 8.7.0 - update sample - update test - remove log4j2 issue doc
|
@barkbay |
| name: logstash-sample | ||
| spec: | ||
| count: 1 | ||
| version: 8.7.0 |
There was a problem hiding this comment.
Currently, this causes the TestSample e2e test to fail, due to the lack of 8.7.0 docker image for logstash
barkbay
left a comment
There was a problem hiding this comment.
Sorry for the delay. I only reviewed the e2e test in this review. I'm still running a few other tests and I think I will ✅ soon.
|
|
||
| // IsSupportedVersion returns true if the resource version is supported for Stack Monitoring, else returns false | ||
| func IsSupportedVersion(v string) error { | ||
| func IsSupportedVersion(v string, minVersion version.Version) error { |
There was a problem hiding this comment.
Not from your PR: the godoc is wrong, this function returns an error not a boolean.
|
|
||
| // checks that the sidecar beats have sent data in the monitoring clusters | ||
| steps := func(k *test.K8sClient) test.StepList { | ||
| return checks.MonitoredSteps(&monitored, k) |
There was a problem hiding this comment.
Note that CheckBeatSidecars will be a no-op as it implicitly creates a list of Elasticsearch Pods only. It might be a good opportunity to either make it more generic or rename it CheckBeatSidecarsInElasticsearch:
diff --git a/test/e2e/test/checks/monitoring.go b/test/e2e/test/checks/monitoring.go
index 604e753cf..313935676 100644
--- a/test/e2e/test/checks/monitoring.go
+++ b/test/e2e/test/checks/monitoring.go
@@ -58,7 +58,7 @@ func (c stackMonitoringChecks) Steps() test.StepList {
}
}
-func (c stackMonitoringChecks) CheckBeatSidecars() test.Step {
+func (c stackMonitoringChecks) CheckBeatSidecarsInElasticsearch() test.Step {
return test.Step{
Name: "Check that beat sidecars are running",
Test: test.Eventually(func() error {| if (logstash.Status.ExpectedNodes != expected.ExpectedNodes) || | ||
| (logstash.Status.AvailableNodes != expected.AvailableNodes) || | ||
| (logstash.Status.Version != expected.Version) { | ||
| return fmt.Errorf("expected status %+v but got %+v", expected, logstash.Status) |
There was a problem hiding this comment.
Should we also check monitoringAssociationStatus.?
There was a problem hiding this comment.
not really, this check doesn't involve monitoring
There was a problem hiding this comment.
However, it is involved in all cases, whether monitoring is activated or not? I was thinking about something along these lines:
func CheckStatus(b Builder, k *test.K8sClient) test.Step {
return test.Step{
Name: "Logstash status should have the correct status",
Test: test.Eventually(func() error {
....
expectedAssociationInStatus := len(logstash.Spec.Monitoring.Metrics.ElasticsearchRefs) + len(logstash.Spec.Monitoring.Metrics.ElasticsearchRefs)
actualAssociationInStatus := len(logstash.Status.MonitoringAssociationStatus)
if expectedAssociationInStatus != actualAssociationInStatus {
return fmt.Errorf("expected %d associations in status but got %d", expectedAssociationInStatus, actualAssociationInStatus)
}
for a, s := range logstash.Status.MonitoringAssociationStatus {
if s != v1.AssociationEstablished {
return fmt.Errorf("association %s has status %s ", a, s)
}
}There was a problem hiding this comment.
I see. You are right. Added it to status check. Thank you!
Co-authored-by: Michael Morello <michael.morello@gmail.com>
Co-authored-by: Michael Morello <michael.morello@gmail.com>
Co-authored-by: Michael Morello <michael.morello@gmail.com>
barkbay
left a comment
There was a problem hiding this comment.
If there are more than 1 monitoring cluster referenced in metrics then the status states that all the associations are Established, while this configuration is not supported (and therefore, nothing is actually deployed):
apiVersion: logstash.k8s.elastic.co/v1alpha1
kind: Logstash
metadata:
annotations:
association.k8s.elastic.co/es-conf-272414756: '{....}'
association.k8s.elastic.co/es-conf-3519008756: '{....}'
association.k8s.elastic.co/es-conf-4222863958: '{....}'
creationTimestamp: "2023-03-21T08:07:05Z"
generation: 1
labels:
test-name: test-ls-mon-a
name: test-ls-mon-a-5ptv
namespace: e2e-mercury
resourceVersion: "757992"
uid: 74383387-d449-459f-a3fb-07279bef0368
spec:
count: 1
monitoring:
logs:
elasticsearchRefs:
- name: test-ls-mon-logs-lp4x
namespace: e2e-mercury
metrics:
elasticsearchRefs:
- name: test-ls-mon-metrics1-rr2c
namespace: e2e-mercury
- name: test-ls-mon-metrics2-zmvs
namespace: e2e-mercury
podTemplate: {}
version: 8.7.0-SNAPSHOT
status:
monitoringAssociationStatus:
e2e-mercury/test-ls-mon-logs-lp4x: Established
e2e-mercury/test-ls-mon-metrics1-rr2c: Established
e2e-mercury/test-ls-mon-metrics2-zmvs: Established
observedGeneration: 1I'm not sure this comes from your PR. Would you mind opening an issue if it's not the case as it seems wrong and misleading? Note that I was not running the webhook, which I think would have prevented this from happening, still looks wrong anyway.
test/e2e/test/logstash/builder.go
Outdated
| "sigs.k8s.io/controller-runtime/pkg/client" | ||
|
|
||
| lsctl "github.com/elastic/cloud-on-k8s/v2/pkg/controller/logstash" | ||
|
|
There was a problem hiding this comment.
nit: Imports formatting seems off
test/e2e/test/k8s_client.go
Outdated
| commonv1.TypeLabelName: label.Type, | ||
| label.ClusterNameLabelName: esName, | ||
| commonv1.TypeLabelName: typeLabel, | ||
| label.ClusterNameLabelName: name, |
There was a problem hiding this comment.
I think this will prevent the Logstash Pods from being listed (and any other Pod other than ES ones I guess) because they do not have a ClusterNameLabelName label.
There was a problem hiding this comment.
Reverted the changes of CheckBeatSidecars and renamed it as suggested
Co-authored-by: Michael Morello <michael.morello@gmail.com>
Co-authored-by: Michael Morello <michael.morello@gmail.com>
This reverts commit 218a1e9.
@barkbay Create an issue for tracking #6558. I could not replicate the problem locally. Please update the issue if you have steps to reproduce it. |
This commit needs to rebase onto #6404
Following the prior art, the PR integrates stack monitoring for Logstash by adding metricsbeat and filebeat as sidecars to deliver metrics and log to monitoring es cluster
There is a sample resource in
config/samples/logstash/logstash_stackmonitor.yamlto deploy and test. It will createAnd show status:
This PR includes tests
TestLogstashStackMonitoringsidecar_test.goStill todo: