Logstash - add ability to reload pipeline(s) without triggering full pod restart#6674
Conversation
|
@robbavey I tried to run logstash.yaml sample and update the Secret |
|
@kaisecheng It takes a surprisingly long time - I've seen it take up to 3 minutes for the change to get picked up. For manual testing, I typically make and apply the change then watch the pod logs. |
This could be due to the kubelets sync interval which is 60 seconds plus some jitter and I guess there is probably also some delay on the Logstash side before the change is detected in the filesystem. We use a trick for Elasticsearch to force an immediate sync for the secrets where timely sync is important: we annotate the Pods (we use a timestamp but the actual annotation is irrelevant) which forces a sync. I am not sure if pipeline changes need such urgent propagation to warrant such extra complexity but I thought I mention it: |
kaisecheng
left a comment
There was a problem hiding this comment.
Thanks for taking care of the pipeline reload issue. It works as expected. I have tested pipeline updates through pipelines and pipelinesRef. Both take 1 ~ 2 minutes locally to reflect the change. The e2e test looks good. I run it with export E2E_TAGS=e2e; make e2e-local TESTS_MATCH=TestPipelineConfigLogstash.
The only question on my mind is whether we should enable config.reload.automatic: true in ECK by default, because it is not enabled in docker nor any distribution
|
@kaisecheng It's a good question, and worth discussing - I think when a change is made to a
My thoughts here are that a change to a definition in the CRD, such as a change to a This change allows that change to be acted upon with as little disruption as we can get away with - we can now limit the change to pipeline(s), rather than restarting the whole pod. If we don't set But, let's discuss |
|
Agree that having pipeline auto-reload is a better experience. Sadly, it is not the default in Logstash. When Users may expect pod restart when pipeline changes, just like changing in I am still not sure whether changing the default reload behavior only in ECK is right, but open to this change. |
|
@barkbay @pebrc Ready for review by ECK team. I implemented the suggested optimization to speed up pipeline loading - thanks for the tip! @kaisecheng and I discussed the |
|
buildkite test this -f p=gke,t=TestLogstashPipelineReload -m s=8.7.0 |
| ) | ||
| }, | ||
| }) | ||
| } |
There was a problem hiding this comment.
diff --git a/pkg/controller/common/reconciler/secret.go b/pkg/controller/common/reconciler/secret.go
index 0b6026f87..50004fd80 100644
--- a/pkg/controller/common/reconciler/secret.go
+++ b/pkg/controller/common/reconciler/secret.go
@@ -30,11 +30,17 @@ const (
SoftOwnerKindLabel = "eck.k8s.elastic.co/owner-kind"
)
+func WithPostUpdate(f func()) func(p *Params) {
+ return func(p *Params) {
+ p.PostUpdate = f
+ }
+}
+
// ReconcileSecret creates or updates the actual secret to match the expected one.
// Existing annotations or labels that are not expected are preserved.
-func ReconcileSecret(ctx context.Context, c k8s.Client, expected corev1.Secret, owner client.Object) (corev1.Secret, error) {
+func ReconcileSecret(ctx context.Context, c k8s.Client, expected corev1.Secret, owner client.Object, opts ...func(*Params)) (corev1.Secret, error) {
var reconciled corev1.Secret
- if err := ReconcileResource(Params{
+ params := Params{
Context: ctx,
Client: c,
Owner: owner,
@@ -54,7 +60,11 @@ func ReconcileSecret(ctx context.Context, c k8s.Client, expected corev1.Secret,
reconciled.Annotations = maps.Merge(reconciled.Annotations, expected.Annotations)
reconciled.Data = expected.Data
},
- }); err != nil {
+ }
+ for _, opt := range opts {
+ opt(¶ms)
+ }
+ if err := ReconcileResource(params); err != nil {
return corev1.Secret{}, err
}
return reconciled, nil
diff --git a/pkg/controller/logstash/pipeline.go b/pkg/controller/logstash/pipeline.go
index 6cbfee388..447ed7b8b 100644
--- a/pkg/controller/logstash/pipeline.go
+++ b/pkg/controller/logstash/pipeline.go
@@ -41,7 +41,13 @@ func reconcilePipeline(params Params) error {
},
}
- if err := reconcileSecretWithFastUpdate(params, expected); err != nil {
+ if _, err := reconciler.ReconcileSecret(params.Context, params.Client, expected, ¶ms.Logstash,
+ reconciler.WithPostUpdate(func() {
+ annotation.MarkPodsAsUpdated(params.Context, params.Client,
+ client.InNamespace(params.Logstash.Namespace),
+ NewLabelSelectorForLogstash(params.Logstash),
+ )
+ })); err != nil {
return err
}
return nilIf we want to reuse the existing secret reconciliation we could add a slice of option functions at the end
Co-authored-by: Peter Brachwitz <peter.brachwitz@gmail.com>
pkg/controller/logstash/pipeline.go
Outdated
| "github.com/elastic/cloud-on-k8s/v2/pkg/controller/common/tracing" | ||
| "github.com/elastic/cloud-on-k8s/v2/pkg/controller/logstash/pipelines" | ||
|
|
||
| "github.com/elastic/cloud-on-k8s/v2/pkg/utils/maps" |
There was a problem hiding this comment.
Nit: group imports (max 3 groups: stdlib / external deps / internal deps).
pkg/controller/logstash/pipeline.go
Outdated
| // This function reconciles the secret, but then adds a postUpdate step to mark the pods as updated | ||
| // to trigger a quicker reload of the updated secret than waiting for the kubelet sync interval to kick in |
There was a problem hiding this comment.
| // This function reconciles the secret, but then adds a postUpdate step to mark the pods as updated | |
| // to trigger a quicker reload of the updated secret than waiting for the kubelet sync interval to kick in | |
| // This function reconciles the secret, but then adds a postUpdate step to mark the pods as updated | |
| // to trigger a quicker reload of the updated secret rather than waiting for the kubelet sync to kick in. |
| // We intentionally DO NOT pass the configHash here. We don't want to consider the pipeline definitions in the | ||
| // hash of the config to ensure that a pipeline change does not automatically trigger a restart | ||
| // of the pod, but allows Logstash's automatic reload of pipelines to take place | ||
| if err := reconcilePipeline(params); err != nil { |
There was a problem hiding this comment.
Should we pass the configHash when config.reload.automaticequals false?
There was a problem hiding this comment.
It's a good question. I'm erring on the side of 'no' at the moment, but I think this is something that could change after the technical preview depending on feedback.
My reasoning on this is that the false (default) value of non-k8s logstash doesn't react to pipeline changes at all, and to change this semantic to restart logstash completely on pipeline changes feels like very different behaviour.
Thinking about how we could add flexibility, I wonder if we might want to introduce something for ECK here, along the lines of:
config.reload.restart_policy: detected_only|all|none, which would either set config.reload.automatic: true for detected_only, and false for all or none, passing the configHash if the value is all, and not if it is none.
cc @flexitrev, @roaksoax, @jsvd
Co-authored-by: Thibault Richard <thbkrkr@users.noreply.github.com>
This commit adds the ability to reload logstash pipelines when the pipeline changes, without triggering a full restart of the pod, leveraging Logstash's ability to watch pipeline definitions, and reload automatically if a change is discovered.
A logstash config directory typically includes
logstash.yml,pipelines.yml,jvm.optionsandlog4j2.propertiesrequired to run logstash - while logstash can store the contents of a pipeline definition in any location, thepipelines.ymldefinition file, which states where these definition files are must be in the same config directory as the other setup files.To enables us to have a mixture of copied and generated files in this config diretory, an
initContaineris used, with a small script to prepare the config directory./usr/share/logstash/configinto the a shared config volumepipelines.ymlandlogstash.ymlsecrets created by the logstash operator.Additionally, we now do not include changes to pipelines in the configuration hash that triggers a pod reload, but instead write the
config.reload.automatic: truesetting in tologstash.ymlNote that triggering a reload is not immediate - there may be a delay measured in minutes between the pipeline definition being changed, and that being reflected in a pipeline reload.