Logstash - add ability to reload pipeline(s) without triggering full pod restart by robbavey · Pull Request #6674 · elastic/cloud-on-k8s

robbavey · 2023-04-10T21:35:48Z

This commit adds the ability to reload logstash pipelines when the pipeline changes, without triggering a full restart of the pod, leveraging Logstash's ability to watch pipeline definitions, and reload automatically if a change is discovered.

A logstash config directory typically includes logstash.yml, pipelines.yml, jvm.options and log4j2.properties required to run logstash - while logstash can store the contents of a pipeline definition in any location, the pipelines.yml definition file, which states where these definition files are must be in the same config directory as the other setup files.

To enables us to have a mixture of copied and generated files in this config diretory, an initContainer is used, with a small script to prepare the config directory.

The script copies the contents of the /usr/share/logstash/config into the a shared config volume
Additionally softlinks are created into this volume from the pipelines.yml and logstash.yml secrets created by the logstash operator.

Additionally, we now do not include changes to pipelines in the configuration hash that triggers a pod reload, but instead write the config.reload.automatic: true setting in to logstash.yml

Note that triggering a reload is not immediate - there may be a delay measured in minutes between the pipeline definition being changed, and that being reflected in a pipeline reload.

kaisecheng · 2023-04-11T16:03:56Z

@robbavey I tried to run logstash.yaml sample and update the Secret logstash-sample-ls-pipeline with kubectl apply -f, but unable to update the pipeline to test the reload. Do you have any trick to update the pipeline?

robbavey · 2023-04-11T16:18:40Z

@kaisecheng It takes a surprisingly long time - I've seen it take up to 3 minutes for the change to get picked up. For manual testing, I typically make and apply the change then watch the pod logs.

…pipelines

pebrc · 2023-04-11T21:09:26Z

It takes a surprisingly long time - I've seen it take up to 3 minutes for the change to get picked up

This could be due to the kubelets sync interval which is 60 seconds plus some jitter and I guess there is probably also some delay on the Logstash side before the change is detected in the filesystem. We use a trick for Elasticsearch to force an immediate sync for the secrets where timely sync is important: we annotate the Pods (we use a timestamp but the actual annotation is irrelevant) which forces a sync.

I am not sure if pipeline changes need such urgent propagation to warrant such extra complexity but I thought I mention it:

cloud-on-k8s/pkg/controller/elasticsearch/certificates/transport/reconcile.go

Line 166 in 85aa75b

annotation.MarkPodAsUpdated(ctx, c, pod)

kaisecheng

Thanks for taking care of the pipeline reload issue. It works as expected. I have tested pipeline updates through pipelines and pipelinesRef. Both take 1 ~ 2 minutes locally to reflect the change. The e2e test looks good. I run it with export E2E_TAGS=e2e; make e2e-local TESTS_MATCH=TestPipelineConfigLogstash.

The only question on my mind is whether we should enable config.reload.automatic: true in ECK by default, because it is not enabled in docker nor any distribution

robbavey · 2023-04-11T23:08:15Z

@kaisecheng It's a good question, and worth discussing -

I think when a change is made to a pipeline or pipelineRef in the CRD, we can choose to

Restart the pipeline(s)
Restart the Pod
Do nothing

My thoughts here are that a change to a definition in the CRD, such as a change to a pipeline or pipelineRef is a deliberate action, that a user might expect to be applied on without manual intervention - similar to a change in config or configRef, which would trigger a restart of a pod.

This change allows that change to be acted upon with as little disruption as we can get away with - we can now limit the change to pipeline(s), rather than restarting the whole pod. If we don't set config.reload.automatic, a user would have to intervene manually, and delete the pod(s) to have the pipeline change come into effect, if we don't restart the pod.

But, let's discuss

kaisecheng · 2023-04-12T12:19:31Z

Agree that having pipeline auto-reload is a better experience. Sadly, it is not the default in Logstash. When config.reload.automatic is set to false, changes in pipeline do not restart the pipeline nor restart the pod.

Users may expect pod restart when pipeline changes, just like changing in config configRef results in a pod restart.
I think making pod restart when reload is set to false and pipeline restart when reload is set to true is truly making reloading a configurable flag. It can be done by passing the boolean flag to reconcilePipeline() to update configHash.

I am still not sure whether changing the default reload behavior only in ECK is right, but open to this change.

…e changes

robbavey · 2023-04-13T14:41:18Z

@barkbay @pebrc Ready for review by ECK team. I implemented the suggested optimization to speed up pipeline loading - thanks for the tip!

@kaisecheng and I discussed the config.reload.automatic: true setting and we decided to keep the implementation as is - set the config.reload.automatic setting to true to allow pipeline changes to be reacted upon automatically, but we will note this in the documentation, and explain how to turn off automatic reloads.

pebrc · 2023-04-21T11:02:51Z

buildkite test this -f p=gke,t=TestLogstashPipelineReload -m s=8.7.0

pebrc

LGTM

pkg/controller/logstash/driver.go

pkg/controller/logstash/initcontainer.go

pebrc · 2023-04-21T10:53:36Z

pkg/controller/logstash/pipeline.go

+			)
+		},
+	})
+}


diff --git a/pkg/controller/common/reconciler/secret.go b/pkg/controller/common/reconciler/secret.go index 0b6026f87..50004fd80 100644 --- a/pkg/controller/common/reconciler/secret.go +++ b/pkg/controller/common/reconciler/secret.go @@ -30,11 +30,17 @@ const ( SoftOwnerKindLabel = "eck.k8s.elastic.co/owner-kind" ) +func WithPostUpdate(f func()) func(p *Params) { + return func(p *Params) { + p.PostUpdate = f + } +} + // ReconcileSecret creates or updates the actual secret to match the expected one. // Existing annotations or labels that are not expected are preserved. -func ReconcileSecret(ctx context.Context, c k8s.Client, expected corev1.Secret, owner client.Object) (corev1.Secret, error) { +func ReconcileSecret(ctx context.Context, c k8s.Client, expected corev1.Secret, owner client.Object, opts ...func(*Params)) (corev1.Secret, error) { var reconciled corev1.Secret - if err := ReconcileResource(Params{ + params := Params{ Context: ctx, Client: c, Owner: owner, @@ -54,7 +60,11 @@ func ReconcileSecret(ctx context.Context, c k8s.Client, expected corev1.Secret, reconciled.Annotations = maps.Merge(reconciled.Annotations, expected.Annotations) reconciled.Data = expected.Data }, - }); err != nil { + } + for _, opt := range opts { + opt(&params) + } + if err := ReconcileResource(params); err != nil { return corev1.Secret{}, err } return reconciled, nil diff --git a/pkg/controller/logstash/pipeline.go b/pkg/controller/logstash/pipeline.go index 6cbfee388..447ed7b8b 100644 --- a/pkg/controller/logstash/pipeline.go +++ b/pkg/controller/logstash/pipeline.go @@ -41,7 +41,13 @@ func reconcilePipeline(params Params) error { }, } - if err := reconcileSecretWithFastUpdate(params, expected); err != nil { + if _, err := reconciler.ReconcileSecret(params.Context, params.Client, expected, &params.Logstash, + reconciler.WithPostUpdate(func() { + annotation.MarkPodsAsUpdated(params.Context, params.Client, + client.InNamespace(params.Logstash.Namespace), + NewLabelSelectorForLogstash(params.Logstash), + ) + })); err != nil { return err } return nil

If we want to reuse the existing secret reconciliation we could add a slice of option functions at the end

Co-authored-by: Peter Brachwitz <peter.brachwitz@gmail.com>

thbkrkr

Damn, I had reviewed this two days ago but didn't submit it sorry.
Looks good! I spotted a few minor things, nothing blocking to merge.

pkg/controller/logstash/initcontainer.go

thbkrkr · 2023-04-19T14:55:24Z

pkg/controller/logstash/pipeline.go

 	"github.com/elastic/cloud-on-k8s/v2/pkg/controller/common/tracing"
 	"github.com/elastic/cloud-on-k8s/v2/pkg/controller/logstash/pipelines"
+
+	"github.com/elastic/cloud-on-k8s/v2/pkg/utils/maps"


Nit: group imports (max 3 groups: stdlib / external deps / internal deps).

thbkrkr · 2023-04-19T14:59:35Z

pkg/controller/logstash/pipeline.go

+// This function reconciles the secret, but then adds a postUpdate step to mark the pods as updated
+// to trigger a quicker reload of the updated secret than waiting for the kubelet sync interval to kick in


Suggested change

// This function reconciles the secret, but then adds a postUpdate step to mark the pods as updated

// to trigger a quicker reload of the updated secret than waiting for the kubelet sync interval to kick in

// This function reconciles the secret, but then adds a postUpdate step to mark the pods as updated

// to trigger a quicker reload of the updated secret rather than waiting for the kubelet sync to kick in.

thbkrkr · 2023-04-21T13:30:53Z

pkg/controller/logstash/driver.go

+	// We intentionally DO NOT pass the configHash here. We don't want to consider the pipeline definitions in the
+	// hash of the config to ensure that a pipeline change does not automatically trigger a restart
+	// of the pod, but allows Logstash's automatic reload of pipelines to take place
+	if err := reconcilePipeline(params); err != nil {


Should we pass the configHash when config.reload.automaticequals false?

It's a good question. I'm erring on the side of 'no' at the moment, but I think this is something that could change after the technical preview depending on feedback.

My reasoning on this is that the false (default) value of non-k8s logstash doesn't react to pipeline changes at all, and to change this semantic to restart logstash completely on pipeline changes feels like very different behaviour.

Thinking about how we could add flexibility, I wonder if we might want to introduce something for ECK here, along the lines of:

config.reload.restart_policy: detected_only|all|none, which would either set config.reload.automatic: true for detected_only, and false for all or none, passing the configHash if the value is all, and not if it is none.

cc @flexitrev, @roaksoax, @jsvd

Co-authored-by: Thibault Richard <thbkrkr@users.noreply.github.com>

robbavey · 2023-04-21T15:18:05Z

Ready for another round @pebrc @thbkrkr

robbavey added 6 commits April 7, 2023 16:54

Copy pipeline and logstash.yml to initContainer

0111ad9

Working

b5f0ed8

Tidy up

8500980

Lint fixes

603a83e

Add e2e tests for pipeline changes

0c3aeb4

Remove unnecessary configHash from pipeline reconciliation

505f1fc

botelastic bot added the triage label Apr 10, 2023

robbavey requested a review from kaisecheng April 10, 2023 21:36

thbkrkr added the >enhancement Enhancement of existing functionality label Apr 11, 2023

botelastic bot removed the triage label Apr 11, 2023

Remove unused import

a1aadae

robbavey mentioned this pull request Apr 11, 2023

Logstash pipelines #6480

Merged

robbavey added 2 commits April 11, 2023 12:35

Merge remote-tracking branch 'upstream/feature/logstash' into reload_…

a5d84ce

…pipelines

Update tests with pipeline reload functionality

5843ff7

kaisecheng reviewed Apr 11, 2023

View reviewed changes

Mark pods as updated to speed up discovery and application of pipelin…

d09f7fe

…e changes

robbavey requested review from barkbay and pebrc April 20, 2023 21:31

pebrc approved these changes Apr 21, 2023

View reviewed changes

Apply suggestions from code review

fb157ad

Co-authored-by: Peter Brachwitz <peter.brachwitz@gmail.com>

thbkrkr reviewed Apr 21, 2023

View reviewed changes

Code review suggestions

598fe50

Apply suggestions from code review

1884ff9

Co-authored-by: Thibault Richard <thbkrkr@users.noreply.github.com>

pebrc approved these changes Apr 21, 2023

View reviewed changes

robbavey merged commit f984355 into elastic:feature/logstash Apr 21, 2023

thbkrkr added the :logstash label Jun 23, 2023

		// This function reconciles the secret, but then adds a postUpdate step to mark the pods as updated
		// to trigger a quicker reload of the updated secret than waiting for the kubelet sync interval to kick in

Conversation

robbavey commented Apr 10, 2023

Uh oh!

kaisecheng commented Apr 11, 2023

Uh oh!

robbavey commented Apr 11, 2023

Uh oh!

pebrc commented Apr 11, 2023

Uh oh!

kaisecheng left a comment

Choose a reason for hiding this comment

Uh oh!

robbavey commented Apr 11, 2023

Uh oh!

kaisecheng commented Apr 12, 2023

Uh oh!

robbavey commented Apr 13, 2023

Uh oh!

pebrc commented Apr 21, 2023

Uh oh!

pebrc left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

pebrc Apr 21, 2023

Choose a reason for hiding this comment

Uh oh!

thbkrkr left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

thbkrkr Apr 19, 2023

Choose a reason for hiding this comment

Uh oh!

thbkrkr Apr 19, 2023

Choose a reason for hiding this comment

Uh oh!

thbkrkr Apr 21, 2023

Choose a reason for hiding this comment

Uh oh!

robbavey Apr 21, 2023

Choose a reason for hiding this comment

Uh oh!

robbavey commented Apr 21, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

thbkrkr left a comment •

edited

Loading