Standby operator pod starts reconciling after namespace change #2341

gyfora · 2024-04-11T15:20:19Z

Bug Report

Based on our production observations it seems to happen that in a HA setup with 2 operator pods, the standby (follower) starts reconciling resources in parallel with the leader after a namespace change event.

This is what we see:

Leader operator reconciles correctly for couple days
At a certain time we get the following log on the standby:

  Changing namespaces on 'flinkdeploymentcontroller' Controller to [...]...

In the Flink operator log this is triggered together with: controller.changeNamespaces(namespaces);

After this log (and changing the namespaces), both the standby and leader starts to reconcile the events, no LeaderElection related logs can be seen otherwise.

This causes all kinds of issues with the managed resources :)

Environment

EKS

The text was updated successfully, but these errors were encountered:

gyfora · 2024-04-11T15:20:25Z

cc @csviri

csviri · 2024-04-11T15:22:19Z

thx @gyfora , we will take a look soon

csviri self-assigned this Apr 11, 2024

csviri linked a pull request Apr 12, 2024 that will close this issue

fix: change namespace starts processor on namespace change even if not leader #2344

Merged

csviri closed this as completed in #2344 Apr 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Standby operator pod starts reconciling after namespace change #2341

Standby operator pod starts reconciling after namespace change #2341

gyfora commented Apr 11, 2024

gyfora commented Apr 11, 2024

csviri commented Apr 11, 2024

Standby operator pod starts reconciling after namespace change #2341

Standby operator pod starts reconciling after namespace change #2341

Comments

gyfora commented Apr 11, 2024

Bug Report

Environment

gyfora commented Apr 11, 2024

csviri commented Apr 11, 2024