-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix topic operator loop for unmanaged topics #10451
Conversation
a4abdbf
to
41330db
Compare
41330db
to
48f1cfe
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not think this is the right fix for this issue. Also, this issue would likely manifest also in other situations with managed topics.
It seems like the Topic operator keeps updating the .status
withotu changing it whihc is wrong. It needs to use the StatusDiff
class to detect if something other than the timestamp changed in the status and only if it did, it should update the custom resource status. That should be done regardless whether it is managed or not.
Aside from that, I would expect the topic to not be in the Ready
state when it is not managed but be in some special state such as Unmanaged
to indicate that the state has been reflected by the Topic Operator. Likely it should not have the topic ID set either as it will not know if it changed or not? But that is not the core of the issue here and can be discussed separately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like the good direction. I left one comment. But the tests also seemed to have failed in the Topic Operator which seems to be related I guess?
if (statusChanged(reconcilableTopic.kt(), oldStatus)) { | ||
|
||
StatusDiff statusDiff = new StatusDiff(oldStatus, reconcilableTopic.kt().getStatus()); | ||
if (!statusDiff.isEmpty() || differentConditions(oldStatus.getConditions(), reconcilableTopic.kt().getStatus().getConditions())) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The StatusDiff should include the conditions. So why do we need the second part of the condition?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, we can remove it, I misslooked how StatusDiff evaluates the conditions path.
I'll remove it and have a look at the tests as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, thanks for working on this. I like the StatusDiff logic and the new Unmanaged condition type, but they require some additional changes.
The setObservedGeneration
and setTopicName
should be computed before the StatusDiff when reconciliation is enabled. This is why the following integration test is failing:
shouldUpdateTopicInKafkaWhenListConfigChangedInKube
You should also handle the corner case of "new AND unmanaged" KTs been created (the expected status should be "Unmanaged" instead of "Ready"). This is why the following integration tests are failing:
shouldNotCreateTopicInKafkaWhenUnmanagedTopicCreatedInKube
shouldRestoreFinalizerIfRemoved
(unmanaged KTs only)
Likely it should not have the topic ID set either as it will not know if it changed or not?
I agree that we should reset the topicId when the KT is paused/unmanaged, and set the topicId when switching back to unpaused/managed. This is actually a separate bug, so I logged #10467 that we should address with a dedicated PR.
pom.xml
Outdated
@@ -162,6 +162,7 @@ | |||
<jupiter.version>5.8.2</jupiter.version> | |||
<strimzi-test-container.version>0.107.0</strimzi-test-container.version> | |||
<mockserver.version>5.13.2</mockserver.version> | |||
<awaitility.version>4.2.2</awaitility.version> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need an extra dependency when we have TopicOperatroTestUtils.waitUntilCondition
, and a number of wait methods in TopicControllerIT
that you can use or adapt?
Tests should be fixed, but I still need to have a look at the test and remove awaitability lib :) Sorry for the delay on this. |
00c30b6
to
6a3e244
Compare
No problem. Can you please rebase and fix checkstyle errors? |
23fd983
to
75963bd
Compare
I think this is already done in a different PR, so maybe this should be closed already :) |
If you are referring to #10481, that addresses a different issue. There is some overlap (introduction of the "Unmanaged" status condition) but there is no fix for the tight reconciliation loop reported by you. The important bit to deliver here is the introduction of StatusDiff. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with Federico. This still fixes the main issue.
if (statusChanged(reconcilableTopic.kt(), oldStatus)) { | ||
|
||
StatusDiff statusDiff = new StatusDiff(oldStatus, reconcilableTopic.kt().getStatus()); | ||
if (!statusDiff.isEmpty() || synchronizableAndDifferentGenerations(reconcilableTopic.kt(), oldStatus)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still don't understand why the synchronizableAndDifferentGenerations
. What exactly does it check that isn't covered by the StatusDiff? synchronizableAndDifferentGenerations
seems to to return true for manage unpaused topics with different generations. But even StatusDiff should be able to handle that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@padilo I tested your changes after removing synchronizableAndDifferentGenerations
and they look good. If you remove that code, then we are good to go. Thanks.
75963bd
to
ea813f6
Compare
done :) |
There is an unused import to fix. |
Signed-off-by: Pablo Díaz <[email protected]>
ea813f6
to
3bcb22a
Compare
/azp run regression |
Azure Pipelines successfully started running 1 pipeline(s). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM assuming the system tests pass. Thanks for the PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks.
Type of change
Description
Fixes a bug in the topic operator with unmanaged topics, that keeps updating the custom resources even isn't going to update the Kafka Topic
Closes #10380
Checklist
Please go through this checklist and make sure all applicable tasks have been done