-
Notifications
You must be signed in to change notification settings - Fork 215
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Operator crashing after sometime #188
Comments
Hi Saikiran, I don't think you're doing something wrong. The error is happening on the fabric8 level. Let us get back to you asap with some ideas. |
Hi @SaikiranDaripelli , In this case the restart is a simple workaround from our side, see: We will try to take a look on this soon. |
@adam-sandor we could try to reconnect automatically from our side, but that should be done after the current changes in progress. |
Yeah it would be great if we could do something about this. I guess many users of the KubernetesClient don't have this problem as they don't watch things for a long time, but an operator does that by definition. |
Thanks for answering my query, i went through the fabric8 issue and they seem to suggest to do it on client end. |
@SaikiranDaripelli in short not, because by default we check if the generation increase, and in this case it won't increase (which can be turned off, in case it will reprocess because we cannot know if it happened during an execution of controller or not). In this case we are maintaining the state (last processed generation) in memory. The issue: https://github.com/ContainerSolutions/java-operator-sdk/issues/38 |
Thanks, then retries without restart will solve my current issue. Regarding storing state, can't sdk itself do what i am doing right now as a workaround, i.e store last successfully processed generation in status sub-resource upon successful controller execution, and discard event if current generation matches one in status sub-resource. |
@SaikiranDaripelli this could be done, it would be nicer if we could do this transparently. In the case when you suggesting we should probably provide some interface how to get the latest generation from the resource (name of the field can be different from different users). So this is definitely one of the ways to go. We will take a look, after the current changes we are working on. |
How about putting that into an annotation? |
Hi, i am encounter the same issue with the release version not match, @SaikiranDaripelli - can you please share how did you solve this issue on your end? is there any other solution for this? |
@PookiPok Right now there is no way to stop operator controller from restarting. |
@SaikiranDaripelli - So is there any workaround for this for now? |
@PookiPok @SaikiranDaripelli the restarting of controller is the workaround basically (thus it restarts but at least the system does not stop working) :( We can try to improve on this in the current version, but we are working on a big change now, there it will be easiert to fix. |
Thank you, waiting for this fix on the next release |
I'm facing similar issue with my operator. Because of restart pod is ending up in crash loop status. Is there any update on the fix or any workaround which we can use? |
@adam-sandor @charlottemach @kirek007 We should consider fix this in the current version (before the event sources are released, since that might take a long time) |
Hi,
We have an operator written using this SDK, and operator pod is restarting every few hours with below exception
Code i am using is
Am i using it wrong?
The text was updated successfully, but these errors were encountered: