Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kopf stops receiving namespace events #232

Open
kopf-archiver bot opened this issue Aug 18, 2020 · 1 comment
Open

Kopf stops receiving namespace events #232

kopf-archiver bot opened this issue Aug 18, 2020 · 1 comment
Labels
archive bug Something isn't working

Comments

@kopf-archiver
Copy link

kopf-archiver bot commented Aug 18, 2020

An issue by logicfox at 2019-11-14 09:13:38+00:00
Original URL: zalando-incubator/kopf#232
 

Expected Behavior

Kopf should actively receive all namespace events.

@kopf.on.event('', 'v1', 'namespaces')
async def handle_event(event, **kwargs):
    logger = kwargs["logger"]
    logger.debug(f"Event: {event}. Cause: {kwargs.get('cause')}.")

Actual Behavior

Kopf receives events for a while and then stops receiving events. Neither create, updateand delete events handlers are triggered nor do the events show up in the raw event handler.

Steps to Reproduce the Problem

  1. Set up kopf to listen to Namespace events (as shown above)
  2. Log a message when events occur
  3. Create and delete namespaces on a cron (once an hour or so). Notice that Kopf stops receiving events after a period of time.

Specifications

  • Platform: Azure Kubernetes Service
  • Kubernetes version: 1.13.10
  • Python version: python:3.8.0-slim-buster
  • Python packages installed: kopf requests requests_oauthlib parse

Commented by nolar at 2019-11-14 09:32:20+00:00
 

logicfox Can you please add the Kopf's version too? pip freeze | grep kopf or kopf --version


Commented by logicfox at 2019-11-14 09:53:17+00:00
 

Sure

kopf==0.22

Commented by nolar at 2019-11-14 11:28:12+00:00
 

Maybe a duplicate of #204 #142 (not certain though).

logicfox Can you please try it with kopf>=0.23rc2? Specifically, kopf==0.23rc1 switches all the I/O internally to asyncio+aiohttp (#227). This already solved some issues with the synchronous sockets freezing in some cases, and maybe solves all the other issues with similar symptoms.

Please, be aware of the massive changes in this RC (see 0.23rc1 & optionally 0.23rc2 release notes) if you have a pre-existing operator, which can be affected — though, in theory, it should be fully backward compatible and safe, but who knows what can break in practice.


Commented by logicfox at 2019-11-18 21:36:25+00:00
 

nolar Sorry, I couldn't test this earlier. But it looks like the problem is still there in the master branch. watch seems to freeze after a while. I'm going to test this with the raw Kubernetes Python client to see if it's an issue with my cluster.


Commented by corka149 at 2020-04-29 18:48:28+00:00
 

We experienced the same issue until we upgraded Kubernetes to 1.15.10 in AKS. In addition I changed the version of Kopf from 0.25 to 0.26.

To the situation before: I noticed that events for CRDs were still being received.


Commented by atamgp at 2020-05-03 08:29:05+00:00
 

Not sure if this is related,on [email protected] and [email protected] I tried :

@kopf.on.login()
def login_fn(**kwargs):
    # return kopf.login_via_client(**kwargs)
    return kopf.login_via_pykube(**kwargs)

@kopf.on.event('', 'v1', 'namespaces')
# @kopf.on.create('core', 'v1', 'namespaces')

Results in

aiohttp.client_exceptions.ClientResponseError: 403, message='Forbidden', url=URL('{eksUrl}/api/v1/namespaces')

In the same promt, kubectl get namespaces works....

upgraded kopf to 27rc5, got it working with:

@kopf.on.login()
def login_fn(**kwargs):
    return kopf.login_via_client(**kwargs)

@kopf.on.create('', 'v1', 'namespaces')

Commented by jumpojoy at 2020-07-29 18:15:08+00:00
 

By default there is no timeout on timeoutSeconds for watch session is not set neither in kopf https://github.com/nolar/kopf/blob/master/kopf/structs/configuration.py#L68 or kubernetes API https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.18/ as result the session might stuck forewer. setting watching.server_timeout to some value might help here. It is important to set server_timeout to value less than watching.client_timeout (which is aiohttp session global timeout)

@kopf.on.startup()
def configure(settings: kopf.OperatorSettings, **_):
  settings.watching.server_timeout = 300

I think not only watching session might stuck, as other calls doesn't have default timeout configured. I've proposed to set timeouts globally per aiohttp session #377 but looks like it is not possible to override settings in the way propowed in patch, so it have to be updated.

@kopf-archiver kopf-archiver bot closed this as completed Aug 18, 2020
@kopf-archiver kopf-archiver bot changed the title [archival placeholder] Kopf stops receiving namespace events Aug 19, 2020
@kopf-archiver kopf-archiver bot added the bug Something isn't working label Aug 19, 2020
@kopf-archiver kopf-archiver bot reopened this Aug 19, 2020
@ps-jay
Copy link

ps-jay commented Oct 7, 2020

Just to keep this issue ticking along since the project move to the nolar space ..

I'm on Kubernetes v1.18.6, using Kopf v0.27, and have observed this issue.
@jumpojoy's solution seems to have fixed it for me:

@kopf.on.startup()
def configure(settings: kopf.OperatorSettings, **_):
  settings.watching.server_timeout = 300

MarkusH added a commit to crate/crate-operator that referenced this issue Nov 2, 2020
Every now and then, the operator might get stuck when watching for
events on the K8s API. By setting some timeouts this can be mitigated.

Refs nolar/kopf#232
Refs https://kopf.readthedocs.io/en/latest/configuration/#api-timeouts
mergify bot pushed a commit to crate/crate-operator that referenced this issue Nov 2, 2020
Every now and then, the operator might get stuck when watching for
events on the K8s API. By setting some timeouts this can be mitigated.

Refs nolar/kopf#232
Refs https://kopf.readthedocs.io/en/latest/configuration/#api-timeouts
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
archive bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant