Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

watcher on 0.92 does not get all objects during Init #1524

Closed
jfeldman325 opened this issue Jun 19, 2024 · 4 comments · Fixed by #1525
Closed

watcher on 0.92 does not get all objects during Init #1524

jfeldman325 opened this issue Jun 19, 2024 · 4 comments · Fixed by #1525
Assignees
Labels
bug Something isn't working runtime controller runtime related
Milestone

Comments

@jfeldman325
Copy link

Current and expected behavior

in 0.92 if you initialize a watcher with an Api object created with Api::all the watcher only reflects objects withing the current namespace into the store. Example:

    let subscriber = pod_writer
        .subscribe()
        .expect("subscribers can only be created from shared stores");
    // (2.5): create a watch stream
    let config = Config::default().concurrency(0);
    let new_client = Client::try_default().await.unwrap();
    let pods: Api<Pod> = Api::all(new_client);

    let pod_watch = watcher(pods, watcher::Config::default())
        .default_backoff()
        .reflect_shared(pod_writer)
        .modify(|pod| {
            pod.managed_fields_mut().clear();
            pod.spec.as_mut().unwrap().containers = vec![]; // We don't need any of this info, reduce memory footprint
        });

This snippit in 0.92 once the reflector returns as ready only contains pods from the current namespace when calling reader.state() , reverting to 0.91 fixes the issue and returns pods across all namespaces.

Possible solution

No response

Additional context

No response

Environment

Client Version: v1.28.4
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.24.12

Configuration and features

No response

Affected crates

No response

Would you like to work on fixing this bug?

None

@jfeldman325 jfeldman325 added the bug Something isn't working label Jun 19, 2024
@clux
Copy link
Member

clux commented Jun 19, 2024

uhh...wtf. this is actually not just related to reflectors (or the unstable stream sharing api that's included above), but seemingly watchers in general. tried using the pod_watcher example and modified the Api::namespaced to an Api::all and i am only getting a subset of the pods...

thanks for reporting this!

@clux clux added the runtime controller runtime related label Jun 19, 2024
@jfeldman325
Copy link
Author

uhh...wtf. this is actually not just related to reflectors (or the unstable stream sharing api that's included above), but seemingly watchers in general. tried using the pod_watcher example and modified the Api::namespaced to an Api::all and i am only getting a subset of the pods...

thanks for reporting this!

No problem! It was very confusing to debug since using the api object to simply list all still returns the expected result!

@clux
Copy link
Member

clux commented Jun 19, 2024

Confirming that this is a result of one of the many watcher changes in 0.92, yes.

The bug in the InitPage watcher state specifically going straight to InitListed even though we just iterated through one page.

State::InitPage {
continue_token,
mut objects,
last_bookmark,
} => {
if let Some(next) = objects.pop_front() {
return (Some(Ok(Event::InitApply(next))), State::InitPage {
continue_token,
objects,
last_bookmark,
});
}
if let Some(resource_version) = last_bookmark {
// we have drained the last page - move on to next stage
return (Some(Ok(Event::InitDone)), State::InitListed { resource_version });
}

it should be starting a new api.list from the previous continue_token (basically a fix is to check for whether continue_token.is_some() before returning InitListed.

@clux
Copy link
Member

clux commented Jun 19, 2024

Will have to do a patch release for this. It breaks store completeness and is a pretty bad bug. Possibly even yank worthy after a patch release.

As for why we never saw this:

  • unit tests did not catch this because they typically do a single page out of laziness and then check that regular watch events follow (which they do).
  • all the controllers i manually e2e tested, run on frequently updated objects, meaning the issue would "self-resolve" in some sense (eventually you'd get all objects)
  • mock test should have caught this, but it had a hole in its tests that only verified the continue token once (where it didn't matter)

we do need a better test case for this long term though. separate issue.

@clux clux changed the title Watcher on Namespaced Resource with Api::all Doesn't Reflect All Namespaces watcher on 0.92 does not get all objects during Init Jun 19, 2024
clux added a commit that referenced this issue Jun 19, 2024
cannot believe i missed this :(
fixes #1524

Signed-off-by: clux <[email protected]>
@clux clux self-assigned this Jun 19, 2024
@clux clux added this to the 0.92.1 milestone Jun 19, 2024
@clux clux closed this as completed in 5320d8f Jun 19, 2024
clux added a commit that referenced this issue Jun 19, 2024
* Fix watcher not fully paginating on Init

cannot believe i missed this :(
fixes #1524

Signed-off-by: clux <[email protected]>

* upgrade mock test (these fail on master)

Signed-off-by: clux <[email protected]>

---------

Signed-off-by: clux <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working runtime controller runtime related
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants