Skip to content

Conversation

juanxiu
Copy link
Contributor

@juanxiu juanxiu commented Aug 20, 2025

What does this PR do / why we need it:
This PR enhances the KubernetesBackend by modifying the Get method to use the informer cache instead of querying the Kubernetes API server directly when retrieving Application resources. It leverages the generic Informer's Lister() method to efficiently access cached resources, thereby reducing load on the API server and improving performance. Corresponding unit tests have been added to verify correct informer cache usage. This change creates a foundation for more efficient resource management via the informer caching mechanism.

Which issue(s) this PR fixes:

Fixes #251

How to test changes / Special notes to the reviewer:

  • Unit tests utilize a fake clientset combined with informer creation to validate that the Get method correctly retrieves Application resources from the informer cache instead of making direct API calls.
  • Manual or integration tests may confirm informer startup, cache synchronization, and API call reduction.
  • Reviewer should verify that the Get method no longer calls the API server directly and uses cached data.

Checklist

  • Documentation update is required by this PR (and has been updated) OR no documentation update is required.

Copy link
Collaborator

@jannfis jannfis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @juanxiu for this PR.

I have a comment requiring some more discussion, PTAL.

if !ok {
return nil, fmt.Errorf("object is not an Application: %T", obj)
}
return app, nil
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Items returned from the cache need to be treated read-only. I believe that's why unrelated unit tests are failing and panicing.

There are two options imho:

  1. We clearly document that objects returned by this function are to be treated read-only, because they are retrieved directly from the cache and caller needs to make a copy if they want to modify it, or
  2. We return a copy of the object retrieved from the cache, to take the burden from the caller.

The first option puts more responsibility to the caller, but is resource efficient. The second option would ensure that the caller can treat the objects lightly, but for the cost of increased memory consumption.

I have not yet made up my mind with regards to which solution I'd prefer. Which one do you think makes more sense?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In option 1, when objects are retrieved through multiple informers, there is a risk that callers may forget to use DeepCopy(). Such oversights can make it difficult to trace and resolve bugs. On the other hand, callers can use resources efficiently and have the flexibility to decide when to create copies.

Option 2 always returns a copy of the object from the function, so callers cannot control when the copy is made. However, callers can still customize behavior by registering event handlers with the informer. Returning a copy from the function enforces consistent object usage and prevents inconsistent handling of copying across different callers.

Personally, I chose option 2 because I believe minimizing the potential for bugs and ensuring consistent and stable behavior throughout the codebase is important.

For these reasons, I have modified the code to return app.DeepCopy(), nil.
Additionally, I plan to update the List methods in the Kubernetes backend for Application and appproject type resources to use the informer cache as well. I would appreciate it if we could merge this after those changes are completed.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems there are other problems with this approach, at least regarding the tests. Some of them are still failing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no issue with loading the informer cache itself. However, in the current unit tests, a different problem arises. To load the cache, informer.Run must be executed, which requires calling the Start method of either the manager or the server. Until now, the code has been written without assuming that Start would be called in the test code. As a result, timing issues related to goroutine execution occur during test runs. How can we resolve this situation?

wq.On("Get").Return(&ev, false)
wq.On("Done", &ev)
s, err := NewServer(context.Background(), fac, "argocd", WithGeneratedTokenSigningKey(), WithAutoNamespaceCreate(true, "", nil))
s.Start(context.Background(), make(chan error))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

During testing in this way, a Start call is required to load the informer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jannfis Can you take a look at this issue and let me know if you have any ideas to resolve it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Kubernetes backend should use informer cache for retrieving resources
2 participants