Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add API function TryUpdatingCache #630

Merged
merged 1 commit into from
Aug 2, 2019

Conversation

saschagrunert
Copy link
Member

The function can be used to reload the registries without harming the
cache on failure. On success, the cache will be updated as intended.
This function can be reused by the current GetRegistries API.

@vrothberg
Copy link
Member

This seems redundant to invalidating the cache and then calling GetRegistries() (minus the error case). Is there a concrete use case for that? I prefer keeping the API small but if there's a non-hypothetical use case, I would not block.

@saschagrunert
Copy link
Member Author

This seems redundant to invalidating the cache and then calling GetRegistries() (minus the error case). Is there a concrete use case for that? I prefer keeping the API small but if there's a non-hypothetical use case, I would not block.

My use case is that I only want to override the cache on successful parsing of the configuration, not in every case.


// GetRegistriesUncached loads the registries from the provided `SystemContext`
// without using the internal cache. On success, the found registries will be
// memoized into the internal registry cache.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit; s/memoized/memorized/ or perhaps "added"

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://en.wikipedia.org/wiki/Memoization ; though maybe that indicates that using simpler words would be easier to understand.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's what I meant, but I changed to to added.

Copy link
Member

@TomSweeneyRedHat TomSweeneyRedHat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, will let @vrothberg and @mtrmac decide if the function should be added or not.

@vrothberg
Copy link
Member

My use case is that I only want to override the cache on successful parsing of the configuration, not in every case.

Who would be the consumer of the API?

@saschagrunert
Copy link
Member Author

My use case is that I only want to override the cache on successful parsing of the configuration, not in every case.

Who would be the consumer of the API?

The CRI-O live config reload feature. The thing is that I don't want to destroy the cache unless the config is valid.

@vrothberg
Copy link
Member

Thinking about it, I guess you're pointing to CRI-O's reload function where it could make sense. If @mtrmac is on board, I would prefer to add a parameter to GetRegistries(..) (e.g., updateCache bool). With the next release of c/image, consumers of sysregistriesv2 are very likely to be broken (s/URL/Location and friends).

@vrothberg
Copy link
Member

... [to be broken], so we could add the parameter.

@mtrmac
Copy link
Collaborator

mtrmac commented May 14, 2019

Thinking about it, I guess you're pointing to CRI-O's reload function where it could make sense. If @mtrmac is on board, I would prefer to add a parameter to GetRegistries(..) (e.g., updateCache bool).

Neither of the two APIs match how I imagine the callers work (nor how I see CRI-O working today).

Most callers should never call GetRegistries; they would (have c/image/docker implicitly) call FindRegistry; what value of updateCache should FindRegistry use for GetRegistries(updateCache)?

For a daemon, I imagine there’s a SIGHUP or similar mechanism to reload configuration (or, perhaps, “reload before starting to process every request, but not every single time while processing it); in that case “reload” is what the caller wants, “reload and return the results” seems like pointless extra return value.

Alternatively, I could imagine a “never cache, reload all the time if the file is valid” configuration — besides being a bit risky, that looks like an extra bit in SystemContext, so that it applies to all the implicit FindRegistry calls as well.

What precisely is the caller doing and what does it want?

@saschagrunert
Copy link
Member Author

For a daemon, I imagine there’s a SIGHUP or similar mechanism to reload configuration (or, perhaps, “reload before starting to process every request, but not every single time while processing it); in that case “reload” is what the caller wants, “reload and return the results” seems like pointless extra return value.

This API would be for the configuration reload of registries.conf during CRI-O runtime. This works by listening on SIGHUP as you already mentioned. My initial approach was to call InvalidateCache(), but this API does not return any result and immediately destroys the cache. Afterwards I would have to call GetRegistries(), which is currently the only function which repopulates the cache and loads the config from the TOML file. For debugging purposes I would give the user the feedback that the registries are verified and which registries were found. As a user I would like to see that my newly configured registry has been added successfully.

What precisely is the caller doing and what does it want?

The caller mainly wants to recreate the registries cache, but only on successful parsing of registries.conf. Beside this, the function should return a meaningful error and the result of the registry cache repopulation.

@vrothberg
Copy link
Member

This API would be for the configuration reload of registries.conf during CRI-O runtime. This works by listening on SIGHUP as you already mentioned. My initial approach was to call InvalidateCache(), but this API does not return any result and immediately destroys the cache. Afterwards I would have to call GetRegistries(), which is currently the only function which repopulates the cache and loads the config from the TOML file. For debugging purposes I would give the user the feedback that the registries are verified and which registries were found. As a user I would like to see that my newly configured registry has been added successfully.

That's pretty much how the API is intended to be used.

The caller mainly wants to recreate the registries cache, but only on successful parsing of registries.conf.

That is the key difference to the current API but the question remains why that is needed instead of using the current API? Could an empty cache lead to issues?

In case of CRI-O, I can imagine that the SIGHUP reload aims at try-reload semantics, meaning that the data will only be updated if it's valid and without error to avoid race-conditions.

@saschagrunert
Copy link
Member Author

saschagrunert commented May 15, 2019

That is the key difference to the current API but the question remains why that is needed instead of using the current API? Could an empty cache lead to issues?

Yes, I think so. Lets imagine the following scenario:

  1. Kubernetes pulls images via CRI-O, whereas the cache of registries.conf is populated and everything works as intended.
  2. The admin changes registries.conf, all background pulls are fine because the cache serves the registries
  3. The admin sends SIGHUP to CRI-O.
  • If we now invalidate the cache without pre-checking the configuration, then all pulls from that point in time are failing if the configuration is invalid.
  • If we pre-check the configuration and only invalidate the cache on successful validation, then everything should be fine.

In case of CRI-O, I can imagine that the SIGHUP reload aims at try-reload semantics, meaning that the data will only be updated if it's valid and without error to avoid race-conditions.

Yes, that's exactly the case. I really want to be sure that everything works before changing the internal application state. For the log_level it works the same way right now:
https://github.com/cri-o/cri-o/blob/0c50025d5f0351ae9acbc1fc46d1afd38c25f953/server/config.go#L218-L232

@mtrmac
Copy link
Collaborator

mtrmac commented May 15, 2019

For a daemon, I imagine there’s a SIGHUP or similar mechanism to reload configuration (or, perhaps, “reload before starting to process every request, but not every single time while processing it); in that case “reload” is what the caller wants, “reload and return the results” seems like pointless extra return value.

This API would be for the configuration reload of registries.conf during CRI-O runtime. This works by listening on SIGHUP as you already mentioned. My initial approach was to call InvalidateCache(), but this API does not return any result and immediately destroys the cache. Afterwards I would have to call GetRegistries(), which is currently the only function which repopulates the cache and loads the config from the TOML file. For debugging purposes I would give the user the feedback that the registries are verified and which registries were found. As a user I would like to see that my newly configured registry has been added successfully.

So, maybe: TryReloadingConfiguration (TryRefreshingCache? whatever, naming is hard) which either reloads the configuration and succeeds (and if we wanted to complicate things significantly, would return also an indication whether the configuration has changed[1]); or keeps the current cached configuration, if any, and fails.

On success, the code can optionally call GetRegistries to see what the changed configuration is, and log that.


[1] This is not something an external caller can quite trivially do, because adding an extra member to a structure does not break the API/ABI, but it can invalidate external code which tries to compare two Registry values.

@mtrmac
Copy link
Collaborator

mtrmac commented May 15, 2019

A further thought — do we even want InvalidateCache to exist at all, or should there only be that Try… API, which preserves the previous cache on failure?

I can see an argument for actually strictly&reliably failing if the configuration is currently invalid, instead of continuing to use some hard-to-describe old configuration and hiding the problem. It’s not very strong, though.

I suppose keeping InvalidateCache around doesn’t hurt, so we can default to keeping it, and give callers both options.

@saschagrunert
Copy link
Member Author

saschagrunert commented May 16, 2019

So, maybe: TryReloadingConfiguration (TryRefreshingCache? whatever, naming is hard) which either reloads the configuration and succeeds (...); or keeps the current cached configuration, if any, and fails.

Yeah, that's what the current implementaion already does, right? Or should I remove the []Registries from the return parameters? (From my POV this would over complicate things and double the work done internally.)

I suppose keeping InvalidateCache around doesn’t hurt, so we can default to keeping it, and give callers both options.

I agree and would not remove it.

@vrothberg
Copy link
Member

Yeah, that's what the current implementaion already does, right? Or should I remove the []Registries from the return parameters? (From my POV this would over complicate things and double the work done internally.)

I'd go with a TryUpdatingCache(..) ([]Registries, error). Callers can still ignore the first return value if they don't need it.

I also agree with keeping InvalidateCache(). It doesn't hurt to keep it around.

@saschagrunert
Copy link
Member Author

I'd go with a TryUpdatingCache(..) ([]Registries, error). Callers can still ignore the first return value if they don't need it.

Renamed the function according to your suggestion.

Copy link
Member

@vrothberg vrothberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM but needs a rebase

@saschagrunert
Copy link
Member Author

Rebased on top of the latest master branch.

@vrothberg
Copy link
Member

@mrunalp @mtrmac PTAL

@rhatdan
Copy link
Member

rhatdan commented Aug 1, 2019

@saschagrunert Are you still interested in this, can you rebase?

@saschagrunert
Copy link
Member Author

@saschagrunert Are you still interested in this, can you rebase?

Yes, rebased on top of the latest master branch

@rhatdan rhatdan changed the title Add API function GetRegistriesUncached Add API function TryReloadConfiguration Aug 1, 2019
@rhatdan
Copy link
Member

rhatdan commented Aug 1, 2019

@mtrmac @vrothberg PTAL

@saschagrunert saschagrunert changed the title Add API function TryReloadConfiguration Add API function TryUpdatingCache Aug 1, 2019
Copy link
Member

@vrothberg vrothberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Collaborator

@mtrmac mtrmac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK on the API (though note the V2RegistriesConf comment); I’m afraid the implementation needs more updates for the interim code changes.

pkg/sysregistriesv2/system_registries_v2.go Outdated Show resolved Hide resolved
pkg/sysregistriesv2/system_registries_v2_test.go Outdated Show resolved Hide resolved
@saschagrunert
Copy link
Member Author

The CI needs #670 to succeed.

The function can be used to reload the registries without harming the
cache if it fails. On success, the cache will be updated as intended.
This function can be reused by the current `GetRegistries` API.

Signed-off-by: Sascha Grunert <[email protected]>
@saschagrunert
Copy link
Member Author

Rebased.

Copy link
Member

@vrothberg vrothberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, let's get this into the v3.0.0 release.

@vrothberg vrothberg mentioned this pull request Aug 2, 2019
@vrothberg
Copy link
Member

Green, pulling the trigger. Thanks @saschagrunert 👍

@vrothberg vrothberg merged commit d2897d7 into containers:master Aug 2, 2019
@saschagrunert saschagrunert deleted the registry-cache branch August 2, 2019 09:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants