-
Notifications
You must be signed in to change notification settings - Fork 395
Cleanup sysregsv2 #464
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cleanup sysregsv2 #464
Conversation
0f91c67 to
5d3ff08
Compare
| } | ||
| // parseURL parses the input string, performs some sanity checks and returns | ||
| // the sanitized input string. The input must be a valid Reference | ||
| // github.com/containers/image/docker/reference. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Careful... in c/image/docker/reference, example.com is (domain="", name within domain = example.com).
So…
| // strip password for security reasons | ||
| uri.User = url.UserPassword(uri.User.Username(), "xxxxxx") | ||
| return url.URL{}, fmt.Errorf("unsupported username/password: %q", uri) | ||
| if _, err := reference.Parse(trimmed); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
… this seems to mostly work, but it accepts test_com (which is an invalid DNS name).
I think this is good enough (at least we are not, it seems, rejecting valid hostname.com[:port] input, and any hostname.com/names… input will be parsed correctly), but instead of the comment about a valid Reference there should be a comment warning about the mismatch between what the parser sees and what the data mean.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will correct the comment. It's a bit frustrating how tough parsing URI's without scheme can be. net/url.URL really didn't help much as, for instance, the scheme of example.com:5000 would be example.com.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test_com issue did not satisfy me, so I had another look at net/url and found the following comment:
URLs that do not start with a slash after the scheme are interpreted as:
scheme:opaque[?query][#fragment]
By considering the URL.Opaque, we can get a thorough validation. I will update the code and you can check if you prefer docker/reference or that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The
test_comissue did not satisfy me, so I had another look at net/url and found the following comment
FWIW the current parseURL still accepts test_com; the URL parser is not in the business of enforcing DNS formatting of host names.
As with the previous version, I think accepting that is good enough.
| if len(v1Registries) > 0 { | ||
| if len(registries) > 0 { | ||
| return nil, fmt.Errorf("mixing sysregistry v1/v2 is not supported") | ||
| return nil, errors.Wrap(&InvalidRegistries{}, "mixing sysregistry v1/v2 is not supported") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn’t this, and all other uses of InvalidRegistries, be return nil, InvalidRegistries{s:"…"} ? It’s surprising for no user of InvalidRegistries to set the s field (why does it exist at all, then?), and if I’m reading it correctly, the errors.Wrapf use would result in an extra trailing ": ".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(It might also be more convenient to use type InvalidRegistries string, because the error case would be just return nil, InvalidRegistries("…"). OTOH it would be easier to accidentally mix the types than when a struct is involved. Either is perfectly fine.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, changed to &InvalidRegistries{s: ...}.
| } | ||
|
|
||
| if reg.Prefix == "" { | ||
| reg.Prefix = reg.URL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/validateRegistries/cleanAndValidateRegistries/, or something like that, maybe?
It’s surprising for a “validate” function to modify its input.
| for _, reg := range registries { | ||
| others, exists := regMap[reg.URL] | ||
| if !exists { | ||
| continue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How can this ever happen? And if it can’t, shouldn’t the function fail with Internal error instead of silently continuing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. I basically forgot to remove those lines; they're debugging artifacts.
Forbid the usage of URI schemes (e.g., http, https, etc.) in the URLs of
registries and mirrors. The reasoning behind is that the `Insecure`
option is sufficient to control skipping of the certs verification and
to allow non-TLS connections.
The change of forbidding URI schemes entails some changes in the code:
- Move from net/url.URL to simple strings as the parsing does not work
properly when URI schemes aren't present.
- The rather complex embedding of different types to allow
(un)marshalling can now be fully avoided.
- Update the tests accordingly.
Refer to the following issue for a background discussion:
containers/podman#721
Signed-off-by: Valentin Rothberg <vrothberg@suse.com>
A registry can be defined multiple times, for instance, to allow different prefixes to be backed by a different set of mirrors. However, we need to make sure that there is no conflict among those registries (e.g., one being secure while the other isn't). Also, add a custom InvalidRegistries error to distinguish between errors caused by an invalid or unsupported configuration and other errors, for instance, caused by unmarshalling issues. Signed-off-by: Valentin Rothberg <vrothberg@suse.com>
5d3ff08 to
ecdf786
Compare
|
@mtrmac I addressed the PR according to your comments. I also squashed the last two commits into one. |
|
Man, that If so, LGTM, and I’ll add the comment after merging this PR. |
Yes, the comment is correct. I'll make sure to have more verbose comments for such wild cases in the future. As always, thanks a lot, @mtrmac! |
|
@baude @mheon @mtrmac @rhatdan @runcom @TomSweeneyRedHat (please add more to list if needed) As you might know, my current plan is to port libpod to use sysregistriesv2 to allow the usage of mirrors. However, there is one thing I noticed that might hurt us in the future: libpod is vendoring buildah, and both are using the sysregistries package (libpod provides even a wrapper package for it). This means that I'd need to update buildah first and then libpod (with vendoring in the updated buildah). It's not only my laziness speaking to avoid chain updates (now and in the future), but also from an architectural point of view: Does it make sense to refactor the pull/push code from buildah and libpod into c/images? buildah and libpod already use c/images, but what we need is an easier to use API with the notion of registries (and mirrors) as we're having with the sysregistriesv2 package. The API certainly requires some flexibility, and we might hit some rough edges here and there but we had the ultimate gain of fast delivery and reduced redundancy. Testing efforts could also be centralized in c/images. @mtrmac, sorry for hijacking this PR to kickoff the discussion, but I couldn't figure out a better way. |
|
For my part, I'd like to see push/pull code move out into a library. We've already split it into a separate package in the |
As mentioned earlier, I’d love for c/image/docker to transparently handle the The one thing I’d like to currently keep out is using the search list for unqualified names, because as it is right now it breaks name matching in signature verification. Or figure out some way to define reasonable name matching semantics for the unqualified names. |
|
I am fine with working on getting a library that could be shared, between the two tools. |
@mtrmac Can you elaborate a bit on this issue? Isn't the search list used already in both, Podman/libpod and Buildah? |
Yes :( Imprecisely, signature verification takes the image reference to be pulled, looks up the relevant public key, and then requires a signature with 1) that public key 2) matching the image reference. The lookup uses a hierarchical namespace looking like Docker references, i.e. something like hostname[/ns/…[/repo]]. This namespace design does not fit the search behavior, where input can be an unqualified With the current search implementations, So, suppose the search list is ( But a MITM attacker can make (In a sense, the problem here is that for unqualified |
|
As always, thanks a lot, @mtrmac for the details. To be honest, I am not quite sure how to continue from here. I cannot start to port libpod to sysregsv2 unless buildah uses it as well, so I think breaking the push/pull code into a library makes sense at that point. But I don't think that we cannot not support the search registries as I consider that a really important usability feature. However, you have valid arguments against adding the push/pull API to containers/image (see above). @mtrmac How could we proceed? I am also okay with porting Buildah first and leave the situation as is. |
(Why? The code can read the old format, so AFAICT as long the config file is not updated, the code can be updated in both independently.)
*shrug* Yeah, I’m not fighting that in
Hopefully not controversial:
As for the search behavior, either it can directly ported from the old to the new API in both packages without modification and kept separate, or the two can share it in some subpackage. They are vendoring each other, so I’m not sure what is the best place—up to the maintainers of these repositories. |
That's right. What I meant was using mirrors actually.
Sounds like a good plan to me, thanks for the proposal. There is just one thing in the current design of sysregsv2 that I really want to change: currently, the API is built to be wrapped by another one (e.g., @mtrmac If you agree, I will prepare a bigger PR including your proposed changes (see above) and the change of v2 to a singleton design.
Sounds good. |
(Is it that expensive, if the kernel is likely to cache everything in memory? A quick benchmark on a SSD workstation shows about 71 us per load, when done in a tight loop, for me, which is negligible compared to the network accesses done by users of the data. I wouldn’t bother unless there was a known benchmark which suffered from this, but that’s no reason to stop you.) Fine with me, as long as the implementation accounts for multi-threading and for multiple users using different config paths within the same process (i.e. a singleton lock-protected hash table or something like that); the support for different config paths via Alternatively, the data could be cached within a |
I think it doesn't hurt per load but when accumulating over time, I'd love my CPU to not waste those cycles. I think it's rather straight forward to implement and other users might benefit from it.
I love this idea. As for multi-threading, I think the only critical section is writing the hash table (e.g., when reading a path for the first time in parallel). All following accesses are reads. Kudos for your patience discussing designs, much appreciated! |
This PR is based on a discussion at containers/podman#721 and includes the following changes: