Skip to content

Conversation

@vrothberg
Copy link
Member

@vrothberg vrothberg commented Jun 8, 2018

This PR is based on a discussion at containers/podman#721 and includes the following changes:

  • sysregistriesv2: forbid URI schemes
  • sysregistriesv2: improve config validation

@vrothberg vrothberg force-pushed the cleanup-sysregsv2 branch from 0f91c67 to 5d3ff08 Compare June 8, 2018 13:50
@rhatdan
Copy link
Member

rhatdan commented Jun 8, 2018

@mtrmac @runcom PTAL

}
// parseURL parses the input string, performs some sanity checks and returns
// the sanitized input string. The input must be a valid Reference
// github.com/containers/image/docker/reference.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Careful... in c/image/docker/reference, example.com is (domain="", name within domain = example.com).

So…

// strip password for security reasons
uri.User = url.UserPassword(uri.User.Username(), "xxxxxx")
return url.URL{}, fmt.Errorf("unsupported username/password: %q", uri)
if _, err := reference.Parse(trimmed); err != nil {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

… this seems to mostly work, but it accepts test_com (which is an invalid DNS name).

I think this is good enough (at least we are not, it seems, rejecting valid hostname.com[:port] input, and any hostname.com/names… input will be parsed correctly), but instead of the comment about a valid Reference there should be a comment warning about the mismatch between what the parser sees and what the data mean.

Copy link
Member Author

@vrothberg vrothberg Jun 11, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will correct the comment. It's a bit frustrating how tough parsing URI's without scheme can be. net/url.URL really didn't help much as, for instance, the scheme of example.com:5000 would be example.com.

Copy link
Member Author

@vrothberg vrothberg Jun 11, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test_com issue did not satisfy me, so I had another look at net/url and found the following comment:

URLs that do not start with a slash after the scheme are interpreted as:

scheme:opaque[?query][#fragment]

By considering the URL.Opaque, we can get a thorough validation. I will update the code and you can check if you prefer docker/reference or that.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test_com issue did not satisfy me, so I had another look at net/url and found the following comment

FWIW the current parseURL still accepts test_com; the URL parser is not in the business of enforcing DNS formatting of host names.

As with the previous version, I think accepting that is good enough.

if len(v1Registries) > 0 {
if len(registries) > 0 {
return nil, fmt.Errorf("mixing sysregistry v1/v2 is not supported")
return nil, errors.Wrap(&InvalidRegistries{}, "mixing sysregistry v1/v2 is not supported")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn’t this, and all other uses of InvalidRegistries, be return nil, InvalidRegistries{s:"…"} ? It’s surprising for no user of InvalidRegistries to set the s field (why does it exist at all, then?), and if I’m reading it correctly, the errors.Wrapf use would result in an extra trailing ": ".

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(It might also be more convenient to use type InvalidRegistries string, because the error case would be just return nil, InvalidRegistries("…"). OTOH it would be easier to accidentally mix the types than when a struct is involved. Either is perfectly fine.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, changed to &InvalidRegistries{s: ...}.

}

if reg.Prefix == "" {
reg.Prefix = reg.URL
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/validateRegistries/cleanAndValidateRegistries/, or something like that, maybe?

It’s surprising for a “validate” function to modify its input.

for _, reg := range registries {
others, exists := regMap[reg.URL]
if !exists {
continue
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How can this ever happen? And if it can’t, shouldn’t the function fail with Internal error instead of silently continuing?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. I basically forgot to remove those lines; they're debugging artifacts.

Forbid the usage of URI schemes (e.g., http, https, etc.) in the URLs of
registries and mirrors.  The reasoning behind is that the `Insecure`
option is sufficient to control skipping of the certs verification and
to allow non-TLS connections.

The change of forbidding URI schemes entails some changes in the code:

- Move from net/url.URL to simple strings as the parsing does not work
  properly when URI schemes aren't present.

- The rather complex embedding of different types to allow
  (un)marshalling can now be fully avoided.

- Update the tests accordingly.

Refer to the following issue for a background discussion:
    containers/podman#721

Signed-off-by: Valentin Rothberg <vrothberg@suse.com>
A registry can be defined multiple times, for instance, to allow
different prefixes to be backed by a different set of mirrors.
However, we need to make sure that there is no conflict among those
registries (e.g., one being secure while the other isn't).

Also, add a custom InvalidRegistries error to distinguish between errors
caused by an invalid or unsupported configuration and other errors, for
instance, caused by unmarshalling issues.

Signed-off-by: Valentin Rothberg <vrothberg@suse.com>
@vrothberg vrothberg force-pushed the cleanup-sysregsv2 branch from 5d3ff08 to ecdf786 Compare June 11, 2018 12:52
@vrothberg
Copy link
Member Author

@mtrmac I addressed the PR according to your comments. I also squashed the last two commits into one.

@mtrmac
Copy link
Collaborator

mtrmac commented Jun 11, 2018

Man, that url.Parse use is some non-obvious code. @vrothberg , can you check that the comment added in the top commit of https://github.com/mtrmac/image/tree/sysregistriesv2-comments describes the situation correctly, please?

If so, LGTM, and I’ll add the comment after merging this PR.

@vrothberg
Copy link
Member Author

Man, that url.Parse use is some non-obvious code. @vrothberg , can you check that the comment added in the top commit of https://github.com/mtrmac/image/tree/sysregistriesv2-comments describes the situation correctly, please?

Yes, the comment is correct. I'll make sure to have more verbose comments for such wild cases in the future. As always, thanks a lot, @mtrmac!

@vrothberg
Copy link
Member Author

@baude @mheon @mtrmac @rhatdan @runcom @TomSweeneyRedHat (please add more to list if needed)

As you might know, my current plan is to port libpod to use sysregistriesv2 to allow the usage of mirrors. However, there is one thing I noticed that might hurt us in the future:

libpod is vendoring buildah, and both are using the sysregistries package (libpod provides even a wrapper package for it). This means that I'd need to update buildah first and then libpod (with vendoring in the updated buildah). It's not only my laziness speaking to avoid chain updates (now and in the future), but also from an architectural point of view:

Does it make sense to refactor the pull/push code from buildah and libpod into c/images?

buildah and libpod already use c/images, but what we need is an easier to use API with the notion of registries (and mirrors) as we're having with the sysregistriesv2 package. The API certainly requires some flexibility, and we might hit some rough edges here and there but we had the ultimate gain of fast delivery and reduced redundancy. Testing efforts could also be centralized in c/images.

@mtrmac, sorry for hijacking this PR to kickoff the discussion, but I couldn't figure out a better way.

@mheon
Copy link
Member

mheon commented Jun 12, 2018

For my part, I'd like to see push/pull code move out into a library. We've already split it into a separate package in the libpod codebase to try and keep it separate from everything else, and I'd be happy to see it move completely out of our repo into a place where it can be shared with other projects more easily.

@mtrmac
Copy link
Collaborator

mtrmac commented Jun 12, 2018

Does it make sense to refactor the pull/push code from buildah and libpod into c/images?

As mentioned earlier, I’d love for c/image/docker to transparently handle the Prefix/URL mapping, mirrors support, and “insecure”/“blocked” configuration.

The one thing I’d like to currently keep out is using the search list for unqualified names, because as it is right now it breaks name matching in signature verification. Or figure out some way to define reasonable name matching semantics for the unqualified names.

@mtrmac
Copy link
Collaborator

mtrmac commented Jun 12, 2018

👍 Thanks again!

Approved with PullApprove

@rhatdan
Copy link
Member

rhatdan commented Jun 12, 2018

I am fine with working on getting a library that could be shared, between the two tools.

@vrothberg vrothberg deleted the cleanup-sysregsv2 branch June 14, 2018 12:04
@vrothberg
Copy link
Member Author

The one thing I’d like to currently keep out is using the search list for unqualified names, because as it is right now it breaks name matching in signature verification. Or figure out some way to define reasonable name matching semantics for the unqualified names.

@mtrmac Can you elaborate a bit on this issue? Isn't the search list used already in both, Podman/libpod and Buildah?

@mtrmac
Copy link
Collaborator

mtrmac commented Jun 14, 2018

The one thing I’d like to currently keep out is using the search list for unqualified names, because as it is right now it breaks name matching in signature verification. Or figure out some way to define reasonable name matching semantics for the unqualified names.

@mtrmac Can you elaborate a bit on this issue? Isn't the search list used already in both, Podman/libpod and Buildah?

Yes :(

Imprecisely, signature verification takes the image reference to be pulled, looks up the relevant public key, and then requires a signature with 1) that public key 2) matching the image reference.

The lookup uses a hierarchical namespace looking like Docker references, i.e. something like hostname[/ns/…[/repo]]. This namespace design does not fit the search behavior, where input can be an unqualified fedora (which does not mean “host name fedora”).

With the current search implementations, podman/buildah take the unqualified input (fedora), and iteratively try to prepend various search registries, and use that result as “the image reference to be pulled”: {docker.io/,registry.fedoraproject.org,registry.access.redhat.com}/fedora — which means that the signature checking expects the (search hostname + user input) pair to match, and that’s not what the user actually wanted.

So, suppose the search list is (registry.fedoraproject.org, docker.io), and the user pulls unqualified fedora. Usually, r.fp.org is available, the image is pulled from there, and a signature is verified using the r.fp.org/fedora public key and expects the r.fp.org/fedora name to match.

But a MITM attacker can make r.fp.org unavailable, and the search will continue trying docker.io/fedora, and a signature will be verified using the docker.io/fedora public key and expect the docker.io/fedora name to match. Or, if docker.io is configured not to require signatures, just making r.fp.org seem down allows the attacker to supply any arbitrary unsigned image and it will be accepted (assuming they can get docker.io to serve it).

(In a sense, the problem here is that for unqualified fedora the user genuinely does not know, or does not care, what server/identity/public key to use. But that’s on us for designing the UI the way that the user can get away with not knowing/caring. I don’t know what the best solution here is, maybe we should replace the general search list with an alias mechanism (so that fedora always means r.fp.org/fedora, whether or not the server is up), maybe the signature policy mechanism needs somehow extending. But until then I would prefer c/image not to accept such input and pretend to do signature verification when it is completely unclear what it is verifying.)

@vrothberg
Copy link
Member Author

As always, thanks a lot, @mtrmac for the details.

To be honest, I am not quite sure how to continue from here. I cannot start to port libpod to sysregsv2 unless buildah uses it as well, so I think breaking the push/pull code into a library makes sense at that point. But I don't think that we cannot not support the search registries as I consider that a really important usability feature. However, you have valid arguments against adding the push/pull API to containers/image (see above).

@mtrmac How could we proceed? I am also okay with porting Buildah first and leave the situation as is.

@mtrmac
Copy link
Collaborator

mtrmac commented Jun 15, 2018

I cannot start to port libpod to sysregsv2 unless buildah uses it as well

(Why? The code can read the old format, so AFAICT as long the config file is not updated, the code can be updated in both independently.)

But I don't think that we cannot not support the search registries as I consider that a really important usability feature.

*shrug* Yeah, I’m not fighting that in libpod/buildah.

@mtrmac How could we proceed? I am also okay with porting Buildah first and leave the situation as is.

Hopefully not controversial:

  • Teach c/image/docker to read sysregistriesv2 for the “insecure” flag.
    • (We might also want to modify or augment c/image/types.SystemContext.DockerInsecureSkipVerify EDIT to be a *bool so that a caller can opt into full verification even if the config says “insecure”. Not sure.)
    • Then drop the GetInsecureRegistries callers from libpod (buildah currently doesn’t have any AFAICS).
  • Implement the URL/Prefix mapping and Mirror features of sysregistriesv2 in c/image/docker. (No need to update libpod/buildah, these features are new and have no pre-existing equivalent.) (This assumes that callers will not want detailed control of the mirror search process; hopefully that is a reasonable assumption.)

As for the search behavior, either it can directly ported from the old to the new API in both packages without modification and kept separate, or the two can share it in some subpackage. They are vendoring each other, so I’m not sure what is the best place—up to the maintainers of these repositories.

@vrothberg
Copy link
Member Author

vrothberg commented Jun 18, 2018

(Why? The code can read the old format, so AFAICT as long the config file is not updated, the code can be updated in both independently.)

That's right. What I meant was using mirrors actually.

  • Teach c/image/docker to read sysregistriesv2 for the “insecure” flag.
    [...]
  • Implement the URL/Prefix mapping and Mirror features of sysregistriesv2 in c/image/docker.

Sounds like a good plan to me, thanks for the proposal.

There is just one thing in the current design of sysregsv2 that I really want to change: currently, the API is built to be wrapped by another one (e.g., FindUnqualifiedSerachRegistries expects registries as an input). With the design you described above, the config is used at many different places, so to avoid continuously reloading the same config file, I'd like to change the API to a singleton. This avoids redundantly loading the cofig file, which would be expensive for daemons (e.g, CRI-O).

@mtrmac If you agree, I will prepare a bigger PR including your proposed changes (see above) and the change of v2 to a singleton design.

As for the search behavior, either it can directly ported from the old to the new API in both packages without modification and kept separate, or the two can share it in some subpackage. They are vendoring each other, so I’m not sure what is the best place—up to the maintainers of these repositories.

Sounds good.

@mtrmac
Copy link
Collaborator

mtrmac commented Jun 18, 2018

With the design you described above, the config is used at many different places, so to avoid continuously reloading the same config file, I'd like to change the API to a singleton. This avoids redundantly loading the cofig file, which would be expensive for daemons (e.g, CRI-O).

(Is it that expensive, if the kernel is likely to cache everything in memory? A quick benchmark on a SSD workstation shows about 71 us per load, when done in a tight loop, for me, which is negligible compared to the network accesses done by users of the data. I wouldn’t bother unless there was a known benchmark which suffered from this, but that’s no reason to stop you.)

Fine with me, as long as the implementation accounts for multi-threading and for multiple users using different config paths within the same process (i.e. a singleton lock-protected hash table or something like that); the support for different config paths via types.SystemContext serves as an alternative to detailed Golang controls overriding the config contents, so that needs to work.

Alternatively, the data could be cached within a types.SystemContext (maybe still with some protection against multi-threaded uses and path changes); most “serious” users are going to have a single long-term instance of SystemContext, and it is passed through the call stack (and unlike context.Context, a callee can modify it so that it affects the caller). OTOH it is possible for types.SystemContext to be nil, and in that case the caching would be impossible or have to rely on the global variables.

@vrothberg
Copy link
Member Author

(Is it that expensive, if the kernel is likely to cache everything in memory? A quick benchmark on a SSD workstation shows about 71 us per load, when done in a tight loop, for me, which is negligible compared to the network accesses done by users of the data. I wouldn’t bother unless there was a known benchmark which suffered from this, but that’s no reason to stop you.)

I think it doesn't hurt per load but when accumulating over time, I'd love my CPU to not waste those cycles. I think it's rather straight forward to implement and other users might benefit from it.

Fine with me, as long as the implementation accounts for multi-threading and for multiple users using different config paths within the same process (i.e. a singleton lock-protected hash table or something like that); the support for different config paths via types.SystemContext serves as an alternative to detailed Golang controls overriding the config contents, so that needs to work.

I love this idea. As for multi-threading, I think the only critical section is writing the hash table (e.g., when reading a path for the first time in parallel). All following accesses are reads. Kudos for your patience discussing designs, much appreciated!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants