Skip to content

Conversation

@yhcote
Copy link
Contributor

@yhcote yhcote commented Nov 6, 2021

This implements the support for inbound SIF files in the image package.

  • Outbound support coming soon.
  • This version requires fakeroot and squashfs-tools distribution packages. Near future intent is to do all this in go, starting with replacing the use of fakeroot.
  • Go tests coming next.

@yhcote yhcote changed the title sif: initial sif transport implementation wip - sif: initial sif transport implementation Nov 12, 2021
@tri-adam
Copy link
Contributor

@yhcote @Vrotberg @mtrmac would love to collaborate with you all on this. We're actively maintaining a github.com/sylabs/sif/v2 module and would welcome a discussion about re-use of (at a minimum) the core SIF implementation.

@yhcote
Copy link
Contributor Author

yhcote commented Nov 17, 2021

@yhcote @Vrotberg @mtrmac would love to collaborate with you all on this. We're actively maintaining a github.com/sylabs/sif/v2 module and would welcome a discussion about re-use of (at a minimum) the core SIF implementation.

@tri-adam I checked quickly and saw two different repos with SIF updates. Since the current feature requires simple data object dumps from the SIF file the current import is sufficient. The best course of action right now to get the ball rolling is to at least get this initial PR in and let the community decide which improvements (some already listed in the PR comment) they want to see next. That PR needs a few touch ups but it should be done this week.

Bring sif code in the repo instead of pulling it in at build time.

Resolves PR code review discussion.

Signed-off-by: Yannick Cote <[email protected]>
Signed-off-by: Yannick Cote <[email protected]>
Signed-off-by: Yannick Cote <[email protected]>
@yhcote yhcote changed the title wip - sif: initial sif transport implementation sif: initial sif transport implementation Nov 17, 2021
@tri-adam
Copy link
Contributor

@tri-adam I checked quickly and saw two different repos with SIF updates.

Yes, there is github.com/sylabs/sif/v2 and github.com/hpcng/sif/v2. github.com/hpcng/sif/v2 is not actively maintained, other than occasionally releasing code from github.com/sylabs/sif/v2 (ex. https://github.com/hpcng/sif/releases/tag/v2.0.0).

github.com/sylabs/sif/v2 contains both features and bug fixes not yet found in the other module. For that reason, I'd expect github.com/sylabs/sif/v2 to be the natural choice if there's interest in code re-use.

Since the current feature requires simple data object dumps from the SIF file the current import is sufficient. The best course of action right now to get the ball rolling is to at least get this initial PR in and let the community decide which improvements (some already listed in the PR comment) they want to see next.

Unless I'm missing something, using github.com/sylabs/sif/v2 rather than duplicating the code in containers/image/sif/sif would be straight forward, and mean less code to maintain in containers/image. I've taken a stab at that here, as a practical example: tri-adam@1dd4610.

Anyways, just want to make it crystal clear that we're open to collaboration on this. Excited to see where it goes, in any case!

@rhatdan
Copy link
Member

rhatdan commented Jan 4, 2022

@mtrmac @vrothberg Can we get a fresh review on this?

Copy link
Collaborator

@mtrmac mtrmac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn’t check the sif/sif package in detail yet — I do default to thinking we should use the maintained external dependency instead, but I’ll check the code etc. first.

Highlights:

  • Sprintf on untrusted input
  • Do we want a temporary compressed representation at all?
  • Removing on-disk data immediately after use
  • Policy namespaces are inconsistent (and must have tests even the final design is trivial)

// LoadContainerFp is responsible for loading a SIF container file. It takes
// a *os.File pointing to an opened file, and whether the file is opened as
// read-only for arguments.
func LoadContainerFp(fp *os.File, rdonly bool) (fimg FileImage, err error) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

None of this looks like something that should be a committed part of the stable c/image API.

Either move the subpackage code to sif/internal (or even sif/internal/sif), or perhaps use the external dependency as discussed elsewhere.

// LICENSE file distributed with the sources of this project regarding your
// rights to use or distribute this software.

// +build linux
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolutely non-blocking:

Which parts are Linux-specific? The syscall uses of Mmap?

It would be attractive to make this usable on macOS or Windows workstations without a Linux VM.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just removing these build conditionals, the code seems to at least build fine on macOS.

Of course things like unsquashfs or fakeroot might not be available — but the error handling needs to be user-acceptable for a Linux only version as well. At least from a very quick search, both seem to be packaged in Homebrew.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fakeroot does exist in homebrew, but it installs with

Warning: fakeroot has been deprecated because it does not build!

and it also doesn’t work — in wildly unpredictable ways (sometimes files are not created, sometimes chown fails with EPERM, sometimes chown doesn’t fail but doesn’t do anything).

So making this Linux-only seems reasonable for now. (Ideally we shouldn’t need fakeroot, and do the conversion from SquashFS to tar most in-memory.)

Comment on lines +17 to 19
// The sif transport is registered by sif*.go
// The ostree transport is registered by ostree*.go
// The storage transport is registered by storage*.go
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please keep the comments in alphabetical order.

@@ -0,0 +1,121 @@
// +build linux
Copy link
Collaborator

@mtrmac mtrmac Jan 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Due to the impact on signature verification, we require the Transport implementations to have pretty close to 100% test coverage. Those tests can mostly be copy&pasted from similar sources as this file (dir/docker/archive).

@@ -0,0 +1,121 @@
// +build linux

package sifimage
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn’t the package name match the directory name?

searchDesc := sif.Descriptor{Datatype: sif.DataDeffile}
resultDescs, _, err := image.fimg.GetFromDescr(searchDesc)
if err == nil && resultDescs != nil {
// we assume in practice that typical SIF files don't hold multiple deffiles
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should there be a warning, or at least a debug log, in that case?

if err == nil && resultDescs != nil {
// we assume in practice that typical SIF files don't hold multiple deffiles
image.deffile = resultDescs[0]
image.defReader = io.NewSectionReader(image.fimg.Fp, image.deffile.Fileoff, image.deffile.Filelen)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

defReader is only used in generateConfig immediately below, so maybe it can just be a local variable.

searchDesc = sif.Descriptor{Datatype: sif.DataEnvVar}
resultDescs, _, err = image.fimg.GetFromDescr(searchDesc)
if err == nil && resultDescs != nil {
// we assume in practice that typical SIF files don't hold multiple EnvVar sets
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should there be a warning, or at least a debug log, in that case?

Comment on lines +59 to +66
// look for an environment variable set object
searchDesc = sif.Descriptor{Datatype: sif.DataEnvVar}
resultDescs, _, err = image.fimg.GetFromDescr(searchDesc)
if err == nil && resultDescs != nil {
// we assume in practice that typical SIF files don't hold multiple EnvVar sets
image.env = resultDescs[0]
image.envReader = io.NewSectionReader(image.fimg.Fp, image.env.Fileoff, image.env.Filelen)
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can’t see any use of .env or .envReader. What does this code do?

)

var (
sifLoggerBuf bytes.Buffer
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this available in any way? I can’t see a way to read this data, which makes the logger pointless.

(c/image otherwise uses logrus).

if err = image.generateRunscript(); err != nil {
return errors.Wrap(err, "generating runscript")
}
image.cmdlist = []string{"/podman/runscript"}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image.cmdlist serves two different purposes — a list of bash commands (e.g. input to generateRunscript, and a single command to be by the runtime. Those should be two separate variables.

return nil
}

func (image SifImage) GetConfig(config *imgspecv1.Image) error {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name of this function doesn’t match what it does. Maybe a ) Command() []string that just returns the command to run?

@mtrmac
Copy link
Collaborator

mtrmac commented Jan 5, 2022

@yhcote @tri-adam Looking at this PR’s

	if string(fimg.Header.Version[:HdrVersionLen-1]) > HdrVersion {
 		return fmt.Errorf("invalid SIF file: Version %s want <= %s", fimg.Header.Version, HdrVersion)
 	}
…
	HdrVersion      = "02"        // SIF SPEC VERSION

(≤ , and 2)

vs. https://github.com/sylabs/sif/blob/e807689a1a94304812a6c20573d0d11484b4e2af/pkg/sif/load.go#L29-L31 and https://github.com/sylabs/sif/blob/e807689a1a94304812a6c20573d0d11484b4e2af/pkg/sif/sif.go#L108-L114 (strict equality, and 1).

What’s that about?

Obviously a file with version "1" passes both checks, so at least that’s good. Is that the only one that matters?

I see sylabs/sif@f68560d exists in various repos, and then there’s sylabs/sif@7441391 .

(Speaking purely for myself in personal capacity, I don’t really want to dig up historical controversies, if any, and especially I’d prefer if c/image didn’t end up having to play arbiter between competing variants of a format, or worse, having to maintain its own private copy that has to do non-obvious workarounds to accept all of those competing variants at the same time. I’m really hoping for something like “$answer is the obviously correct one, some of the code is just behind”…)

@dtrudg
Copy link

dtrudg commented Jan 5, 2022

Hi @mtrmac @yhcote (@tri-adam)

The HdrVersion '01' is the one that really makes sense here, as it's the prevailing version. Regrettably during the development of the early 3.x releases of Singularity there were some changes like this one (the increment, followed by reversion over the increment) that are not great, in hindsight...

The SIF v1.0.3 tag that contains the reversion from 02->01 in sylabs/sif@7441391 was first included in Singularity 3.2.0. Versions since then, including current versions of Singularity, are creating SIFs with a header version of 01.

04:47 PM $ singularity sif header test.sif 
Launch Script:        #!/usr/bin/env run-singularity
Version:              01

Essentially all SIF containers you would find in the wild will say 01.

I'll defer to @tri-adam r.e. the version check in the current SIF code.

@dtrudg
Copy link

dtrudg commented Jan 5, 2022

Further to my previous comment....

Singularity v3.1.1 used SIF v1.0.2, in which HdrVersion '01' was in effect (before the increment).
https://github.com/sylabs/singularity/blob/a83e020c14d6ee6af84831ff0c68acb2f176f0d4/Gopkg.toml#L51
https://github.com/sylabs/sif/blob/39d9b2aa8931c90c87fa13c28677f2296e9ad91f/pkg/sif/sif.go#L94

Singularity v3.2.0 used SIF v1.0.3, in which HdrVersion '01' was in effect (after the reversion).
https://github.com/sylabs/singularity/blob/8ed39ade65934bf8cc202c7c5d3bf6ac2ae17c9d/go.mod#L75
https://github.com/sylabs/sif/blob/a9c017e66287673d786e56f758912dff0d765312/pkg/sif/sif.go#L96

So, we had no official tagged release that should have given a HdrVersion '02' SIF image. They should only appear if someone had built from development code pulling in SIF at that point. The code pulled into containers/image in this PR wasn't used in a release of Singularity.

@mtrmac
Copy link
Collaborator

mtrmac commented Jan 6, 2022

So, we had no official tagged release that should have given a HdrVersion '02' SIF image.

Thanks, that’s very helpful.

@mtrmac
Copy link
Collaborator

mtrmac commented Jan 6, 2022

(Accidentally sent message, see below for the finished one.)

@mtrmac mtrmac closed this Jan 6, 2022
@mtrmac mtrmac reopened this Jan 6, 2022
@mtrmac
Copy link
Collaborator

mtrmac commented Jan 6, 2022

github.com/hpcng/sif/v2 is not actively maintained, other than occasionally releasing code from github.com/sylabs/sif/v2 (ex. https://github.com/hpcng/sif/releases/tag/v2.0.0).

For the record, looking at the v1.7.0, v2.0.0, and latest common v2.2.3 tags, that seems true enough — there are some differences between the corresponding tags, but mostly just a timing difference (a commit missing from one version but added later, or the like), and almost all of the new code seems cherry-picked from the Sylabs repo.

@mtrmac
Copy link
Collaborator

mtrmac commented Jan 6, 2022

I've taken a stab at that here, as a practical example: tri-adam@1dd4610.

Comparing with the current PR, I think LoadContainerFromPath should come with OptLoadWithFlag(os.O_RDONLY). Otherwise the two look pretty similar — the only not-quite-trivial behavior difference I saw is things like string(fimg.Header.Version[:HdrVersionLen-1]) vs. comparing the complete array against a value that includes a trailing 0; an implementation difference is that it the Sylabs version doesn’t try to use mmap.

Using the external library would also not drag in any extra dependencies.

@mtrmac
Copy link
Collaborator

mtrmac commented Jan 8, 2022

Unless I'm missing something, using github.com/sylabs/sif/v2 rather than duplicating the code in containers/image/sif/sif would be straight forward, and mean less code to maintain in containers/image. I've taken a stab at that here, as a practical example: tri-adam@1dd4610.

@tri-adam Could you sign this commit (git commit --amend -s) per https://github.com/containers/image/blob/main/CONTRIBUTING.md#sign-your-prs , and submit that branch as a PR, please? Then we can integrate it, along with other finishing touches I’m working on.

@mtrmac
Copy link
Collaborator

mtrmac commented Jan 10, 2022

To test this, e.g.

# This is probably not the right way to download the image, it’s just a guess that seems to work
curl -L 'https://library.sylabs.io/v1/imagefile/library/default/lolcow:latest?arch=amd64' -o lolcow.sif
bin/skopeo copy --debug sif:lolcow.sif dir:t
podman run --privileged dir:t

or, of course, directly podman pull sif:… / podman run sif:….

@tri-adam
Copy link
Contributor

tri-adam commented Jan 10, 2022

@tri-adam Could you sign this commit (git commit --amend -s) per https://github.com/containers/image/blob/main/CONTRIBUTING.md#sign-your-prs , and submit that branch as a PR, please? Then we can integrate it, along with other finishing touches I’m working on.

@mtrmac done, please see #1436. I haven't re-based or otherwise modified the original branch, but happy to do so if it'd help. Cheers.

@mtrmac
Copy link
Collaborator

mtrmac commented Jan 10, 2022

I have taken over this effort in #1438 . Thank you very much, @yhcote and @tri-adam !

@mtrmac mtrmac closed this Jan 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants