Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide large file progress reporting hooks to copy operations #219

Merged
merged 1 commit into from
Feb 24, 2017

Conversation

erikh
Copy link
Contributor

@erikh erikh commented Jan 19, 2017

Again, including the other commits for ease of getting things to compile; there
is an example at the end. Only the last commit is relevant in this PR.

This provides a hook and an interval timer for progressed copy operations, such
as image packaging. It only covers the docker endpoints for now and is
untested, again, as a proof of concept for approval.

Users can specify a function which is called at each interval as long as the
copy is progressing. This function is passed the current byte offset of the
copy operation. They can use this to calculate sizes, or percentages, or the
phase of the moon at the time the current byte is downloading. Resolution of
the timer is completely independent.

Here's some code that works with this: it reports byte offsets every 100ms
during both download and upload operations during this layer edit.

Thank you again for the consideration.

package main

import (
	"fmt"
	"time"

	"github.com/containers/image/copy"
	"github.com/containers/image/docker/daemon"
	"github.com/containers/image/docker/reference"
	"github.com/containers/image/types"
)

func main() {
	ref, err := daemon.ParseReference("docker.io/library/golang:latest")
	if err != nil {
		panic(err)
	}

	copyCtx := &types.SystemContext{
		CopyHook: func(offset uint64) {
			fmt.Println(offset)
		},
		CopyHookInterval: 100 * time.Millisecond,
	}

	img, err := ref.NewImage(copyCtx)
	if err != nil {
		panic(err)
	}
	defer img.Close()

	tgtRef, _ := reference.ParseNamed("docker.io/erikh/test:latest")
	tgt, err := daemon.NewReference("", tgtRef)
	if err != nil {
		panic(err)
	}

	b, _, err := img.Manifest()
	if err != nil {
		panic(err)
	}
	fmt.Println(string(b))

	var i int
	err = copy.Image(nil, tgt, ref, &copy.Options{
		SourceCtx:        copyCtx,
		DestinationCtx:   copyCtx,
		RemoveSignatures: true,
		LayerCopyHook: func(srcLayer types.BlobInfo) bool {
			i++
			return i < 2 // only the first layer
		},
	})

	if err != nil {
		panic(err)
	}
}

Copy link
Collaborator

@mtrmac mtrmac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There already is a progress reporting mechanism in imageCopier.copyBlobFromStream. Having every source/destination implement this individually seems wasteful.

Would it work for you to instead somehow parametrize the progress reporting in imageCopier.copyBlobFromStream, so that a caller could provide its own reporting object / callback / something in copy.Options, and if that is defined, it would be used instead of writing a progress bar to copy.Options.ReportWriter?

@runcom , you probably know much more about the available options and the commonly used interfaces in this area.

}
}()

resp, err := c.ImageLoad(ctx, reader, true)
if sysctx.CopyHook != nil && sysctx.CopyHookInterval > 0 {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sysctx is allowed to be nil

@@ -73,8 +74,14 @@ func newImageSource(ctx *types.SystemContext, ref daemonReference) (types.ImageS
}
}()

if _, err := io.Copy(tarCopyFile, inputStream); err != nil {
return nil, err
if ctx == nil {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ctx != nil && ctx.CopyHookInterval > 0 && … ?

@erikh
Copy link
Contributor Author

erikh commented Jan 19, 2017 via email

@erikh erikh force-pushed the tar-hook branch 2 times, most recently from 64293b8 to 672557f Compare January 20, 2017 08:19
@erikh
Copy link
Contributor Author

erikh commented Jan 20, 2017

I've rebased this to the point tests should pass; I will leave it alone until we decide what to do. In the meantime, I'm going to port box to use containers/image so I can get a better feel for what I actually need, but I'd like to keep my progress meters :)

@erikh
Copy link
Contributor Author

erikh commented Jan 20, 2017

I have made an attempt to pretty up the API; let me know what you think. This
supplies the copy hook at copy.Options time (for both source and destination
reporting) and in the systemcontext still (e.g., if you only want to display a
meter for a single direction, which we do in box).

@mtrmac
Copy link
Collaborator

mtrmac commented Jan 20, 2017

  • This still forces every single ImageSource and ImageDestination to implement progress reporting separately; do you really need that, or do you intend to use them through copy.Image? If the latter, a hook in copyBlobFromStream should suffice.
  • Is it really necessary to have a separate progress report for the source and the destination? copyBlobFromStream connects the two using a series of pipes etc., which may do some compression/decompression but should generally be much faster than downloads over the network, so I’d expect progress reporting on the source and destination to report very similar percentages (though perhaps different raw values if one stream is compressed and the other uncompressed)
  • As it is, CopyHook has no information at all about which stream it is being called for, it just receives a successive series of completely context-less integers which have no obvious relation to each other. (This is already true when hooking both the source and destination, and it will get much worse if [performance] parallel layers upload/download in copy #105 ever happens). Hence my question to @runcom about existence of a commonly accepted interface which does provide sufficient information and context. E.g., I guess (without knowing the field) that docker/docker/pkg/progress might be an option—it is still a series of unconnected data points, with the recipient having to sort them out, but at least they have a disambiguating ID for each of the parallel operations.

@erikh
Copy link
Contributor Author

erikh commented Jan 21, 2017

  • This still forces every single ImageSource and ImageDestination to
    implement progress reporting separately; do you really need that, or do you
    intend to use them through copy.Image? If the latter, a hook in
    copyBlobFromStream should suffice.

Not exactly; if you pass the copyfunc into the options it is directly added to
each context so that you don't have to do it yourself, OR you can do the
contexts to scope it.

  • Is it really necessary to have a separate progress report for the source
    and the destination? copyBlobFromStream connects the two using a series of
    pipes etc., which may do some compression/decompression but should generally
    be much faster than downloads over the network, so I’d expect progress
    reporting on the source and destination to report very similar percentages
    (though perhaps different raw values if one stream is compressed and the
    other uncompressed)

This is for flexibility, it can be useful when doing it unidirectionally.
Try the latest version of the source (below) to see what I mean. The first pass at
ref.NewImage implements a source copy (download) hook, while the others
process the image copy.

package main

import (
	"fmt"
	"time"

	"github.com/containers/image/copy"
	"github.com/containers/image/docker/daemon"
	"github.com/containers/image/docker/reference"
	"github.com/containers/image/signature"
)

func main() {
	ref, err := daemon.ParseReference("docker.io/library/golang:latest")
	if err != nil {
		panic(err)
	}

  // you can do it this way too

	img, err := ref.NewImage(&types.SystemContext{
  })
			CopyHookInterval: 100 * time.Millisecond,
			CopyHook: func(offset uint64) {
				fmt.Println(offset)
			},
	if err != nil {
		panic(err)
	}
	defer img.Close()

	tgtRef, _ := reference.ParseNamed("docker.io/erikh/test:latest")
	tgt, err := daemon.NewReference("", tgtRef)
	if err != nil {
		panic(err)
	}

	b, _, err := img.Manifest()
	if err != nil {
		panic(err)
	}
	fmt.Println(string(b))

	pc, err := signature.NewPolicyContext(&signature.Policy{
		Default: []signature.PolicyRequirement{signature.NewPRInsecureAcceptAnything()},
	})

	if err != nil {
		panic(err)
	}

	err = copy.Image(
		pc,
		tgt,
		ref,
		&copy.Options{
			CopyHookInterval: 100 * time.Millisecond,
			CopyHook: func(offset uint64) {
				fmt.Println(offset)
			},
			RemoveSignatures: true,
		})

	if err != nil {
		panic(err)
	}
}

> - As it is, `CopyHook` has no information at all about which stream it is
> being called for, it just receives a successive series of completely
> context-less integers which have no obvious relation to each other. (This is
> already true when hooking both the source and destination, and it will get
> much worse if #105 ever happens). Hence my question to @runcom about
> existence of a commonly accepted interface which does provide sufficient
> information and context. E.g., I _guess_ (without knowing the field) that
> `docker/docker/pkg/progress` might be an optionit is still a series of
> unconnected data points, with the recipient having to sort them out, but at
> least they have a disambiguating `ID` for each of the parallel operations.

Perhaps I need to comment the source; the successive series is a increasing
value based on the progress of the download. It has context, and can be
leveraged with closures to get flexibility while processing it.

I really would prefer a hook if possible. Dealing with docker's progress meters
is pretty terrible and I have a nice, colorized/tty-aware UI with clear progress
information that only triggers when necessary (e.g, for copying large files and
layers only). I really don't want to ditch that. Basically, I'd really like to
avoid using a canned progress meter. It's really not the responsibility of this
library to implement progress meters if you ask me, but I realize it's not my
say too. :)

@runcom
Copy link
Member

runcom commented Jan 21, 2017

  • As it is, CopyHook has no information at all about which stream it is
    being called for, it just receives a successive series of completely
    context-less integers which have no obvious relation to each other. (This is
    already true when hooking both the source and destination, and it will get
    much worse if [performance] parallel layers upload/download in copy #105 ever happens). Hence my question to @runcom about
    existence of a commonly accepted interface which does provide sufficient
    information and context. E.g., I guess (without knowing the field) that
    docker/docker/pkg/progress might be an option—it is still a series of
    unconnected data points, with the recipient having to sort them out, but at
    least they have a disambiguating ID for each of the parallel operations.

Perhaps I need to comment the source; the successive series is a increasing
value based on the progress of the download. It has context, and can be
leveraged with closures to get flexibility while processing it.

I really would prefer a hook if possible. Dealing with docker's progress meters
is pretty terrible and I have a nice, colorized/tty-aware UI with clear progress
information that only triggers when necessary (e.g, for copying large files and
layers only). I really don't want to ditch that. Basically, I'd really like to
avoid using a canned progress meter. It's really not the responsibility of this
library to implement progress meters if you ask me, but I realize it's not my
say too. :)

I agree the docker progress package isn't exactly what we're looking for there (and I'm not in favor of it here!).
This copy function hook provides a nice abstraction to me though, even given @mtrmac concerns in the first comment. We could find a way to still have this closure passed down addressing @mtrmac concerns.

@erikh
Copy link
Contributor Author

erikh commented Jan 21, 2017

I had another idea: what if we pass a send-only channel that passes a struct,
which gives us the flexibility of modifying the payload later.

e.g., the struct could have the byte offset, but also the percentage if know it.

The channel is optional. They can do whatever the heck they want with it then. :)

@mtrmac
Copy link
Collaborator

mtrmac commented Jan 23, 2017

  • This still forces every single ImageSource and ImageDestination to
    implement progress reporting separately; do you really need that, or do you
    intend to use them through copy.Image? If the latter, a hook in
    copyBlobFromStream should suffice.

Not exactly; if you pass the copyfunc into the options it is directly added to
each context so that you don't have to do it yourself, OR you can do the
contexts to scope it.

That’s not what I mean; what you say allows the caller to set a single hook, but the implementations are not shared. We have 6 transports, * 2 for source/destination, that’s 12 places which would need to now care about progress reporting, and 12 places which would have to be kept in sync. Whereas a progress reporting in copyBlobFromStream or so would be a single piece of code to maintain (which also means that it could, without 12 times the pain, be a bit more complex).

  • Is it really necessary to have a separate progress report for the source
    and the destination? copyBlobFromStream connects the two using a series of
    pipes etc., which may do some compression/decompression but should generally
    be much faster than downloads over the network, so I’d expect progress
    reporting on the source and destination to report very similar percentages
    (though perhaps different raw values if one stream is compressed and the
    other uncompressed)

This is for flexibility, it can be useful when doing it unidirectionally.
Try the latest version of the source (below) to see what I mean.

I can’t see that this code demonstrates anything; it doesn’t treat sources/destinations differently at all if I am not mistaken (though it also appears somehow truncated?!).

Sure, a hook in copyBlobFromStream would be useless for users who deal with ImageSource/ImageDestination directly, without using copy.Image. I’m leaning towards saying that such callers are dealing with deep enough internals that they can bother to do a tiny bit of extra work themselves. (Besides, a shared hook like your streamcopy.WithProgress, or more likely an io.Reader wrapper like pb.ProgressBar.NewProxyReader, is easy enough to hook up if the caller were dealing with a single stream at a time.)

  • As it is, CopyHook has no information at all about which stream it is
    being called for, it just receives a successive series of completely
    context-less integers which have no obvious relation to each other. (This is
    already true when hooking both the source and destination, and it will get
    much worse if [performance] parallel layers upload/download in copy #105 ever happens).

Perhaps I need to comment the source; the successive series is a increasing
value based on the progress of the download.

But which download? With the existing code, copying a 3-layer image with layers A, B, C, the hook could very well receive
1, 100, 1024, 0, 50, 500, 2, 200, 2048, 50, 100, 1000, … (the values being progress of, in order, writing A, reading A, reading B, writing B, writing C, reading C, writing A, reading A, reading B, writing B, writing C, reading C, …), and the hook has no idea which values belongs to which stream.

Basically, I'd really like to
avoid using a canned progress meter. It's really not the responsibility of this
library to implement progress meters if you ask me, but I realize it's not my
say too. :)

Sure, copy.Options should receive an interface which implements the actual data collection, or drawing if any. So, it would be a docker/docker/pkg/progress.Output, not directly docker/docker/pkg/jsomessage or whatever draws the progress bars. (Not saying that that is necessarily a good interface. Just that this might be a problem which already has a nice solution, both as a data type and as a UI drawing consumer)

@mtrmac
Copy link
Collaborator

mtrmac commented Jan 23, 2017

I had another idea: what if we pass a send-only channel that passes a struct,
which gives us the flexibility of modifying the payload later.

e.g., the struct could have the byte offset, but also the percentage if know it.

The channel is optional. They can do whatever the heck they want with it then. :)

Well, a ProgressUpdate <- chan ProgressStruct and a func ProgressUpdate(ProgressStruct) are semantically substitutable, and different callers may prefer one over another. But I expect that a real progress-bar implementation would want to know more:

  • Of course, progress of one of the progress-reporting operations has been updated
    • This will always include at least a raw value; it may also report other stats (current / average network throughput; is the speed limited by network or local storage speed?)
  • A new progress-reporting operation has been added to the work plan: (e.g. first we download a manifest; after doing that and parsing it, we may decide that we need to copy N layers).
    • That operation may or may not have a duration known in advance (file size if copying it; perhaps a percentage if doing a scan of something, or compressing/decompressing as a separate step)
    • That operation may or may not have a human-readable name (specific path, “creating a temporary copy”, vs. unreadable digest values)
  • One of the progress-reporting operations has successfully finished.
  • One of the progress-reporting operations has failed, with a specific error message.

For all of this, a typed interface with appropriate method names would probably be a better fit than a channel (which could be used, but every consumer would then have to do a switch incomingValue.ReportKind or something similar using the type system).

And yes, this seems fairly complex, which is why I keep asking whether there is a nice existing interface definition we could just use without having to invent this.

@erikh
Copy link
Contributor Author

erikh commented Jan 24, 2017

Yeah see I don't think that's necessary at all -- nor will we have the capability of determining things like network bandwidth consumed without a lot of extra tooling.

However, one can derive that pretty trivially with the implementation above -- just divide the number by the time transpired at any reporting interval.

What I really just want -- and literally nothing else -- is something that has nothing to do with progress bars itself, but the reporting of the progress. Error states, etc can come from different sources.

I can build progress meters from that.

@mtrmac
Copy link
Collaborator

mtrmac commented Jan 24, 2017

Yeah see I don't think that's necessary at all

I can buy that for the extra details and errors, but not the start/stop indications.

How can anyone consume the one-value "here’s an integer” data when copy.Image makes many copies? How can that happen without “new copy starting” notification, when the caller does not know in advance how many copies will be made?

giuseppe pushed a commit to giuseppe/image that referenced this pull request Jan 24, 2017
@erikh
Copy link
Contributor Author

erikh commented Feb 6, 2017

Here's the latest program and patch:

package main

import (
	"fmt"
	"time"

	"github.com/containers/image/copy"
	"github.com/containers/image/docker/daemon"
	"github.com/containers/image/docker/reference"
	"github.com/containers/image/signature"
	"github.com/containers/image/types"
)

func main() {
	ref, err := daemon.ParseReference("docker.io/library/golang:latest")
	if err != nil {
		panic(err)
	}

	signal := make(chan types.SignalProperties)

	go func() {
		for prop := range signal {
			fmt.Println(prop.Artifact, prop.Delta)
		}
	}()

	// you can do it this way too

	img, err := ref.NewImage(&types.SystemContext{
		SignalInterval: 100 * time.Millisecond,
		Signal:         signal,
	})
	if err != nil {
		panic(err)
	}
	defer img.Close()

	close(signal)

	tgtRef, _ := reference.ParseNamed("docker.io/erikh/test:latest")
	tgt, err := daemon.NewReference("", tgtRef)
	if err != nil {
		panic(err)
	}

	if err != nil {
		panic(err)
	}

	b, _, err := img.Manifest()
	if err != nil {
		panic(err)
	}
	fmt.Println(string(b))

	pc, err := signature.NewPolicyContext(&signature.Policy{
		Default: []signature.PolicyRequirement{signature.NewPRInsecureAcceptAnything()},
	})

	if err != nil {
		panic(err)
	}

	signal = make(chan types.SignalProperties)

	go func() {
		for prop := range signal {
			fmt.Println(prop.Artifact, prop.Delta)
		}
	}()

	err = copy.Image(
		pc,
		tgt,
		ref,
		&copy.Options{
			SignalInterval:   100 * time.Millisecond,
			Signal:           signal,
			RemoveSignatures: true,
		})

	close(signal)

	if err != nil {
		panic(err)
	}
}

As you can see I've moved it to use a channel with a communicated struct that
current contains the offset of the copy (which is labeled delta here on
accident, will fix in a bit). The only thing I don't like about this interface
is that you have to close your own channels (in the source case, we can't avoid
it), but that's not really terrible.

@mtrmac
Copy link
Collaborator

mtrmac commented Feb 6, 2017

Thanks.

  • I still think that the progress hook should only be in copy.Image, not in the individual transport implementations; the underlying implementation can be made available to low-level users who want to call PutBlob/GetBlob directly.
  • Also, an interface would be more suitable than a channel, because it allows “starting an artifact copy” / “ending an artifact copy” notifications in a clean, type-safe way. At least the latter is very valuable.

@erikh
Copy link
Contributor Author

erikh commented Feb 6, 2017 via email

@mtrmac
Copy link
Collaborator

mtrmac commented Feb 6, 2017

@runcom , opinions on the channel vs. interface issue?

@erikh
Copy link
Contributor Author

erikh commented Feb 7, 2017

Hey @mtrmac, I updated to use copy.Image and I think you will find the API
considerably more malleable this way. I also am passing a types.ImageBlob as
the artifact now instead of a string.

@erikh
Copy link
Contributor Author

erikh commented Feb 7, 2017

The last posted example should still work.

Copy link
Collaborator

@mtrmac mtrmac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! This broadly makes sense to me, these are just implementation-level comments.

copy/copy.go Outdated
@@ -173,6 +179,8 @@ func Image(policyContext *signature.PolicyContext, destRef, srcRef types.ImageRe
diffIDsAreNeeded: src.UpdatedImageNeedsLayerDiffIDs(manifestUpdates),
canModifyManifest: canModifyManifest,
reportWriter: reportWriter,
signalInterval: options.SignalInterval,
signal: options.Signal,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

options may be nil

copy/copy.go Outdated
stream = r
} else {
stream = destStream
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use the same stage description structure as the rest of the steps, this function is pretty complex and the regularity makes it easier to understand:

  • Create a separate newline-separated section for this stage in the pipeline, starting with a // === -prefixed comment
  • Use destStream as both the input and output of that pipeline stage (i.e. if ic.signal != nil … { … destStream = r } with no else branch)

"github.com/containers/image/types"
)

const megaByte = float64(1024 * 1024)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not used.

// WithProgress implements io.Copy with a buffered reader, then measures
// reports to the offsetFunc with the count processed at the interval.
// If io.EOF is not returned then the error is returned. Otherwise, it is nil.
// WithProgress closes the writer after finishing.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment needs updating (e.g. there is no offsetFunc)

// reports to the offsetFunc with the count processed at the interval.
// If io.EOF is not returned then the error is returned. Otherwise, it is nil.
// WithProgress closes the writer after finishing.
func WithProgress(writer io.WriteCloser, reader io.Reader, artifact types.BlobInfo, interval time.Duration, signal chan types.SignalProperties) error {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Returning a value from a goroutine is not really useful.

(It might well make sense to encapsulate a bit more, i.e. to have a public function here which accepts an io.Reader and returns an io.Reader, making the pipe and the goroutine entirely invisible to the rest of the code.)

} else {
return nil
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This goto seems unnecessary; AFAICS this guarantees that there will not be a “signal” with the final 100% count value. But…

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you just objecting to it becuase it uses goto?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I am not dogmatic about goto, though it does seem to me that a goto-less equivalent would be a bit shorter. The objection was really about skipping the final update with 100% progress.

Though, with the point about io.Reader returning rn >0 && err != nil, I expect this to be restructured so much that nothing about these particular lines will remain applicable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. I'll review and adjust. Thanks.

}
if err != nil {
return err
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is documented that io.Reader can return rn >0 && err != nil at the same time; i.e. the code should first write any outstanding data if any, and then check err. (See the implementation of io.Copy)


if interval > 0 && signal != nil {
if time.Since(t) > interval {
signal <- types.SignalProperties{Delta: count, Artifact: artifact}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So Delta is not a “delta” (change) since the last update, but the total offset? It would be more natural to call the member Offset then.

copy/copy.go Outdated
// intermediary to signal progress information to any registered channel.
if ic.signal != nil && ic.signalInterval > 0 {
r, w := io.Pipe()
go streamcopy.WithProgress(w, destStream, inputInfo, ic.signalInterval, ic.signal)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn’t this use srcInfo instead of inputInfo? inputInfo may be {Digest: "", Size:-1} in the “compressing on the fly” case.

types/types.go Outdated

// SignalProperties is used to pass information from the copy code to a monitor which
// can use the real-time information to produce output or react to changes.
type SignalProperties struct {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

”Signal” is pretty generic. s/Signal/Progress/g?

copy/copy.go Outdated
// the pipe is here to facilitate the stream copier which will act as an
// intermediary to signal progress information to any registered channel.
if ic.signal != nil && ic.signalInterval > 0 {
r, w := io.Pipe()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

defer r.Close() somewhere, otherwise we could exit with an unreferenced unclosed PipeReader and a PipeWriter still blocked on trying to write to it.

@erikh
Copy link
Contributor Author

erikh commented Feb 7, 2017 via email

@erikh erikh force-pushed the tar-hook branch 2 times, most recently from 9374e25 to c0cb039 Compare February 7, 2017 16:48
@erikh
Copy link
Contributor Author

erikh commented Feb 7, 2017

Not finshed with review comments, but I did add a test if you'd like to peek.

@erikh
Copy link
Contributor Author

erikh commented Feb 9, 2017

I've updated to use your new pattern but the tests aren't passing. Not sure how to best test this yet, so I'm going to tinker for a while. Let me know if you'd like additional changes.

@erikh erikh force-pushed the tar-hook branch 4 times, most recently from 749d70e to a553ec7 Compare February 9, 2017 03:55
@erikh
Copy link
Contributor Author

erikh commented Feb 9, 2017

giving up on a test unless you have a better idea :) Let me know if you'd like anything changed, but I believe this is merge-worthy at this point.

Copy link
Collaborator

@mtrmac mtrmac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! ACK.

It might be useful to have a NewProgressReader as a public API (it would be very easy to combine with GetBlob/PutBlob for those who don’t use copy.Image), but that is not at all essential for this PR and it can be done at any later time if needed.

WRT testing, it seems to me that the difficult question is what behavior should the test check for. It is possible to create an artificial test expecting exactly the implementation artifacts of a particular implementation; for a generic test, there’s not that much I suppose (artifact ID matches, offset is in the expected interval and doesn’t go backwards, perhaps with a delayingReader as an input the reported offset is never larger than what we know the reader has provided.)

We do have test-skopeo which ensures that this has not broken anything in the default case, at least.

copy/copy.go Outdated
options = &Options{}
}

if options.ReportWriter != nil {
reportWriter = options.ReportWriter
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One last thing, please move the reportWriter := ioutil.Discard initialization near this if.

@erikh
Copy link
Contributor Author

erikh commented Feb 10, 2017 via email

@erikh
Copy link
Contributor Author

erikh commented Feb 10, 2017

This should be ready now.

@mtrmac
Copy link
Collaborator

mtrmac commented Feb 10, 2017

👍 pending tests. Thanks!

Approved with PullApprove

@erikh
Copy link
Contributor Author

erikh commented Feb 15, 2017

Just kicked the tests, sorry this took so long.

@erikh
Copy link
Contributor Author

erikh commented Feb 16, 2017

tests are passing, looks like it needs another approval to get through

@mtrmac
Copy link
Collaborator

mtrmac commented Feb 21, 2017

@runcom PTAL

@@ -92,14 +95,22 @@ type Options struct {
ReportWriter io.Writer
SourceCtx *types.SystemContext
DestinationCtx *types.SystemContext
ProgressInterval time.Duration // time to wait between reports to signal the progress channel
Progress chan types.ProgressProperties // Reported to when ProgressInterval has arrived for a single artifact+offset.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if this is a receive only channel, could you strongly type it here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This won't work because we do send to the channel eventually from our code, not the foreign code.

Unless I misunderstood you, when I implemented this, as soon as the send is hit by the compiler in copy/progress_reader.go it aborts.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mmm ok not blocking this PR I guess

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAICS this should actually be a send-only channel. The code calling copy.Image needs a bidirectional channel, but copy.Image will only be sending to it, and the rest of the code calling copy.Image will only be receiving from it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so yeah, my point was: let's strongly type the direction when we can in order to allow the go compiler (or runtime?) to be more efficient (I read this somewhere, I don't remember where, I just remember strongly typing the direction of the channel helped). Again, not blocking this PR.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK, strongly typing is useful enough just for the documentation value; but not blocking for me either.

// progressReader is a reader that reports its progress on an interval.
type progressReader struct {
source io.Reader
channel chan types.ProgressProperties
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto for the receive only channel I guess

@runcom
Copy link
Member

runcom commented Feb 23, 2017

Could you rebase this @erikh ?

@erikh
Copy link
Contributor Author

erikh commented Feb 23, 2017

done

@runcom
Copy link
Member

runcom commented Feb 24, 2017

@erikh one last check before I merge, is there any skopeo PR which takes care of the test failure here?

@mtrmac
Copy link
Collaborator

mtrmac commented Feb 24, 2017

@erikh one last check before I merge, is there any skopeo PR which takes care of the test failure here?

I think that test failure (manifest/manifest.go:57: undefined: v1.MediaTypeImageManifestList) is #240 . This should not break skopeo AFAICS.

@runcom
Copy link
Member

runcom commented Feb 24, 2017

👍

merging then

Approved with PullApprove

@runcom runcom merged commit f8bbb39 into containers:master Feb 24, 2017
@mtrmac
Copy link
Collaborator

mtrmac commented Feb 24, 2017

(Um, it would still have been useful to merge #240 first and see the tests pass. But, well, it’s done, not worth reverting at this point just for insiting on the format process.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants