algorithm: Deprecate ErrDigestInvalidLength #35

wking · 2017-06-09T21:34:09Z

This error was added in 9721254 (Validate digest length on parsing, 2015-12-02) and motivated with:

The new error is mostly so that digest.Set doesn't need a special way to figure out if invalid digest had a algorithm prefix or not.

That doesn't make sense to me, because Digest.Validate returns ErrDigestUnsupported when there is no algorithm prefix (I've added a test to confirm this as well), and that seems orthogonal to encoding validation (which is the only place ErrDigestInvalidLength could matter). From the other changes in distribution/distribution#1231, I think the issue was just that invalid lengths weren't raising errors at all.

I see no reason to distinguish between invalid lengths, invalid characters, and other invalid-encoded-portion errors, so this commit collapses the error into ErrDigestInvalidFormat.

The code I'm removing assumed hex (with .Size() * 2), so this PR removes that assumption. The sha256 and other hex-based algorithms are still getting their lengths validated via anchoredEncodedRegexps.

This PR is based on my earlier suggestion. I think @AkihiroSuda kept the error because of @stevvooe's comment:

This needs to remain. Are we no longer throwing this error?

but with this PR we will continue throwing an error in the invalid-length case and folks who are switching on ErrDigestInvalidLength can continue to do so. It's just that they'll no longer be able to distinguish between ErrDigestInvalidLength and ErrDigestInvalidFormat, and since I see no reason to do that, I don't see it as an important backwards-compat concern. If someone does have a reason to make that distinction, pointing at that code would be a good reason to close this PR.

This error was added in 9721254 (Validate digest length on parsing, 2015-12-02) and motivated with [1]: > The new error is mostly so that digest.Set doesn't need a special > way to figure out if invalid digest had a algorithm prefix or not. That doesn't make sense to me, because Digest.Validate returns ErrDigestUnsupported when there is no algorithm prefix (I've added a test to confirm this as well), and that seems orthogonal to encoding validation (which is the only place ErrDigestInvalidLength could matter). From the other changes in [1], I think the issue was just that invalid lengths weren't raising errors at all. I see no reason to distinguish between invalid lengths, invalid characters, and other invalid-encoded-portion errors, so this commit collapses the error into ErrDigestInvalidFormat. The code I'm removing assumed hex (with '.Size() * 2'), so this commit removes that assumption. The sha256 and other hex-based algorithms are still getting their lengths validated via anchoredEncodedRegexps. [1]: distribution/distribution#1231 Signed-off-by: W. Trevor King <wking@tremily.us>

dmcgowan · 2017-06-09T21:59:28Z

I agree that throwing 2 different errors with similar meaning is not the way to go. Clients who are interested in these values would then have to check 2 separate values, although likely when they were created separately it was assumed that clients would not be switching off invalid data (not actionable) and just propagating the error or discarding the value. A better solution involves wrapping such as github.com/pkg/errors, but we are not going to pull in a vendor to this package. The best thing we can probably do now is just close this PR and revisit it if pkg/errors is upstreamed into golang or when we add support for other encodings and need logic smarter than .Size() * 2 to determine the expected size. No longer throwing the invalid length error just reduces the messaging to the callers, who are very unlikely to be switching off these error messages anyway. Checking 2 values instead of 1 is not ideal if they are, but it is not worth reducing the error granularity or adding error type complexity.

wking · 2017-06-09T22:20:29Z

On Fri, Jun 09, 2017 at 02:59:29PM -0700, Derek McGowan wrote: A better solution involves wrapping such as `github.com/pkg/errors`, but we are not going to pull in a vendor to this package.

I don't think that's a better solution, because anchoredEncodedRegexps is checking both the character validity and length at the same time [1]. Why would you even *want* two errors to distinguish between invalid characters and invalid lengths, even if we had a way to wrap them together for easier catching?

The best thing we can probably do now is just close this PR and revisit it if `pkg/errors` is upstreamed into golang or when we add support for other encodings and need logic smarter than `.Size() * 2` to determine the expected size.

But we already check the length via the regexp [2]. Why do it twice? The current way adds a redundant (hex-assuming) size check to valid digests, and the only upside is a possibly-cheaper rejection (vs. the regexp check) for invalid-length encoded portions of the wrong length. I haven't benchmarked either the size check or the regexp check, and I don't know what fraction of production use encoded portions have invalid lengths, but if most of your encoded portions are valid, this PR will make validation faster by one size check with no downside.

… but it is not worth reducing the error granularity…

I agree with the portions that I've replaced with ellipses, but why is it not worth reducing error granularity if clients are not acting on the distinction? [1]: https://github.com/opencontainers/go-digest/blob/279bed98673dd5bef374d3b6e4b09e2af76183bf/algorithm.go#L55-L61 [2]: https://github.com/opencontainers/go-digest/blob/279bed98673dd5bef374d3b6e4b09e2af76183bf/algorithm.go#L188

stevvooe · 2017-06-10T00:13:44Z

@wking Length and format errors are fundamentally different. This PR is fairly naive.

wking · 2017-06-10T04:22:59Z

Length and format errors are fundamentally different.

Maybe in some cases. For example, sha256:abcdef0123456789 is too short, but otherwise fine, so I'm fine expecting the ErrDigestInvalidLength it actually returns. And sha256:E58FCF7418D4390DEC8E8FB69D88C06EC07039D651FEDD3AA72AF9972E7D046B is the right length, and has only invalid characters (uppercase hex), so I'm fine expecting the ErrDigestInvalidFormat it actually returns. But there are a number of murky corner cases, and it's not obvious to me from the error docs what to expect in the following cases:

sha256: violates both the encoded grammar (which contains the one-or-more +) and the sha256-specific expected 64-char size by having a zero length. It contains no invalid characters. Should this be an invalid format (because it does not match the grammar) or an invalid length (because the encoded part is 0 chars long, not 64)? The master branch returns ErrDigestInvalidFormat, but I don't think that's obviously a better choice than ErrDigestInvalidLength.
sha256:d41d8cd98f00b204e9800m98ecf8427e violates the sha256 encoding requirements by containing an m and the encoded grammar (which contains the one-or-more +) and sha256-specific expected 64-char size by having a zero length. Should this be an invalid format (because of the m) or an invalid length (because the encoded part is 32 chars long, not 64)? The master branch returns ErrDigestInvalidLength, but I don't think that's obviously a better choice than ErrDigestInvalidFormat.

So to find potential effects due to these murky cases, let's look for some consumers distinguishing between the two cases:

opencontainers/image-spec $ git grep 'ErrDigestInvalid[FL]' v1.0.0-rc6
…no hits…
opencontainers/image-tools $ git grep 'ErrDigestInvalid[FL]' 38db2e4 | grep -v /go-digest/
…no hits…
opencontainers/runc $ git grep 'ErrDigestInvalid[FL]' v1.0.0-rc3
…no hits…
opencontainers/containerd $ git grep 'ErrDigestInvalid[FL]' v0.2.8
…no hits…
moby/moby $ git grep 'ErrDigestInvalid[FL]' v17.05.0-ce | grep -v '/go-digest/\|/docker/distribution/'
…no hits…
docker/distribution$ git grep 'ErrDigestInvalid[FL]' v2.6.1 | grep -v '^v2.6.1:digest/'
v2.6.1:reference/reference.go:  // ErrDigestInvalidFormat represents an error while trying to parse a string as a tag.
v2.6.1:reference/reference.go:  ErrDigestInvalidFormat = errors.New("invalid digest format")
v2.6.1:reference/reference.go:          return nil, ErrDigestInvalidFormat
v2.6.1:reference/reference_test.go:                     err:   digest.ErrDigestInvalidLength,
v2.6.1:registry/handlers/images.go:                                     if verificationError == digest.ErrDigestInvalidFormat {
v2.6.1:registry/handlers/images.go:             case digest.ErrDigestInvalidFormat:
v2.6.1:registry/storage/cache/cachecheck/suite.go:              MediaType: "application/octet-stream"}); err != digest.ErrDigestInvalidFormat {
v2.6.1:registry/storage/cache/cachecheck/suite.go:      if _, err := cache.Stat(ctx, ""); err != digest.ErrDigestInvalidFormat {

Going through those:

reference/reference.go is defining its own local ErrDigestInvalidFormat and comparing it to its own local digest regexp. This is independent of the go-digest code (they have a forked copy under digest/), and the lack of an analog to ErrDigestInvalidLength suggests the distinction I'm trying to drop in this PR is not important. The code in this area is also calling the forked go-digest analog internally.
reference/reference_test.go is expecting ErrDigestInvalidLength in a test including sha256:ffffffffffffffffffffffffffffffffff which matches our master.
registry/handlers/images.go is converting ErrDigestInvalidFormat to ErrorCodeDigestInvalid and using ErrorCodeUnknown for ErrDigestInvalidLength (and again here). But I'd expect ErrorCodeDigestInvalid to be a better match than ErrorCodeUnknown for both, based on their documentation.
registry/storage/cache/cachecheck/suite.go is expecting ErrDigestInvalidFormat in a test for sha384:abc, but our current master will return ErrDigestInvalidLength in that case.
registry/storage/cache/cachecheck/suite.go is expecting ErrDigestInvalidFormat in a test for the empty digest, which matches our master.

So of the docker/distribution cases, one is a wrapping validation framework that defines its own ErrDigestInvalidFormat for both length and valid-char errors, one has a pretty API version of ErrDigestInvalidFormat but none for ErrDigestInvalidLength, one disagrees with our master about the error (for sha384:abc), and two agree with our master about the error (for the empty digest and sha256:ffffffffffffffffffffffffffffffffff). That doesn't seem like a ringing endorsement for “these are fundamentally different” to me.

So is there anywhere that this distinction matters? The only downstream effect I can see for this consolidation is that it would be improved errors in the docker/distribution API. Are there any consumers that would be negatively impacted by this change?

stevvooe · 2017-09-08T23:36:56Z

@wking The problem with this PR is that it removes the size check. That is not required to deprecate the length error.

wking · 2017-09-14T01:47:26Z

The problem with this PR is that it removes the size check.

No it doesn't. We're still checking size here, used here. Also note that all the old unit tests still raise the expected error; none are failing to raise an error. If you feel it is missing a case, feel free to propose a new test case.

wking force-pushed the deprecate-ErrDigestInvalidLength branch from b297dcc to 1167650 Compare June 9, 2017 21:35

wking mentioned this pull request Jun 9, 2017

algorithm: ErrDigestInvalidFormat on Validate() even for unknown algorithms #36

Closed

stevvooe closed this Jun 10, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

algorithm: Deprecate ErrDigestInvalidLength #35

algorithm: Deprecate ErrDigestInvalidLength #35

Uh oh!

wking commented Jun 9, 2017

Uh oh!

dmcgowan commented Jun 9, 2017

Uh oh!

wking commented Jun 9, 2017 via email

Uh oh!

stevvooe commented Jun 10, 2017

Uh oh!

wking commented Jun 10, 2017 •

edited

Loading

Uh oh!

stevvooe commented Sep 8, 2017

Uh oh!

wking commented Sep 14, 2017 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

algorithm: Deprecate ErrDigestInvalidLength #35

algorithm: Deprecate ErrDigestInvalidLength #35

Uh oh!

Conversation

wking commented Jun 9, 2017

Uh oh!

dmcgowan commented Jun 9, 2017

Uh oh!

wking commented Jun 9, 2017 via email

Uh oh!

stevvooe commented Jun 10, 2017

Uh oh!

wking commented Jun 10, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

stevvooe commented Sep 8, 2017

Uh oh!

wking commented Sep 14, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

wking commented Jun 10, 2017 •

edited

Loading

wking commented Sep 14, 2017 •

edited

Loading