-
Notifications
You must be signed in to change notification settings - Fork 166
wfe: fix a race in orderForDisplay
#500
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
wfe: fix a race in orderForDisplay
#500
Conversation
I have an integration test for an ACME extension ([1]) that instantiate
Pebble. I noticed that I would sometimes get test failures under
`go test -race` like this:
```
==================
WARNING: DATA RACE
Write at 0x00c00026cd80 by goroutine 143:
github.com/letsencrypt/pebble/v2/wfe.(*WebFrontEndImpl).orderForDisplay.func2()
/home/timg/source/pebble/wfe/wfe.go:1876 +0x255
math/rand.(*Rand).Shuffle()
/usr/local/go/src/math/rand/rand.go:265 +0x96
math/rand.Shuffle()
/usr/local/go/src/math/rand/rand.go:470 +0x38
github.com/letsencrypt/pebble/v2/wfe.(*WebFrontEndImpl).orderForDisplay()
/home/timg/source/pebble/wfe/wfe.go:1875 +0x204
github.com/letsencrypt/pebble/v2/wfe.(*WebFrontEndImpl).Order()
/home/timg/source/pebble/wfe/wfe.go:2021 +0x5ed
github.com/letsencrypt/pebble/v2/wfe.(*WebFrontEndImpl).Order-fm()
<autogenerated>:1 +0x69
github.com/letsencrypt/pebble/v2/wfe.(*WebFrontEndImpl).HandleFunc.func1()
/home/timg/source/pebble/wfe/wfe.go:306 +0xa09
github.com/letsencrypt/pebble/v2/wfe.wfeHandlerFunc.ServeHTTP()
/home/timg/source/pebble/wfe/wfe.go:146 +0x56
github.com/letsencrypt/pebble/v2/wfe.(*topHandler).ServeHTTP()
/home/timg/source/pebble/wfe/wfe.go:158 +0x5b
github.com/letsencrypt/pebble/v2/wfe.(*WebFrontEndImpl).HandleFunc.StripPrefix.func2()
/usr/local/go/src/net/http/server.go:2282 +0x471
net/http.HandlerFunc.ServeHTTP()
/usr/local/go/src/net/http/server.go:2220 +0x47
net/http.(*ServeMux).ServeHTTP()
/usr/local/go/src/net/http/server.go:2747 +0x255
net/http.serverHandler.ServeHTTP()
/usr/local/go/src/net/http/server.go:3210 +0x2a1
net/http.(*conn).serve()
/usr/local/go/src/net/http/server.go:2092 +0x12a4
net/http.(*Server).Serve.gowrap3()
/usr/local/go/src/net/http/server.go:3360 +0x4f
Previous read at 0x00c00026cd80 by goroutine 274:
github.com/letsencrypt/pebble/v2/ca.(*CAImpl).CompleteOrder()
/home/timg/source/pebble/ca/ca.go:448 +0x76f
github.com/letsencrypt/pebble/v2/wfe.(*WebFrontEndImpl).FinalizeOrder.func1()
/home/timg/source/pebble/wfe/wfe.go:2215 +0x8d
Goroutine 143 (running) created at:
net/http.(*Server).Serve()
/usr/local/go/src/net/http/server.go:3360 +0x8ec
net/http.(*Server).ServeTLS()
/usr/local/go/src/net/http/server.go:3401 +0x706
net/http.ServeTLS()
/usr/local/go/src/net/http/server.go:2875 +0x269
github.com/tgeoghegan/oidf-box/test.setupPebble.func1()
/home/timg/source/oidf-box/test/harness.go:297 +0x19c
Goroutine 274 (running) created at:
github.com/letsencrypt/pebble/v2/wfe.(*WebFrontEndImpl).FinalizeOrder()
/home/timg/source/pebble/wfe/wfe.go:2214 +0x2fa4
github.com/letsencrypt/pebble/v2/wfe.(*WebFrontEndImpl).FinalizeOrder-fm()
<autogenerated>:1 +0x69
github.com/letsencrypt/pebble/v2/wfe.(*WebFrontEndImpl).HandleFunc.func1()
/home/timg/source/pebble/wfe/wfe.go:306 +0xa09
github.com/letsencrypt/pebble/v2/wfe.wfeHandlerFunc.ServeHTTP()
/home/timg/source/pebble/wfe/wfe.go:146 +0x56
github.com/letsencrypt/pebble/v2/wfe.(*topHandler).ServeHTTP()
/home/timg/source/pebble/wfe/wfe.go:158 +0x5b
github.com/letsencrypt/pebble/v2/wfe.(*WebFrontEndImpl).HandleFunc.StripPrefix.func2()
/usr/local/go/src/net/http/server.go:2282 +0x471
net/http.HandlerFunc.ServeHTTP()
/usr/local/go/src/net/http/server.go:2220 +0x47
net/http.(*ServeMux).ServeHTTP()
/usr/local/go/src/net/http/server.go:2747 +0x255
net/http.serverHandler.ServeHTTP()
/usr/local/go/src/net/http/server.go:3210 +0x2a1
net/http.(*conn).serve()
/usr/local/go/src/net/http/server.go:2092 +0x12a4
net/http.(*Server).Serve.gowrap3()
/usr/local/go/src/net/http/server.go:3360 +0x4f
==================
```
What's happening is that `CAImpl.CompleteOrder` is fetching an `Order`
from the "database" (which is in-memory) and iterating over the
identifiers therein. `WebFrontEndImpl.orderForDisplay` fetches an
`Order` from the database and then wants to construct a shuffled version
of the order to return to the client to ensure nobody makes assumptions
about the layout of these objects.
`orderForDisplay` tries to take a copy of the `acme.Order`, but
evidently that copies *pointers* to the `[]acme.Identifier Identifiers`
and `[]string Authorizations` fields, which means shuffling them in the
copy shuffles them in the DB object and makes the race detector
unhappy.
Explicit `slices.Clone` calls make sure we're not mutating objects from
the database.
[1]: https://github.com/tgeoghegan/oidf-box
wfe/wfe.go
Outdated
| // below could mutate the order in the database, causing data races. | ||
| result := order.Order | ||
| result.Identifiers = slices.Clone(order.Order.Identifiers) | ||
| result.Authorizations = slices.Clone(order.Order.Authorizations) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I must admit that I'm not certain if this is the "right", Goish, Gothonic way to fix this, so I'm happy to be shown better solutions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This definitely works. But it has the downside of not catching any new pointer-valued fields which are added to the Order type in the future.
I think it would be slightly more idiomatic to instead say:
result := acme.Order{
Status: order.Order.Status,
Error: order.Order.Error,
etc
Identifiers: slices.Clone(order.Order.Identifiers),
etc
}This has a similar downside -- if a new field is added to the Order type, it may not get added to this operation. But that will result in the field being missing, rather than subtler bugs, so is much more likely to be caught by unit or integration tests.
This solution is so much more verbose that I'm honestly not sure it counts as "better", but I do think it has worthwhile advantages. Up to you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, the thing where unmentioned struct fields get filled with the zero value kinda works against us here. I like yours better because then at least we're not mutating an object after creating it, but I don't know how to do better.
wfe/wfe.go
Outdated
| // below could mutate the order in the database, causing data races. | ||
| result := order.Order | ||
| result.Identifiers = slices.Clone(order.Order.Identifiers) | ||
| result.Authorizations = slices.Clone(order.Order.Authorizations) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This definitely works. But it has the downside of not catching any new pointer-valued fields which are added to the Order type in the future.
I think it would be slightly more idiomatic to instead say:
result := acme.Order{
Status: order.Order.Status,
Error: order.Order.Error,
etc
Identifiers: slices.Clone(order.Order.Identifiers),
etc
}This has a similar downside -- if a new field is added to the Order type, it may not get added to this operation. But that will result in the field being missing, rather than subtler bugs, so is much more likely to be caught by unit or integration tests.
This solution is so much more verbose that I'm honestly not sure it counts as "better", but I do think it has worthwhile advantages. Up to you!
wfe/wfe.go
Outdated
| // use as the result. We have to make sure to copy the identifiers and | ||
| // authorizations *slices* and not *pointers to the slices* or the mutations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A more correct way to phrase this would be that we have to copy the contents of the idents and authzs slices, rather than copying the slices themselves, because slices are just pointers with some extra metadata.
Those fields aren't "pointers to slices" (that would be type *[]string).
* 'main' of https://github.com/letsencrypt/pebble: (35 commits) add overriding of ARI response (letsencrypt#501) wfe: fix a race in `orderForDisplay` (letsencrypt#500) Bump golang.org/x/ dependencies (letsencrypt#499) currectly triggers BadSignatureAlgorithmProblem at JWS parse time (letsencrypt#492) use newer validation subdomain for dns-account-01 (fix CI eggsampler/acme error) (letsencrypt#498) Orders don't have a "deactivated" status. (letsencrypt#301) Update golangci-lint (letsencrypt#488) build(deps): bump github.com/go-jose/go-jose/v4 from 4.0.4 to 4.0.5 (letsencrypt#487) Truncate ARI timestamps to millisecond resolution (letsencrypt#485) return logical and compliant ARI windows for expiring certs (letsencrypt#484) Update dependencies (letsencrypt#481) docs: rm mention of subproblems being unimpl'd (letsencrypt#479) Fix(NOISSUE): Fix docker compose file example in README.md (letsencrypt#475) Add support for ACME Profiles (letsencrypt#473) Simplify KU, EKU, and SKID fields of issued certs (letsencrypt#472) Update golangci-lint to 1.60.2 (letsencrypt#474) Update /x/net for compatibility with go1.23 (letsencrypt#470) Reject extra command line args and fix README invocation (letsencrypt#467) Document exposing API and management ports when not using docker-compose.yaml (letsencrypt#465) Implement latest draft-ietf-acme-ari spec (letsencrypt#461) ...
I have an integration test for an ACME extension (1) that instantiate Pebble. I noticed that I would sometimes get test failures under
go test -racelike this:What's happening is that
CAImpl.CompleteOrderis fetching anOrderfrom the "database" (which is in-memory) and iterating over the identifiers therein.WebFrontEndImpl.orderForDisplayfetches anOrderfrom the database and then wants to construct a shuffled version of the order to return to the client to ensure nobody makes assumptions about the layout of these objects.orderForDisplaytries to take a copy of theacme.Order, but evidently that copies pointers to the[]acme.Identifier Identifiersand[]string Authorizationsfields, which means shuffling them in the "copy" shuffles them in the DB object and makes the race detector unhappy.