Skip to content

Conversation

@tgeoghegan
Copy link
Contributor

I have an integration test for an ACME extension (1) that instantiate Pebble. I noticed that I would sometimes get test failures under go test -race like this:

==================
WARNING: DATA RACE
Write at 0x00c00026cd80 by goroutine 143:
  github.com/letsencrypt/pebble/v2/wfe.(*WebFrontEndImpl).orderForDisplay.func2()
      /home/timg/source/pebble/wfe/wfe.go:1876 +0x255
  math/rand.(*Rand).Shuffle()
      /usr/local/go/src/math/rand/rand.go:265 +0x96
  math/rand.Shuffle()
      /usr/local/go/src/math/rand/rand.go:470 +0x38
  github.com/letsencrypt/pebble/v2/wfe.(*WebFrontEndImpl).orderForDisplay()
      /home/timg/source/pebble/wfe/wfe.go:1875 +0x204
  github.com/letsencrypt/pebble/v2/wfe.(*WebFrontEndImpl).Order()
      /home/timg/source/pebble/wfe/wfe.go:2021 +0x5ed
  github.com/letsencrypt/pebble/v2/wfe.(*WebFrontEndImpl).Order-fm()
      <autogenerated>:1 +0x69
  github.com/letsencrypt/pebble/v2/wfe.(*WebFrontEndImpl).HandleFunc.func1()
      /home/timg/source/pebble/wfe/wfe.go:306 +0xa09
  github.com/letsencrypt/pebble/v2/wfe.wfeHandlerFunc.ServeHTTP()
      /home/timg/source/pebble/wfe/wfe.go:146 +0x56
  github.com/letsencrypt/pebble/v2/wfe.(*topHandler).ServeHTTP()
      /home/timg/source/pebble/wfe/wfe.go:158 +0x5b
  github.com/letsencrypt/pebble/v2/wfe.(*WebFrontEndImpl).HandleFunc.StripPrefix.func2()
      /usr/local/go/src/net/http/server.go:2282 +0x471
  net/http.HandlerFunc.ServeHTTP()
      /usr/local/go/src/net/http/server.go:2220 +0x47
  net/http.(*ServeMux).ServeHTTP()
      /usr/local/go/src/net/http/server.go:2747 +0x255
  net/http.serverHandler.ServeHTTP()
      /usr/local/go/src/net/http/server.go:3210 +0x2a1
  net/http.(*conn).serve()
      /usr/local/go/src/net/http/server.go:2092 +0x12a4
  net/http.(*Server).Serve.gowrap3()
      /usr/local/go/src/net/http/server.go:3360 +0x4f

Previous read at 0x00c00026cd80 by goroutine 274:
  github.com/letsencrypt/pebble/v2/ca.(*CAImpl).CompleteOrder()
      /home/timg/source/pebble/ca/ca.go:448 +0x76f
  github.com/letsencrypt/pebble/v2/wfe.(*WebFrontEndImpl).FinalizeOrder.func1()
      /home/timg/source/pebble/wfe/wfe.go:2215 +0x8d

Goroutine 143 (running) created at:
  net/http.(*Server).Serve()
      /usr/local/go/src/net/http/server.go:3360 +0x8ec
  net/http.(*Server).ServeTLS()
      /usr/local/go/src/net/http/server.go:3401 +0x706
  net/http.ServeTLS()
      /usr/local/go/src/net/http/server.go:2875 +0x269
  github.com/tgeoghegan/oidf-box/test.setupPebble.func1()
      /home/timg/source/oidf-box/test/harness.go:297 +0x19c

Goroutine 274 (running) created at:
  github.com/letsencrypt/pebble/v2/wfe.(*WebFrontEndImpl).FinalizeOrder()
      /home/timg/source/pebble/wfe/wfe.go:2214 +0x2fa4
  github.com/letsencrypt/pebble/v2/wfe.(*WebFrontEndImpl).FinalizeOrder-fm()
      <autogenerated>:1 +0x69
  github.com/letsencrypt/pebble/v2/wfe.(*WebFrontEndImpl).HandleFunc.func1()
      /home/timg/source/pebble/wfe/wfe.go:306 +0xa09
  github.com/letsencrypt/pebble/v2/wfe.wfeHandlerFunc.ServeHTTP()
      /home/timg/source/pebble/wfe/wfe.go:146 +0x56
  github.com/letsencrypt/pebble/v2/wfe.(*topHandler).ServeHTTP()
      /home/timg/source/pebble/wfe/wfe.go:158 +0x5b
  github.com/letsencrypt/pebble/v2/wfe.(*WebFrontEndImpl).HandleFunc.StripPrefix.func2()
      /usr/local/go/src/net/http/server.go:2282 +0x471
  net/http.HandlerFunc.ServeHTTP()
      /usr/local/go/src/net/http/server.go:2220 +0x47
  net/http.(*ServeMux).ServeHTTP()
      /usr/local/go/src/net/http/server.go:2747 +0x255
  net/http.serverHandler.ServeHTTP()
      /usr/local/go/src/net/http/server.go:3210 +0x2a1
  net/http.(*conn).serve()
      /usr/local/go/src/net/http/server.go:2092 +0x12a4
  net/http.(*Server).Serve.gowrap3()
      /usr/local/go/src/net/http/server.go:3360 +0x4f
==================

What's happening is that CAImpl.CompleteOrder is fetching an Order from the "database" (which is in-memory) and iterating over the identifiers therein. WebFrontEndImpl.orderForDisplay fetches an Order from the database and then wants to construct a shuffled version of the order to return to the client to ensure nobody makes assumptions about the layout of these objects.

orderForDisplay tries to take a copy of the acme.Order, but evidently that copies pointers to the []acme.Identifier Identifiers and []string Authorizations fields, which means shuffling them in the "copy" shuffles them in the DB object and makes the race detector unhappy.

I have an integration test for an ACME extension ([1]) that instantiate
Pebble. I noticed that I would sometimes get test failures under
`go test -race` like this:

```
==================
WARNING: DATA RACE
Write at 0x00c00026cd80 by goroutine 143:
  github.com/letsencrypt/pebble/v2/wfe.(*WebFrontEndImpl).orderForDisplay.func2()
      /home/timg/source/pebble/wfe/wfe.go:1876 +0x255
  math/rand.(*Rand).Shuffle()
      /usr/local/go/src/math/rand/rand.go:265 +0x96
  math/rand.Shuffle()
      /usr/local/go/src/math/rand/rand.go:470 +0x38
  github.com/letsencrypt/pebble/v2/wfe.(*WebFrontEndImpl).orderForDisplay()
      /home/timg/source/pebble/wfe/wfe.go:1875 +0x204
  github.com/letsencrypt/pebble/v2/wfe.(*WebFrontEndImpl).Order()
      /home/timg/source/pebble/wfe/wfe.go:2021 +0x5ed
  github.com/letsencrypt/pebble/v2/wfe.(*WebFrontEndImpl).Order-fm()
      <autogenerated>:1 +0x69
  github.com/letsencrypt/pebble/v2/wfe.(*WebFrontEndImpl).HandleFunc.func1()
      /home/timg/source/pebble/wfe/wfe.go:306 +0xa09
  github.com/letsencrypt/pebble/v2/wfe.wfeHandlerFunc.ServeHTTP()
      /home/timg/source/pebble/wfe/wfe.go:146 +0x56
  github.com/letsencrypt/pebble/v2/wfe.(*topHandler).ServeHTTP()
      /home/timg/source/pebble/wfe/wfe.go:158 +0x5b
  github.com/letsencrypt/pebble/v2/wfe.(*WebFrontEndImpl).HandleFunc.StripPrefix.func2()
      /usr/local/go/src/net/http/server.go:2282 +0x471
  net/http.HandlerFunc.ServeHTTP()
      /usr/local/go/src/net/http/server.go:2220 +0x47
  net/http.(*ServeMux).ServeHTTP()
      /usr/local/go/src/net/http/server.go:2747 +0x255
  net/http.serverHandler.ServeHTTP()
      /usr/local/go/src/net/http/server.go:3210 +0x2a1
  net/http.(*conn).serve()
      /usr/local/go/src/net/http/server.go:2092 +0x12a4
  net/http.(*Server).Serve.gowrap3()
      /usr/local/go/src/net/http/server.go:3360 +0x4f

Previous read at 0x00c00026cd80 by goroutine 274:
  github.com/letsencrypt/pebble/v2/ca.(*CAImpl).CompleteOrder()
      /home/timg/source/pebble/ca/ca.go:448 +0x76f
  github.com/letsencrypt/pebble/v2/wfe.(*WebFrontEndImpl).FinalizeOrder.func1()
      /home/timg/source/pebble/wfe/wfe.go:2215 +0x8d

Goroutine 143 (running) created at:
  net/http.(*Server).Serve()
      /usr/local/go/src/net/http/server.go:3360 +0x8ec
  net/http.(*Server).ServeTLS()
      /usr/local/go/src/net/http/server.go:3401 +0x706
  net/http.ServeTLS()
      /usr/local/go/src/net/http/server.go:2875 +0x269
  github.com/tgeoghegan/oidf-box/test.setupPebble.func1()
      /home/timg/source/oidf-box/test/harness.go:297 +0x19c

Goroutine 274 (running) created at:
  github.com/letsencrypt/pebble/v2/wfe.(*WebFrontEndImpl).FinalizeOrder()
      /home/timg/source/pebble/wfe/wfe.go:2214 +0x2fa4
  github.com/letsencrypt/pebble/v2/wfe.(*WebFrontEndImpl).FinalizeOrder-fm()
      <autogenerated>:1 +0x69
  github.com/letsencrypt/pebble/v2/wfe.(*WebFrontEndImpl).HandleFunc.func1()
      /home/timg/source/pebble/wfe/wfe.go:306 +0xa09
  github.com/letsencrypt/pebble/v2/wfe.wfeHandlerFunc.ServeHTTP()
      /home/timg/source/pebble/wfe/wfe.go:146 +0x56
  github.com/letsencrypt/pebble/v2/wfe.(*topHandler).ServeHTTP()
      /home/timg/source/pebble/wfe/wfe.go:158 +0x5b
  github.com/letsencrypt/pebble/v2/wfe.(*WebFrontEndImpl).HandleFunc.StripPrefix.func2()
      /usr/local/go/src/net/http/server.go:2282 +0x471
  net/http.HandlerFunc.ServeHTTP()
      /usr/local/go/src/net/http/server.go:2220 +0x47
  net/http.(*ServeMux).ServeHTTP()
      /usr/local/go/src/net/http/server.go:2747 +0x255
  net/http.serverHandler.ServeHTTP()
      /usr/local/go/src/net/http/server.go:3210 +0x2a1
  net/http.(*conn).serve()
      /usr/local/go/src/net/http/server.go:2092 +0x12a4
  net/http.(*Server).Serve.gowrap3()
      /usr/local/go/src/net/http/server.go:3360 +0x4f
==================
```

What's happening is that `CAImpl.CompleteOrder` is fetching an `Order`
from the "database" (which is in-memory) and iterating over the
identifiers therein. `WebFrontEndImpl.orderForDisplay` fetches an
`Order` from the database and then wants to construct a shuffled version
of the order to return to the client to ensure nobody makes assumptions
about the layout of these objects.

`orderForDisplay` tries to take a copy of the `acme.Order`, but
evidently that copies *pointers* to the `[]acme.Identifier Identifiers`
and `[]string Authorizations` fields, which means shuffling them in the
copy shuffles them in the DB object and makes the race detector
unhappy.

Explicit `slices.Clone` calls make sure we're not mutating objects from
the database.

[1]: https://github.com/tgeoghegan/oidf-box
wfe/wfe.go Outdated
// below could mutate the order in the database, causing data races.
result := order.Order
result.Identifiers = slices.Clone(order.Order.Identifiers)
result.Authorizations = slices.Clone(order.Order.Authorizations)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I must admit that I'm not certain if this is the "right", Goish, Gothonic way to fix this, so I'm happy to be shown better solutions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This definitely works. But it has the downside of not catching any new pointer-valued fields which are added to the Order type in the future.

I think it would be slightly more idiomatic to instead say:

result := acme.Order{
	Status: order.Order.Status,
	Error: order.Order.Error,
  etc
	Identifiers: slices.Clone(order.Order.Identifiers),
  etc
}

This has a similar downside -- if a new field is added to the Order type, it may not get added to this operation. But that will result in the field being missing, rather than subtler bugs, so is much more likely to be caught by unit or integration tests.

This solution is so much more verbose that I'm honestly not sure it counts as "better", but I do think it has worthwhile advantages. Up to you!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the thing where unmentioned struct fields get filled with the zero value kinda works against us here. I like yours better because then at least we're not mutating an object after creating it, but I don't know how to do better.

aarongable
aarongable previously approved these changes May 20, 2025
wfe/wfe.go Outdated
// below could mutate the order in the database, causing data races.
result := order.Order
result.Identifiers = slices.Clone(order.Order.Identifiers)
result.Authorizations = slices.Clone(order.Order.Authorizations)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This definitely works. But it has the downside of not catching any new pointer-valued fields which are added to the Order type in the future.

I think it would be slightly more idiomatic to instead say:

result := acme.Order{
	Status: order.Order.Status,
	Error: order.Order.Error,
  etc
	Identifiers: slices.Clone(order.Order.Identifiers),
  etc
}

This has a similar downside -- if a new field is added to the Order type, it may not get added to this operation. But that will result in the field being missing, rather than subtler bugs, so is much more likely to be caught by unit or integration tests.

This solution is so much more verbose that I'm honestly not sure it counts as "better", but I do think it has worthwhile advantages. Up to you!

aarongable
aarongable previously approved these changes May 20, 2025
wfe/wfe.go Outdated
Comment on lines 1843 to 1844
// use as the result. We have to make sure to copy the identifiers and
// authorizations *slices* and not *pointers to the slices* or the mutations
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A more correct way to phrase this would be that we have to copy the contents of the idents and authzs slices, rather than copying the slices themselves, because slices are just pointers with some extra metadata.

Those fields aren't "pointers to slices" (that would be type *[]string).

@aarongable aarongable merged commit 39dbb64 into letsencrypt:main May 21, 2025
14 checks passed
kwatson added a commit to kwatson/letsencrypt-pebble that referenced this pull request Jun 9, 2025
* 'main' of https://github.com/letsencrypt/pebble: (35 commits)
  add overriding of ARI response (letsencrypt#501)
  wfe: fix a race in `orderForDisplay` (letsencrypt#500)
  Bump golang.org/x/ dependencies (letsencrypt#499)
  currectly triggers BadSignatureAlgorithmProblem at JWS parse time (letsencrypt#492)
  use newer validation subdomain for dns-account-01 (fix CI eggsampler/acme error) (letsencrypt#498)
  Orders don't have a "deactivated" status. (letsencrypt#301)
  Update golangci-lint (letsencrypt#488)
  build(deps): bump github.com/go-jose/go-jose/v4 from 4.0.4 to 4.0.5 (letsencrypt#487)
  Truncate ARI timestamps to millisecond resolution (letsencrypt#485)
  return logical and compliant ARI windows for expiring certs (letsencrypt#484)
  Update dependencies (letsencrypt#481)
  docs: rm mention of subproblems being unimpl'd (letsencrypt#479)
  Fix(NOISSUE): Fix docker compose file example in README.md (letsencrypt#475)
  Add support for ACME Profiles (letsencrypt#473)
  Simplify KU, EKU, and SKID fields of issued certs (letsencrypt#472)
  Update golangci-lint to 1.60.2 (letsencrypt#474)
  Update /x/net for compatibility with go1.23 (letsencrypt#470)
  Reject extra command line args and fix README invocation (letsencrypt#467)
  Document exposing API and management ports when not using docker-compose.yaml (letsencrypt#465)
  Implement latest draft-ietf-acme-ari spec (letsencrypt#461)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants