-
Notifications
You must be signed in to change notification settings - Fork 212
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
int: reduce allocation by representing small ints as pointers #280
Conversation
This change defines low-level accessors for the small and big arms of the int union so that the representation can be easily changed in a follow-up. Change-Id: I7c4ae279a6d2e7b76e102ba5d01a3cd1c56fb368
starlark/int_generic.go
Outdated
@@ -0,0 +1,33 @@ | |||
//+build !linux,!darwin |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This pair of lines should be one line:
// +build !linux,!darwin !amd64,!arm64,!mips64x,!ppc64x
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. I always forget whether space means AND or OR.
starlark/int_posix64.go
Outdated
package starlark | ||
|
||
// Optimized Int implementation for 64-bit machines running POSIX (for mmap). | ||
// It reserves a portion of the address space to represent int32 values. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider expanding this:
It reserves a 4GB range of the virtual address space using mmap and represents int32 values
as addresses within that range. This disambiguates int32 values from big.Int pointers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
var smallints = reserveAddresses(1 << 32) | ||
|
||
func reserveAddresses(len int) uintptr { | ||
b, err := unix.Mmap(-1, 0, len, unix.PROT_READ, unix.MAP_PRIVATE|unix.MAP_ANONYMOUS) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to use something similar to NaN boxing instead? That would let you avoid the mmap
.
amd64 processors only support 48-bit virtual addresses. They're sign-extended to 64-bits. There's a big range in the middle that could be used to store other values. I believe arm64 processors have a similar limitation. I don't know about mips.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't NaN-boxing the trick by which JS implementations embed 53-bit ints, 48-bit pointers, and other tagged value types into each of the many kinds of NaN? We're trying to hide numbers inside pointers, but Nan-boxing goes in the opposite direction: hiding pointers inside numbers, which is problematic for garbage collection. (big.Ints would falsely appear unreachable.)
The use of mmap is not such a problem: it is relatively portable (it's available as VirtualAlloc on Windows), and the current approach doesn't depend on features of the hardware other than word size.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just meant NaN boxing as an analogy: using the dead bits of a large type (in this case unsafe.Pointer
) to store a smaller numeric value (int32
).
Fair enough on mmap
though. I'm still stuck in the dark ages of 32-bit where virtual address space was a scarce resource. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you were willing to understand what "valid" pointers are on the host system, you could take a space from the "invalid" pointers by fiat, without asking mmap for address range. On amd64 the valid pointers start with 17 0s or 17 1s; if you pick a different prefix, like "01", then you could store up to 62-bit numbers in the remaining bits. But on non-amd64 that specific trick wouldn't work. On some 64-bit systems Go supports (I forget which one) the entire 64-bit range is technically valid. The mmap trick is more portable for sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to stick with the mmap version for now because it seems less fragile overall, but I will remember these ideas for when I implement my next proper compiler to object code. :)
This change defines a new representation for Int on 64-bit machines running a POSIX operating system, by reserving a 4GB portion of the address space. Pointers to addresses in this region represent int32 values, and are disjoint from all *big.Int pointers, allowing us to represent the union in a single pointer. This means the conversion from Int to Value does not allocate. The gauss benchmark (added in this CL) shows -40% ns, -48% bytes, -63% allocs: Benchmark/bench_gauss-12 84 13648744 ns/op 3175816 B/op 105862 allocs/op (before) Benchmark/bench_gauss-12 55 24283703 ns/op 6119844 B/op 289862 allocs/op (after) On 32-bit machines, or those running a non-POSIX system, we continue to use the old representation. Change-Id: I2915a8f8abff18ab2eba2891c352700c3ab0d4c4
var smallints = reserveAddresses(1 << 32) | ||
|
||
func reserveAddresses(len int) uintptr { | ||
b, err := unix.Mmap(-1, 0, len, unix.PROT_READ, unix.MAP_PRIVATE|unix.MAP_ANONYMOUS) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just meant NaN boxing as an analogy: using the dead bits of a large type (in this case unsafe.Pointer
) to store a smaller numeric value (int32
).
Fair enough on mmap
though. I'm still stuck in the dark ages of 32-bit where virtual address space was a scarce resource. :)
It turns out there's quite a lot of spare address space just sitting there awaiting one's creativity. This change only squandered 1 part in 2^32 of it. :) |
@alandonovan This is fun! Out of interest, do you have performance numbers on how much this helped on the targeted platforms? (at least for the microbenchmarks you wrote) |
Take a look at the commit message: 29194bd |
Aha, of course -- I was being too PR-centric. Thanks! |
No description provided.