-
Notifications
You must be signed in to change notification settings - Fork 17.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sync: Pool example suggests incorrect usage #23199
Comments
I should also note that if #22950 is done, then usages like this will cause large buffers to be pinned forever since this example has a steady state of |
Here's an even worse situation than earlier (suggested by @bcmills): pool := sync.Pool{New: func() interface{} { return new(bytes.Buffer) }}
processRequest := func(size int) {
b := pool.Get().(*bytes.Buffer)
time.Sleep(500 * time.Millisecond) // Simulate processing time
b.Grow(size)
pool.Put(b)
time.Sleep(1 * time.Millisecond) // Simulate idle time
}
// Simulate a steady stream of infrequent large requests.
go func() {
for {
processRequest(1 << 28) // 256MiB
}
}()
// Simulate a storm of small requests.
for i := 0; i < 1000; i++ {
go func() {
for {
processRequest(1 << 10) // 1KiB
}
}()
}
// Continually run a GC and track the allocated bytes.
var stats runtime.MemStats
for i := 0; ; i++ {
runtime.ReadMemStats(&stats)
fmt.Printf("Cycle %d: %dB\n", i, stats.Alloc)
time.Sleep(time.Second)
runtime.GC()
} Rather than a single one-off large request, let there be a steady stream of occasional large requests intermixed with a large number of small requests. As this snippet runs, the heap keeps growing over time. The large request is "poisoning" the pool such that most of the small requests eventually pin a large capacity buffer under the hood. |
Yikes. My goal in adding the example was to try to show the easiest-to-understand use case for a Pool. |
The solution is of course to put only buffers with small byte slices back into the pool. if b.Cap() <= 1<<12 {
pool.put(b)
} |
Alternatively, you could use an array of |
There are many possible solutions: the important thing is to apply one of them. A related problem can arise with goroutine stacks in conjunction with “worker pools”, depending on when and how often the runtime reclaims large stacks. (IIRC that has changed several times over the lifetime of the Go runtime, so I'm not sure what the current behavior is.) If you have a pool of worker goroutines executing callbacks that can vary significantly in stack usage, you can end up with all of the workers consuming very large stacks even if the overall fraction of large tasks remains very low. |
Do you have any suggestions for better use cases we could include in the example, that are reasonably compact? Maybe the solution is not to recommend a sync.Pool at all anymore? This is my understanding from a comment I read about how GC makes this more or less useless |
Would changing the example to use an array (fixed size) rather than a slice solve this problem?
I legitimately don't think there ever was a time to generally recommend Sorry to interject randomly, but I saw this thread on Twitter and have strong opinions on this feature. |
We would certainly like to get to this point, and the GC has improved a lot, but for high-churn allocations with obvious lifetimes and no need for zeroing,
That's clearly true, but even right now it's partly by chance that these examples are eventually dropping the large buffers. And in the more realistic stochastic mix example, it's not clear to me that #22950 would make it any better or worse. I agree with @dsnet's original point that we should document that |
We've used sync.Pool for dealing with network packets and others do too (such as lucas-clemente/quic-go) because for those use cases you gain performance when using them. However, in those cases The same we did even for structs such as when parsing packets to a struct |
I guess a minimal example of such a usage would be: package main
import (
"sync"
"net"
)
const MaxPacketSize = 4096
var bufPool = sync.Pool {
New: func() interface{} {
return make([]byte, MaxPacketSize)
},
}
func process(outChan chan []byte) {
for data := range outChan {
// process data
// Re-slice to maximum capacity and return it
// for re-use. This is important to guarantee that
// all calls to Get() will return a buffer of
// length MaxPacketSize.
bufPool.Put(data[:MaxPacketSize])
}
}
func reader(conn net.PacketConn, outChan chan []byte) {
for {
data := bufPool.Get().([]byte)
n, _, err := conn.ReadFrom(data)
if err != nil {
break
}
outChan <- data[:n]
}
close(outChan)
}
func main() {
N := 3
var wg sync.WaitGroup
outChan := make(chan []byte, N)
wg.Add(N)
for i := 0; i < N; i++ {
go func() {
process(outChan)
wg.Done()
}()
}
wg.Add(1)
conn, err := net.ListenPacket("udp", "localhost:10001")
if err != nil {
panic(err.Error())
}
go func() {
reader(conn, outChan)
wg.Done()
}()
wg.Wait()
} but of course... whether this is going to be faster than the GC depends on how many packets per second you have to deal with and what exactly you do with the data etc. In real world you'd benchmark with GC/sync.Pool and compare the two. At the time we wrote our code there was a significant time spent allocating new stuff and using a scheme as above we've managed to increase the throughput. Of course, one should re-benchmark this with every update to the GC. |
Change https://golang.org/cl/136035 mentions this issue: |
Change https://golang.org/cl/136115 mentions this issue: |
Change https://golang.org/cl/136116 mentions this issue: |
Something about use raw |
It does difficult to use raw |
This uses a sync.Pool wrapper (called pool.Dynamic) that prevents the pool from holding on to very large buffers indefinitely, while still amortizing the cost of allocation. The policy appears to give good results both with the pre-existing tests and the specific tests added for the pool. This is useful because the library is also used from long-running server contexts, where it would be unfortunate to pin very large buffers for too long. See golang/go#23199. Example algorithm run (from the test): ``` num allocs value target capacity 1 1 100000 100000.000000 100000 2 1 1 52048.000000 100000 3 1 1 28072.000000 100000 4 1 1 16084.000000 100000 5 1 1 10090.000000 100000 6 1 1 7093.000000 100000 7 2 10 5594.500000 4096 8 2 1 4845.250000 4096 9 2 1 4470.625000 4096 10 2 1 4283.312500 4096 11 2 1 4189.656250 4096 12 2 1 4142.828125 4096 13 2 1 4119.414062 4096 14 2 1 4107.707031 4096 15 2 12 4101.853516 4096 16 2 1 4098.926758 4096 17 2 1 4097.463379 4096 18 2 1 4096.731689 4096 19 2 1 4096.365845 4096 20 2 1 4096.182922 4096 21 2 1 4096.091461 4096 22 2 1 4096.045731 4096 23 2 1000 4096.022865 4096 24 2 100 4096.011433 4096 25 3 10000 10000.000000 10000 26 4 100000 100000.000000 100000 27 4 1 52048.000000 100000 28 4 100000 100000.000000 100000 29 4 1 52048.000000 100000 30 4 50000 51024.000000 100000 31 4 1 27560.000000 100000 32 4 1 15828.000000 100000 33 4 25000 25000.000000 100000 34 4 1 14548.000000 100000 35 4 1 9322.000000 100000 36 5 1 6709.000000 4096 37 6 100000 100000.000000 100000 38 6 1 52048.000000 100000 39 6 1 28072.000000 100000 40 6 1 16084.000000 100000 41 6 1 10090.000000 100000 42 6 1 7093.000000 100000 43 7 1 5594.500000 4096 44 7 1 4845.250000 4096 45 7 1 4470.625000 4096 46 7 1 4283.312500 4096 47 7 100 4189.656250 4096 48 7 100 4142.828125 4096 49 7 100 4119.414062 4096 50 7 1 4107.707031 4096 51 7 1 4101.853516 4096 52 7 1 4098.926758 4096 53 7 1 4097.463379 4096 54 7 1 4096.731689 4096 55 7 100 4096.365845 4096 56 7 200 4096.182922 4096 57 7 300 4096.091461 4096 58 7 100 4096.045731 4096 59 7 50 4096.022865 4096 60 7 50 4096.011433 4096 61 7 50 4096.005716 4096 62 7 50 4096.002858 4096 63 7 50 4096.001429 4096 64 7 1 4096.000715 4096 65 7 1 4096.000357 4096 66 7 1 4096.000179 4096 67 7 1 4096.000089 4096 68 8 100000000 100000000.000000 100000000 69 8 1000000 50500000.000000 100000000 70 8 100000 25300000.000000 100000000 71 8 10000 12655000.000000 100000000 72 8 1000 6329548.000000 100000000 73 9 100 3166822.000000 4096 74 9 10 1585459.000000 4096 75 9 1 794777.500000 4096 76 9 1 399436.750000 4096 77 9 500 201766.375000 4096 78 9 2020 102931.187500 4096 79 9 400 53513.593750 4096 80 9 3984 28804.796875 4096 81 9 5 16450.398438 4096 82 9 200 10273.199219 4096 83 9 500 7184.599609 4096 84 10 40000 40000.000000 40000 85 10 35000 37500.000000 40000 86 11 45000 45000.000000 45000 87 11 42000 43500.000000 45000 88 11 38000 40750.000000 45000 89 11 38000 39375.000000 45000 90 11 39000 39187.500000 45000 91 11 41000 41000.000000 45000 92 11 42000 42000.000000 45000 93 11 42000 42000.000000 45000 94 11 2000 23048.000000 45000 95 11 4000 13572.000000 45000 96 11 3949 8834.000000 45000 97 11 2011 6465.000000 45000 98 11 4096 5280.500000 45000 99 11 33 4688.250000 45000 100 11 0 4392.125000 45000 101 12 4938 4938.000000 4938 102 12 1 4517.000000 4938 103 12 1 4306.500000 4938 104 12 1200 4201.250000 4938 105 12 2400 4148.625000 4938 106 12 1200 4122.312500 4938 107 12 200 4109.156250 4938 108 12 400 4102.578125 4938 109 12 600 4099.289062 4938 110 12 700 4097.644531 4938 111 12 100 4096.822266 4938 112 12 400 4096.411133 4938 113 12 500 4096.205566 4938 114 12 700 4096.102783 4938 115 12 600 4096.051392 4938 116 12 900 4096.025696 4938 117 12 1000 4096.012848 4938 118 12 1100 4096.006424 4938 119 12 1200 4096.003212 4938 120 12 1000 4096.001606 4938 ``` Benchmarks also show that the pool does retain the buffer, as performance is not worsened over the previous commit: ``` $ git checkout main TMPDIR="$HOME/tmp/tmpdir" mkdir "$TMPDIR" || true for file in jar/testdata/* ; do RTMPDIR="$TMPDIR/$(basename $file)" mkdir "$RTMPDIR" || true ln -fv "$PWD/$file" "$RTMPDIR" done for commit in $(git log --pretty=oneline | head -5 | awk '{print $1}' | tac) ; do git checkout $commit go build hyperfine --ignore-failure --warmup 1 "./log4jscanner $TMPDIR/400mb_jar_in_jar.jar" rm log4jscanner done HEAD is now at 48d70bf jar: add benchmarks with 400mb_jar_in_jar.jar Time (mean ± σ): 2.026 s ± 0.324 s [User: 2.363 s, System: 1.269 s] Range (min … max): 1.651 s … 2.749 s 10 runs HEAD is now at bf524fa jar: close the zip.File reader before recursing Time (mean ± σ): 1.908 s ± 0.297 s [User: 2.084 s, System: 1.218 s] Range (min … max): 1.502 s … 2.567 s 10 runs HEAD is now at 4b23cd3 jar: prefer io.ReadFull over io.ReadAll Time (mean ± σ): 445.9 ms ± 51.2 ms [User: 401.7 ms, System: 79.9 ms] Range (min … max): 386.3 ms … 566.1 ms 10 runs HEAD is now at 37376ef jar: reuse buffers for nested .jar's Time (mean ± σ): 464.5 ms ± 41.8 ms [User: 420.5 ms, System: 93.7 ms] Range (min … max): 409.2 ms … 545.5 ms 10 runs HEAD is now at c17a81b jar: do not keep large buffers unnecessarily Time (mean ± σ): 436.1 ms ± 26.2 ms [User: 409.5 ms, System: 77.6 ms] Range (min … max): 390.2 ms … 472.7 ms 10 runs ```
This uses a sync.Pool wrapper (called pool.Dynamic) that prevents the pool from holding on to very large buffers indefinitely, while still amortizing the cost of allocation. The policy appears to give good results both with the pre-existing tests and the specific tests added for the pool. This is useful because the library is also used from long-running server contexts, where it would be unfortunate to pin very large buffers for too long. See golang/go#23199. Example algorithm run (from the test): ``` num allocs value target capacity 1 1 100000 100000.000000 100000 2 1 1 52048.000000 100000 3 1 1 28072.000000 100000 4 1 1 16084.000000 100000 5 1 1 10090.000000 100000 6 1 1 7093.000000 100000 7 2 10 5594.500000 4096 8 2 1 4845.250000 4096 9 2 1 4470.625000 4096 10 2 1 4283.312500 4096 11 2 1 4189.656250 4096 12 2 1 4142.828125 4096 13 2 1 4119.414062 4096 14 2 1 4107.707031 4096 15 2 12 4101.853516 4096 16 2 1 4098.926758 4096 17 2 1 4097.463379 4096 18 2 1 4096.731689 4096 19 2 1 4096.365845 4096 20 2 1 4096.182922 4096 21 2 1 4096.091461 4096 22 2 1 4096.045731 4096 23 2 1000 4096.022865 4096 24 2 100 4096.011433 4096 25 3 10000 10000.000000 10000 26 4 100000 100000.000000 100000 27 4 1 52048.000000 100000 28 4 100000 100000.000000 100000 29 4 1 52048.000000 100000 30 4 50000 51024.000000 100000 31 4 1 27560.000000 100000 32 4 1 15828.000000 100000 33 4 25000 25000.000000 100000 34 4 1 14548.000000 100000 35 4 1 9322.000000 100000 36 5 1 6709.000000 4096 37 6 100000 100000.000000 100000 38 6 1 52048.000000 100000 39 6 1 28072.000000 100000 40 6 1 16084.000000 100000 41 6 1 10090.000000 100000 42 6 1 7093.000000 100000 43 7 1 5594.500000 4096 44 7 1 4845.250000 4096 45 7 1 4470.625000 4096 46 7 1 4283.312500 4096 47 7 100 4189.656250 4096 48 7 100 4142.828125 4096 49 7 100 4119.414062 4096 50 7 1 4107.707031 4096 51 7 1 4101.853516 4096 52 7 1 4098.926758 4096 53 7 1 4097.463379 4096 54 7 1 4096.731689 4096 55 7 100 4096.365845 4096 56 7 200 4096.182922 4096 57 7 300 4096.091461 4096 58 7 100 4096.045731 4096 59 7 50 4096.022865 4096 60 7 50 4096.011433 4096 61 7 50 4096.005716 4096 62 7 50 4096.002858 4096 63 7 50 4096.001429 4096 64 7 1 4096.000715 4096 65 7 1 4096.000357 4096 66 7 1 4096.000179 4096 67 7 1 4096.000089 4096 68 8 100000000 100000000.000000 100000000 69 8 1000000 50500000.000000 100000000 70 8 100000 25300000.000000 100000000 71 8 10000 12655000.000000 100000000 72 8 1000 6329548.000000 100000000 73 9 100 3166822.000000 4096 74 9 10 1585459.000000 4096 75 9 1 794777.500000 4096 76 9 1 399436.750000 4096 77 9 500 201766.375000 4096 78 9 2020 102931.187500 4096 79 9 400 53513.593750 4096 80 9 3984 28804.796875 4096 81 9 5 16450.398438 4096 82 9 200 10273.199219 4096 83 9 500 7184.599609 4096 84 10 40000 40000.000000 40000 85 10 35000 37500.000000 40000 86 11 45000 45000.000000 45000 87 11 42000 43500.000000 45000 88 11 38000 40750.000000 45000 89 11 38000 39375.000000 45000 90 11 39000 39187.500000 45000 91 11 41000 41000.000000 45000 92 11 42000 42000.000000 45000 93 11 42000 42000.000000 45000 94 11 2000 23048.000000 45000 95 11 4000 13572.000000 45000 96 11 3949 8834.000000 45000 97 11 2011 6465.000000 45000 98 11 4096 5280.500000 45000 99 11 33 4688.250000 45000 100 11 0 4392.125000 45000 101 12 4938 4938.000000 4938 102 12 1 4517.000000 4938 103 12 1 4306.500000 4938 104 12 1200 4201.250000 4938 105 12 2400 4148.625000 4938 106 12 1200 4122.312500 4938 107 12 200 4109.156250 4938 108 12 400 4102.578125 4938 109 12 600 4099.289062 4938 110 12 700 4097.644531 4938 111 12 100 4096.822266 4938 112 12 400 4096.411133 4938 113 12 500 4096.205566 4938 114 12 700 4096.102783 4938 115 12 600 4096.051392 4938 116 12 900 4096.025696 4938 117 12 1000 4096.012848 4938 118 12 1100 4096.006424 4938 119 12 1200 4096.003212 4938 120 12 1000 4096.001606 4938 ``` Benchmarks also show that the pool does retain the buffer, as performance is not worsened over the previous commit: ``` $ git checkout main TMPDIR="$HOME/tmp/tmpdir" mkdir "$TMPDIR" || true for file in jar/testdata/* ; do RTMPDIR="$TMPDIR/$(basename $file)" mkdir "$RTMPDIR" || true ln -fv "$PWD/$file" "$RTMPDIR" done for commit in $(git log --pretty=oneline | head -5 | awk '{print $1}' | tac) ; do git checkout $commit go build hyperfine --ignore-failure --warmup 1 "./log4jscanner $TMPDIR/400mb_jar_in_jar.jar" rm log4jscanner done HEAD is now at 48d70bf jar: add benchmarks with 400mb_jar_in_jar.jar Time (mean ± σ): 2.026 s ± 0.324 s [User: 2.363 s, System: 1.269 s] Range (min … max): 1.651 s … 2.749 s 10 runs HEAD is now at bf524fa jar: close the zip.File reader before recursing Time (mean ± σ): 1.908 s ± 0.297 s [User: 2.084 s, System: 1.218 s] Range (min … max): 1.502 s … 2.567 s 10 runs HEAD is now at 4b23cd3 jar: prefer io.ReadFull over io.ReadAll Time (mean ± σ): 445.9 ms ± 51.2 ms [User: 401.7 ms, System: 79.9 ms] Range (min … max): 386.3 ms … 566.1 ms 10 runs HEAD is now at 37376ef jar: reuse buffers for nested .jar's Time (mean ± σ): 464.5 ms ± 41.8 ms [User: 420.5 ms, System: 93.7 ms] Range (min … max): 409.2 ms … 545.5 ms 10 runs HEAD is now at c17a81b jar: do not keep large buffers unnecessarily Time (mean ± σ): 436.1 ms ± 26.2 ms [User: 409.5 ms, System: 77.6 ms] Range (min … max): 390.2 ms … 472.7 ms 10 runs ```
Change https://go.dev/cl/455236 mentions this issue: |
Change https://go.dev/cl/464344 mentions this issue: |
Logger.Log methods are called in a highly concurrent manner in many servers. Acquiring a lock in each method call results in high lock contention, especially since each lock covers a non-trivial amount of work (e.g., formatting the header and outputting to the writer). Changes made: * Modify the Logger to use atomics so that the header formatting can be moved out of the critical lock section. Acquiring the flags does not occur in the same critical section as outputting to the underlying writer, so this introduces a very slight consistency instability where concurrently calling multiple Logger.Output along with Logger.SetFlags may cause the older flags to be used by some ongoing Logger.Output calls even after Logger.SetFlags has returned. * Use a sync.Pool to buffer the intermediate buffer. This approach is identical to how fmt does it, with the same max cap mitigation for #23199. * Only protect outputting to the underlying writer with a lock to ensure serialized ordering of output. Performance: name old time/op new time/op delta Concurrent-24 19.9µs ± 2% 8.3µs ± 1% -58.37% (p=0.000 n=10+10) Updates #19438 Change-Id: I091beb7431d8661976a6c01cdb0d145e37fe3d22 Reviewed-on: https://go-review.googlesource.com/c/go/+/464344 TryBot-Result: Gopher Robot <[email protected]> Run-TryBot: Joseph Tsai <[email protected]> Reviewed-by: Ian Lance Taylor <[email protected]> Reviewed-by: Bryan Mills <[email protected]>
Logger.Log methods are called in a highly concurrent manner in many servers. Acquiring a lock in each method call results in high lock contention, especially since each lock covers a non-trivial amount of work (e.g., formatting the header and outputting to the writer). Changes made: * Modify the Logger to use atomics so that the header formatting can be moved out of the critical lock section. Acquiring the flags does not occur in the same critical section as outputting to the underlying writer, so this introduces a very slight consistency instability where concurrently calling multiple Logger.Output along with Logger.SetFlags may cause the older flags to be used by some ongoing Logger.Output calls even after Logger.SetFlags has returned. * Use a sync.Pool to buffer the intermediate buffer. This approach is identical to how fmt does it, with the same max cap mitigation for golang#23199. * Only protect outputting to the underlying writer with a lock to ensure serialized ordering of output. Performance: name old time/op new time/op delta Concurrent-24 19.9µs ± 2% 8.3µs ± 1% -58.37% (p=0.000 n=10+10) Updates golang#19438 Change-Id: I091beb7431d8661976a6c01cdb0d145e37fe3d22 Reviewed-on: https://go-review.googlesource.com/c/go/+/464344 TryBot-Result: Gopher Robot <[email protected]> Run-TryBot: Joseph Tsai <[email protected]> Reviewed-by: Ian Lance Taylor <[email protected]> Reviewed-by: Bryan Mills <[email protected]>
Change https://go.dev/cl/471200 mentions this issue: |
Logger.Log methods are called in a highly concurrent manner in many servers. Acquiring a lock in each method call results in high lock contention, especially since each lock covers a non-trivial amount of work (e.g., formatting the header and outputting to the writer). Changes made: * Modify the Logger to use atomics so that the header formatting can be moved out of the critical lock section. Acquiring the flags does not occur in the same critical section as outputting to the underlying writer, so this introduces a very slight consistency instability where concurrently calling multiple Logger.Output along with Logger.SetFlags may cause the older flags to be used by some ongoing Logger.Output calls even after Logger.SetFlags has returned. * Use a sync.Pool to buffer the intermediate buffer. This approach is identical to how fmt does it, with the same max cap mitigation for golang#23199. * Only protect outputting to the underlying writer with a lock to ensure serialized ordering of output. Performance: name old time/op new time/op delta Concurrent-24 19.9µs ± 2% 8.3µs ± 1% -58.37% (p=0.000 n=10+10) Updates golang#19438 Change-Id: I091beb7431d8661976a6c01cdb0d145e37fe3d22 Reviewed-on: https://go-review.googlesource.com/c/go/+/464344 TryBot-Result: Gopher Robot <[email protected]> Run-TryBot: Joseph Tsai <[email protected]> Reviewed-by: Ian Lance Taylor <[email protected]> Reviewed-by: Bryan Mills <[email protected]>
Logger.Log methods are called in a highly concurrent manner in many servers. Acquiring a lock in each method call results in high lock contention, especially since each lock covers a non-trivial amount of work (e.g., formatting the header and outputting to the writer). Changes made: * Modify the Logger to use atomics so that the header formatting can be moved out of the critical lock section. Acquiring the flags does not occur in the same critical section as outputting to the underlying writer, so this introduces a very slight consistency instability where concurrently calling multiple Logger.Output along with Logger.SetFlags may cause the older flags to be used by some ongoing Logger.Output calls even after Logger.SetFlags has returned. * Use a sync.Pool to buffer the intermediate buffer. This approach is identical to how fmt does it, with the same max cap mitigation for golang#23199. * Only protect outputting to the underlying writer with a lock to ensure serialized ordering of output. Performance: name old time/op new time/op delta Concurrent-24 19.9µs ± 2% 8.3µs ± 1% -58.37% (p=0.000 n=10+10) Updates golang#19438 Change-Id: I091beb7431d8661976a6c01cdb0d145e37fe3d22 Reviewed-on: https://go-review.googlesource.com/c/go/+/464344 TryBot-Result: Gopher Robot <[email protected]> Run-TryBot: Joseph Tsai <[email protected]> Reviewed-by: Ian Lance Taylor <[email protected]> Reviewed-by: Bryan Mills <[email protected]>
I think one issue is that |
So I believe, capping the length or capacity of a slice does not actually reduce the size of the underlying backing array. So, this would not actually save or restore any more memory than stuffing the whole buffer anyways. The only way to truly shrink the memory usage of a slice would be to reallocate it at a smaller size, and then copy the data into… but at that point, in this particular use case, that means we would be better off just dropping the whole slice on the ground, and letting the GC clean it up. |
For the buffer specific problem, it would be great to support a new Buffer type with multiple underlying (same-capacity) slices, so that we can instead put the underlying slices in Pool. |
The operation of
sync.Pool
assumes that the memory cost of each element is approximately the same in order to be efficient. This property can be seen by the fact thatPool.Get
returns you a random element, and not the one that has "the greatest capacity" or what not. In other words, from the perspective of thePool
, all elements are more or less the same.However, the
Pool
example storesbytes.Buffer
objects, which have an underlying[]byte
of varying capacity depending on how much of the buffer is actually used.Dynamically growing an unbounded buffers can cause a large amount of memory to be pinned and never be freed in a live-lock situation. Consider the following:
Depending on timing, the above snippet takes around 35 GC cycles for the initial set of large requests (2.5GiB) to finally be freed, even though each of the subsequent writes only use around 1KiB. This can happen in a real server handling lots of small requests, where large buffers allocated by some prior request end up being pinned for a long time since they are not in
Pool
long enough to be collected.The example claims to be based on
fmt
usage, but I'm not convinced thatfmt
's usage is correct. It is susceptible to the live-lock problem described above. I suspect this hasn't been an issue in most real programs sincefmt.PrintX
is typically not used to write very large strings. However, other applications ofsync.Pool
may certainly have this issue.I suggest we fix the example to store elements of fixed size and document this.
\cc @kevinburke @LMMilewski @bcmills
The text was updated successfully, but these errors were encountered: