-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime: lock-free channels #8899
Comments
What is the update on this ? |
Unfortunately the CL seems to have gone stale @dvyukov due to changes to runtime's code. |
I have implemented an extremely simple lock-free channel here, if there's interest, I'm willing to implement it in the runtime and submit a CL (and re-license it under the Go license of course). ➜ go test -bench=. -benchmem -cpu 1,4,8,32 -benchtime 3s
# ch := NewSize(runtime.NumCPU())
BenchmarkLFChan 30000000 168 ns/op 40 B/op 4 allocs/op
BenchmarkLFChan-4 30000000 175 ns/op 45 B/op 4 allocs/op
BenchmarkLFChan-8 20000000 205 ns/op 45 B/op 4 allocs/op
BenchmarkLFChan-32 20000000 201 ns/op 45 B/op 4 allocs/op
# ch := make(chan interface{}, runtime.NumCPU())
BenchmarkChan 50000000 115 ns/op 8 B/op 1 allocs/op
BenchmarkChan-4 20000000 261 ns/op 8 B/op 1 allocs/op
BenchmarkChan-8 20000000 331 ns/op 8 B/op 1 allocs/op
BenchmarkChan-32 10000000 532 ns/op 8 B/op 1 allocs/op
PASS
ok github.com/OneOfOne/lfchan 51.663s |
@OneOfOne The tricky part will be to support blocking on empty/full chan and select statements. |
@dvyukov I'll work on fixing 2 and 3 and ping you back, and if I get your approval I'll try to figure out 1. |
Your benchmarks look good but any change to channels has to work efficiently with select statements. |
@dvyukov I updated the package and fixed all your notes. @ianlancetaylor I added Select support to the package. ➜ go test -bench=. -benchmem -cpu 1,4,8,32 -benchtime 3s
# ch := NewSize(100)
BenchmarkLFChan 20000000 292 ns/op 8 B/op 1 allocs/op
BenchmarkLFChan-4 20000000 202 ns/op 8 B/op 1 allocs/op
BenchmarkLFChan-8 30000000 161 ns/op 8 B/op 1 allocs/op
BenchmarkLFChan-32 20000000 215 ns/op 8 B/op 1 allocs/op
# ch := make(chan interface{}, 100)
BenchmarkChan 10000000 371 ns/op 8 B/op 1 allocs/op
BenchmarkChan-4 10000000 378 ns/op 8 B/op 1 allocs/op
BenchmarkChan-8 10000000 506 ns/op 8 B/op 1 allocs/op
BenchmarkChan-32 10000000 513 ns/op 8 B/op 1 allocs/op
|
I'll benchmark up to 80 cores later today and post results. |
On an 80-core Intel host: $ go test -bench=. -benchmem -cpu 1,2,4,8,16,32,48,64,72,80 -benchtime 10s
PASS
BenchmarkLFChan 30000000 506 ns/op 8 B/op 1 allocs/op
BenchmarkLFChan-2 20000000 1107 ns/op 8 B/op 1 allocs/op
BenchmarkLFChan-4 10000000 1611 ns/op 8 B/op 1 allocs/op
BenchmarkLFChan-8 10000000 1710 ns/op 8 B/op 1 allocs/op
BenchmarkLFChan-16 10000000 2165 ns/op 8 B/op 1 allocs/op
BenchmarkLFChan-32 10000000 2192 ns/op 8 B/op 1 allocs/op
BenchmarkLFChan-48 10000000 2288 ns/op 8 B/op 1 allocs/op
BenchmarkLFChan-64 10000000 2354 ns/op 8 B/op 1 allocs/op
BenchmarkLFChan-72 5000000 2454 ns/op 8 B/op 1 allocs/op
BenchmarkLFChan-80 5000000 2554 ns/op 8 B/op 1 allocs/op
BenchmarkChan 20000000 768 ns/op 8 B/op 1 allocs/op
BenchmarkChan-2 20000000 1188 ns/op 8 B/op 1 allocs/op
BenchmarkChan-4 20000000 3215 ns/op 8 B/op 1 allocs/op
BenchmarkChan-8 5000000 3657 ns/op 8 B/op 1 allocs/op
BenchmarkChan-16 10000000 2734 ns/op 8 B/op 1 allocs/op
BenchmarkChan-32 5000000 2387 ns/op 8 B/op 1 allocs/op
BenchmarkChan-48 5000000 2448 ns/op 8 B/op 1 allocs/op
BenchmarkChan-64 5000000 2452 ns/op 8 B/op 1 allocs/op
BenchmarkChan-72 5000000 2552 ns/op 8 B/op 1 allocs/op
BenchmarkChan-80 5000000 2659 ns/op 8 B/op 1 allocs/op
ok github.com/OneOfOne/lfchan 436.003s On a 48-core AMD host: $ go test -bench=. -benchmem -cpu 1,2,4,8,16,32,48 -benchtime 10s
PASS
BenchmarkLFChan 20000000 734 ns/op 8 B/op 1 allocs/op
BenchmarkLFChan-2 10000000 1271 ns/op 8 B/op 1 allocs/op
BenchmarkLFChan-4 20000000 1140 ns/op 8 B/op 1 allocs/op
BenchmarkLFChan-8 20000000 1097 ns/op 8 B/op 1 allocs/op
BenchmarkLFChan-16 10000000 1257 ns/op 8 B/op 1 allocs/op
BenchmarkLFChan-32 10000000 1564 ns/op 8 B/op 1 allocs/op
BenchmarkLFChan-48 10000000 2172 ns/op 8 B/op 1 allocs/op
BenchmarkChan 20000000 937 ns/op 8 B/op 1 allocs/op
BenchmarkChan-2 20000000 1147 ns/op 8 B/op 1 allocs/op
BenchmarkChan-4 10000000 1721 ns/op 8 B/op 1 allocs/op
BenchmarkChan-8 10000000 2372 ns/op 8 B/op 1 allocs/op
BenchmarkChan-16 10000000 2349 ns/op 8 B/op 1 allocs/op
BenchmarkChan-32 5000000 2295 ns/op 8 B/op 1 allocs/op
BenchmarkChan-48 5000000 2753 ns/op 8 B/op 1 allocs/op
ok github.com/OneOfOne/lfchan 276.712s |
@jonhoo thanks for running the benchmark, it is rather weird how different the numbers are, I wonder if it has something to do with having multiple CPU sockets. |
@OneOfOne These machines are also somewhat older, so each core is slower than what you'd find in your laptop nowadays. |
Any idea why adding HW threads doesn't result in increased parallelism. On Sat, Mar 26, 2016 at 2:07 PM, Jon Gjengset [email protected]
|
@RLH: see the linked pages for hardware descriptions. I'm not pinning threads, but I never run with |
@OneOfOne There are several problems with this implementation:
|
@dvyukov 3. fixed with OneOfOne/lfchan@bdddd90 |
have you tried to write a SPIN model for your implementation?
the runtime used to have SPIN model for the scheduler, but
it got out of date pretty quickly. If we add more lock-free stuff
to the runtime, I'd suggest we start with a formal model and
strive to keep the formal model always in sync with the source
code.
|
@minux I'm not sure what a SPIN model is. |
I believe benchmarks in https://github.com/OneOfOne/lfchan/blob/master/chan_test.go are not benchmarking what you want. Operations like I think a better approach would be to use 1 reader and 1 writer goroutine and let RunParallel handle concurrency level for you. Also avoid using atomic operations and wait groups. Also I want to point out that the following result:
shows that the benchmarked operation became |
For a lock free implementation, this code sure does spend a long time in |
@jonhoo I fixed that, would you be kind to rerun the benchmarks, I simplified them as per @kostya-sh's suggestion and removed the GOMAXPROCS call. @dvyukov pointed me to the right direction with his
|
This bug was originally about lock-free channels in the runtime. All this recent discussion is about an external package. Can we move discussion to elsewhere on Github (e.g. https://github.com/OneOfOne/lfchan/issues) so I can remain subscribed to this bug for its original purpose? |
Created an issue over at OneOfOne/lfchan#3. Will continue posting benchmark results there. |
@OneOfOne The queue is still not FIFO after 2f9d3a217eadbc557e758649fa153dd59ff14c11 and len still can return -1. You can block and unblock goroutines without runtime support, sync.Mutex+Cond can block/unblock goroutines. You can stab runtime scheduler by implementing 2 functions on top of Mutex+Cond: park(g *G) and unpark(g *G). |
@jonhoo how did you generate the plot? |
@dvyukov ping? |
After the recent fairness changes in channels, my algorithm does not apply per se. My algorithm does not have the fairness properties that the current code provide. |
@dvyukov any plans to revisit this in future ? |
No particular plans. |
What are the fairness properties this implementation is missing? |
@stjepang I've lost all context already. What implementation do you mean? Why do you think there are fairness problems? |
@dvyukov In order to improve the performance of Go's channels, you designed almost lock-free channels and wrote this implementation. Now it's apparently abandonded, as you commented:
I was wondering what are the fairness changes that make your algorithm inapplicable. Why exactly your algorithm didn't make it as the official Go channel implementation? What are the problems with it that still need to be solved? |
I mean this change:
It was constantly deprioritized over other runtime changes, thus constantly get outdated and I wasn't able to keep up updating it. As far as I remember it all worked. Now the major problem is rebasing it onto current runtime code. Also need to decide what to do with respect to the fairness change 16740. |
@dvyukovI, I might be wrong but I noticed you have not been active like before. Is this as a result of your busy schedule or you lost the zeal as a result of multiple rejections of some of your proposal or working on a secret project we can't know about yet :) |
@olekukonko working on a non-secret project you can know about: https://github.com/google/syzkaller |
@dvyukov, the project looks Interesting, I will be experimenting with it and thanks for the prompt response. Regards. |
Are there any plans to revisit this? |
2021 now. hopefully someday we see this implemented 😄 |
Candidate -> https://github.com/alphadose/ZenQ |
The text was updated successfully, but these errors were encountered: