Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sync/atomic: CAS on ARM does not provide memory fence #7977

Closed
rui314 opened this issue May 12, 2014 · 11 comments
Closed

sync/atomic: CAS on ARM does not provide memory fence #7977

rui314 opened this issue May 12, 2014 · 11 comments
Milestone

Comments

@rui314
Copy link
Member

rui314 commented May 12, 2014

What does 'go version' print?
go version devel +2e591e82a8c8 Fri May 02 13:17:55 2014 -0700 + linux/amd64

What happened?
It's pointed out in https://golang.org/cl/91230048 that Go's CompareAndSwap
functions for ARM don't provide memory fence. As a result, on non-Linux OSes,
synchronization mechanisms including Mutex don't guarantee the Go memory model.

We need to add memory fence operations to CAS functions.
@minux
Copy link
Member

minux commented May 12, 2014

Comment 1:

We will need a way to test this.
BTW, we can test it on linux, because src/sync/atomic/export_linux_arm_test.go exported
generalCAS64 to tests on linux/arm, and we do have SMP linux machines.
If we could make a test, say, running in less than 30s, I think it's worthwhile to
enable it for
-short tests.

Labels changed: added release-go1.3maybe, repo-main.

Status changed to Accepted.

@rsc
Copy link
Contributor

rsc commented May 12, 2014

Comment 2:

See my comments on the CL. I don't believe this is worth the risk of fixing now. It
needs to come up in practice first.

Labels changed: added release-go1.4, removed release-go1.3maybe.

@dvyukov
Copy link
Member

dvyukov commented May 13, 2014

Comment 3:

FTR, a comment by minux from https://golang.org/cl/91230048/:
-----
src/pkg/runtime/asm_arm.s:579: BYTE $0xf3; BYTE $0xbf; BYTE $0x8f; BYTE $0x5b // DMB ISH
DMB is only supported on ARMv7.
for V6, we will need to use something like:
mcr p15, 0, REG, c7, c10, 5
for dmb (REG is an ARM register that must contain 0).

@minux
Copy link
Member

minux commented May 13, 2014

Comment 4:

I've verified that ARMv7 also supports the cp15 way of data memory barrier.
so for 1.4 we can use that unconditionally.

@dvyukov
Copy link
Member

dvyukov commented May 13, 2014

Comment 5:

Is there any performance difference? Do we care?
What is the minimal support ARM version for linux? Does cp15 work there?

@minux
Copy link
Member

minux commented May 13, 2014

Comment 6:

on linux, cas32 is always using kernel provided routine, so no problem there.
for cas64, we have a three way choice:
1. for modern ones (>=3.1), cas64 will be handled by the kernel provided routine,
2. if we are running on ARMv6 or above on older kernels, we will use the generic
LDREX/STREX
based implementation, and we need DMB in this case, at least in theory.
3. if we are running on ARMv5E on older kernel, use lock-based cas64 emulation from
runtime.
in that case, the memory barrier should be provided by the mutex?
on ARMv5E, cp15 isn't supported, but I don't think ARMv5 supports MP either.
on ARMv6+, cp15 is supported, and depending on the kernel SMP configuration, dmb in the
kernel provided cas64 primitive will either be replaced with a simple compiler memory
barrier
(asm (::"memory")), or it the real cp15 mcr instruction. So for SMP configurations,
choice 1 and 2
shouldn't have any performance differences.
Using DMB definitely have adverse performance impact on non-SMP ARMv6+ CPUs, but I guess
we can't do any better (actually i think whether the lack of DMB is a problem or not
depends on
the implementation of the cpu)

@rsc
Copy link
Contributor

rsc commented Sep 18, 2014

Comment 7:

Labels changed: added release-none, removed release-go1.4.

@crawshaw crawshaw modified the milestone: Go1.5 Mar 18, 2015
@crawshaw
Copy link
Member

crawshaw commented Apr 6, 2015

In attempting to swap out the darwin/arm builder for a 4th generation iPad (Apple A6X chip) I ran across this possibly-related error message:

fatal error: unexpected signal during runtime execution
[signal 0xb code=0x1 addr=0x0 pc=0x4035e9c]

runtime stack:
runtime.throw(0x417cb70, 0x2a)
        /private/var/folders/q8/bf_4b1ts2zj0l7b0p1dv36lr0000gp/T/gobuilder/darwin-arm-crawshaw-dfc9e264d18a/go/src/runtime/panic.go:543 +0x7c
runtime.sigpanic()
        /private/var/folders/q8/bf_4b1ts2zj0l7b0p1dv36lr0000gp/T/gobuilder/darwin-arm-crawshaw-dfc9e264d18a/go/src/runtime/sigpanic_unix.go:12 +0x50
runtime.acquirep(0x0)
        /private/var/folders/q8/bf_4b1ts2zj0l7b0p1dv36lr0000gp/T/gobuilder/darwin-arm-crawshaw-dfc9e264d18a/go/src/runtime/proc1.go:2603 +0x54
runtime.stopm()
        /private/var/folders/q8/bf_4b1ts2zj0l7b0p1dv36lr0000gp/T/gobuilder/darwin-arm-crawshaw-dfc9e264d18a/go/src/runtime/proc1.go:1028 +0x17c
runtime.findrunnable(0x50120a00)
        /private/var/folders/q8/bf_4b1ts2zj0l7b0p1dv36lr0000gp/T/gobuilder/darwin-arm-crawshaw-dfc9e264d18a/go/src/runtime/proc1.go:1360 +0x57c
runtime.schedule()
        /private/var/folders/q8/bf_4b1ts2zj0l7b0p1dv36lr0000gp/T/gobuilder/darwin-arm-crawshaw-dfc9e264d18a/go/src/runtime/proc1.go:1459 +0x1d8
runtime.goschedImpl(0x501a8460)
        /private/var/folders/q8/bf_4b1ts2zj0l7b0p1dv36lr0000gp/T/gobuilder/darwin-arm-crawshaw-dfc9e264d18a/go/src/runtime/proc1.go:1533 +0xfc
runtime.gosched_m(0x501a8460)
        /private/var/folders/q8/bf_4b1ts2zj0l7b0p1dv36lr0000gp/T/gobuilder/darwin-arm-crawshaw-dfc9e264d18a/go/src/runtime/proc1.go:1541 +0x38
runtime.mcall(0x5018c000)
        /private/var/folders/q8/bf_4b1ts2zj0l7b0p1dv36lr0000gp/T/gobuilder/darwin-arm-crawshaw-dfc9e264d18a/go/src/runtime/asm_arm.s:178 +0x5c


goroutine 1 [chan receive]:
testing.RunTests(0x4195374,(lldb)  0x4204ca8, 0x2a, 0x2a, 0x1)
        /private/var/folders/q8/bf_4b1ts2zj0l7b0p1dv36lr0000gp/T/gobuilder/darwin-arm-crawshaw-dfc9e264d18a/go/src/testing/testing.go:561 +0x780
testing.(*M).Run(0x50116120, 0x4218328)
        /private/var/folders/q8/bf_4b1ts2zj0l7b0p1dv36lr0000gp/T/gobuilder/darwin-arm-crawshaw-dfc9e264d18a/go/src/testing/testing.go:490 +0x74
main.main()
        /var/folders/q8/bf_4b1ts2zj0l7b0p1dv36lr0000gp/T/go-build399665451/sync/atomic/_test/_testmain.go:142 +0x1b4


goroutine 17 [syscall, locked to thread]:
runtime.goexit()
        /private/var/folders/q8/bf_4b1ts2zj0l7b0p1dv36lr0000gp/T/gobuilder/darwin-arm-crawshaw-dfc9e264d18a/go/src/runtime/asm_arm.s:982 +0x4


goroutine 165 [chan receive]:
sync/atomic_test.TestStoreLoadRelAcq32(0x5011ce40)
        /private/var/folders/q8/bf_4b1ts2zj0l7b0p1dv36lr0000gp/T/gobuilder/darwin-arm-crawshaw-dfc9e264d18a/go/src/sync/atomic/atomic_test.go:1327 +0x218
testing.tRunner(0x5011ce40, 0x4204e40)
        /private/var/folders/q8/bf_4b1ts2zj0l7b0p1dv36lr0000gp/T/gobuilder/darwin-arm-crawshaw-dfc9e264d18a/go/src/testing/testing.go:452 +0xc8
created by testing.RunTests
        /private/var/folders/q8/bf_4b1ts2zj0l7b0p1dv36lr0000gp/T/gobuilder/darwin-arm-crawshaw-dfc9e264d18a/go/src/testing/testing.go:560 +0x754


goroutine 167 [runnable]:
runtime.Gosched()
        /private/var/folders/q8/bf_4b1ts2zj0l7b0p1dv36lr0000gp/T/gobuilder/darwin-arm-crawshaw-dfc9e264d18a/go/src/runtime/proc.go:157 +0x10
sync/atomic_test.TestStoreLoadRelAcq32.func1(0x1, 0x50118a20, 0x50224000, 0x5011ce40, 0x5011eec0)
        /private/var/folders/q8/bf_4b1ts2zj0l7b0p1dv36lr0000gp/T/gobuilder/darwin-arm-crawshaw-dfc9e264d18a/go/src/sync/atomic/atomic_test.go:1314 +0xec
created by sync/atomic_test.TestStoreLoadRelAcq32
        /private/var/folders/q8/bf_4b1ts2zj0l7b0p1dv36lr0000gp/T/gobuilder/darwin-arm-crawshaw-dfc9e264d18a/go/src/sync/atomic/atomic_test.go:1325 +0x1ec

FAIL sync/atomic_test 21.858s

@rsc rsc added this to the Unplanned milestone Apr 10, 2015
@crawshaw
Copy link
Member

I'm a little worried about not fixing this for Go 1.5, because it may affect iOS.

This can be fixed for iOS in a similar way to the linux fix. Because iOS always implies cgo, we can pass the darwin OSAtomicCompareAndSwap32Barrier symbol into the runtime and use it the way we do the linux-provided CAS.

Does that seem reasonable?

@minux
Copy link
Member

minux commented May 6, 2015

Here is the disassembly for armv7s version of OSAtomicCompareAndSwap64Barrier:
_OSAtomicCompareAndSwap64Barrier:
2f9d4198 b580 push {r7, lr}
2f9d419a 466f mov r7, sp
2f9d419c f8d79008 ldr.w r9, [r7, #8]
2f9d41a0 e8d9ce7f ldrexd r12, lr, [r9]
2f9d41a4 ea8e0e01 eor.w lr, lr, r1
2f9d41a8 ea8c0c00 eor.w r12, r12, r0
2f9d41ac ea5c0c0e orrs.w r12, r12, lr
2f9d41b0 d10f bne 0x2f9d41d2
2f9d41b2 f3bf8f5a dmb ishst
2f9d41b6 e8c9237c strexd r12, r2, r3, [r9]
2f9d41ba f1bc0f00 cmp.w r12, #0x0
2f9d41be d00a beq 0x2f9d41d6
2f9d41c0 e8d9ce7f ldrexd r12, lr, [r9]
2f9d41c4 ea8e0e01 eor.w lr, lr, r1
2f9d41c8 ea8c0c00 eor.w r12, r12, r0
2f9d41cc ea5c0c0e orrs.w r12, r12, lr
2f9d41d0 d0f1 beq 0x2f9d41b6
2f9d41d2 2000 movs r0, #0x0
2f9d41d4 bd80 pop {r7, pc}
2f9d41d6 2001 movs r0, #0x1
2f9d41d8 f3bf8f5b dmb ish
2f9d41dc bd80 pop {r7, pc}
2f9d41de bf00 nop

I've verified that the arm64 version doesn't have barrier.

We can just add DMB instruction to our CAS implementation.

@gopherbot
Copy link
Contributor

CL https://golang.org/cl/12950 mentions this issue.

@rsc rsc closed this as completed in 4bd8040 Jul 30, 2015
@mikioh mikioh modified the milestones: Go1.5, Unplanned Jul 31, 2015
@golang golang locked and limited conversation to collaborators Aug 5, 2016
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

7 participants