-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cmd/cgo: specify rules for passing pointers between Go and C #12416
Comments
This addresses pretty much all of my concerns about being able to build plumbing from Go to native libraries or syscalls. Rule #3 is potentially a little problematic (imagine calling into a USB read syscall for a device that simply does not return data for a loooong time), but can be worked around (periodic timeout and retry, two part operation w/ select + read, etc), without too much suffering. |
Presumably this requires either a) a GC that doesn't move objects; or b) a form or (automatic, via CGO) pinning of Go pointers passed to C. |
@swetland - I've harassed Ian enough on this issue already, but the workaround for Rule number 3 is to allocate memory with malloc, and let that pointer leak into the Go code instead of a Go pointer leaking into C. The GC will tolerate it, does not need to worry about collecting it, and you can use a finalizer on the Go object encapsulating the state of the USB device to ensure that if the object is not freed by close, then it is freed by the finalizer. This is the idiom I've used in the past, and it's the one I recommend until you have performance problems that would justify trying something else, and my first answer for the first round of performance problems is "we need to improve the inliner" so that encapsulation of the malloc'd buffer is a zero-cost abstraction. |
@ianlancetaylor For the first rule, can you clarify what "Go code may pass a Go pointer to C provided that the Go memory to which it points does not contain any Go pointers." means and perhaps some further rationale behind this restriction? I understand why C code mustn't mutate Go pointers, but there seems to be a significant gap between those two restrictions. For example, in the code below, is this rule satisfied or violated by passing a pointer to a pointer-free subregion of a Go struct? If it's violated, do you propose the toolchain should reject this still? If so, how?
Also, I assume by "Go pointers" you actually mean to include all pointer-like types (i.e., strings, slices, functions, interfaces, maps, channels)? |
I think your fill example satisfies the rules.
The rule seems to forbid passing a *N to C function, but I agree the
rule need more clarification as to what constitute a Go pointer. For
example, what if the N struct is declared as:
type N struct {
Next unsafe.Pointer
Buf [1024]byte
}
and the programmer guarantee that Next pointer will be pointing to
C.malloc'ed memory?
|
Thanks for starting the work! That said, some aspects are still not very clear to me -- I'd be very grateful for some clarifications. I'm especially interested from the point of view of the "Windows world", i.e. interacting with WinAPI "syscalls" & some other custom DLLs:
I'd be really very grateful if answers to those questions could be incorporated into this proposal... |
@griesemer Yes. If the GC requires notification about pinned pointers, cgo will generate the calls to pin them. |
Likewise if the GC needs to know when an object is no longer pinned, cgo On Mon, Aug 31, 2015 at 3:11 PM, Ian Lance Taylor [email protected]
|
If at some point we have a moving GC, then the GC will need to update all Go pointers to point to their new location. We can work around that for a Go pointer explicitly passed to C code, by temporarily pinning the pointer. However, if the memory to which that Go pointer points itself holds pointers, then we must either pin those pointers too or we must modify them while the C code is running. The former is difficult and causes pinning to become a transitive operation. The latter is racy. The only way this restriction could be a problem would be if Go code wanted to pass a complex data structure into C code, where the data structure is allocated by Go code in Go memory. It would have to use C types, of course, or the C code wouldn't be able to use it. That is not a case that seems likely to arise often, and in particular it seems often best to allocate these complex data structures using C memory via C.malloc. So I think it's preferable to use a relatively simple rule here, rather than try to develop a more complex set of restrictions. This proposal aims to find out whether people can live with the rule. Your code example satisfies the rule. The Go memory to which the pointer points is the [1024]byte.
Yes. |
@minux Your example is fine. I tried to clearly define, for purposes of this proposal, a Go pointer as a pointer into memory allocated by the Go allocator. It is a purely dynamic concept that has nothing to do with types. The cgo part of the proposal has to do with types, but that is not the actual rules about what is permitted. |
Clearly the syscall package has to consider the same set of issues, but these rules are intended to apply to cgo. We control all the entry points to the syscall package, so if appropriate we can do something different. I think that for Windows we essentially need to define for each entry point what is permitted. For example, currently on Unix the function syscall.ForkExec takes a pointer to a syscall.ProcAttr and that structure is permitted to contain Go pointers. This is fine, and it demonstrates that the syscall functions are not required to precisely follow the rules for cgo.
What matters for this proposal is whether the value is a pointer into Go memory. The actual type of the value does not matter.
One goal of this proposal is to avoid any documented rules for pinning pointers. Sure, this may be implemented by pinning internally, but I don't think we want to make pinning pointers part of the Go runtime API.
What matters is the values stored in this uintptr fields. Update: some of my above comments were incorrect. A Go value with type uintptr is not going to be treated as a Go pointer. While the type of the pointer doesn't matter, it does have to be a pointer. |
On Mon, Aug 31, 2015 at 3:22 PM, Ian Lance Taylor [email protected]
|
@ianlancetaylor So it sounds like the rationale then is you want to do direct pinning without transitive pinning, and transitive pinning is necessary if C code is going to access Go pointers in Go memory. Your proposed solution seems to try to use the C type system to prevent C code from accessing Go pointers in Go memory. For completeness, I see another solution of simply ruling that C code may not access Go pointers in Go memory. I.e., if a *N pointer is passed to C code, the C code may freely read/write Buf, but not Next. Instead, the C code would need to call into a Go helper if it wants to access the pointers. (Analogously, Go code needs to call into C helpers to access unusual C struct members like bit fields.) |
@minux Yes, we could add that dynamic check. I can't decide how much it would help. To be safe, it would have to consider the type of the argument, so that we don't give an error for *float64 where the float value happens to look like a pointer. But then a Go pointer converted to uintptr will slip through. So it becomes another partial check. It might still be worth doing, of course. |
@mdempsky I think I would say that we want to put the smallest possible number of restrictions on future garbage collection work. Prohibiting passing any Go pointers into C is too severe and prevents too many reasonable coding patterns. This is the minimal next step. To be very pedantic, the proposed solution is the set of rules listed in the issue. The cgo work is an attempt to detect violations of those rules at build time.
I don't see any technical problems with this. But it seems easier to get wrong, and doesn't seem any simpler. |
IIUC, the rule stating that one should not store pointers to (I am not complaining, I am just trying to make sure I correctly understand the proposal) does this mean that the "only" way to write extension modules (for CPython, C-ruby, C-shiny-interpreter) which usually involves wrapping a |
@sbinet Yes, I think you are correct. Sorry. Serialization is one approach, but it is not the only approach. Another approach is a map[int]interface{}. For each Go value you need to pass to Python, increment a counter to get a token value, and store the Go value in the map at that location. When calling back into Go, pass the token value. The Go side uses the token to look into the map to fetch the value. When you release the Python value, tell the Go side to delete the map entry. This mechanism is probably simplest if the Python code never needs to examine the Go value. |
couldn't this be implemented as some kind of C-api on top of |
Is the map a global? That would imply contention and/or synchronization. We don't have a notion of thread-local/Goroutine-local storage in Go, do we? This would also need to survive a transition from Go to C to Go to make this work. |
On Mon, Aug 31, 2015 at 6:02 PM, dr2chase [email protected] wrote:
|
I'm sorry, I don't understand what you are asking. For an example of the map approach, see https://github.com/swig/swig/blob/master/Lib/go/goruntime.swg#L408 . |
Design doc at https://go-review.googlesource.com/14112 . It incorporates some of the above discussion. |
CL https://golang.org/cl/14112 mentions this issue. |
Thanks for some clarifications, and mention of syscall in the doc.
Would those per-entry-point "detailed" features be available to third-party library/package writers? I believe there's significant value in being able to call WinAPI functions from pure Go code, which then can be just "go get", without need for a whole MinGW toolchain installed and well-configured (and necessarily from some specific, this-month-fashionable fork). And I believe there are third-party libraries making use of that, and not only on Windows but I believe on Linux too. Not to mention golang.org/x/sys, I suppose? And, right, WinAPI calls do love to receive pointers to structs with pointers (including ASCIIZ strings), that's quite true.
I'm not sure I understand still. In Go ~1.4 the boundary between what to GC was a "pointer" vs. "not a pointer, just an int, even if same bit pattern" seemed to be placed quite explicitly between unsafe.Pointer and uintptr. From the above I seem to understand this may possibly change? That said, from other comments I'm starting to understand the specifics here are maybe expected to be fleshed out later (or do I misunderstand?) |
I don't know. The syscall and x/sys packages can provide functions to handle structs with pointers, doing things like copying the pointers into non-garbage-collected memory. I think it would be fine to make something like that generally available. I think we have to be careful about what we make generally available, to avoid overly constraining future GC work.
There is not any intent to change the garbage collector behaviour. This proposal is about the very specific problem of how to safely pass a Go pointer to C code. You're right: I was wrong about the types. A value of type uintptr is not a pointer. You are already in potential trouble if you convert a Go pointer to uintptr. This proposal won't make that worse or better. The type of a Go pointer doesn't matter, but it does have to be some sort of pointer type. |
Is there a working example of passing callback function into a C code? It seems that example from the wiki isn't working anymore. |
@idavydov please see https://golang.org/wiki/Questions |
This reverts commit e4d361c. The above commit is not working properly as it will convert wrongly characters. Seem to be caused by improper use pointer because iconv expecting pointer to pointer which it seem to allocate the memory internally. This allocated memory is not being passed out to golang side as the converted characters is not visible to golang side. This is a quick fix before proper inspection should be done on iconv pkg. However, without that commit mean that this pkg can not support go1.6 and above as document here: golang/go#12416
Discussion at: golang/go#12416 https://github.com/golang/proposal/blob/master/design/12416-cgo-pointers.md Issues with the Go memory management code were hacked around by adding a new check that ensured that pointers to memory managed by Go are never passed to cgo code, since one can not be sure how and where pointers to Go memory are being stored in the C code. As a result, the memory must be manually malloc'd and free'd by cgo, even if it's a Go object in the C memory. The userdata fix makes this change explicit, by running a malloc and free using cgo, to explicitly control the memory management.
Re-implement the cgo wrappers for flate, bzip2, and brotli to properly follow the C-to-Go pointer rules outlined in (golang/go#12416). Also, move the wrappers to seperate packages and update the bench tool to use the new wrappers.
Issues #12116 and #8310 discuss the problem of when it is OK to pass pointers between Go and C code when using cgo. This proposal is a specific set of rules for when it is OK, plus some modifications to cgo to partially enforce the rules.
Definitions: a Go pointer is a pointer to memory allocated by the Go runtime. A C pointer is a pointer to memory not allocated by the Go runtime, such as by C.malloc.
The proposed rules are:
The proposal is that we declare that all Go programs that adhere to those rules are, and will remain, valid, under the Go 1 compatibility guarantee. Go programs that do not adhere to these rules may break today or in future releases.
We further propose some modifications to cgo that will help enforce these rules. The cgo changes are not the rules; the rules are as above. The cgo changes are intended to make it easier to detect, at build time, programs that violate the rules.
These cgo rules are more strict than the rules about passing pointers. They will prohibit cases that are permitted. It is also possible to work around them, even without using unsafe, and write code that is not blocked by cgo but that violates the above rules. I will provide examples in a separate design doc.
The text was updated successfully, but these errors were encountered: