-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
protoc-gen-go: unexport XXX_ fields from generated types #276
Comments
(Note that the accessor code could be significantly simpler if golang/go#18617 is accepted.) |
That works if we were using v.Interface() but I think most of the time we are using v.Field(i).Addr().UnsafePointer() or v.Field(i).Set(foo), and reflect will reject those. |
More generally, I do think this is an interesting thing to keep in mind, but we should also figure out where we are headed. For example, hypothetically, if we move to completely generated code, then the XXX_ fields can be xxx_ fields directly, with no problem. And again, hypothetically, if we move to completely unsafe code (no safe fallback for App Engine), the XXX_ fields can be xxx_ fields directly too, since we can write to them using unsafe.Pointer arithmetic. |
We can use v.Interface() with func unsafeGetUnrecognized(v reflect.Value) (*XXX_InternalUnrecognized, bool) {
if v.IsNil() {
return nil, false
}
offset, ok := unrecOffsets.Load(v.Type())
if ok {
addr := uintptr(v.Addr().UnsafePointer()) + offset.(uintptr)
return (*XXX_InternalUnrecognized)(unsafe.Pointer(addr))
}
w, ok := v.Interface().(interface {
unrecognized() *XXX_InternalUnrecognized
})
if !ok {
return nil, false
}
p := w.unrecognized()
offset = uintptr(unsafe.Pointer(p)) - uintptr(v.Addr().UnsafePointer())
unrecOffsets.LoadOrStore(v.Type(), offset)
return p, true
} That said, it's not obvious to me that that's even necessary: if we're already paying the overhead of The p, _ := getUnrecognized(v.Interface())
*p = foo |
Agreed. I think we should unexport these fields no matter what, but the implementation details depend heavily on how we decide to optimize the rest of the package. |
Is there a possibility of golang/protobuf starting to generate some marshaling and unmarshaling code? |
@awalterschulze We're looking into ways to speed up marshal and unmarshal. Entirely generated code seems unlikely to be the right end point (too much code, not enough performance win), but we're evaluating multiple points in the solution space. Code not depending on the proto package at all is an interesting idea, although that implies even more generated code. I don't think that was on our radar. |
I am moving the conversation to another issue, since I feel I am hijacking the current issue. |
@awalterschulze It's at least partly relevant to this discussion, because cutting the dependency on the |
I know at least that the double embedding tricked has caused me a lot of work when it was done for extensions. So I would opt for leaving this as is from a purely selfish point of view. But thats probably not a good enough reason not to do this. |
That's true, but note that these fields are already excluded from the compatibility guidelines:
So pushing third-party protobuf tools to use only the stable part of the API is probably a good thing anyway. |
Yes fair enough. I'll do the work :) Only using the compatible parts is not possible for all cases. To do better marshaling and unmarshaling testing I would like to be able to generate messages with populated XXX_recognized fields. So my plugin to generate randomly populated messages, needs to be able access that field. |
Couldn't you populate the exported fields, marshal the message, and inject pre-serialized data for unrecognized fields, and then unmarshal the message? It's an extra marshal/unmarshal round-trip, but that doesn't seem terrible for a test utility and allows you to work with only the exported API (and the documented proto encoding spec).
All of those are implementation details, though: they're features that are normally internal to functions in the Also note that with the double-embedding approach, code within the generated packages would still have access to the (unexported) second-embedding struct and its fields. So if we're careful in how we define the first-embedding API, it may still be possible for I would like to see the |
The current way allows me to inject those values on several levels in deeply nested messages. The need for the proto package is a sore point for me and one that I would like to remove in future, at least as an option for gogoprotobuf. Even with a limited usecase I still think removing this dependency is of great value in simplifying some installations. It would be great if the resulting xxx field can still be touched be generated code without going into the proto package. I would really love to one day have an extension to not import the proto package. This obviously won't work for |
That should be straightforward for proto3, since it uses It may be possible for proto2 messages that don't reserve extension fields, but we'd need to figure out what to do about unknown fields: proto2 requires that messages preserve unknown fields, but at the moment we don't have an exported, stable API for accessing or manipulating them and I'm not sure that we should. |
Exactly if I just stop Register messages only extensions and any stop working :) But if this change goes through in such a way that a proto package is needed, then my plan only works for proto3 which is quite a shame. So If something has to change, them I am for the publically available XXX_ becoming a private xxx_ As I understand XXX_InternalUnrecognized would require the use a proto package? |
Why do you want to not have a proto package? It seems like there's an underlying root cause there (like maybe package management) that we should be addressing instead. |
dependencies, given even the best package management, still need to be managed. But there is also confusion when using gogoprotobuf and golang/protobuf. Also vendoring a protobuf project has its limits. |
latest code occurs, again |
XXX_ fields are back again ... :D |
Renaming as "unexported XXX_ fields" since the current title suggests that "XXX_" fields will one day disappear; which is not true. The generator will need to inject some fields necessary for proto functionality. The goal is to have them be unexported and not user-visible either through documentation or by direct field access. |
Any update on this issue 👍 |
this really needs to be resolved. completely makes testing protobuf structs a nightmare. |
Please explain? |
fields that get populated with data during serialization, are nested, and no clean way to reset them to zero. means that one can't compare them with a structure made in a test without jumping through hoops. |
That has nothing to do this this issue though. This issue is about unexporting the XXX_ fields, which doesn't change the fact that the internal fields will still be populated; and doing something like You want something like #762 |
it use to work just fine. ;) but yes something like #762 would work. |
We modify protoc-gen-go to stop generating exported XXX fields. The unsafe implementation is unaffected by this change since unsafe can access fields regardless of visibility. However, for the purego implementation, we need to respect Go visibility rules as enforced by the reflect package. We work around this by generating a exporter function that given a reference to the message and the field to export, returns a reference to the unexported field value. This exporter function is protected by a constant such that it is not linked into the final binary in non-purego build environment. Updates #276 Change-Id: Idf5c1f158973fa1c61187ff41440acb21c5dac94 Reviewed-on: https://go-review.googlesource.com/c/protobuf/+/185141 Reviewed-by: Damien Neil <[email protected]>
The v1.20.0 release of |
The exported
XXX_
fields on generated message structs are awkward. They make "obvious" reflection code over proto messages easy to get wrong (by not skipping theXXX_
fields), and they pollute the documentation of the generated packages with irrelevant implementation details.I believe it is possible to unexport them. They appear to need to be exported due to the restriction on reflect.Value.Interface of unexported fields, but we can observe that every nested
proto.Message
is reached through only exported fields orXXX_
fields.One technique we could use to access the fields after unexporting them is double-embedding. Start with an exported type in the proto package with an unexported (pointer) accessor method. Add one (unexported) struct type per generated proto package which embeds the exported struct. Embed this unexported struct in each message which needs the corresponding field.
The unexported struct field is itself unexported, but since that embeds a struct with an unexported method defined in the
proto
package it is accessible withinproto
via type-assertion to an interface containing the method.A rough sketch: https://play.golang.org/p/yDjZZtGmB6
The text was updated successfully, but these errors were encountered: