-
Notifications
You must be signed in to change notification settings - Fork 806
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
added Sizer type assert at cachedsize method. The sizecache is not ca… #412
Conversation
…lculated if the struct conforms to the Sizer interface
…both reflect and unsafe cases
proto/table_marshal.go
Outdated
@@ -213,6 +213,9 @@ func (u *marshalInfo) size(ptr pointer) int { | |||
// cachedsize gets the size from cache. If there is no cache (i.e. message is not generated), | |||
// fall back to compute the size. | |||
func (u *marshalInfo) cachedsize(ptr pointer) int { | |||
if s, ok := structPointer_Interface(ptr, u.typ).(Sizer); ok { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really clean solution :)
Let's merge this into swaptypeassert and then we review the whole swaptypeassert branch, what do you think? |
That sounds like a plan. I am not sure about the performance impact though. |
Is it possible to maybe rather do the size call in There is a check for a marshaler if u.hasmarshaler {
m := ptr.asPointerTo(u.typ).Interface().(Marshaler)
b, _ := m.Marshal()
return len(b)
} Maybe that is also a good place to check for sizer, possibly even store whether the message has a sizer in a |
I think we will end up in the same situation. |
If we want to add a hassizer check in the marshalInfo.size(pointer) method, we might have to replace the marshalInfo.cachedSize(pointer) calls in: to just normal marshalInfo.size(pointer) calls. This is just of the top of my head, there might still be a better place to place the check. |
It would be great if we could do a check, call Size and place the result in cachedSize, but maybe I am missing the whole bug? @dsnet do you want to get in on this? We are trying to fix swapping the type assert that you suggested, before we merge |
With the quick email we just had with walter I think that by only checking the Sizer interface in the cachedsized we might be missing something or might be doing unnecessary work. I might have to rethink this. |
@jmarais this pull request really showed me the problem, but I have a suggestion to go another way. I have a small reproducing example: option (gogoproto.sizer_all) = true;
message FindMistake {
optional bool Field1 = 1;
}
message NinOptStruct {
optional FindMistake Field4 = 4;
} It seems that Marshal assumes that xxx_messageInfo_NinOptStruct.Size(m) has already been called. func (m *NinOptStruct) XXX_Size() int {
if mm, ok := (interface{})(m).(proto.Sizer); ok {
// xxx_messageInfo_NinOptStruct.Size(m) // uncomment this to make code work
return mm.Size()
}
return xxx_messageInfo_NinOptStruct.Size(m)
} I have modified the sizedcache method to get a stack trace: func (u *marshalInfo) cachedsize(ptr pointer) int {
if u.sizecache.IsValid() {
panic("valid")
return int(atomic.LoadInt32(ptr.offset(u.sizecache).toInt32()))
}
return u.size(ptr)
} stack trace:
Here on line 2237 we find that u.cachedsize is being called and assumes that the size has already been cached. // makeMessageMarshaler returns the sizer and marshaler for a message field.
// u is the marshal info of the message.
func makeMessageMarshaler(u *marshalInfo) (sizer, marshaler) {
return func(ptr pointer, tagsize int) int {
p := ptr.getPointer()
if p.isNil() {
return 0
}
siz := u.size(p)
return siz + SizeVarint(uint64(siz)) + tagsize
},
func(b []byte, ptr pointer, wiretag uint64, deterministic bool) ([]byte, error) {
p := ptr.getPointer()
if p.isNil() {
return b, nil
}
b = appendVarint(b, wiretag)
siz := u.cachedsize(p) // line 2237
b = appendVarint(b, uint64(siz))
return u.marshal(b, p, deterministic)
}
} But if we generated the Size method, so there is no reason that the size should already be cached. My preferred solution here is to not call our generated Size method inside XXX_Size. So the XXX_Size method will always look like this: func (m *NinOptStruct) XXX_Size() int {
return xxx_messageInfo_NinOptStruct.Size(m)
} No reason to optimize the calculation of Size with generated code, if you aren't using the generated Marshalers. What do you think? If you agree, do you think you can make the change in generator.go ? |
…ither use MarshalTo or default marhsaling. hassizer property now checked in the marshalinfo size method if the type has a Size() method. removed the Sizer interface check in the generated XXX_Size methods.
"reflect" | ||
) | ||
|
||
var sizerType = reflect.TypeOf((*Sizer)(nil)).Elem() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we also have to remember ProtoSizer which has a ProtoSize method, instead of a Size method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we should then call either size or protosize if it is available?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, if we call size at all, given the insight we had offline and in the next comment.
proto/table_marshal.go
Outdated
} | ||
|
||
s := ptr.asPointerTo(u.typ).Interface().(Sizer) | ||
n := s.Size() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed, now we are now effectively calculating the size twice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeh. But the current size recursively calls size again if needed and that initialized struct field's marshalinfos, calculate the struct field size and store it at the cache location. If we dont call size on struct fields we don't cache their sizes. unfortunate the sizerers don't really need this recursive size call.
We could check if the thing is a marshaler and sizer it can just return the size, because we dont need the caching then. But other than that we do need to be able to cache the size for the default marshaler. Hmm. I might have another idea.
…e and cached size methods to just return the message size without looking at any other struct fields.
proto/table_marshal.go
Outdated
@@ -167,6 +169,17 @@ func (u *marshalInfo) size(ptr pointer) int { | |||
u.computeMarshalInfo() | |||
} | |||
|
|||
// Uses the message's Size method if available | |||
if u.hassizer { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think, like you suggested, the best we can do is move these checks into the if u.hasmarshaler
, because then it is quicker to call size than marshaling, but otherwise let's have cachesize run its course.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we move these checks in u.hasmarshaler we wont be calling any other struct field's size methods and thus not init + caching their sizes. I think the best we could do then is just use the default sizing even if it is a sizer, and then only use the Size method if the message is a marshaler as well.
proto/table_marshal.go
Outdated
@@ -213,7 +227,15 @@ func (u *marshalInfo) size(ptr pointer) int { | |||
// cachedsize gets the size from cache. If there is no cache (i.e. message is not generated), | |||
// fall back to compute the size. | |||
func (u *marshalInfo) cachedsize(ptr pointer) int { | |||
if u.sizecache.IsValid() { | |||
if u.hassizer { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These will now never be cached, I can see that it will work, but a cachedsize, should win in speed, I would think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeh. I agree.
benchcmp_marshaling_33637a95againstc36358ef.txt |
… of the message is also a marshaler
proto/table_marshal.go
Outdated
messageset bool // uses message set wire format | ||
hasmarshaler bool // has custom marshaler | ||
hassizer bool // has custom sizer | ||
hasprotosizer bool // has custom protosizer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can put hassize and hasprotosizer later in the struct with a newline in between (for gofmt) to minimize the diff here?
g.Out() | ||
g.P("}") | ||
g.P("return xxx_messageInfo_", ccTypeName, ".Size(m)") | ||
if (gogoproto.IsMarshaler(g.file.FileDescriptorProto, message.DescriptorProto) || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh nice :)
g.Out() | ||
g.P("}") | ||
g.P("return xxx_messageInfo_", ccTypeName, ".Marshal(b, m, deterministic)") | ||
if gogoproto.IsMarshaler(g.file.FileDescriptorProto, message.DescriptorProto) || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we do a compile time check here, then we should do the same for the XXX_Unmarshal method. We should just try to be consistent.
Cool benchmarks. Maybe in a separate pull request, it might be time to update https://github.com/gogo/protobuf/blob/master/bench.md :) Ok but I think this is very close to merging. Thanks so much :D |
test/mixbench/marshal.txt
Outdated
@@ -1,73 +1,73 @@ | |||
goos: darwin | |||
goos: linux |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we update this as part of another pull request, which includes the update of bench.md
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I missed this. My intention wasn't to update the benches.
Looks good to me :) Great work! |
…lculated if the struct conforms to the Sizer interface