-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
(WIP) Improve Binary Encode Performance #964
base: main
Are you sure you want to change the base?
Conversation
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR, Elias!
Unfortunately, resizable ArrayBuffers are relatively new. We'll have to wait until they are more widely available before we can use them.
I left a couple of informative comments.
// NodeJS strings are by default UTF-8, so we can assume the byte length as the length of | ||
// the string. | ||
const valueBytesLength = value.length; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately this won't work. Strings are UTF-16, and length
returns the number of code units.
@@ -80,7 +80,7 @@ function writeFields( | |||
} | |||
if (opts.writeUnknownFields) { | |||
for (const { no, wireType, data } of msg.getUnknown() ?? []) { | |||
writer.tag(no, wireType).raw(data); | |||
writer.tag(no, wireType).bytes(data); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The bytes
method prefixes the length. The change will corrupt data, and needs to be reverted.
for (const item of list) { | ||
writeScalarValue(writer, scalarType, item as ScalarValue); | ||
} | ||
writer.join(); | ||
writer.tag(field.number, WireType.LengthDelimited); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fork()
and join()
will take everything between, and write it length-prefixed. The tag needs to come first, then the length prefix, then the data.
break; | ||
} | ||
writer.join(); | ||
writer.tag(field.number, WireType.LengthDelimited); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as L178.
// TODO(ekrekr): this is really slow, because it has to allocate a whole new array buffer. | ||
// Instead we should be writing the message directly to the original arraybuffer, then inserting | ||
// the length beforehand. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Unfortunately, varints have variable length, so this isn't straight-forward.
/** | ||
* Encode UTF-8 text to an existing binary. | ||
*/ | ||
encodeInto: ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a breaking change to add a mandatory property to the interface, so we'll have to find other means.
Work in progress - I haven't yet come up with a way to efficiently fork a new message directly within the contigious memory space, for length delimited records.
Efficiency optimisations through avoiding copying of array buffers, by writing directly to an expanding array buffer with dynamic contiguous memory space.
TODO: add performance tests and flamegraphs once working.