-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace flatbuffers #2121
Comments
Cap'n Proto seems like the way to go for maximum performance. There were good points provided in issue #269 and having @kentonv himself on board we couldn't go wrong with serialization (even if we need some tweaks to meet the requirements). |
I'm not sure if Cap'n Proto will perform much better than FlatBuffers. In theory, they should be similar, as they are both zero-copy protocols. I don't know very much about the details of FlatBuffers, so it's entirely possible there's some reason I'm not aware of why it would be significantly slower (or faster?) than Cap'n Proto. You could certainly try it and see what happens. If they do in fact perform similarly, then I think you're out of options. I don't think there's any other serialization framework which would be expected to perform better than these for this use case. |
Some of our messages are very small - two int32 integers. We need a framework which can take a struct like struct ReadMessage {
rid: i32;
} and compile into JavaScript code which only does the bare minimum: interface ReadMessage { rid: number };
function ReadMessageParse(i32: Int32Array) => ReadMessage {
assert(i32[0] == READ_MESSAGE_ID);
return { rid: i32[1] };
}
function ReadMessageSerialize(msg: ReadMessage, i32: Int32Array): void {
i32[0] = READ_MESSAGE_ID;
i32[1] = msg.rid;
} We know this is fast... Unfortunately existing systems seem to have a lot more code. I'd be interested to know if cap'n'proto can produce minimal code like this. |
Hmm. You could short-circuit Cap'n Proto to provide something like that. Fundamentally when you have a struct reader in Cap'n Proto, it's just a pointer to a buffer, and each accessor method simply reads from / writes to an offset in that buffer. So if you directly construct a struct reader from a buffer that contains only that one struct, then that's basically what you get. However, I'm not sure how you extend that to support strings, or any sort of variable-length fields. At that point, you fundamentally need some sort of way to delimit objects, deal with memory allocation, etc. That's where the complexity comes from. Could you give me an example of what kind of code you'd hope to see generated for a struct that contains some string fields? |
Oh, also: Is forwards/backwards-compatibility important to you? Or can you always assume the TypeScript and Rust sides are updated in lockstep? If compatibility is not an issue I suppose there's a lot you can simplify, but serialization frameworks usually prioritize compatibility. |
I think the zero-copy parameter solves the problem of variable length. The read operation currently looks like this: function reqRead(
rid: number,
zeroCopy: Uint8Array
): [flatbuffers.Builder, msg.Any, flatbuffers.Offset, Uint8Array] {
const builder = flatbuffers.createBuilder();
const inner = msg.Read.createRead(builder, rid);
return [builder, msg.Any.Read, inner, zeroCopy];
}
export async function read(rid: number, zeroCopy: Uint8Array): Promise<ReadResult> {
// dispatch.sendAsync sends our serialized flatbuffers int8Array with a "zeroCopy" handle
// to c++ v8 bindings that then passes it along to rust land via ffi. zeroCopy is used as a fixed
// length buffer in this case the length being the number of bytes you want to read from a file.
// We never have to copy zeroCopy we can just write directly to it from the other side hence the
// name "zeroCopy".
return resRead(await dispatch.sendAsync(...reqRead(rid, zeroCopy)));
} We don't perform any serialization on the zero copy parameter, so It can be any length it needs. |
Only for strings and buffers, but Servo's Tendril project solves similar problems. |
First part of this transition has landed in 00ac871 The new serialization method doesn't generalize - it's just the simplest thing that would work for read, write (and soon accept). It will require more work to migrate other ops to this system. |
Just a suggestion, have you looked at Abomonation? It's super-fast and also super-unsafe - but perhaps no moreso than rolling your own. |
https://github.com/FishOrBear/SimpleFlatBuffers |
@FishOrBear oh very nice! I wonder if you could give an example of how translating, say, our chown function would look using SimpleFlatBuffer |
I am not familiar with
If you use let path: string;
let f = new SimpleFlatBuffers();
f.writeString(path);
//If finished
f.finish();
//send buffers
let bytes = f.bb; Unlike FlatBuffers, SimpleFlatBuffer uses sequential write data. The party receiving the data can also directly read the data. let f = new SimpleFlatBuffers(bytes);
let path = f.readString(); I am not familiar with the interaction of js and c++ or rust. If the passed uint8array is passed in memory, then even we can make a memory alignment pass object. For example in C++: Struct A
{
Int32 a;
Int32 b;
} Pass binary memory to js. Let f = new SimpleFlatBuffers(btyes);
Let a = f.readInt32();
Let b = f.readInt32(); |
In addition, some of the performance of flatbuffers can be optimized by using DataView. Part of it might be through optimizing String Decode. |
Our benchmarks are finally at the point where we're able to measure the overhead of serialization. The gap below is due to serialization:
Flatbuffers appears to now be our bottleneck (see benchmarks in #1966)
Here are our requirements for a new serialization library:
&[u8]
in Rust orUint8Array
in JS). That is, it should not return a newly allocated slice.String
andi32
types within a struct.The text was updated successfully, but these errors were encountered: