-
Notifications
You must be signed in to change notification settings - Fork 824
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rkyv read native artifact from archive #2190
Conversation
I'm in question about whether we should prefer rkyv deserialize over archive. The deser approach require minimum change (just the change in this PR, PR is almost done except for moving impl rkyv in PrimaryMap/IndexMap inside wasmer). The later is fast, but more importantly, just found its speed is consistent.
And all access of ModuleMetadata as method calls. When access ModuleMetadata's field of PrimaryMap/IndexMap type, also need implement entire ArchivedPrimaryMap/ArchivedIndexMap methods that mirror all used methods of PrimaryMap/IndexMap. |
@ailisp we just merged #2195, which moves Also, I think is critical that serialization/deserialization remains constant regardless of the platform/architecture. That means that serializing the same content from macOS (with the m1 / arm64 chip) should give the same result as doing it from Linux (with a typical x86_64 chip) - and it should be the same for 32 bit architectures. This is critical to allow cross-compilation from other devices. That means compiling the file in one system, and running it in another. Note: if we can assure constant serialization from different architectures with the archive approach, I'm happy to choose that one! Thanks! |
That's great to hear! would you also plan to move IndexMap in it? It's okay if not, I can Implement traits on some WasmerIndexMapWrapper trait.
Good point! I'll do some test. author mentioned major caveat is it is using platform native endian and didn't fully test for cross platform. Other than it theoratically work. Archive is internally encode all pointers to a relative offset as u64, that indicate where object is located in the resulting [u8]. I consider little endian is common enough (mac/linux/windows x86 and ARM) and author also gives an solution to handle cross platform endian: https://davidkoloski.me/rkyv/faq.html#how-does-rkyv-handle-endianness Also I suppose the constant serialization means "the deserialized back object is Eq" when serialize from any platform and deserialize in any platform, instead of "serialized bytes is same byte by byte". Because in test I saw bincode doesn't give same byte when serialize the same ModuleMetadata twice. |
Just thought of this. |
@syrusakbary I discovered you have some commented lines of using flexbuffers in wasmer, how is that going? |
Build failed: |
It seems like cargo deny check is failing |
bors r+ |
bors r- |
Canceled. |
@ailisp we need to merge this suggestion #2190 (comment) Otherwise tests will fail: |
Btw do you use some script / tool to run all tests locally? Sorry to have you start test back and forth |
This are the main tests run: make test
make test-capi
make test-integration |
bors r+ |
2190: Rkyv read native artifact from archive r=syrusakbary a=ailisp <!-- Prior to submitting a PR, review the CONTRIBUTING.md document for recommendations on how to test: https://github.com/wasmerio/wasmer/blob/master/CONTRIBUTING.md#pull-requests --> # Description <!-- Provide details regarding the change including motivation, links to related issues, and the context of the PR. --> As mentioned in #2180, load a native artifact is slow because of deserialization ModuleMetadata. This draft PR show how fast it can be when serialize it archive to a [u8] that can be directly read back and interprete as nested struct/HashMap/PrimaryMap without a deserialization. Run ``` cargo bench -p wasmer-cache --features singlepass ``` from root dir, will see a lot of timeing in the end: ``` ... 3.447994ms // bincode::deserialize rkyv 22ns rkyv deser 386.539µs ``` This PR use a forked rkyv, for impl rkyv traits on PrimaryMap and IndexMap. TODO: - [x] in fork rkyv, impl rkyv traits on IndexMap is not guanranteed keep indexmap's order, just a quick hack to see timing (reuse ArchivedHashMap). need to impl a real ArchivedIndexMap - ~[ ] replace each use of `ModuleMetadata` and children to `ArchivedModuleMetadata` or something like:~ ``` enum ModuleMetadataWrap { Original(ModuleMetadata), Archived(ArchivedModuleMetadata) } ``` ~and field access to `ModuleMetadataWrap` method calls~ - [x] rkyv deserialize is not as fast as rkyv archive, but fast enough and advantage is minimum change to wasmer's struct field access code, so I plan to use rkyv deserialize - [x] add a test to ensure rkyv does deserialize correctly. I added assert to see deserialized obj is Eq to original, need to turn it to a unit test. - [x] move impl rkyv::* for PrimaryMap/IndexMap inside this PR, instead of using a fork of rkyv, using a remote-derive technique # Review - [ ] Add a short description of the the change to the CHANGELOG.md file Co-authored-by: Bo Yao <[email protected]>
Build failed: |
@syrusakbary I run cargo fmt --all on two computers give me different format result. So there are two reasons. One of it is I had a https://github.com/near/nearcore/blob/master/rustfmt.toml Or, it is helpful to ask contributors to always upgrade to latest stable rust (as used in ci), before run
|
now make lint, test and test-capi pass, make test-integration fail at a same reason as master, looks unrevelant to this PR:
|
Let's open a PR for |
I've looked and for some reasons tests are failing in Aarch64, but I believe is completely unrelated to this PR. Merging manually |
@ailisp if you can also open another PR integrating rkyv into the JIT engine that would be great |
Sounds good. We decide to use JIT engine now |
Description
As mentioned in #2180, load a native artifact is slow because of deserialization ModuleMetadata. This draft PR show how fast it can be when serialize it archive to a [u8] that can be directly read back and interprete as nested struct/HashMap/PrimaryMap without a deserialization. Run
from root dir, will see a lot of timeing in the end:
This PR use a forked rkyv, for impl rkyv traits on PrimaryMap and IndexMap.
TODO:
[ ] replace each use ofModuleMetadata
and children toArchivedModuleMetadata
or something like:and field access toModuleMetadataWrap
method callsReview