Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

missleading benchmarks #26

Open
Licenser opened this issue Sep 5, 2023 · 4 comments
Open

missleading benchmarks #26

Licenser opened this issue Sep 5, 2023 · 4 comments

Comments

@Licenser
Copy link

Licenser commented Sep 5, 2023

Hi,

I wanted to suggest changing the benchmark output slightly, as they are presented, it is somewhat misleading.

The way serde_json and simd-json treat the Dom is very different from how simdjson treats the Dom. Both are valid tradeoffs to make, but comparing them is not very meaningful.

Both serde_json and simd-json when presenting a Dom create a nested data structure that is modifiable and has indexed maps - a data structure on its own. That comes at the cost of allocations and filling data structures, but it's a valid tradeoff when either map are accessed frequently, or the date needs to be modified.

simdjson presents a pointer to the tape as a Dom, which means it does not perform extra allocations but does not allow mutations, and lookups are always in linear time.

Again, both are valid tradeoffs for different use cases. However, comparing them is problematic as what we compare isn't the same result.

I think the best way would be to create a third category aside of Dom, Struct called Tape, which is the fully validated JSON but not put in a nested data structure. serde_json does not provide an interface like that, simd-json does provide to_tape which provides an equivalent data structure to simdjson but without the nicer access functions (so that should be easy to implement oneself or add).

@SunDoge
Copy link
Owner

SunDoge commented Sep 6, 2023

Thanks for your reminder. I'll convert the dom to serde_json's Value and rebenchmark it.

@aminya
Copy link
Contributor

aminya commented Sep 6, 2023

@SunDoge I think both should be included in the benchmarks. Converting might not be needed for all the applications.

@Licenser
Copy link
Author

Licenser commented Sep 7, 2023

Ja both is definetly the best, and if not all libraries support all target formats isn't a big issue

@Licenser
Copy link
Author

FWIW simd-json has now DOM like read-only access to the tape so it would be possible to include the DOM versions in the benchmark as well

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants