Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize binary get_number implementation by reading multiple bytes at once #4391

Draft
wants to merge 2 commits into
base: develop
Choose a base branch
from

Conversation

TianyiChen
Copy link

@TianyiChen TianyiChen commented Jun 7, 2024

This PR improves performance for get_number implementation by reading multiple bytes at once, which saves calling overhead especially when interacting with file I/O. It adds get_elements to input adapters and allow them to select more efficient underlying calls to read multiple bytes if available

Performance for reading msgpack from C style FILE

develop

FromMsgpack/floats             166717521 ns    163715750 ns            4 bytes_per_second=52.4267Mi/s
FromMsgpack/signed_ints        171707844 ns    167527000 ns            4 bytes_per_second=51.234Mi/s
FromMsgpack/unsigned_ints      167388344 ns    164910750 ns            4 bytes_per_second=52.0468Mi/s
FromMsgpack/small_signed_ints  102578387 ns    100799000 ns            7 bytes_per_second=46.3689Mi/s

branch:

FromMsgpack/floats              61241576 ns     60370091 ns           11 bytes_per_second=142.174Mi/s
FromMsgpack/signed_ints         67659257 ns     62698083 ns           12 bytes_per_second=136.895Mi/s
FromMsgpack/unsigned_ints       59830260 ns     57518000 ns           13 bytes_per_second=149.224Mi/s
FromMsgpack/small_signed_ints   63264757 ns     61850167 ns           12 bytes_per_second=75.5688Mi/s

Questions:

  • for get_elements in wide_string_input_adapter, I encountered a compile error without it. If I don't implement it, tests are passing locally, which seems to make sense: if wchar is used, likely the content has non-ASCII text and we won't want to interpret it as binary numbers, where get_elements is currently used. I am not sure are we expected to fallback to reading each char one-by-one or doing something else. Currently it just falls back to the get_character method.
  • the current benchmark doesn't contain one which reads/writes binary from file, so the benchmark won't change with the PR unless those tests are included, shall we add them like Reading multiple bytes from an input adapter? #4389 (comment) ?

Pull request checklist

Read the Contribution Guidelines for detailed information.

  • Changes are described in the pull request, or an existing issue is referenced.
  • The test suite compiles and runs without error.
  • Code coverage is 100%. Test cases can be added by editing the test suite.
  • The source code is amalgamated; that is, after making changes to the sources in the include/nlohmann directory, run make amalgamate to create the single-header files single_include/nlohmann/json.hpp and single_include/nlohmann/json_fwd.hpp. The whole process is described here.

Please don't

  • The C++11 support varies between different compilers and versions. Please note the list of supported compilers. Some compilers like GCC 4.7 (and earlier), Clang 3.3 (and earlier), or Microsoft Visual Studio 13.0 and earlier are known not to work due to missing or incomplete C++11 support. Please refrain from proposing changes that work around these compiler's limitations with #ifdefs or other means.
  • Specifically, I am aware of compilation problems with Microsoft Visual Studio (there even is an issue label for this kind of bug). I understand that even in 2016, complete C++11 support isn't there yet. But please also understand that I do not want to drop features or uglify the code just to make Microsoft's sub-standard compiler happy. The past has shown that there are ways to express the functionality such that the code compiles with the most recent MSVC - unfortunately, this is not the main objective of the project.
  • Please refrain from proposing changes that would break JSON conformance. If you propose a conformant extension of JSON to be supported by the library, please motivate this extension.
  • Please do not open pull requests that address multiple issues.

@github-actions github-actions bot added the L label Jun 7, 2024
@TianyiChen TianyiChen force-pushed the msgpack-int branch 2 times, most recently from 11070e5 to b25a53f Compare June 7, 2024 01:13
Copy link

github-actions bot commented Jun 7, 2024

🔴 Amalgamation check failed! 🔴

The source code has not been amalgamated. @TianyiChen
Please read and follow the Contribution Guidelines.

@coveralls
Copy link

Coverage Status

coverage: 99.951% (-0.05%) from 100.0%
when pulling ea8b03d on TianyiChen:msgpack-int
into 8c391e0 on nlohmann:develop.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants