-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Let json:decode/3 keep whitespaces #8809
Let json:decode/3 keep whitespaces #8809
Conversation
CT Test Results 2 files 95 suites 57m 32s ⏱️ Results for commit 5b6e52e. ♻️ This comment has been updated with latest results. To speed up review, make sure that you have read Contributing to Erlang/OTP and that all checks pass. See the TESTING and DEVELOPMENT HowTo guides for details about how to run test locally. Artifacts// Erlang/OTP Github Action Bot |
c142d08
to
ceb3c11
Compare
@michalmuskala Can you take a look.. |
lib/stdlib/src/json.erl
Outdated
@@ -1386,16 +1386,18 @@ object_key(_, Original, Skip, Acc, Stack, Decode) -> | |||
|
|||
continue(<<Rest/bits>>, Original, Skip, Acc, Stack0, Decode, Value) -> | |||
case Stack0 of | |||
[] -> terminate(Rest, Original, Skip, Acc, Value); | |||
[] -> terminate(Rest, Rest, Acc, Value); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This breaks the binary pattern optimisation since the tail is now used in two paces - I'm fairly sure it will significantly affect performance
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From what we could see, looking at the beam asm,
the only difference was that it creates a sub-binary in the case when Stack = [],
which should be ok since that is only for the last "term" in the json string.
But I'm not fluent in beam-asm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks fine, the optimization is disabled for this particular call but not the others. Should this become a performance problem, it's not a huge deal for the compiler to allow the reused Rest
context to be passed alongside a new tail binary representing Rest
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pushed up a new variant that postpones the creation and only does when necessary.
`json:decode/3` always stripped leading whitespaces in the `Rest` binary, which could be problematic if user expected them. E.g `json:decode(<<"foo\n bar">>, ok, #{})` returned: `{<<"foo">>, ok, <<"bar">>}` instead of `{<<"foo">>, ok, <<"\n bar">>}`. If `Rest` only contains whitespaces they are removed, so that the user can match on empty binary to know if they should continue the decoding loop. E.g `json:decode(<<"foo\n ">>, ok, #{})` still returns: `{<<"foo">>, ok, <<>>}`
ea95878
to
5b6e52e
Compare
json:decode/3
always stripped leading whitespaces in theRest
binary, which could be problematic if user expected them.E.g
json:decode(<<"foo\n bar">>, ok, #{})
returned:{<<"foo">>, ok, <<"bar">>}
instead of{<<"foo">>, ok, <<"\n bar">>}
.If
Rest
only contains whitespaces they are removed, so that the user can match on empty binary to know if they should continue the decoding loop.E.g
json:decode(<<"foo\n ">>, ok, #{})
still returns:{<<"foo">>, ok, <<>>}