Optimize JSON parsing a bit by asterite · Pull Request #14366 · crystal-lang/crystal

asterite · 2024-03-15T22:29:42Z

Two things here:

Read string/IO byte per byte instead of char per char
When trying to parse a string in the IO case, try to use the peek buffer if there's one

I used this Ruby/Crystal file to generate a big JSON:

json = %({"foo": 1, "bar": 2, "hello": "this will have an escape in it\\\"oh well", "banana": true, "array": [1, 2, 3, "a string that is a bit long"], "hash": {"a": 1, "b": 2, "c": 3}})
file = "[" + ([json] * 100000).join(",") + "]"
File.write("json.json", file)

Here's the benchmark:

require "json"
require "benchmark"

file_io = File.open("json.json")
file_string = File.read("json.json")

Benchmark.ips do |x|
  x.report("JSON.parse (string)") do
    JSON.parse(file_string)
  end
  x.report("JSON.parse (IO)") do
    file_io.pos = 0
    JSON.parse(file_io)
  end
end

Before:

JSON.parse (string)   8.41  (118.97ms) (± 0.86%)  117MB/op        fastest
    JSON.parse (IO)   3.64  (274.72ms) (± 1.99%)  117MB/op   2.31× slower

After:

JSON.parse (string)  11.98  ( 83.50ms) (± 7.70%)  117MB/op        fastest
    JSON.parse (IO)   9.06  (110.41ms) (± 6.26%)  117MB/op   1.32× slower

I think the difference between before and after will be bigger if there are more strings to parse.

A note for the forum thread: I tried parsing that same big file with Ruby 3.1 and Ruby was (slightly) slower: 16 seconds in Crystal vs. 19 seconds in Ruby. This is on a Mac. So I don't know why it was slower in Crystal for OP (maybe in Linux it's different?)

Regarding memory: Crystal requires 117MB to load that entire data into memory. But in Ruby it's the same. So I'm not sure how memory can be further optimized...

Please review carefully! I think tests for when the peek buffer is incomplete or unavailable might not exist right now. Feel free to continue working on top of this PR (pushing commits to this branch or creating another PR from this code).

asterite · 2024-03-16T10:35:54Z

I found another optimization that improved both speed of parsing and the memory it allocates.

This PR made parsing slower by always allocating a string to parse integers and floats from. It also made it consume more memory. I was concerned about it at that time exactly because of that, but I agreed that correctness is more important than performance. However, we can have both! For "small" integers (less than 19 digits, never floats) we can compute the int value and always know that we are doing it right.

Before the last commit:

JSON.parse (string)  10.76  ( 92.92ms) (± 8.00%)  125MB/op        fastest
    JSON.parse (IO)   7.51  (133.21ms) (± 5.99%)  125MB/op   1.43× slower

After the last commit:

JSON.parse (string)  12.09  ( 82.69ms) (± 1.28%)  105MB/op        fastest
    JSON.parse (IO)   8.06  (124.08ms) (± 2.37%)  105MB/op   1.50× slower

So a bit faster and 20MB less, so about 16% percent less in this case.

But the memory allocated here depends on the amount of numbers and how long they are. For example, using this file:

json = "[1234234, 2982374, 3982734, 49827344, 592834, 65825, 723498, 82348, 9239847324, 1082348, 1123498, 122348, 132348, 142347, 152348, 16234283]"
file = "[" + ([json] * 10000).join(",") + "]"
File.write("json.json", file)

running the benchmark, before this PR:

JSON.parse (string)  95.86  ( 10.43ms) (± 1.44%)  13.9MB/op        fastest
    JSON.parse (IO)  42.03  ( 23.79ms) (± 3.17%)  13.9MB/op   2.28× slower

after this PR:

JSON.parse (string) 205.35  (  4.87ms) (± 1.11%)  9.06MB/op        fastest
    JSON.parse (IO)  76.83  ( 13.02ms) (± 1.18%)  9.06MB/op   2.67× slower

so 5 MB less than before, from a total of 14MB, that's about 30% less memory!

crysbot · 2024-03-16T10:36:22Z

This pull request has been mentioned on Crystal Forum. There might be relevant details there:

https://forum.crystal-lang.org/t/performance-issues-with-the-json-parser/6678/17

asterite · 2024-03-16T17:51:06Z

I think one of the "peek" branches is wrong. I'll see if I can merge it. I don't have a test to reproduce it yet, just one scenario I ran into.

philipp-kempgen · 2024-03-17T00:38:19Z

A note for the forum thread: I tried parsing that same big file with Ruby 3.1 and Ruby was (slightly) slower: 16 seconds in Crystal vs. 19 seconds in Ruby. This is on a Mac. So I don't know why it was slower in Crystal for OP

Let's make the JSON data contain some long strings. They are Base64-encoded in my case, but that doesn't matter.

gen-json.rb:

json = %|{"a_base64":#{("a" * 5000).inspect},"b_base64":#{("a" * 10000).inspect}}|
json = "[" + ([json] * 2500).join(",") + "]"
File.write("json.json", json)

benchmark-pk.cr:

require "json"
require "benchmark"

file_io = File.open("json.json")
file_string = File.read("json.json")

Benchmark.bm do |x|
  x.report("JSON.parse (string)") do
    JSON.parse(file_string)
  end
end

benchmark-pk.rb:

require "json"
require "benchmark"

file_io = File.open("json.json")
str = File.read("json.json")

Benchmark.bm(19) do |x|
  x.report("JSON.parse(str)") do
    JSON.parse(str)
  end
end

$ crystal build --release benchmark-pk.cr
$ /usr/bin/time -l ./benchmark-pk
                          user     system      total        real
JSON.parse (string)   0.174718   0.005743   0.180461 (  0.180668)
        0,20 real         0,18 user         0,02 sys
            94814208  maximum resident set size
                   0  average shared memory size
                   0  average unshared data size
                   0  average unshared stack size
                5877  page reclaims
                   0  page faults
                   0  swaps
                   0  block input operations
                   0  block output operations
                   0  messages sent
                   0  messages received
                   0  signals received
                   0  voluntary context switches
                  39  involuntary context switches
          1868924524  instructions retired
           591412605  cycles elapsed
            93389632  peak memory footprint

That's before your changes, on Apple aarch64/arm64.

$ /usr/bin/time -l ruby ./benchmark-pk.rb
                          user     system      total        real
JSON.parse(str)       0.063432   0.003553   0.066985 (  0.067092)
        0,12 real         0,10 user         0,01 sys
            90554368  maximum resident set size
                   0  average shared memory size
                   0  average unshared data size
                   0  average unshared stack size
                5642  page reclaims
                   0  page faults
                   0  swaps
                   0  block input operations
                   0  block output operations
                   0  messages sent
                   0  messages received
                   0  signals received
                   0  voluntary context switches
                  43  involuntary context switches
          1390183537  instructions retired
           367126927  cycles elapsed
            86574272  peak memory footprint

That's using Ruby 3.3.0.
The Ruby version is faster by a factor of 0.181/0.067 = 2.70.

crysbot · 2024-03-17T00:44:09Z

This pull request has been mentioned on Crystal Forum. There might be relevant details there:

https://forum.crystal-lang.org/t/performance-issues-with-the-json-parser/6678/20

jzakiya · 2024-03-17T01:07:42Z

Starting with Ruby 3.3 you can enable YJIT at runtime.
It would be nice to see the w/wo YJIT times for this.

I now put the following Ruby snippet at the top of Ruby files to
automatically enable YJIT when using Ruby >=3.3.

# Enable YJIT if using CRuby >= 3.3"
RubyVM::YJIT.enable if RUBY_ENGINE == 'ruby' and RUBY_VERSION.to_f >= 3.3

philipp-kempgen · 2024-03-17T01:14:06Z

Starting with Ruby 3.3 you can enable YJIT at runtime. It would be nice to see the w/wo YJIT times for this.

I get the same times with or without YJIT.

philipp-kempgen · 2024-03-17T01:22:12Z

And just for the record, here's my benchmark in Ruby with Oj:

require "benchmark"
require "json"
require "oj"

file_io = File.open("json.json")
str = File.read("json.json")

Benchmark.bm(19) do |x|
  x.report("JSON.parse(str)") do
    JSON.parse(str)
  end
  x.report("Oj.load(str)") do
    Oj.load(str)
  end
end

                          user     system      total        real
JSON.parse(str)       0.063371   0.003690   0.067061 (  0.067182)
Oj.load(str)          0.028192   0.003248   0.031440 (  0.031523)

i.e. Oj is faster than the Crystal version by a factor of 0.181/0.032 = 5.66.

asterite · 2024-03-17T12:38:31Z

There are more improvements to be made here. I'd like to send them one by one in small PRs. If I put them all together here chances of this being merged are very small.

asterite · 2024-03-18T14:11:17Z

That said, I don't think we'll reach the level of optimization of Oj. The C code is pretty hand-crafted. We could try to do the same but that file has copyright...

crysbot · 2024-03-21T01:10:27Z

This pull request has been mentioned on Crystal Forum. There might be relevant details there:

https://forum.crystal-lang.org/t/performance-issues-with-the-json-parser/6678/25

straight-shoota · 2024-03-17T17:28:15Z

src/json/lexer/string_based.cr

 # :nodoc:
 class JSON::Lexer::StringBased < JSON::Lexer
-  def initialize(string)
+  def initialize(string : String)


polish: This could be @string : String.

straight-shoota · 2024-03-21T19:53:30Z

src/json/lexer/io_based.cr

+
+    pos = 0
+
+    while true


thought: I'm wondering if the strategy here could be based on byte search (peek.index('"')) instead? The implementation for that can be more efficient than iterating over each byte. Slice#index on a byte buffer is backed by memchr, so it depends on how much the libc implementation is optimized.

We still need to sanity check for escape sequences and unallowed characters in the potential string, though. So that would reduce effectiveness. I think it could overall be better that way, but I'm not sure. Just wanted to leave this thought here. It should be good to take the current implementation for now.

Might be true. When I needed to find the end of a JSON string in a Bytes, I came up with something like this:

def json_bytes_end_of_string_index( haystack : Bytes, offset : Int32 = 0 ) : Int32? index = haystack.index( '"'.ord.to_u8, offset ) # memchr() return nil if ! index return index unless haystack[ index - 1 ] == '\\'.ord.to_u8 # Fall back to the more complete bytewise implementation below. # ... end

I did not check if it makes sense to use strpbrk() to find the next "interesting" byte, such as quote, backslash, ...

That's the other optimization I was thinking. It makes string parsing much faster. I didn't want to include it in this PR though!

crysbot · 2024-04-17T19:58:09Z

This pull request has been mentioned on Crystal Forum. There might be relevant details there:

https://forum.crystal-lang.org/t/stringpool-make-the-hash-key-part-of-the-public-api/6766/5

This reverts commit 9ef6366.

Ary Borenszweig added 4 commits March 15, 2024 18:19

JSON: lex per byte, not per char (string case)

8c3bd22

JSON: lex per byte, not per char (IO case)

fe846a8

JSON (IO): try to use peek buffer

31df1ee

Continue parsing after escape slash

c083651

Blacksmoke16 added performance topic:stdlib:serialization labels Mar 15, 2024

No need to store and parse string value for "small" integers

b69dbcf

asterite marked this pull request as draft March 16, 2024 17:51

Ary Borenszweig added 2 commits March 16, 2024 14:58

Use peek bufffer before skipping

48d7b48

Less times incrementing column number

69c8a20

asterite marked this pull request as ready for review March 16, 2024 18:16

straight-shoota reviewed Mar 21, 2024

View reviewed changes

straight-shoota approved these changes Apr 29, 2024

View reviewed changes

straight-shoota added this to the 1.13.0 milestone May 11, 2024

straight-shoota merged commit 9ef6366 into master May 14, 2024

straight-shoota deleted the optimize-json branch May 14, 2024 11:12

BrewTestBot mentioned this pull request Jul 10, 2024

crystal 1.13.0 Homebrew/homebrew-core#176873

Merged

1 task

Blacksmoke16 mentioned this pull request Jul 10, 2024

1.13.0 - a big problem with UTF strings after JSON.parse when some special symbols present; broken encoding #14803

Closed

straight-shoota added a commit that referenced this pull request Jul 10, 2024

Revert "Optimize JSON parsing a bit (#14366)"

1bdd91a

This reverts commit 9ef6366.

straight-shoota mentioned this pull request Jul 10, 2024

Revert "Optimize JSON parsing a bit" #14804

Merged

straight-shoota added a commit that referenced this pull request Jul 11, 2024

Revert "Optimize JSON parsing a bit (#14366)" (#14804)

2204be7

This reverts commit 9ef6366.

Uh oh!

Conversation

asterite commented Mar 15, 2024

Uh oh!

asterite commented Mar 16, 2024

Uh oh!

crysbot commented Mar 16, 2024

Uh oh!

asterite commented Mar 16, 2024

Uh oh!

philipp-kempgen commented Mar 17, 2024

Uh oh!

crysbot commented Mar 17, 2024

Uh oh!

jzakiya commented Mar 17, 2024

Uh oh!

philipp-kempgen commented Mar 17, 2024

Uh oh!

philipp-kempgen commented Mar 17, 2024

Uh oh!

asterite commented Mar 17, 2024

Uh oh!

asterite commented Mar 18, 2024

Uh oh!

crysbot commented Mar 21, 2024

Uh oh!

straight-shoota Mar 17, 2024

Choose a reason for hiding this comment

Uh oh!

straight-shoota Mar 21, 2024

Choose a reason for hiding this comment

Uh oh!

philipp-kempgen Mar 21, 2024

Choose a reason for hiding this comment

Uh oh!

asterite Mar 30, 2024

Choose a reason for hiding this comment

Uh oh!

crysbot commented Apr 17, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants