encoding/json: performance slower than expected #5683

gopherbot · 2013-06-11T18:24:19Z

by reid.write:

STR:
1. clone the git repository here: [email protected]:tarasglek/jsonbench.git
2. Generate some sample JSON data using the instructions in the README
3. Run the go json benchmark which is in gojson/src/jsonbench/json.go

What is the expected output?
I expected to see performance roughly in line with Java (using Jackson for json
parsing). On my test machine, the Java benchmark results in the following:
Processing 436525928 bytes took 5491 ms (75.82 MB/s)

What do you see instead?
Significantly slower performance, using the same input file:
Duration: 27.497043818s, 15.14 MB/s

Which compiler are you using (5g, 6g, 8g, gccgo)?
Not sure

Which operating system are you using?
Linux dell-ubuntu 3.2.0-45-generic #70-Ubuntu SMP Wed May 29 20:12:06 UTC 2013 x86_64
x86_64 x86_64 GNU/Linux

Which version are you using?  (run 'go version')
go version go1.1 linux/amd64

rsc · 2013-06-14T13:08:07Z

Comment 1:

If you want us to investigate further, please attach a data file and a program.
I tried looking in your github repo but I couldn't figure out how to generate a data
file (enable telemetry means nothing to me) and it's probably good for us to have the
same data file anyway. 
I did look at the sources. Your Java program is just tokenizing the JSON, not building a
data structure. Go is actually building a data structure for you. So you are comparing
apples and oranges.
A more equal comparison would be to have Go unmarshal into
var x struct{}
dec.Decode(&x)
which will parse the JSON but throw away all the data. 
In practice, you should be parsing into a real struct anyway. I think you'll find that
case is much nicer in Go than in Java, and probably competitive in speed.

Status changed to WaitingForReply.

gopherbot · 2013-06-14T18:15:18Z

Comment 2 by reid.write:

Thanks for the feedback!
You're right, it's much faster with the unmarshaling approach you recommended:
Duration: 10.034308451s, 47.98 MB/s
I will update the Java test to be more apples to apples.  The Java version is basically
only doing JSON Validation, so I'll modify it to parse the actual document structure.
In the meantime, I've added a new make target `make sample-data` that will generate a
~500MB test file, as well as a few more make targets to simplify running the various
tests.
Again, Go's performance when actually parsing the objects is:
Duration: 35.149537439s, 14.27 MB/s
For reference, here is the performance for a few of the other languages/runtimes on my
machine:
Javascript (spidermonkey):
39.550669327341836MB/s 525822000bytes in 12.679s
Python (simplejson):
37 MB/s 525831000bytes in 13seconds
C++ (rapidjson):
163 MB/s 525831000 bytes in 3 seconds

davecheney · 2013-06-27T11:18:46Z

Comment 3:

@reid - is there anything left to do on this ticket ?

gopherbot · 2013-07-04T14:35:36Z

Comment 4 by reid.write:

I've made changes to the Java benchmark to make for a more apples-to-apples comparison. 
Go is still significantly slower than most of the other versions, and it should be
fairly straightforward to actually run the tests now (make sample-data; make java; make
python; make go)
If someone is able to look into improving json parsing performance, that would be much
appreciated!

adg · 2013-07-04T23:10:48Z

Comment 5:

If you don't publish your benchmark data and methodology (code), then your complaints
aren't actionable.
Anyone working on this needs to be making the same comparison. Saying "make it faster"
is not helpful.

alberts · 2013-07-05T06:11:22Z

Comment 6:

It's right there in the first line, isn't it?
https://github.com/tarasglek/jsonbench

gopherbot · 2013-07-09T19:01:21Z

Comment 7 by reid.write:

The motivation for filing this issue is that we are logging terabytes of Firefox
performance data in (compressed) JSON format and we'd really like to be able to process
& analyze this data set using a modern language like Go instead of Java / C++.
Here are the latest steps to reproduce:
1. clone (or update) the git repository here: git clone
[email protected]:tarasglek/jsonbench.git
2. Generate some sample JSON data: make sample-data
3. Run the go json benchmark: make go
Optionally, you can run some of the other language benchmarks using "make python", "make
java", etc.
Here are the results I get on my test machine:
make go: 14.36 MB/s in 34.932531708s
make python_simplejson: 39 MB/s 525831000bytes in 12seconds
make java: 48.82 MB/s 525822000 bytes in 10.27 s
Note: The # bytes is different for java because it's not counting end-of-line characters.

rsc · 2013-07-09T19:14:28Z

Comment 8:

You are measuring something that is likely not relevant to the final
program, namely unmarshaling into a generic data type
(map[string]interface{}).
Almost any Go program doing JSON for data structure processing unmarshals
into a struct. Even if the struct lists every field in the input, that form
will be more compact in memory and therefore faster to create.
Russ

gopherbot · 2013-09-14T14:00:29Z

Comment 9 by runner.mei:

I have the same feeling, I found the problem in the json/stream.go, as following:
   155  func (enc *Encoder) Encode(v interface{}) error {
   156      if enc.err != nil {
   157          return enc.err
   158      }
   159      enc.e.Reset()
   160      err := enc.e.marshal(v)
   161      if err != nil {
   162          return err
   163      }
   164  
   165      // Terminate each value with a newline.
   166      // This makes the output look a little nicer
   167      // when debugging, and some kind of space
   168      // is required if the encoded value was a number,
   169      // so that the reader knows there aren't more
   170      // digits coming.
   171      enc.e.WriteByte('\n')
   172  
---------------------------- here ----------------------------------
   173      if _, err = enc.w.Write(enc.e.Bytes()); err != nil {
   174          enc.err = err
   175      }
   176      return err
   177  }
In general the "w" is the buffer,  Write method had an unnecessary memory copy

rsc · 2013-11-27T18:45:41Z

Comment 10:

Labels changed: added go1.3maybe.

rsc · 2013-12-04T01:31:04Z

Comment 11:

Labels changed: added release-none, removed go1.3maybe.

rsc · 2013-12-04T01:50:32Z

Comment 12:

Labels changed: added repo-main.

gaurish · 2015-12-17T05:48:21Z

is there any work planned on this?

bradfitz · 2015-12-17T05:57:36Z

Nobody is working on it.

kevinburke · 2016-06-27T00:07:09Z

FWIW, I've resurrected the package in question, added structs so it's not just parsing into a struct{}, and added benchmarks around it that follow the format of the benchmarks in the standard library. Those are here: https://github.com/kevinburke/jsonbench/blob/master/go/bench_test.go

The previous check for characters inside of a JSON string that needed to be escaped performed seven different boolean comparisons before determining that a ASCII character did not need to be escaped. Most characters do not need to be escaped, so this check can be done in a more performant way. Use the same strategy as the unicode package for precomputing a range of characters that need to be escaped, then do a single lookup into a character array to determine whether the character needs escaping. On an AWS c4.large node: $ benchstat benchmarks/master-bench benchmarks/json-table-bench name old time/op new time/op delta CodeEncoder-2 19.0ms ± 0% 15.5ms ± 1% -18.16% (p=0.000 n=19+20) CodeMarshal-2 20.1ms ± 1% 16.8ms ± 2% -16.35% (p=0.000 n=20+21) CodeDecoder-2 49.3ms ± 1% 49.5ms ± 2% ~ (p=0.498 n=16+20) DecoderStream-2 416ns ± 0% 416ns ± 1% ~ (p=0.978 n=19+19) CodeUnmarshal-2 51.0ms ± 1% 50.9ms ± 1% ~ (p=0.490 n=19+17) CodeUnmarshalReuse-2 48.5ms ± 2% 48.5ms ± 2% ~ (p=0.989 n=20+19) UnmarshalString-2 541ns ± 1% 532ns ± 1% -1.75% (p=0.000 n=20+21) UnmarshalFloat64-2 485ns ± 1% 481ns ± 1% -0.92% (p=0.000 n=20+21) UnmarshalInt64-2 429ns ± 1% 427ns ± 1% -0.49% (p=0.000 n=19+20) Issue10335-2 631ns ± 1% 619ns ± 1% -1.84% (p=0.000 n=20+20) NumberIsValid-2 19.1ns ± 0% 19.1ns ± 0% ~ (all samples are equal) NumberIsValidRegexp-2 689ns ± 1% 690ns ± 0% ~ (p=0.150 n=20+20) SkipValue-2 14.0ms ± 0% 14.0ms ± 0% -0.05% (p=0.000 n=18+18) EncoderEncode-2 525ns ± 2% 512ns ± 1% -2.33% (p=0.000 n=20+18) name old speed new speed delta CodeEncoder-2 102MB/s ± 0% 125MB/s ± 1% +22.20% (p=0.000 n=19+20) CodeMarshal-2 96.6MB/s ± 1% 115.6MB/s ± 2% +19.56% (p=0.000 n=20+21) CodeDecoder-2 39.3MB/s ± 1% 39.2MB/s ± 2% ~ (p=0.464 n=16+20) CodeUnmarshal-2 38.1MB/s ± 1% 38.1MB/s ± 1% ~ (p=0.525 n=19+17) SkipValue-2 143MB/s ± 0% 143MB/s ± 0% +0.05% (p=0.000 n=18+18) I also took the data set reported in golang#5683 (browser telemetry data from Mozilla), added named structs for the data set, and turned it into a proper benchmark: https://github.com/kevinburke/jsonbench/blob/master/go/bench_test.go The results from that test are similarly encouraging. On a 64-bit Mac: $ benchstat benchmarks/master-benchmark benchmarks/json-table-benchmark name old time/op new time/op delta CodeMarshal-4 1.19ms ± 2% 1.08ms ± 2% -9.33% (p=0.000 n=21+17) Unmarshal-4 3.09ms ± 3% 3.06ms ± 1% -0.83% (p=0.027 n=22+17) UnmarshalReuse-4 3.04ms ± 1% 3.04ms ± 1% ~ (p=0.169 n=20+15) name old speed new speed delta CodeMarshal-4 80.3MB/s ± 1% 88.5MB/s ± 1% +10.29% (p=0.000 n=21+17) Unmarshal-4 31.0MB/s ± 2% 31.2MB/s ± 1% +0.83% (p=0.025 n=22+17) On the c4.large: $ benchstat benchmarks/master-bench benchmarks/json-table-bench name old time/op new time/op delta CodeMarshal-2 1.10ms ± 1% 0.98ms ± 1% -10.12% (p=0.000 n=20+54) Unmarshal-2 2.82ms ± 1% 2.79ms ± 0% -1.09% (p=0.000 n=20+51) UnmarshalReuse-2 2.80ms ± 0% 2.77ms ± 0% -1.03% (p=0.000 n=20+52) name old speed new speed delta CodeMarshal-2 87.3MB/s ± 1% 97.1MB/s ± 1% +11.27% (p=0.000 n=20+54) Unmarshal-2 33.9MB/s ± 1% 34.2MB/s ± 0% +1.10% (p=0.000 n=20+51) For what it's worth, I tried other heuristics - short circuiting the conditional for common ASCII characters, for example: if (b >= 63 && b != 92) || (b >= 39 && b <= 59) || (rest of the conditional) This offered a speedup around 7-9%, not as large as the submitted change. Change-Id: Idcf88f7b93bfcd1164cdd6a585160b7e407a0d9b

gopherbot · 2016-06-27T05:00:26Z

CL https://golang.org/cl/24466 mentions this issue.

The previous check for characters inside of a JSON string that needed to be escaped performed seven different boolean comparisons before determining that a ASCII character did not need to be escaped. Most characters do not need to be escaped, so this check can be done in a more performant way. Use the same strategy as the unicode package for precomputing a range of characters that need to be escaped, then do a single lookup into a character array to determine whether the character needs escaping. On an AWS c4.large node: $ benchstat benchmarks/master-bench benchmarks/json-table-bench name old time/op new time/op delta CodeEncoder-2 19.0ms ± 0% 15.5ms ± 1% -18.16% (p=0.000 n=19+20) CodeMarshal-2 20.1ms ± 1% 16.8ms ± 2% -16.35% (p=0.000 n=20+21) CodeDecoder-2 49.3ms ± 1% 49.5ms ± 2% ~ (p=0.498 n=16+20) DecoderStream-2 416ns ± 0% 416ns ± 1% ~ (p=0.978 n=19+19) CodeUnmarshal-2 51.0ms ± 1% 50.9ms ± 1% ~ (p=0.490 n=19+17) CodeUnmarshalReuse-2 48.5ms ± 2% 48.5ms ± 2% ~ (p=0.989 n=20+19) UnmarshalString-2 541ns ± 1% 532ns ± 1% -1.75% (p=0.000 n=20+21) UnmarshalFloat64-2 485ns ± 1% 481ns ± 1% -0.92% (p=0.000 n=20+21) UnmarshalInt64-2 429ns ± 1% 427ns ± 1% -0.49% (p=0.000 n=19+20) Issue10335-2 631ns ± 1% 619ns ± 1% -1.84% (p=0.000 n=20+20) NumberIsValid-2 19.1ns ± 0% 19.1ns ± 0% ~ (all samples are equal) NumberIsValidRegexp-2 689ns ± 1% 690ns ± 0% ~ (p=0.150 n=20+20) SkipValue-2 14.0ms ± 0% 14.0ms ± 0% -0.05% (p=0.000 n=18+18) EncoderEncode-2 525ns ± 2% 512ns ± 1% -2.33% (p=0.000 n=20+18) name old speed new speed delta CodeEncoder-2 102MB/s ± 0% 125MB/s ± 1% +22.20% (p=0.000 n=19+20) CodeMarshal-2 96.6MB/s ± 1% 115.6MB/s ± 2% +19.56% (p=0.000 n=20+21) CodeDecoder-2 39.3MB/s ± 1% 39.2MB/s ± 2% ~ (p=0.464 n=16+20) CodeUnmarshal-2 38.1MB/s ± 1% 38.1MB/s ± 1% ~ (p=0.525 n=19+17) SkipValue-2 143MB/s ± 0% 143MB/s ± 0% +0.05% (p=0.000 n=18+18) I also took the data set reported in #5683 (browser telemetry data from Mozilla), added named structs for the data set, and turned it into a proper benchmark: https://github.com/kevinburke/jsonbench/blob/master/go/bench_test.go The results from that test are similarly encouraging. On a 64-bit Mac: $ benchstat benchmarks/master-benchmark benchmarks/json-table-benchmark name old time/op new time/op delta CodeMarshal-4 1.19ms ± 2% 1.08ms ± 2% -9.33% (p=0.000 n=21+17) Unmarshal-4 3.09ms ± 3% 3.06ms ± 1% -0.83% (p=0.027 n=22+17) UnmarshalReuse-4 3.04ms ± 1% 3.04ms ± 1% ~ (p=0.169 n=20+15) name old speed new speed delta CodeMarshal-4 80.3MB/s ± 1% 88.5MB/s ± 1% +10.29% (p=0.000 n=21+17) Unmarshal-4 31.0MB/s ± 2% 31.2MB/s ± 1% +0.83% (p=0.025 n=22+17) On the c4.large: $ benchstat benchmarks/master-bench benchmarks/json-table-bench name old time/op new time/op delta CodeMarshal-2 1.10ms ± 1% 0.98ms ± 1% -10.12% (p=0.000 n=20+54) Unmarshal-2 2.82ms ± 1% 2.79ms ± 0% -1.09% (p=0.000 n=20+51) UnmarshalReuse-2 2.80ms ± 0% 2.77ms ± 0% -1.03% (p=0.000 n=20+52) name old speed new speed delta CodeMarshal-2 87.3MB/s ± 1% 97.1MB/s ± 1% +11.27% (p=0.000 n=20+54) Unmarshal-2 33.9MB/s ± 1% 34.2MB/s ± 0% +1.10% (p=0.000 n=20+51) For what it's worth, I tried other heuristics - short circuiting the conditional for common ASCII characters, for example: if (b >= 63 && b != 92) || (b >= 39 && b <= 59) || (rest of the conditional) This offered a speedup around 7-9%, not as large as the submitted change. Change-Id: Idcf88f7b93bfcd1164cdd6a585160b7e407a0d9b Reviewed-on: https://go-review.googlesource.com/24466 Reviewed-by: Joe Tsai <[email protected]> Run-TryBot: Joe Tsai <[email protected]> TryBot-Result: Gobot Gobot <[email protected]>

valyala · 2018-05-29T21:57:56Z

I would also point out there are tools for generating custom serializers and deserializers which should allow you to greatly exceed the performance of the generic JSON library for any specific object.

As an example I added fastjson benchmark to the https://github.com/kevinburke/jsonbench/blob/master/go/bench_test.go and got the following results on go tip:

$ GOMAXPROCS=1 go test -bench=Unmarshal
goos: linux
goarch: amd64
pkg: github.com/kevinburke/jsonbench/go
BenchmarkUnmarshal         	     500	   2715217 ns/op	  35.23 MB/s	  462379 B/op	    7256 allocs/op
BenchmarkUnmarshalFastjson 	   10000	    205919 ns/op	 464.57 MB/s	     132 B/op	       0 allocs/op
BenchmarkUnmarshalReuse    	     500	   2808449 ns/op	  34.06 MB/s	  435406 B/op	    7220 allocs/op

The code for the added benchmark:

func BenchmarkUnmarshalFastjson(b *testing.B) {
        if codeJSON == nil {
                b.StopTimer()
                codeInit()
                b.StartTimer()
        }
        b.ReportAllocs()
        var p fastjson.Parser
        for i := 0; i < b.N; i++ {
                if _, err := p.ParseBytes(codeJSON); err != nil {
                        b.Fatal("Unmarshal:", err)
                }
        }
        b.SetBytes(int64(len(codeJSON)))
}

as · 2018-05-30T01:20:55Z

The primary benefit I see from having a "faster" library is that it will reduce the number of third-party JSON parsers.

These parsers prioritize speed over simplicity, safety, compatibility, and security. It would be nice not to encounter the same 10-25 packages with the same micro-benchmark leaflets every so often. Going through the task of deleting them from a codebase requires a lot of patience and is often time consuming even with special-purpose tooling for the task.

gopherbot · 2018-07-07T14:24:52Z

Change https://golang.org/cl/122459 mentions this issue: encoding/json: tweak the size of the safe tables

gopherbot · 2018-07-07T17:23:42Z

Change https://golang.org/cl/122460 mentions this issue: encoding/json: encode struct field names ahead of time

gopherbot · 2018-07-07T21:01:47Z

Change https://golang.org/cl/122462 mentions this issue: encoding/json: remove alloc when encoding short byte slices

gopherbot · 2018-07-08T16:00:56Z

Change https://golang.org/cl/122468 mentions this issue: encoding/json: various minor speed-ups

gopherbot · 2018-07-08T16:00:56Z

Change https://golang.org/cl/122467 mentions this issue: encoding/json: defer error context work until necessary

gopherbot · 2018-07-22T12:57:05Z

Change https://golang.org/cl/125418 mentions this issue: encoding/json: prepare struct field names as []byte

gopherbot · 2018-07-22T12:57:05Z

Change https://golang.org/cl/125416 mentions this issue: encoding/json: simplify the structEncoder type

gopherbot · 2018-07-22T12:58:50Z

Change https://golang.org/cl/125417 mentions this issue: encoding/json: inline fieldByIndex

Struct field names are static, so we can run HTMLEscape on them when building each struct type encoder. Then, when running the struct encoder, we can select either the original or the escaped field name to write directly. When the encoder is not escaping HTML, using the original string works because neither Go struct field names nor JSON tags allow any characters that would need to be escaped, like '"', '\\', or '\n'. When the encoder is escaping HTML, the only difference is that '<', '>', and '&' are allowed via JSON struct field tags, hence why we use HTMLEscape to properly escape them. All of the above lets us encode field names with a simple if/else and WriteString calls, which are considerably simpler and faster than encoding an arbitrary string. While at it, also include the quotes and colon in these strings, to avoid three WriteByte calls in the loop hot path. Also added a few tests, to ensure that the behavior in these edge cases is not broken. The output of the tests is the same if this optimization is reverted. name old time/op new time/op delta CodeEncoder-4 7.12ms ± 0% 6.14ms ± 0% -13.85% (p=0.004 n=6+5) name old speed new speed delta CodeEncoder-4 272MB/s ± 0% 316MB/s ± 0% +16.08% (p=0.004 n=6+5) name old alloc/op new alloc/op delta CodeEncoder-4 91.9kB ± 0% 93.2kB ± 0% +1.43% (p=0.002 n=6+6) name old allocs/op new allocs/op delta CodeEncoder-4 0.00 0.00 ~ (all equal) Updates #5683. Change-Id: I6f6a340d0de4670799ce38cf95b2092822d2e3ef Reviewed-on: https://go-review.googlesource.com/122460 Run-TryBot: Daniel Martí <[email protected]> TryBot-Result: Gobot Gobot <[email protected]> Reviewed-by: Brad Fitzpatrick <[email protected]>

If the encoded bytes fit in the bootstrap array encodeState.scratch, use that instead of allocating a new byte slice. Also tweaked the Encoding vs Encoder heuristic to use the length of the encoded bytes, not the length of the input bytes. Encoding is used for allocations of up to 1024 bytes, as we measured 2048 to be the point where it no longer provides a noticeable advantage. Also added some benchmarks. Only the first case changes in behavior. name old time/op new time/op delta MarshalBytes/32-4 420ns ± 1% 383ns ± 1% -8.69% (p=0.002 n=6+6) MarshalBytes/256-4 913ns ± 1% 915ns ± 0% ~ (p=0.580 n=5+6) MarshalBytes/4096-4 7.72µs ± 0% 7.74µs ± 0% ~ (p=0.340 n=5+6) name old alloc/op new alloc/op delta MarshalBytes/32-4 112B ± 0% 64B ± 0% -42.86% (p=0.002 n=6+6) MarshalBytes/256-4 736B ± 0% 736B ± 0% ~ (all equal) MarshalBytes/4096-4 7.30kB ± 0% 7.30kB ± 0% ~ (all equal) name old allocs/op new allocs/op delta MarshalBytes/32-4 2.00 ± 0% 1.00 ± 0% -50.00% (p=0.002 n=6+6) MarshalBytes/256-4 2.00 ± 0% 2.00 ± 0% ~ (all equal) MarshalBytes/4096-4 2.00 ± 0% 2.00 ± 0% ~ (all equal) Updates #5683. Change-Id: I5fa55c27bd7728338d770ae7c0756885ba9a5724 Reviewed-on: https://go-review.googlesource.com/122462 Run-TryBot: Daniel Martí <[email protected]> TryBot-Result: Gobot Gobot <[email protected]> Reviewed-by: Brad Fitzpatrick <[email protected]>

Calling Name on a reflect.Type is somewhat expensive, as it involves a number of nested calls and string handling. This cost was showing up when decoding structs, as we were calling it to set up an error context. We can avoid the extra work unless we do encounter an error, which makes decoding via struct types faster. name old time/op new time/op delta CodeDecoder-4 31.0ms ± 1% 29.9ms ± 1% -3.69% (p=0.002 n=6+6) name old speed new speed delta CodeDecoder-4 62.6MB/s ± 1% 65.0MB/s ± 1% +3.83% (p=0.002 n=6+6) Updates #5683. Change-Id: I48a3a85ef0ba96f524b7c3e9096cb2c4589e077a Reviewed-on: https://go-review.googlesource.com/122467 Run-TryBot: Daniel Martí <[email protected]> TryBot-Result: Gobot Gobot <[email protected]> Reviewed-by: Brad Fitzpatrick <[email protected]>

Reuse v.Type() and cachedTypeFields(t) when decoding maps and structs. Always use the same data slices when in hot loops, to ensure that the compiler generates good code. "for i < len(data) { use(d.data[i]) }" makes it harder for the compiler. Finally, do other minor clean-ups, such as deduplicating switch cases, and using a switch instead of three chained ifs. The decoder sees a noticeable speed-up, in particular when decoding structs. name old time/op new time/op delta CodeDecoder-4 29.8ms ± 1% 27.5ms ± 0% -7.83% (p=0.002 n=6+6) name old speed new speed delta CodeDecoder-4 65.0MB/s ± 1% 70.6MB/s ± 0% +8.49% (p=0.002 n=6+6) Updates #5683. Change-Id: I9d751e22502221962da696e48996ffdeb777277d Reviewed-on: https://go-review.googlesource.com/122468 Run-TryBot: Daniel Martí <[email protected]> TryBot-Result: Gobot Gobot <[email protected]> Reviewed-by: Brad Fitzpatrick <[email protected]>

structEncoder had two slices - the list of fields, and a list containing the encoder for each field. structEncoder.encode then looped over the fields, and indexed into the second slice to grab the field encoder. However, this makes it very hard for the compiler to be able to prove that the two slices always have the same length, and that the index expression doesn't need a bounds check. Merge the two slices into one to completely remove the need for bounds checks in the hot loop. While at it, don't copy the field elements when ranging, which greatly speeds up the hot loop in structEncoder. name old time/op new time/op delta CodeEncoder-4 6.18ms ± 0% 5.56ms ± 0% -10.08% (p=0.002 n=6+6) name old speed new speed delta CodeEncoder-4 314MB/s ± 0% 349MB/s ± 0% +11.21% (p=0.002 n=6+6) name old alloc/op new alloc/op delta CodeEncoder-4 93.2kB ± 0% 62.1kB ± 0% -33.33% (p=0.002 n=6+6) Updates #5683. Change-Id: I0dd47783530f439b125e084aede09dda172eb1e8 Reviewed-on: https://go-review.googlesource.com/125416 Run-TryBot: Daniel Martí <[email protected]> TryBot-Result: Gobot Gobot <[email protected]> Reviewed-by: Brad Fitzpatrick <[email protected]>

This function was only used in a single place - in the field encoding loop within the struct encoder. Inlining the function call manually lets us get rid of the call overhead. But most importantly, it lets us simplify the logic afterward. We no longer need to use reflect.Value{} and !fv.IsValid(), as we can skip the field immediately. The two factors combined (mostly just the latter) give a moderate speed improvement to this hot loop. name old time/op new time/op delta CodeEncoder-4 6.01ms ± 1% 5.91ms ± 1% -1.66% (p=0.002 n=6+6) name old speed new speed delta CodeEncoder-4 323MB/s ± 1% 328MB/s ± 1% +1.69% (p=0.002 n=6+6) Updates #5683. Change-Id: I12757c325a68abb2856026cf719c122612a1f38e Reviewed-on: https://go-review.googlesource.com/125417 Run-TryBot: Daniel Martí <[email protected]> Reviewed-by: Brad Fitzpatrick <[email protected]> TryBot-Result: Gobot Gobot <[email protected]>

gopherbot · 2018-08-25T22:57:56Z

Change https://golang.org/cl/131401 mentions this issue: encoding/json: remove a branch in the structEncoder loop

gopherbot · 2018-08-25T22:57:56Z

Change https://golang.org/cl/131400 mentions this issue: encoding/json: avoid some more pointer receivers

A few encoder struct types, such as map and slice, only encapsulate other prepared encoder funcs. Using pointer receivers has no advantage, and makes calling these methods slightly more expensive. Not a huge performance win, but certainly an easy one. The struct types used in the benchmark below contain one slice field and one pointer field. name old time/op new time/op delta CodeEncoder-4 5.48ms ± 0% 5.39ms ± 0% -1.66% (p=0.010 n=6+4) name old speed new speed delta CodeEncoder-4 354MB/s ± 0% 360MB/s ± 0% +1.69% (p=0.010 n=6+4) Updates #5683. Change-Id: I9f78dbe07fcc6fbf19a6d96c22f5d6970db9eca4 Reviewed-on: https://go-review.googlesource.com/131400 Run-TryBot: Daniel Martí <[email protected]> Reviewed-by: Brad Fitzpatrick <[email protected]> TryBot-Result: Gobot Gobot <[email protected]>

Encoders like map and array can use the much cheaper "i > 0" check to see if we're not writing the first element. However, since struct fields support omitempty, we need to keep track of that separately. This is much more expensive - after calling the field encoder itself, and retrieving the field via reflection, this branch was the third most expensive piece of this field loop. Instead, hoist the branch logic outside of the loop. The code doesn't get much more complex, since we just delay the writing of each byte until the next iteration. Yet the performance improvement is noticeable, even when the struct types in CodeEncoder only have 2 and 7 fields, respectively. name old time/op new time/op delta CodeEncoder-4 5.39ms ± 0% 5.31ms ± 0% -1.37% (p=0.010 n=4+6) name old speed new speed delta CodeEncoder-4 360MB/s ± 0% 365MB/s ± 0% +1.39% (p=0.010 n=4+6) Updates #5683. Change-Id: I2662cf459e0dfd68e56fa52bc898a417e84266c2 Reviewed-on: https://go-review.googlesource.com/131401 Run-TryBot: Daniel Martí <[email protected]> Reviewed-by: Brad Fitzpatrick <[email protected]> TryBot-Result: Gobot Gobot <[email protected]>

mvdan · 2018-08-26T23:16:38Z

I think the encoder is reasonably optimized - there is no more low-hanging fruit that I can see. It should get considerable improvements once the compiler gets better at certain things like inlining and removing nil/bounds checks.

The decoder has plenty of room to improve, though. It seems to me like the single biggest win would be #16212. The encoder already prepares the reflect work upfront and caches it globally by type.

I wonder if there's a point to leaving such a broad issue open. Though I do think that issues to improve decoding performance, such as the one mentioned above, would be useful. Perhaps even one to track the overall decoding slowness. But this issue seems too broad to be useful at this point.

bradfitz · 2018-08-27T02:29:04Z

@mvdan, can you post a benchmark summary here of Go 1.11 vs Go tip after your various changes?

I'm fine closing this and opening more specific issues. Just cross-reference them to this.

mvdan · 2018-08-27T21:06:39Z

Of course. I don't have 1.11 installed yet (I know), but barely any performance work happened in the package during the 1.11 cycle, so these 1.10 vs tip numbers should still be useful.

name           old time/op    new time/op    delta
CodeEncoder-4    7.43ms ± 0%    5.35ms ± 1%  -28.01%  (p=0.002 n=6+6)
CodeDecoder-4    30.8ms ± 1%    27.3ms ± 0%  -11.42%  (p=0.004 n=6+5)

name           old speed      new speed      delta
CodeEncoder-4   261MB/s ± 0%   363MB/s ± 1%  +38.91%  (p=0.002 n=6+6)
CodeDecoder-4  62.9MB/s ± 1%  70.9MB/s ± 1%  +12.71%  (p=0.002 n=6+6)

name           old alloc/op   new alloc/op   delta
CodeEncoder-4    91.9kB ± 0%    62.1kB ± 0%  -32.38%  (p=0.002 n=6+6)
CodeDecoder-4    2.74MB ± 0%    2.74MB ± 0%   -0.04%  (p=0.010 n=6+4)

name           old allocs/op  new allocs/op  delta
CodeEncoder-4      0.00           0.00          ~     (all equal)
CodeDecoder-4     90.3k ± 0%     77.5k ± 0%  -14.09%  (p=0.002 n=6+6)

Also note that lots of other changes happened to improve performance in earlier releases, such as @kevinburke's table lookups for the encoder. So the speedup should be much higher if we compared tip with an even older Go release.

mvdan · 2018-08-31T20:40:07Z

Closing as discussed above - discussion will continue on other json performance issues, after being cross-referenced to this one.

gopherbot added waitingforreply labels Dec 4, 2013

rsc added this to the Unplanned milestone Apr 10, 2015

rsc removed release-none labels Apr 10, 2015

mschuett mentioned this issue Sep 23, 2015

CPU usage really high DECK36/go-log2gelf#3

Open

harshavardhana mentioned this issue Jul 14, 2016

Room for upload speed optimization ? minio/minio#2208

Closed

hellerve mentioned this issue Feb 23, 2017

Birdwatcher performance is suboptimal ecix/birdwatcher#7

Open

dsnet added the Performance label Aug 23, 2018

mvdan closed this as completed Aug 31, 2018

mvdan mentioned this issue Aug 31, 2018

encoding/json: do all reflect work before decoding #16212

Open

mvdan mentioned this issue Nov 22, 2018

encoding/json: speed up the decoding scanner #28923

Closed

golang locked and limited conversation to collaborators Aug 31, 2019

gopherbot added the FrozenDueToAge label Aug 31, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

encoding/json: performance slower than expected #5683

encoding/json: performance slower than expected #5683

gopherbot commented Jun 11, 2013

rsc commented Jun 14, 2013

gopherbot commented Jun 14, 2013

davecheney commented Jun 27, 2013

gopherbot commented Jul 4, 2013

adg commented Jul 4, 2013

alberts commented Jul 5, 2013

gopherbot commented Jul 9, 2013

rsc commented Jul 9, 2013

gopherbot commented Sep 14, 2013

rsc commented Nov 27, 2013

rsc commented Dec 4, 2013

rsc commented Dec 4, 2013

gaurish commented Dec 17, 2015

bradfitz commented Dec 17, 2015

kevinburke commented Jun 27, 2016

gopherbot commented Jun 27, 2016

valyala commented May 29, 2018

as commented May 30, 2018

gopherbot commented Jul 7, 2018

gopherbot commented Jul 7, 2018

gopherbot commented Jul 7, 2018

gopherbot commented Jul 8, 2018

gopherbot commented Jul 8, 2018

gopherbot commented Jul 22, 2018

gopherbot commented Jul 22, 2018

gopherbot commented Jul 22, 2018

gopherbot commented Aug 25, 2018

gopherbot commented Aug 25, 2018

mvdan commented Aug 26, 2018

bradfitz commented Aug 27, 2018

mvdan commented Aug 27, 2018

mvdan commented Aug 31, 2018

encoding/json: performance slower than expected #5683

encoding/json: performance slower than expected #5683

Comments

gopherbot commented Jun 11, 2013

rsc commented Jun 14, 2013

gopherbot commented Jun 14, 2013

davecheney commented Jun 27, 2013

gopherbot commented Jul 4, 2013

adg commented Jul 4, 2013

alberts commented Jul 5, 2013

gopherbot commented Jul 9, 2013

rsc commented Jul 9, 2013

gopherbot commented Sep 14, 2013

rsc commented Nov 27, 2013

rsc commented Dec 4, 2013

rsc commented Dec 4, 2013

gaurish commented Dec 17, 2015

bradfitz commented Dec 17, 2015

kevinburke commented Jun 27, 2016

gopherbot commented Jun 27, 2016

valyala commented May 29, 2018

as commented May 30, 2018

gopherbot commented Jul 7, 2018

gopherbot commented Jul 7, 2018

gopherbot commented Jul 7, 2018

gopherbot commented Jul 8, 2018

gopherbot commented Jul 8, 2018

gopherbot commented Jul 22, 2018

gopherbot commented Jul 22, 2018

gopherbot commented Jul 22, 2018

gopherbot commented Aug 25, 2018

gopherbot commented Aug 25, 2018

mvdan commented Aug 26, 2018

bradfitz commented Aug 27, 2018

mvdan commented Aug 27, 2018

mvdan commented Aug 31, 2018