Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimization of fastJsonNode #1227

Closed
wants to merge 12 commits into from
Closed

Optimization of fastJsonNode #1227

wants to merge 12 commits into from

Conversation

tzdybal
Copy link
Contributor

@tzdybal tzdybal commented Jul 21, 2017

fastJsonNode is strongly refactored and optimized. Code is simpler, consumes less memory and executes faster.

Changes:

  • slices used instead of maps
  • unified attrs and children containers into one (attrs), removed fastJsonAttr type
  • (min-)heap used instead of sorting
  • much simplified processing (because of single container in fastJsonNode)

Performance comparison - artificial test (committed to query/outputnode_test.go), times measured without profiler. Memory measured (for entire test execution, including all setup) with default profiler configuration (because -memprofilerate=1 was much too slow).

Branch master improve/encode_meomry Improvement
Time: 172.283s 0.336s > 512 x
Total memory" 43.65GB 283.30MB > 157 x

In fact, memory improvement for encode function is much bigger - encode doesn't allocate memory. fastJsonNode creation is also much cheaper (in terms of memory).


This change is Reviewable

@manishrjain
Copy link
Contributor

I don't totally understand it, but looks like a great change. I've got a few comments. Once @janardhan1993 approves this, feel free to submit.


Reviewed 3 of 3 files at r1.
Review status: all files reviewed at latest revision, 6 unresolved discussions, some commit checks broke.


query/outputnode.go, line 292 at r1 (raw file):

func makeScalarNode(attr string, isChild bool, val []byte) *fastJsonNode {
	return &fastJsonNode{attr, 0, isChild, val, nil}

Please use the field names as well to qualify the variables.


query/outputnode.go, line 296 at r1 (raw file):

func makeNestedNode(attr string, isChild bool, val *fastJsonNode) *fastJsonNode {
	return &fastJsonNode{attr, 0, isChild, nil, []*fastJsonNode{val}}

Please use the field names to qualify the variables.


query/outputnode.go, line 427 at r1 (raw file):

	if fj.attrs.Len() > 0 {
		out.WriteRune('{')
		curr := heap.Pop(&fj.attrs).(*fastJsonNode)

s/curr/cur


query/outputnode.go, line 550 at r1 (raw file):

		// Merging with parent.
		parentSlice = merge(parentSlice, childSlice)
		//} else {

Remove?


query/outputnode_test.go, line 34 at r1 (raw file):

func makeFastJsonNode() *fastJsonNode {
	return &fastJsonNode{
	// uncoment for master branch

remove?


query/outputnode_test.go, line 41 at r1 (raw file):

func TestEncodeMemory(t *testing.T) {

Remove blank line.


Comments from Reviewable

@tzdybal
Copy link
Contributor Author

tzdybal commented Jul 24, 2017

Review status: all files reviewed at latest revision, 6 unresolved discussions.


query/outputnode.go, line 292 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

Please use the field names as well to qualify the variables.

Done.


query/outputnode.go, line 296 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

Please use the field names to qualify the variables.

Done.


query/outputnode.go, line 427 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

s/curr/cur

Done.


query/outputnode.go, line 550 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

Remove?

Done.


query/outputnode_test.go, line 34 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

remove?

Done.


query/outputnode_test.go, line 41 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

Remove blank line.

Done.


Comments from Reviewable

@janardhan1993
Copy link
Contributor

:lgtm:


Reviewed 2 of 3 files at r2.
Review status: 1 of 2 files reviewed at latest revision, 7 unresolved discussions.


query/outputnode.go, line 350 at r2 (raw file):

func (fj *fastJsonNode) SetUID(uid uint64, attr string) {
	fj.attrs = append(fj.attrs, makeScalarNode(attr, false, []byte(fmt.Sprintf("\"%#x\"", uid))))

I am not that familiar with this code, but if addlistchild, addmapchild or setuid are called twice then we would have the same output twice ?
If we are running it in debug mode and user has also requested for uid, setUid would be called twice.


Comments from Reviewable

@tzdybal
Copy link
Contributor Author

tzdybal commented Jul 24, 2017

Review status: 1 of 4 files reviewed at latest revision, 7 unresolved discussions.


query/outputnode.go, line 350 at r2 (raw file):

Previously, janardhan1993 (Janardhan Reddy) wrote…

I am not that familiar with this code, but if addlistchild, addmapchild or setuid are called twice then we would have the same output twice ?
If we are running it in debug mode and user has also requested for uid, setUid would be called twice.

Special handling in SetUID function added (with test supporting this behaviour). Case where user asks for same predicate many times are handled earlier during the processing.


Comments from Reviewable

@tzdybal tzdybal closed this in 5efbe7d Jul 24, 2017
@pawanrawal pawanrawal deleted the improve/encode_memory branch December 19, 2017 08:42
jarifibrahim pushed a commit that referenced this pull request Mar 16, 2020
Important changes
```
 - Changes to overlap check in compaction.
 - Remove 'this entry should've been caught' log.
 - Changes to write stalling on levels 0 and 1.
 - Compression is disabled by default in Badger.
 - Bloom filter caching in a separate ristretto cache.
 - Compression/Encryption in background.
 - Disable cache by default in badger.
```

The following new changes are being added from badger
`git log ab4352b00a17...91c31ebe8c22`

```
91c31eb Disable cache by default (#1257)
eaf64c0 Add separate cache for bloom filters (#1260)
1bcbefc Add BypassDirLock option (#1243)
c6c1e5e Add support for watching nil prefix in subscribe API (#1246)
b13b927 Compress/Encrypt Blocks in the background (#1227)
bdb2b13 fix changelog for v2.0.2 (#1244)
8dbc982 Add Dkron to README (#1241)
3d95b94 Remove coveralls from Travis Build(#1219)
5b4c0a6 Fix ValueThreshold for in-memory mode (#1235)
617ed7c Initialize vlog before starting compactions in db.Open (#1226)
e908818 Update CHANGELOG for Badger 2.0.2 release. (#1230)
bce069c Fix int overflow for 32bit (#1216)
e029e93 Remove ExampleDB_Subscribe Test (#1214)
8734e3a Add missing package to README for badger.NewEntry (#1223)
78d405a Replace t.Fatal with require.NoError in tests (#1213)
c51748e Fix flaky TestPageBufferReader2 test (#1210)
eee1602 Change else-if statements to idiomatic switch statements. (#1207)
3e25d77 Rework concurrency semantics of valueLog.maxFid (#1184) (#1187)
4676ca9 Add support for caching bloomfilters (#1204)
c3333a5 Disable compression and set ZSTD Compression Level to 1 (#1191)
0acb3f6 Fix L0/L1 stall test (#1201)
7e5a956 Support disabling the cache completely. (#1183) (#1185)
82381ac Update ristretto to version  8f368f2 (#1195)
3747be5 Improve write stalling on level 0 and 1
5870b7b Run all tests on CI (#1189)
01a00cb Add Jaegar to list of projects (#1192)
9d6512b Use fastRand instead of locked-rand in skiplist (#1173)
2698bfc Avoid sync in inmemory mode (#1190)
2a90c66 Remove the 'this entry should've caught' log from value.go (#1170)
0a06173 Fix checkOverlap in compaction (#1166)
0f2e629 Fix windows build (#1177)
03af216 Fix commit sha for WithInMemory in CHANGELOG. (#1172)
23a73cd Update CHANGELOG for v2.0.1 release. (#1181)
465f28a Cast sz to uint32 to fix compilation on 32 bit (#1175)
ea01d38 Rename option builder from WithInmemory to WithInMemory. (#1169)
df99253 Remove ErrGCInMemoryMode in CHANGELOG. (#1171)
8dfdd6d Adding changes for 2.0.1 so far (#1168)
```
jarifibrahim pushed a commit that referenced this pull request Jul 11, 2020
This commit brings following new changes from badger
This commit also disable conflict detection in badger to save memory.

```
0dfb8b4 Changelog for v20.07.0 (#1411)
03ba278 Add missing changelog for v2.0.3 (#1410)
6001230 Revert "Compress/Encrypt Blocks in the background (#1227)" (#1409)
800305e Revert "Buffer pool for decompression (#1308)" (#1408)
63d9309 Revert "fix: Fix race condition in block.incRef (#1337)" (#1407)
e0d058c Revert "add assert to check integer overflow for table size (#1402)" (#1406)
d981f47 return error if the vlog writes exceeds more that 4GB. (#1400)
7f4e4b5 add assert to check integer overflow for table size (#1402)
8e896a7 Add a contribution guide (#1379)
b79aeef Avoid panic on multiple closer.Signal calls (#1401)
717b89c Enable cross-compiled 32bit tests on TravisCI (#1392)
09dfa66 Update ristretto to commit f66de99 (#1391)
509de73 Update head while replaying value log (#1372)
e013bfd Rework DB.DropPrefix (#1381)
3042e37 pre allocate cache key for the block cache and the bloom filter cache (#1371)
675efcd Increase default valueThreshold from 32B to 1KB (#1346)
158d927 Remove second initialization of writech in Open (#1382)
d37ce36 Tests: Use t.Parallel in TestIteratePrefix tests  (#1377)
3f4761d Force KeepL0InMemory to be true when InMemory is true (#1375)
dd332b0 Avoid panic in filltables() (#1365)
c45d966 Fix assert in background compression and encryption. (#1366)
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants