Write providers to disk to avoid memory leaks #2860

whyrusleeping · 2016-06-16T23:29:42Z

This is a rough first hack on this. It works as intended, but i'm not happy with the actual way of putting the information on disk.

Currently, it puts the list of peers providing a given key to the datastore at /providers/<KEY>. This means that everytime a provider is added, we write marshal all the providers and write to the datastore.

License: MIT
Signed-off-by: Jeromy [email protected]

whyrusleeping · 2016-06-16T23:31:45Z

I'm thinking it might be best to just write /providers/<KEY>/<peerid> for each provider entry instead of storing the whole set at once

whyrusleeping · 2016-06-17T03:51:14Z

@lgierth @Kubuxu @kevina If you guys have thoughts or CR for this, that would be much appreciated

whyrusleeping · 2016-06-20T23:19:33Z

This also has the super cool advantage of persisting providers information across node reboots

whyrusleeping · 2016-06-21T01:41:12Z

@Kubuxu @jbenet @kevina I'm converting keys and peerIDs to hex before passing them to the datastore NewKey constructor. This is a workaround for the go-datastore bug #2601

Before merging this, I think we might want to decide if this is the way we want to do this moving forward, or if we should attempt a fix at the go-datastore level (allowing us to skip the hex encoding)

kevina · 2016-06-21T01:56:40Z

@whyrusleeping #2601 is causing some problems for me in the filestore, in particular the filestore maintenance commands will report the mangled hash which can confuse users at best and at also cause problems when trying to access the block outside of the filestore, I think we should work to fix this once and for all. I image any fix will likely introduce a repo. change. Let's take this discussion over to #2601.

kevina · 2016-06-21T06:08:33Z

routing/dht/providers/providers.go

@@ -81,7 +81,7 @@ func NewProviderManager(ctx context.Context, local peer.ID, dstore ds.Datastore)
 const providersKeyPrefix = "/providers/"

 func mkProvKey(k key.Key) ds.Key {
-	return ds.NewKey(providersKeyPrefix + hex.EncodeToString([]byte(k)))
+	return ds.NewKey(providersKeyPrefix + base64.StdEncoding.EncodeToString([]byte(k)))
 }


@whyrusleeping I would use RawURLEncoding. (1) StdEncoding includes '/' in the Alphabet (see https://tools.ietf.org/html/rfc4648) URLEncoding does not (hence it's name) (2) The RAW form does not include the unnecessary '=' padding character.

whyrusleeping · 2016-06-22T19:59:44Z

@kevina @Kubuxu could I get you guys to review this PR? Would be very helpful

Kubuxu · 2016-06-22T21:39:58Z

SGTM but I would like one more pair of eyes on it.

kevina · 2016-06-22T22:07:51Z

@whyrusleeping there is a failed test, should we worry about it?

Kubuxu · 2016-06-22T22:19:34Z

@kevina no it isn't connected

Kubuxu · 2016-06-22T22:22:24Z

@whyrusleeping t0060-daemon.sh compares daemon output in tests. It needs fixing.

whyrusleeping · 2016-06-23T00:20:28Z

@Kubuxu thats fixed in #2891

kevina · 2016-06-23T05:19:17Z

routing/dht/providers/providers.go

+		KeysOnly: true,
+		Prefix:   providersKeyPrefix,
+	})
+


@whyrusleeping, using dstore.Query will work, but it is very slow. The Query mechanism has a lot of overhead. The data is being sent to you via an unbuffered channel in a separate go routine. In my own filestore code I was able to get a modest speedup by increasing the channel buffer size used the query, but it was still slow (speed up of 2 to 3x). By querying the leveldb directly I was able to get a 10-12x speedup when doing a filestore ls. (See ipfs-filestore#10) I do not know how bad it will hurt performance here but it is something to keep it mind if you notice a slowdown after they code is deployed.

@kevina even with KeysOnly set to true? Now that you mention this, i think i'll throw in some timers.

@whyrusleeping Yes, in fact that is how I was using Query.

@whyrusleeping would it be possible to improve the perf of that call, I depend on it in bloom PR (with AllKeysChan), better perf on it, shorter bloom build time.

As a quick hack you can try and increase the channel buffer size here https://github.com/ipfs/go-datastore/blob/master/query/query.go#L185.

A more long term solution might be to rewrite the query interface not use a direct iterator rather then using Goroutines, however Disk IO may nullify this benefit.

See https://ewencp.org/blog/golang-iterators/ for some performance comparisons.

Yeah, using a channel as an iterator sucks. If one of you wants to work on improving the perf of query that would be great.

We could change the interface to not use a channel, and have it instead just return the next value directly. Then on top of that we could provide a method for turning the direct query result into a channel buffered one for usecases that need it

👍 for not using channels there

Actually, you can avoid using a query entirely if you store the records as ipfs objects properly, and only use leveldb to store key -> hash_of_providers_object, similar to how the pinset is stored

kevina · 2016-06-23T20:05:11Z

SGTM. As far as I can tell I don't see any real problems.

License: MIT Signed-off-by: Jeromy <[email protected]>

whyrusleeping · 2016-06-28T17:23:49Z

Choo Choo!

This reverts commit 5592144, reversing changes made to 3b2993d. License: MIT Signed-off-by: Lars Gierth <[email protected]>

whyrusleeping force-pushed the feat/provide-storage branch from d3a2034 to fd96e0c Compare June 17, 2016 01:27

kevina reviewed Jun 21, 2016
View reviewed changes

whyrusleeping added this to the Ipfs 0.4.3 milestone Jun 21, 2016

kevina reviewed Jun 23, 2016
View reviewed changes

kevina mentioned this pull request Jun 23, 2016

remove hex encoding from flatfs ipfs/go-datastore#39

Merged

Write providers to disk to avoid memory leaks

8f91069

License: MIT Signed-off-by: Jeromy <[email protected]>

whyrusleeping force-pushed the feat/provide-storage branch from 86540b6 to 8f91069 Compare June 25, 2016 16:54

providers test with multiple peers

959fe64

License: MIT Signed-off-by: Jeromy <[email protected]>

Kubuxu added the RFM label Jun 26, 2016

use no padding encoding

d489f82

License: MIT Signed-off-by: Jeromy <[email protected]>

whyrusleeping merged commit 5592144 into master Jun 28, 2016

whyrusleeping deleted the feat/provide-storage branch June 28, 2016 17:23

whyrusleeping mentioned this pull request Jun 28, 2016

Providers memory leak #2750

Closed

ghost pushed a commit that referenced this pull request Jul 28, 2016

Revert "Merge pull request #2860 from ipfs/feat/provide-storage"

9c4c4cb

This reverts commit 5592144, reversing changes made to 3b2993d. License: MIT Signed-off-by: Lars Gierth <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Write providers to disk to avoid memory leaks #2860

Write providers to disk to avoid memory leaks #2860

whyrusleeping commented Jun 16, 2016

whyrusleeping commented Jun 16, 2016

whyrusleeping commented Jun 17, 2016

whyrusleeping commented Jun 20, 2016

whyrusleeping commented Jun 21, 2016

kevina commented Jun 21, 2016 •

edited

Loading

kevina Jun 21, 2016

whyrusleeping commented Jun 22, 2016

Kubuxu commented Jun 22, 2016

kevina commented Jun 22, 2016

Kubuxu commented Jun 22, 2016

Kubuxu commented Jun 22, 2016 •

edited

Loading

whyrusleeping commented Jun 23, 2016

kevina Jun 23, 2016

whyrusleeping Jun 23, 2016

kevina Jun 23, 2016

Kubuxu Jun 23, 2016

kevina Jun 23, 2016

kevina Jun 23, 2016

whyrusleeping Jun 26, 2016

jbenet Aug 27, 2016

jbenet Aug 27, 2016

kevina commented Jun 23, 2016

whyrusleeping commented Jun 28, 2016

Write providers to disk to avoid memory leaks #2860

Write providers to disk to avoid memory leaks #2860

Conversation

whyrusleeping commented Jun 16, 2016

whyrusleeping commented Jun 16, 2016

whyrusleeping commented Jun 17, 2016

whyrusleeping commented Jun 20, 2016

whyrusleeping commented Jun 21, 2016

kevina commented Jun 21, 2016 • edited Loading

Choose a reason for hiding this comment

whyrusleeping commented Jun 22, 2016

Kubuxu commented Jun 22, 2016

kevina commented Jun 22, 2016

Kubuxu commented Jun 22, 2016

Kubuxu commented Jun 22, 2016 • edited Loading

whyrusleeping commented Jun 23, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kevina commented Jun 23, 2016

whyrusleeping commented Jun 28, 2016

kevina commented Jun 21, 2016 •

edited

Loading

Kubuxu commented Jun 22, 2016 •

edited

Loading