IPFS consumes a large amount of network traffic #2917

cminnoy · 2016-06-28T20:01:44Z

Ubuntu 16.01 Intel x64
ipfs version 0.4.1

Type bug
Area DHT?
Priority high

Description:

Hi,

I'm running two instances of IPFS (on two different machines), each connect to a different ISP.
I noticed for a while a very high network traffic from those nodes. Those nodes run only IPFS,
so no other network traffic from those machines than IPFS network traffic.
Last week, I didn't use IPFS on those machines, but IPFS was still turned on.
During that one week, the first node consumed 58 Gigabyte of network traffic, and the second node
59,9 Gigabyte. This network traffic was completly generated by IPFS while not using IPFS actively.
Sure this must be a serious bug. When I stop the IPFS daemon service, the network traffic consumption stopped immediatly.

Cheers,

Chris

Kubuxu · 2016-06-28T20:12:55Z

0.4.1 is very very old version, please try using build from the current master branch as it contains many performance improvements but we are still working on in.

gsf · 2016-06-28T21:47:23Z

Same behavior on my nodes as well, running on master.

Stebalien · 2016-06-30T15:00:00Z

If you run a public facing node, it will serve up all blocks it has downloaded on request. Unfortunately, even if you haven't downloaded anything, you'll still be constantly bombarded with wantlists and download requests.

Kubuxu · 2016-06-30T15:20:17Z

Not really download requests but general chatter (DHT upkeep which we try to balance) and wantlists which we will try to optimize. Reducing passive traffic is on our list and this version includes few improvements, for example #2817 but it needs network to adopt it before you will see reduction of traffic.

Stebalien · 2016-06-30T15:31:38Z

It would also be nice if ipfs stats bw gave an accurate bandwidth estimate. For me, actual (passive) bandwidth usage is usually over an order of magnitude greater than that reported by ipfs. Unfortunately, this is probably hard to implement.

cminnoy · 2016-07-28T20:05:39Z

Checked v0.4.3-dev yesterday/today. In 24 hours the DHT consumed 6.2 gigabyte of network traffic (almost half to receive and transfer). Hardly a good figure.

pchiusano · 2016-07-28T20:48:34Z

Great that this is being worked on... this issue is super important for anyone thinking of running IPFS nodes in the cloud where they will have to pay for bandwidth.

Can you give any sort of sense for what the ideal, expected about of bandwidth usage would be assuming that:

You aren't requesting files from anyone
You have an initially totally empty IPFS node running. Thus there aren't any files you are uniquely holding.

Should it be 0... very close to 0? It depends (on what exactly?) The system feels like a black box at the moment. I have no idea whether this is just a simple bug to fix or something inherently problematic with the algorithms being used.

I don't really know what I'm talking about :) but it seems like DHT upkeep should be minimal bandwidth, unless other nodes are redundantly polling you. Likewise wantlist traffic seems like it should be low unless there is massive flux in the set of files stored network-wide and/or massive flux in demand for files.

Another general comment / feature request - provide a setting for amount of bandwidth you want to allocate to IPFS, and have routing, etc, make use of this information somehow.

whyrusleeping · 2016-08-04T14:50:20Z

@pchiusano Ideally, bandwidth usage would be quite low, but configurable depending on how helpful to the network you want your node to be. On average, i think hitting < 50kbps is a good 'low' goal. DHTs are very chatty, and the dht employed by ipfs is even more talkative than for example bittorrents mainline DHT due to the way we to content routing.

In the short term, we're looking into implementing hard limits on outgoing and incoming bandwidth. This has the potential to cause severe slowdowns, but should keep ipfs running without destroying low bandwidth connections.

On the still-short-but-slightly-longer term we are looking at different options for content routing. That portion of the dht could be swapped out by a 'tracker' or 'federated dht' of supernodes to reduce the bandwidth consumption on any node choosing to use that system. This obviously impairs the decentralization of the system, but will be a good option in many cases. Even with this system, you will still be able to fall back to searching through the dht if you want to.

In the longer-ish term, we're hoping to implement more advanced routing systems and combining 'supernodes' and more exciting DHT algorithms (see coral clustered DHTs), as well as optimizing many aspects of how the content routing system works.

If any of this is interesting to anyone, I highly encourage you to get involved and help out. There's always plenty to do :)

pchiusano · 2016-08-04T15:11:19Z

Thanks for detailed reply. Do you have a formal description of the DHT algorithm being used? And to what extent can that be changed while still being IPFS?

whyrusleeping · 2016-08-04T15:24:57Z

@pchiusano We don't have a formal writeup of the logic for our current DHT implementation, but it is an implementation of kademlia with peer routing, a direct value storage layer (for ipns records and public keys) as well as an indirect content storage layer (for storage records of who has which objects). Our K-Value as per kademlia is 20.

The majority of the bandwidth problem is the sheer number of provider records we store. Each provider record needs to be stored on the K closest peers to the block its referencing, which means lots of dht crawling during large adds. We have a few different ideas on improving this, such as batching outgoing provider storage (to save on the number of RPCs required) and integrating supernodes into the logic to short circuit certain routines.

Stebalien · 2016-08-04T15:54:32Z

(theoretically, not all nodes need to run a full DHT).

jbenet · 2016-08-04T21:09:09Z

not all nodes need a dht
resource constraints are much needed, and coming. please help us make
them!
can even resource constrain per protocol (keeping dht serving low for
example)
definitely want to spend way less bw.

We should experiment with some "client only" dht nodes relatively soon.
Will mean upgrading the protocol as some expectations would need to change
On Thu, Aug 4, 2016 at 11:54 Steven Allen [email protected] wrote:

(theoretically, not all nodes need to run a full DHT).

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#2917 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AAIcocskMizYgnGglqSTMB7BhoWxjnHQks5qcgs5gaJpZM4JAeg5
.

ulrichard · 2016-09-28T20:14:34Z

My boss came to me today and told me that my machine generated a lot of traffic. I found out that it was IPFS. To me that was totally unexpected. Throttling would be highly desirable.

d10r · 2016-10-08T21:07:38Z

My idle ipfs node now generates >2GB traffic per hour (I know because that's the default threshold for warning emails at Hetzner).

At the moment, connected peers count is >300. Would it help to limit the number of connected peers?

whyrusleeping · 2016-10-09T16:50:08Z

@d10r what does ipfs stats bw --proto=/ipfs/dht report?
Other protocol options to try checking are:

/ipfs/kad/1.0.0 <- new dht protocol, should be more efficient
/ipfs/bitswap
/ipfs/bitswap/1.0.0 <- newer clients

The older /ipfs/dht protocol is still the most widely deployed, but as the network migrates to 0.4.3 (and 0.4.4, which is currently 'master') the bandwidth consumption there should go down.

If youre running a recent build from source, you can disable the dht (almost) entirely by running your daemon with ipfs daemon --routing=dhtclient. This feature is still very experimental, but it should help. Let me know if you try it and notice any odd behaviour.

Kubuxu · 2016-10-09T19:43:37Z

Old dht protocol will have lot more overhead that won't be included in the stats, (I would estimate it to be about 20-30%).

gsf · 2016-10-10T19:19:14Z

Here's a report from my mostly idle node that's been running on master for a couple of days. Thanks for the continued efforts to constrain resource use!

# ipfs stats bw --proto=/ipfs/dht
Bandwidth
TotalIn: 718 MB
TotalOut: 2.2 GB
RateIn: 53 B/s
RateOut: 1.6 kB/s
# ipfs stats bw --proto=/ipfs/kad/1.0.0
Bandwidth
TotalIn: 58 MB
TotalOut: 140 MB
RateIn: 78 B/s
RateOut: 1.5 kB/s
# ipfs stats bw --proto=/ipfs/bitswap
Bandwidth
TotalIn: 9.2 GB
TotalOut: 78 kB
RateIn: 36 kB/s
RateOut: 0 B/s
# ipfs stats bw --proto=/ipfs/bitswap/1.0.0
Bandwidth
TotalIn: 9.7 MB
TotalOut: 357 B
RateIn: 0 B/s
RateOut: 0 B/s
# ps aux | grep ipf[s]
root     12010 11.8 78.9 1351400 399524 pts/2  Sl+  Oct08 443:20 ipfs daemon
# ipfs --version
ipfs version 0.4.4-dev

d10r · 2016-10-13T00:59:07Z

My stats now for ~48h, ipfs 0.4.3:

ipfs@ipfs:~$ ipfs stats bw --proto=/ipfs/dht
Bandwidth
TotalIn: 5.1 GB
TotalOut: 5.4 GB
RateIn: 14 kB/s
RateOut: 1.8 kB/s
ipfs@ipfs:~$ ipfs stats bw --proto=/ipfs/bitswap
Bandwidth
TotalIn: 7.1 GB
TotalOut: 137 MB
RateIn: 2.4 kB/s
RateOut: 0 B/s

It's now much lower then before, probably because I've set swarm address filters for non-public IP blocks after having been warned by my hosting provider (same as #1226).

whyrusleeping · 2016-10-13T19:35:32Z

The bitswap numbers are surprisingly high. How do the bitswap TotalIn numbers compare to the amount of data you've downloaded with ipfs?

d10r · 2016-10-14T02:28:27Z

I haven't downloaded anything during that time.
New stats:

Bandwidth
TotalIn: 11 GB
TotalOut: 208 MB
RateIn: 3.0 kB/s
RateOut: 0 B/s

Possibly related: Few weeks ago I tried https://github.com/davidar/ipfs-maps, resulting in the import of some tile files to my ipfs node.
So there is content on the node which could in theory be fetched by others. I don't however believe that to be the case. How can I check this?

gsf · 2016-10-14T18:02:48Z

Similar stats for my node restarted a few days ago. Nothing downloaded. Only thing pinned is that quick-start directory (QmYwAPJzv5CZsnA625s3Xf2nemtYgPpHdWEz79ojWnPbdG).

# ipfs stats bw --proto=/ipfs/bitswap
Bandwidth
TotalIn: 13 GB
TotalOut: 481 kB
RateIn: 3.6 kB/s
RateOut: 0 B/s

whyrusleeping · 2017-09-03T00:54:07Z

This should be better in the upcoming 0.4.11 release. Bitswap sessions reduces the amount of wantlist entries that bitswap broadcasts to everyone

chilarai · 2017-09-22T10:43:47Z

It still consumes a lot of bandwidth. When will the 0.4.11 be released?

bogdanbiv · 2017-12-02T10:58:22Z

@chilarai I will also be retesting on 0.4.13 (this is more of a self-reminder).

Stebalien · 2017-12-02T21:53:11Z

With IPFS 0.4.13 and dhtclient off, I've been seeing about 5KiB (overestimate) in background traffic. That should sum to about ~3GiB of traffic per week.

chilarai · 2017-12-03T14:14:19Z

Thanks @bogdanbiv @Stebalien

eingenito · 2019-05-14T17:40:58Z

Closing this issue as old/fixed. Please reopen if you are still seeing the same behavior.

Stebalien · 2019-05-14T18:20:12Z

It's still bad but yeah, this issue isn't directly actionable.

Stebalien mentioned this issue Jun 30, 2016

Bandwidth statistics #2923

Closed

whyrusleeping added kind/bug A bug in existing code (including security flaws) topic/dht Topic dht labels Jul 28, 2016

whyrusleeping mentioned this issue Aug 9, 2016

Implement bandwidth limiting #3065

Open

pchiusano mentioned this issue Aug 12, 2016

Investigate/implement IPFS-based implementation of BlockStore unisonweb/unison#87

Closed

whyrusleeping added the help wanted Seeking public contribution on this issue label Sep 14, 2016

whyrusleeping mentioned this issue Sep 28, 2016

add experimental dht client mode flag #3269

Merged

Kubuxu added the status/ready Ready to be worked label Nov 28, 2016

Stebalien added status/deferred Conscious decision to pause or backlog and removed status/ready Ready to be worked labels Dec 18, 2018

eingenito closed this as completed May 14, 2019

DonaldTsang mentioned this issue May 16, 2019

Performace, or How IPFS will be better than BitTorrent #6342

Closed

M-Stenzel mentioned this issue Nov 21, 2021

IPFS causes very accelerated network traffic on server #8561

Closed

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

IPFS consumes a large amount of network traffic #2917

IPFS consumes a large amount of network traffic #2917

cminnoy commented Jun 28, 2016

Kubuxu commented Jun 28, 2016

gsf commented Jun 28, 2016

Stebalien commented Jun 30, 2016

Kubuxu commented Jun 30, 2016

Stebalien commented Jun 30, 2016 •

edited

Loading

cminnoy commented Jul 28, 2016

pchiusano commented Jul 28, 2016 •

edited

Loading

whyrusleeping commented Aug 4, 2016

pchiusano commented Aug 4, 2016

whyrusleeping commented Aug 4, 2016

Stebalien commented Aug 4, 2016

jbenet commented Aug 4, 2016

ulrichard commented Sep 28, 2016

d10r commented Oct 8, 2016

whyrusleeping commented Oct 9, 2016

Kubuxu commented Oct 9, 2016

gsf commented Oct 10, 2016

d10r commented Oct 13, 2016

whyrusleeping commented Oct 13, 2016

d10r commented Oct 14, 2016

gsf commented Oct 14, 2016

whyrusleeping commented Sep 3, 2017 •

edited

Loading

chilarai commented Sep 22, 2017

bogdanbiv commented Dec 2, 2017

Stebalien commented Dec 2, 2017

chilarai commented Dec 3, 2017

eingenito commented May 14, 2019

Stebalien commented May 14, 2019

IPFS consumes a large amount of network traffic #2917

IPFS consumes a large amount of network traffic #2917

Comments

cminnoy commented Jun 28, 2016

Kubuxu commented Jun 28, 2016

gsf commented Jun 28, 2016

Stebalien commented Jun 30, 2016

Kubuxu commented Jun 30, 2016

Stebalien commented Jun 30, 2016 • edited Loading

cminnoy commented Jul 28, 2016

pchiusano commented Jul 28, 2016 • edited Loading

whyrusleeping commented Aug 4, 2016

pchiusano commented Aug 4, 2016

whyrusleeping commented Aug 4, 2016

Stebalien commented Aug 4, 2016

jbenet commented Aug 4, 2016

ulrichard commented Sep 28, 2016

d10r commented Oct 8, 2016

whyrusleeping commented Oct 9, 2016

Kubuxu commented Oct 9, 2016

gsf commented Oct 10, 2016

d10r commented Oct 13, 2016

whyrusleeping commented Oct 13, 2016

d10r commented Oct 14, 2016

gsf commented Oct 14, 2016

whyrusleeping commented Sep 3, 2017 • edited Loading

chilarai commented Sep 22, 2017

bogdanbiv commented Dec 2, 2017

Stebalien commented Dec 2, 2017

chilarai commented Dec 3, 2017

eingenito commented May 14, 2019

Stebalien commented May 14, 2019

Stebalien commented Jun 30, 2016 •

edited

Loading

pchiusano commented Jul 28, 2016 •

edited

Loading

whyrusleeping commented Sep 3, 2017 •

edited

Loading