-
-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
IPFS consumes a large amount of network traffic #2917
Comments
0.4.1 is very very old version, please try using build from the current |
Same behavior on my nodes as well, running on master. |
If you run a public facing node, it will serve up all blocks it has downloaded on request. Unfortunately, even if you haven't downloaded anything, you'll still be constantly bombarded with wantlists and download requests. |
Not really download requests but general chatter (DHT upkeep which we try to balance) and wantlists which we will try to optimize. Reducing passive traffic is on our list and this version includes few improvements, for example #2817 but it needs network to adopt it before you will see reduction of traffic. |
It would also be nice if |
Checked v0.4.3-dev yesterday/today. In 24 hours the DHT consumed 6.2 gigabyte of network traffic (almost half to receive and transfer). Hardly a good figure. |
Great that this is being worked on... this issue is super important for anyone thinking of running IPFS nodes in the cloud where they will have to pay for bandwidth. Can you give any sort of sense for what the ideal, expected about of bandwidth usage would be assuming that:
Should it be 0... very close to 0? It depends (on what exactly?) The system feels like a black box at the moment. I have no idea whether this is just a simple bug to fix or something inherently problematic with the algorithms being used. I don't really know what I'm talking about :) but it seems like DHT upkeep should be minimal bandwidth, unless other nodes are redundantly polling you. Likewise wantlist traffic seems like it should be low unless there is massive flux in the set of files stored network-wide and/or massive flux in demand for files. Another general comment / feature request - provide a setting for amount of bandwidth you want to allocate to IPFS, and have routing, etc, make use of this information somehow. |
@pchiusano Ideally, bandwidth usage would be quite low, but configurable depending on how helpful to the network you want your node to be. On average, i think hitting < 50kbps is a good 'low' goal. DHTs are very chatty, and the dht employed by ipfs is even more talkative than for example bittorrents mainline DHT due to the way we to content routing. In the short term, we're looking into implementing hard limits on outgoing and incoming bandwidth. This has the potential to cause severe slowdowns, but should keep ipfs running without destroying low bandwidth connections. On the still-short-but-slightly-longer term we are looking at different options for content routing. That portion of the dht could be swapped out by a 'tracker' or 'federated dht' of supernodes to reduce the bandwidth consumption on any node choosing to use that system. This obviously impairs the decentralization of the system, but will be a good option in many cases. Even with this system, you will still be able to fall back to searching through the dht if you want to. In the longer-ish term, we're hoping to implement more advanced routing systems and combining 'supernodes' and more exciting DHT algorithms (see coral clustered DHTs), as well as optimizing many aspects of how the content routing system works. If any of this is interesting to anyone, I highly encourage you to get involved and help out. There's always plenty to do :) |
Thanks for detailed reply. Do you have a formal description of the DHT algorithm being used? And to what extent can that be changed while still being IPFS? |
@pchiusano We don't have a formal writeup of the logic for our current DHT implementation, but it is an implementation of kademlia with peer routing, a direct value storage layer (for ipns records and public keys) as well as an indirect content storage layer (for storage records of who has which objects). Our K-Value as per kademlia is 20. The majority of the bandwidth problem is the sheer number of provider records we store. Each provider record needs to be stored on the K closest peers to the block its referencing, which means lots of dht crawling during large adds. We have a few different ideas on improving this, such as batching outgoing provider storage (to save on the number of RPCs required) and integrating supernodes into the logic to short circuit certain routines. |
(theoretically, not all nodes need to run a full DHT). |
We should experiment with some "client only" dht nodes relatively soon.
|
My boss came to me today and told me that my machine generated a lot of traffic. I found out that it was IPFS. To me that was totally unexpected. Throttling would be highly desirable. |
My idle ipfs node now generates >2GB traffic per hour (I know because that's the default threshold for warning emails at Hetzner). At the moment, connected peers count is >300. Would it help to limit the number of connected peers? |
@d10r what does
The older If youre running a recent build from source, you can disable the dht (almost) entirely by running your daemon with |
Old dht protocol will have lot more overhead that won't be included in the stats, (I would estimate it to be about 20-30%). |
Here's a report from my mostly idle node that's been running on master for a couple of days. Thanks for the continued efforts to constrain resource use!
|
My stats now for ~48h, ipfs 0.4.3:
It's now much lower then before, probably because I've set swarm address filters for non-public IP blocks after having been warned by my hosting provider (same as #1226). |
The bitswap numbers are surprisingly high. How do the bitswap |
I haven't downloaded anything during that time.
Possibly related: Few weeks ago I tried https://github.com/davidar/ipfs-maps, resulting in the import of some tile files to my ipfs node. |
Similar stats for my node restarted a few days ago. Nothing downloaded. Only thing pinned is that quick-start directory (QmYwAPJzv5CZsnA625s3Xf2nemtYgPpHdWEz79ojWnPbdG).
|
This should be better in the upcoming 0.4.11 release. Bitswap sessions reduces the amount of wantlist entries that bitswap broadcasts to everyone |
It still consumes a lot of bandwidth. When will the 0.4.11 be released? |
@chilarai I will also be retesting on 0.4.13 (this is more of a self-reminder). |
With IPFS 0.4.13 and dhtclient off, I've been seeing about 5KiB (overestimate) in background traffic. That should sum to about ~3GiB of traffic per week. |
Thanks @bogdanbiv @Stebalien |
Closing this issue as old/fixed. Please reopen if you are still seeing the same behavior. |
It's still bad but yeah, this issue isn't directly actionable. |
Ubuntu 16.01 Intel x64
ipfs version 0.4.1
Type bug
Area DHT?
Priority high
Description:
Hi,
I'm running two instances of IPFS (on two different machines), each connect to a different ISP.
I noticed for a while a very high network traffic from those nodes. Those nodes run only IPFS,
so no other network traffic from those machines than IPFS network traffic.
Last week, I didn't use IPFS on those machines, but IPFS was still turned on.
During that one week, the first node consumed 58 Gigabyte of network traffic, and the second node
59,9 Gigabyte. This network traffic was completly generated by IPFS while not using IPFS actively.
Sure this must be a serious bug. When I stop the IPFS daemon service, the network traffic consumption stopped immediatly.
Cheers,
Chris
The text was updated successfully, but these errors were encountered: