-
Notifications
You must be signed in to change notification settings - Fork 277
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
discv5: decide on proof-of-work for node identities #122
Comments
I'm worried that although you can now model the cost of eclipsing a node as non-zero, this cost will still likely be relatively insubstantial to an attacker if we keep the cost low enough for normal nodes (and even resource constrained devices). It also seems that we would need to keep the PoW fixed rather than dynamically adjusting (like in a PoW consensus) to ensure that an attacker can't make it costly for honest nodes to participate. [It is also unclear at first glance how you would actually do a dynamic PoW in this context..] How do you plan on choosing a suitable value for the difficulty of the PoW? This problem is fundamental to DHTs, right? Are there any other novel solutions that don't use PoW that you have come upon? |
Yes, the idea here is to keep a static PoW target. There is no way to attack that in the same way the blockchain 'difficulty' could be attacked. We'd basically decide on the amount of resources creating a node ID should take and put these parameters into the software. Changing it would be a software update.
Not really. There was a cool idea in Low-Resource Eclipse Attacks on Ethereum’s Peer-to-Peer Network to make the distance measure private to each node (this is Countermeasure 3 in the paper). It breaks routing in a fundamental way, but might be worth going off on that tangent a little longer to see if it leads anywhere. The advantage we have over traditional DHTs is that we're not concerned with content-adressed data storage. It might actually be OK to break routing or make it less efficient because we really only care about random walks of the DHT. |
Actually, let's go on that tangent right now. Most ideas in discv5 (basic wire protocol, ENR, liveness check logic, topic queues) are universal and not specifically tied to the Kademlia structure. Topic ad distribution is, but only in that it builds on the idea of XOR distance, and we're actually still unsure about how to do the distribution right. If we'd break routing, essentially removing Kademlia from the protocol, we'd need something to replace it. That something needs to have:
|
cc @dryajov |
Has there been a decision on this topic? |
No decision has been made so far. I don't like PoW either. |
Simple idea to try from @zsfelfoldi: If we change the node distance calculation to be based on the hash of the node's IP address, it would be much harder to generate chosen IDs. We can actually pull this off with discv5 because There are some downsides to this idea: if routing is completely dependent on IP address distribution, we have to deal with odd cases like many nodes being behind the same NAT address. There would also be logic specific to IPv6 to avoid mining on an address suffix. |
That's an interesting idea. One concern I have is that it would mean if a client is changing IP's frequently (which is common since most home ISP's do not provide static IP's by default), would that mean severe down-time for home users every time their IP changes (since the client needs to restart the whole peer discovery and reputation building from scratch) ? One solution would be, after a first connection with a (new) peer, make it so that all his peers would accept subsequent incoming connections from new IP's, as long as the hash of that IP is signed with the same formerly provided public-key (say, during handshake). Makes sense? So: Another thing to think about is if there is any privacy implications of such scheme. |
This issue attempts to collect our options regarding proof-of-work on node identities.
This isn't a complete solution just yet, more a place where we can dump thoughts and
collect research.
First, let's describe what the problem is:
discv5 is a Kademlia-inspired system. As such, it relies on the XOR distance measure to
establish routing tables, forming an overlay network. All known structured overlay
networks are known to be susceptible to attacks (such as the eclipse attack) based on
attacker-chosen node IDs. Proof-of-work is one solution to this problem. If generating a
new node identity required a certain amount of computational work, attackers would have a
harder time creating chosen node IDs to influence routing tables.
Another solution is to design an overlay network that doesn't use node IDs for routing, or
using an unstructured network topology. If such a design exists, and fulfills the
requirements we have for node discovery, I'd strongly prefer to use it instead of adding
proof-of-work. Unfortunately I don't know of any overlay design with those properties.
How much work?
Creating a new node identity is a one time process for most users, but a repeated task for
the attacker. The kind of attacker we want to rule out is one using many cheap cloud
servers. These are typically memory-constrained. If creating a new node required 4GB of
memory, it'd be harder to parallelize this operation at the cost of raising the barrier of
entry into the network. We are less concerned with CPU time consumption of the
proof-of-work function. Computation times up to ten seconds are no issue for most people,
especially for a one-time task.
Suitable proof-of-work functions
Since all network participants should verify the proof-of-work value if provided, the
function chosen must be asymmetric, i.e. verification should take very little effort. We
also need to consider that the chosen function must be simple to implement in multiple
programming languages. It'd be best to choose something that's already available.
There aren't a lot of functions to choose from here, so we might as well look at all the
current options:
is the verification time. We're already struggling with ethash verification performance
for Ethereum block validation on smaller devices.
was that I couldn't find many implementations.
layer. It's really simple to implement the algorithm and verification is essentially
free. One downside with this function is the proof size. Cuckoo cycle proofs are L * 4
bytes, where L is typically > 20. This makes it unsuitable for inclusion in ENR (see
below).
Implementation Ideas
In discv5, node IDs are derived from node records using an 'identity scheme'. This is good
because any piece of metadata can be attached to the node and relayed in discovery, or
even across multiple discovery systems. This additional metadata can also be used to
derive the overlay node ID.
There are multiple ways to integrate proof-of-work into ENR, depending on how much
flexibility we want. If the proof-of-work value would influence the node ID directly, we'd
need to define a new identity scheme for it. We could also leave the identity scheme as-is
and add an attribute containing a value related to the ID created by any scheme. I am
strongly leaning toward the latter approach, since it allows us to keep improving the PoW
while keeping the ID stable.
If we were to choose Cuckoo Cycle, an additional complication would be transmitting the
proof value in the discovery protocol. I did research this a bit: to make this work,
we'd add an ENR attribute containing an 8-byte PoW nonce. This attribute would simply
signal support for proof-of-work. To actually create the proof, we'd basically try random
nonces until a cuckoo cycle is found for the
sha256(node-id || nonce)
input value. Theproof would have to be transmitted in a new protocol message or during the handshake.
The text was updated successfully, but these errors were encountered: