Increase power table lookback to detect/mitigate power spikes #876
Replies: 5 comments 13 replies
-
I plan to read this in detail when I'm not on my phone, but, in addition to the F3 discussion to which you pointed (where we've gone back and forth on dropping the 900 parameter entirely), there's also related work by @guy-goren on computing EC finality (doc). No specific conflict here, I think, but just making sure the two teams are aware of ongoing work. |
Beta Was this translation helpful? Give feedback.
-
Current fast finality (F3) proposals will not reduce the power table lookback number. That number should not be named finality because it is not a true deterministic finality either now or after F3. The fast finality proposals aim to make minimal changes to EC for many reasons (the potential risk here being but one). So nothing will change except that honest nodes will not extend sibling chains of truly finalised blocks. There still might be reasonable motivation to extend 900 epochs to 7 days, but they should be motivated without reference to finality. |
Beta Was this translation helpful? Give feedback.
-
how? and why is 7 days "ideal"? what does "ideal" mean here?
7 days is almost 4% of a sectors minimal lifetime. the need to store (and potentially prove) that looks costly from the miner side. |
Beta Was this translation helpful? Give feedback.
-
i am not deep enough into the details of the code - but does this have any implications for snapping? especially for sectors that change QAP while snapping data? asking because i see this as the most critical, instant way to facilitate a hostile power takeover on the network another thing that comes to mind in that direction are termination fees. |
Beta Was this translation helpful? Give feedback.
-
I have a contrary take on this issue. I do not think that there is a problem worth solving and, in case we can be convinced that there, is I think there are better ways to solve it. The discussion is framed around an "adversarial" spike in power. I don't believe that the protocol can define such a thing, nor could participants necessarily judge it either. The security of the chain is based in the resources committed to it. It must be secure from those resources – if rapidly adding more resources could be a problem, we should address that more directly in protocol design. The discussion focuses on rate of addition of resources, and provides for some advance warning style of alerting mechanism that could (in theory) trigger human intervention. I acknowledge that appealing to "social consensus" is sometimes the only recourse for some disastrous events, but designing toward making it easier and more likely for external intervention to interfere with the protocol rules runs counter to the decentralised and autonomous goals of a protocol like this. No problem worth solvingIf I understand correctly, the supposed problem might be framed as "what if someone commits lots of resources to the network with the aim of damaging it". This has to be irrational at least to the first order, or the fundamental security model is flawed. Adding more resources must increase a participant's incentive alignment with the network. If a malicious party really does want to pay the high cost to attack it, there's not really anything any network can do. Security is rooted in this incentive alignment, and there is some cost which which any network can be attacked. We can, of course, design the protocol to make that cost of attack high and this is a problem worth solving. Filecoin is secured by both economic stake and physical storage commitments. But consider pure proof-of-stake blockchains, such as Ethereum. In these chains, the only cost of consensus power is stake. To my knowledge, no such chains have a mechanism affording advance notification of future increases in stake, with the purpose of enabling manual operator intervention in the protocol. Why not? Either their teams are also all confident in their incentive security model, or we have discovered something important that none of them have. Filecoin is naturally in an even better position, because not only does an attacker need to acquire 1/3 of all stake, they also need to provide a significant amount of hardware. The balance of sealing vs storage hardware depends on the rate at which they intend to build power (faster -> more sealing throughput). Note that in Filecoin and other networks, the "easiest" attack is to acquire power from participants who already have it, say by bribing them. Building it from scratch requires larger total commitments because an attacker needs ⅓ of the resulting stake/hardware, which means they need to add ½ of the starting amount. Note also that such a bribing attack would go completely undetected by any warning system based on growth in committed resources anyway. Solve it differentlyThe proposal is based around observation of the rate of commitment of resources. It is reasonable to believe that the cost of an attack is related to the duration for which resources are committed. So even given a basically sound model, perhaps permitting a fast rate of power growth makes an attack uncomfortably cheap. If this is the case, we should simply and directly address this in the protocol by limiting the maximum rate of growth. It's not effective to limit individual participants (generally anonymous), but we can limit the network as a whole to a rate which would give other participants ample time to observe and respond to rapid growth – much longer than the 7 days proposed. Such a mechanism would resemble the stake churn limits of POS protocols. E.g. read about Ethereum's entry queue limits in EIP-7514 and discussion here and here. Filecoin power growth could similarly be limited to, say, support doubling in size at most every 60 days. This would give a huge 60-day "advance warning" of a rapidly growing stake, while still exceeding the network baseline function by 6x (so not effectively limiting network growth). With a long window, we could more reasonably hope that participants would respond with their own onboarding, rather than appealing to a centralised network halt. A built-in protocol rule like this could directly prevent exploitation of any perceived rate-related weakness in the incentive security model. An independent mechanism decoupled from proof-of-storage or consensus rules would be simpler. It would avoid introducing new problems associated with the lookback (e.g .that end-of-life sectors enjoy power with nothing at stake). As an aside, an economic mechanism like dynamic onboarding fees (#587) could increase the cost of rapid accumulation of power high enough to make attack irrational again without need for any limit. In summary:
|
Beta Was this translation helpful? Give feedback.
-
We propose to set power table lookback to 7 days. We invite comments and feedbacks from the community as early as possible, before proposing a formal FIP with a stabilized value.
Motivation
Currently there is no concerning security issue in the Filecoin network. Nevertheless, in case of an hypothetical severe security issue, we want to be prepared to preserve Filecoin.
In particular, we want to have time to react to an attempt of network takeover due to adversarial spikes of powers without compromising consensus security.
Assuming an adversary can fake power and gets noticed, how can we enable the possibility to put in place concrete countermeasures to mitigate the issue? First thing to ensure is to have enough time to put any countermeasure in place.
Today consensus power is granted at least
ChainFinality
(=900) blocks after a sector is onboarded onchain viaProveCommit
.Technically, a sector is activated right after the first
WindowPoSt
, which happens within 24h afterProveCommit
is finalized. Nevertheless, power is not granted right after the firstWindowPoSt
.Indeed, at each epoch
t
, leader election protocol selects SPs proportionally to their quality adjusted power at epocht- ChainFinality
. This means that the minimum delay in power activation is indeed 900 blocks afterProveCommit
(note that this happens ifWindowPost
happens right afterProveCommit
. Any time window elapsed fromProveCommit
to the firstWindowPost
defers sector power acquisition accordingly).This translates in having a small (~ 7h) window of time to react to any adversarial power spike without compromising consensus security.
Thus, if we want to be sure the time we have to react to a major security issue is long enough, we need to decouple power table lookback and
ChainFinality
, setting it in an independent manner.We identify power table lookback = 7 days to be the ideal window of time for sector power activation.
Protocol Specification
GetWinningPoStSectorSetLookback
and set the output value toEpochsInOneWeek
(*Miner).mineOne()
loglineReference to the code here
Impact on Filecoin
Sector power is deferred by 7 days wrt today (where we have a minimal of 900 epochs delay between sector onboarding and sector acquiring power).
This power will be only shifted by 7 days (not lost). Indeed, after termination a sector will retain power for 7 days for what regards Leader Election protocol.
Sectors need to be proven via WindowPost for the initial 7 days even if they won't have power. Similarly, they won't need to be proved for the 7 days after termination, while keeping the power. This means that WindowPost is required for the lifetime period on the sector (without extra proving overhead).
That said, it is possible that expired sector counting in the power allocation are challenged at WinningPoSt. This means that in order to be sure to be able to answer WinnignPoSt challnges in the first 7 days after expiration, expired sectors should be stored.
We think that this extra storage effort is not a dealbreaker, considering the entire sector lifetime (assuming 3.5y of sector lifetime, we are talking about an additional storage cost of 0.5% overall). On the other hand, such a change would make way more secure than it would be if this change would not be put in place.
Beta Was this translation helpful? Give feedback.
All reactions