Finalize deposit before applying it to a state #3689

mkalinin · 2024-04-17T19:49:03Z

Proposal

Extends state.pending_balance_deposits to keep the deposit data and an epoch when a deposit was processed by the consensus layer. The deposit is applied to the state only when that epoch gets finalized and if the activation churn has enough capacity.

Analysis

Reduces engineering complexity of EIP-6110 implementation as an index of a new validator is associated with the pubkey only after the deposit is finalized, so there is no need to support forks in (pubkey, index) cache implementation.
Increases beacon state size as each entry of the deposits queue needs to hold 176 bytes of additional data (pubkey, withdrawal credentials and signature) which is a 12 times increase. For example, with 100_000 validators in the activation queue the pending deposits queue will consume ~18MB vs 1.5MB of just pending deposit balances.
Rate limits signature verifications per epoch by the means of the activation queue, but moves all verification operations to the epoch processing.
Requires top-ups to be finalized before they can be processed.
Invalid deposits will waste churn. Considering a high cost of submitting an invalid deposit, the potential effect should be negligible.
Makes no use for validator.activation_aligibility_epoch which can be repurposed in the future.

dapplion · 2024-04-18T07:34:23Z

The implementation complexity of the fork-aware pubkey cache has not been as bad as I originally imagined in Lodestar and Lighthouse. Would like to know from other teams too.

fradamt

Since deposits are not applied until after finalization, we don't need activation_eligibility_epoch anymore, as its role is replaced by deposit.epoch in a PendingDeposit. Then we can remove is_eligible_for_activation_queue from process_registry_updates, and move its balance check to is_eligible_for_activation:

def is_eligible_for_activation(state: BeaconState, validator: Validator) -> bool:
    """
    Check if ``validator`` is eligible for activation.
    """
    return (
        # Has not yet been activated
        and validator.activation_epoch == FAR_FUTURE_EPOCH
        # Has sufficient effective balance
        and validator.effective_balance >= MIN_ACTIVATION_BALANCE
    )

specs/electra/beacon-chain.md

fradamt · 2024-04-18T07:40:47Z

specs/electra/beacon-chain.md

@@ -1271,12 +1258,14 @@ def process_deposit_receipt(state: BeaconState, deposit_receipt: DepositReceipt)
    if state.deposit_receipts_start_index == UNSET_DEPOSIT_RECEIPTS_START_INDEX:
        state.deposit_receipts_start_index = deposit_receipt.index



What about checking is_valid_deposit_signature here? Unless we specifically want to avoid burning a top-up with incorrect signature

There are two things why do signature check as is:

Some infra might already have top-ups sent with zero signatures (worth checking tho)

Rate limiting a number of deposit signature checks per epoch is really nice, it alleviates the risk of delaying block processing if someone manages to pack a thousand deposits into it

Co-authored-by: fradamt <[email protected]>

StefanBratanov · 2024-04-22T11:01:25Z

The implementation complexity of the fork-aware pubkey cache has not been as bad as I originally imagined in Lodestar and Lighthouse. Would like to know from other teams too.

In Teku, we implemented it in a way, where we cache only the finalized indices and for the rest we do an N loop starting from the last finalized index + 1. It could be a problem for chains which are not finalizing for a long time, but otherwise I don't see a big issue with it, but would prefer if this PR makes it into the specs.

dapplion · 2024-04-22T14:55:16Z

An important consideration w.r.t. the inactivity leak is that this PR prevents top-ups during non-finality. This affects both the ability of leaking validators to maintain their stake share, and non-leaking validators from increasing their share.

mkalinin · 2024-04-23T04:25:44Z

An important consideration w.r.t. the inactivity leak is that this PR prevents top-ups during non-finality. This affects both the ability of leaking validators to maintain their stake share, and non-leaking validators from increasing their share.

This is an important consideration. Do you think there is any risk related to that?

Honest validators’ stake should remain stable as they keep voting — this is at least a desired property, but i don’t know if in practice this property holds in all cases as there might be not enough block space to accommodate all submitted attestations on chain thus honest validators may leak too. Preventing dishonest validators that aren’t voting from increasing their stake share is not strongly desired but a nice property, imo.

etan-status

I like the introduction of the queue, as it makes sure that BLS signature verifications remain bounded (16 / block). This way, the CL hardware requirements don't become a concern on every future gas limit increase.

Finality also simplifies implementations, because it means that validator indices to pubkey mappings are the same across all valid branches. As new validator activations can only happen while finalizing, this shouldn't affect the underlying PoS security.

The one thing that will no longer be possible is topping up a validator while the network is not finalizing / possibly leaking. I don't personally have concerns with that, but it should be documented at least as a semantic change that affects non-finalizing networks.

etan-status · 2024-05-13T09:03:50Z

specs/electra/beacon-chain.md

@@ -155,7 +153,7 @@ The following values are (non-configurable) constants used throughout the specif

 | Name | Value | Unit |
 | - | - | :-: |
-| `PENDING_BALANCE_DEPOSITS_LIMIT` | `uint64(2**27)` (= 134,217,728) | pending balance deposits |
+| `PENDING_DEPOSITS_LIMIT` | `uint64(2**27)` (= 134,217,728) | pending deposits |


2^32 would be the theoretical limit with infinite gas fee and a single block that fills the entire deposit tree

etan-status · 2024-05-13T09:07:35Z

specs/electra/beacon-chain.md

        if processed_amount + deposit.amount > available_for_processing:
            break
-        increase_balance(state, deposit.index, deposit.amount)
+        if deposit.epoch >= state.finalized_checkpoint.epoch:


This should also check for the legacy style deposits import to have caught up with the queue.

Otherwise, deposits from the old style (via eth1_data) and from the new queue can get reordered, possibly messing with decentralized staking pools that rely on a linear deposit order.

terencechain · 2024-05-14T16:06:49Z

The implementation complexity of the fork-aware pubkey cache has not been as bad as I originally imagined in Lodestar and Lighthouse. Would like to know from other teams too.

Same with Prysm. We implemented it the same way as Teku, as mentioned by @StefanBratanov. I'm indifferent to this, but if the sole purpose is to remove the fork-aware pubkey cache, I don't think it's necessary.

mkalinin · 2024-05-15T16:30:41Z

Outcomes of the breakout session on the change proposed by this PR:

Queueing deposits is important to rate limit sig verification operations. There is a desire not only to limit a number of deposits processed per epoch by the churn, but also by a number (the proposed number is 16). The reason for that is that there can be legit blocks with hundreds of deposits in them (e.g. staking pools doing batch depositing) and we want to ensure that processing of such blocks won't be delayed.
We want to benchmark processing a block with ~1000 deposits in it to understand how big of the problem it could be.
Finalizing (pubkey, index) might not only matter for client implementations but also for tooling and infrastructure like block explorers. Though, having this cache fork-independent became a weak argument since most of the teams implemented it, it still can prevent potential bugs in clients and reduces complexity.
To alleviate hash_tree_root complexity of potentially large and slicing queue, we want to explore two queue solution or queue with two pointers solution. The key intuition behind either of such solutions is that any queue is append only and reset to empty, no slicing is possible.
Deposits should be processed in strict order around the transition to prevent frontrunning. Potential solution is to create new validators induced by Eth1Data poll deposits instantly but add synthesized deposits with the entire balance of that validators to the end of the queue so the balance will hit the churn; and prevent new style deposits from being processed until the Eth1Data poll gap is filled.
We agreed that we do accept data complexity (potential state growth) and the fact that no top-ups is possible during period of non-finality. The latter even seems a positive thing.

pawanjay176 · 2024-05-17T12:21:43Z

Benchmarked a block with 1000 deposits using lcli.
Attaching the block and pre-state if anyone wants to run it on their clients.
big-deposit-test.zip

[2024-05-17T11:59:49Z INFO  lcli::transition_blocks] Run 0: 841.606125ms                                                                                                                                              
[2024-05-17T11:59:49Z DEBUG lcli::transition_blocks] Build caches: 1.894041ms                                                                                                                                         
[2024-05-17T11:59:49Z DEBUG lcli::transition_blocks] Initial tree hash: 225.958µs                                                                                                                                     
[2024-05-17T11:59:49Z DEBUG lcli::transition_blocks] Slot processing: 5.209µs                                                                                                                                         
[2024-05-17T11:59:49Z DEBUG lcli::transition_blocks] Build all caches (again): 875ns                                                                                                                                  
[2024-05-17T11:59:49Z DEBUG lcli::transition_blocks] Batch verify block signatures: 4.483875ms                                                                                                                        
[2024-05-17T11:59:50Z DEBUG lcli::transition_blocks] Process block: 838.27275ms                                                                                                                                       
[2024-05-17T11:59:50Z DEBUG lcli::transition_blocks] Post-block tree hash: 1.896833ms

(Note: the Batch verify block signatures time doesn't include deposit signatures)
The block was generated on a local kurtosis network by making bulk deposits.

Even on this relatively small network, the block arrived quite late on the node and took quite a bit to process.

May 17 11:56:13.776 DEBG Delayed head block                      set_as_head_time_ms: 2, imported_time_ms: 92, attestable_delay_ms: 6756, available_delay_ms: 6682, execution_time_ms: 3179, blob_delay_ms: unknown, observed_delay_ms: 3502, total_delay_ms: 6776, slot: 36, proposer_index: 125, block_root: 0x20ab5155dff25ded6c704d97028297a8d33af4b171c93b45c1bd7e8df3fec126, service: beacon

The execution layer processing itself took ~3secs. Consensus block processing is interleaved with the execution processing in lighthouse, so overall it took total_delay - observed_delay ~ 3.2s for processing the block.

mkalinin · 2024-05-20T12:10:49Z

@pawanjay176 thanks a lot for this benchmark!

Even on this relatively small network, the block arrived quite late on the node

I am curious why is that happened? It’s hard to believe that the propagation time took that many especially considering a small network. My guess would be that it wasn’t sent to the wire before fully processed by the proposer’s node, wdyt?

pawanjay176 · 2024-05-28T06:44:27Z

Sorry took a while, was on vacation.

My guess would be that it wasn’t sent to the wire before fully processed by the proposer’s node, wdyt?

Yeah I think that is right, the propagation seems near instant but producing the big block, sending it to the VC across the http api, getting it back and then publishing it finally took longer than for a smaller block.

etan-status · 2024-05-31T14:05:39Z

specs/electra/beacon-chain.md

    """
    return (
-        validator.activation_eligibility_epoch == FAR_FUTURE_EPOCH


Correct to assume that validators can never have an activation_eligibility_epoch > ELECTRA_FORK_EPOCH? e.g., by depositing during the last slot before electra activates.

Those validators are added to the pending deposits queue in upgrade_to_electra

etan-status · 2024-05-31T14:06:50Z

specs/electra/beacon-chain.md

+                epoch=get_current_epoch(state),
+                pubkey=validator.pubkey,
+                withdrawal_credentials=validator.withdrawal_credentials,
+                amount=balance,


excess_balance? Because MIN_ACTIVATION_BALANCE was already credited.

etan-status · 2024-05-31T14:09:16Z

specs/electra/beacon-chain.md

+                pubkey=validator.pubkey,
+                withdrawal_credentials=validator.withdrawal_credentials,
+                amount=balance,
+                signature=BLSSignature(),


For SyncAggregate, G2_POINT_AT_INFINITY is used instead of a zero value when noone signed.

Or, None with EIP-7688 optionals (but I guess zero value or infinity is also fine, as it is not checked anyway).

etan-status · 2024-05-31T14:11:27Z

specs/electra/beacon-chain.md

-            PendingBalanceDeposit(index=index, amount=excess_balance)
+        state.pending_deposits.append(
+            PendingDeposit(
+                epoch=get_current_epoch(state),


Is there an off-by-one here?

When epoch N finalizes, everything up through and including compute_start_slot_at_epoch(N) is finalized.

If activations are processed for deposit.epoch <= finalized_checkpoint.epoch, then that would also activate deposits on the non-finalized slots within N (only the initial slot of N is finalized).

If activations are processed for deposit.epoch < finalized_checkpoint.epoch, then deposits within the very first slot of N are not processed for another ~6 minutes despite already being finalized.

Currently, it seems we are following the second strategy (deposit.epoch >= state.finalized_checkpoint.epoch). Probably fine to have the extra ~6 minutes wait for deposits within the first slot, but could be avoided, so still keeping the comment for awareness.

I personally in favour of strategy (2) because of its simplicity

I thought about this and I changed my mind. The issue is that clients may (and probably will) filter deposits by blocks or db invariants like "finalized", the presence of this off by one creates edge cases that need to be always checked and tested. While it's not likely to create a consensus bug I feel it will be a source of bugs nonetheless

If we set deposit.epoch in the queue to get_current_epoch(state) + 1, will that prevent issues that you have in mind? It can also be renamed to deposit.activation_epoch to better reflect the semantics

I guess storing slot and processing deposits up through deposit_receipt.slot <= compute_start_slot_at_epoch(state.finalized_checkpoint.epoch) is the most straight forward.

etan-status · 2024-05-31T14:14:49Z

specs/electra/beacon-chain.md

+            PendingDeposit(
+                epoch=get_current_epoch(state),
+                pubkey=validator.pubkey,
+                withdrawal_credentials=validator.withdrawal_credentials,


Are withdrawal_credentials necessary to be saved here?

If yes, there could be a race where a BLSToExecutionChange gets processed while the PendingDeposit is in the queue, changing the withdrawal credentials of the validator while the PendingDeposit retains the stale entry.

If no, we could just set this to a zero value (or to None with EIP-7688 optionals).

It isn’t necessary so I think that setting it to zero value should be fine

mkalinin · 2024-06-26T08:17:03Z

Closing in favour of #3818

mkalinin · 2024-06-27T14:04:48Z

Closing in favour of #3818

Finalize deposit before applying it to a state

9aed70b

mkalinin mentioned this pull request Apr 17, 2024

Consensus-layer Call 132 ethereum/pm#1010

Closed

hwwhww added the Electra label Apr 18, 2024

fradamt reviewed Apr 18, 2024

View reviewed changes

mkalinin and others added 2 commits April 18, 2024 17:07

Apply fixes by @fradamt

d772386

Co-authored-by: fradamt <[email protected]>

Get rid of is_eligible_for_activation_queue

3d4fe06

philknows mentioned this pull request May 2, 2024

Electra tracker ChainSafe/lodestar#6341

Open

12 tasks

etan-status approved these changes May 13, 2024

View reviewed changes

james-prysm mentioned this pull request May 15, 2024

EIP-6110: Supply validator deposits on chain prysmaticlabs/prysm#13944

Merged

ralexstokes mentioned this pull request May 30, 2024

Consensus-layer Call 134 ethereum/pm#1050

Closed

etan-status reviewed May 31, 2024

View reviewed changes

mkalinin mentioned this pull request Jun 24, 2024

Handle top-ups to exiting/exited validators #3776

Merged

james-prysm mentioned this pull request Jun 25, 2024

Finalized validator index cache saving prysmaticlabs/prysm#14146

Merged

mkalinin mentioned this pull request Jun 26, 2024

eip6110: Queue deposit requests and apply them during epoch processing #3818

Merged

6 tasks

mkalinin closed this Jun 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Finalize deposit before applying it to a state #3689

Finalize deposit before applying it to a state #3689

mkalinin commented Apr 17, 2024 •

edited

Loading

dapplion commented Apr 18, 2024

fradamt left a comment •

edited

Loading

fradamt Apr 18, 2024

mkalinin Apr 18, 2024

StefanBratanov commented Apr 22, 2024

dapplion commented Apr 22, 2024

mkalinin commented Apr 23, 2024

etan-status left a comment

etan-status May 13, 2024

etan-status May 13, 2024

terencechain commented May 14, 2024

mkalinin commented May 15, 2024

pawanjay176 commented May 17, 2024 •

edited

Loading

mkalinin commented May 20, 2024

pawanjay176 commented May 28, 2024

etan-status May 31, 2024

mkalinin Jun 3, 2024

etan-status May 31, 2024

etan-status May 31, 2024

etan-status May 31, 2024

mkalinin Jun 3, 2024

potuz Jun 6, 2024

mkalinin Jun 6, 2024 •

edited

Loading

etan-status Jun 6, 2024

etan-status May 31, 2024

mkalinin Jun 3, 2024

mkalinin commented Jun 26, 2024

mkalinin commented Jun 27, 2024

		@@ -1271,12 +1258,14 @@ def process_deposit_receipt(state: BeaconState, deposit_receipt: DepositReceipt)
		if state.deposit_receipts_start_index == UNSET_DEPOSIT_RECEIPTS_START_INDEX:
		state.deposit_receipts_start_index = deposit_receipt.index

Finalize deposit before applying it to a state #3689

Finalize deposit before applying it to a state #3689

Conversation

mkalinin commented Apr 17, 2024 • edited Loading

Proposal

Analysis

dapplion commented Apr 18, 2024

fradamt left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

StefanBratanov commented Apr 22, 2024

dapplion commented Apr 22, 2024

mkalinin commented Apr 23, 2024

etan-status left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

terencechain commented May 14, 2024

mkalinin commented May 15, 2024

pawanjay176 commented May 17, 2024 • edited Loading

mkalinin commented May 20, 2024

pawanjay176 commented May 28, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mkalinin Jun 6, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mkalinin commented Jun 26, 2024

mkalinin commented Jun 27, 2024

mkalinin commented Apr 17, 2024 •

edited

Loading

fradamt left a comment •

edited

Loading

pawanjay176 commented May 17, 2024 •

edited

Loading

mkalinin Jun 6, 2024 •

edited

Loading