Skip to content

Conversation

@jtraglia
Copy link
Member

@jtraglia jtraglia commented Jun 2, 2025

As discussed on ACDT today, we decided to change this to MAY.

@nalepae
Copy link
Contributor

nalepae commented Jun 2, 2025

Repost of #4320 (comment):

Let's imagine the following situation:

A big node operator has two running BNs: BN-1 (with validator clients connected), and BN-2 (runnning but without any validator clients connected).

BN-1 advertises a high cgc. BN-2 advertises the min cgc.

Now, a new version of the client is released.
In order to minimize downtime, the node operator first upgrades BN-2, then connects validator clients to BN-2.

Now, the BN-1 as the min cgc advertised.
Without backfill, the BN-2 won't have the correct cgc advertised until 4096 epochs (~18 days), which is already kind of an issue. (Both BN-1 and BN-2 will advertise the minimum cgc during this period.)

If BN switches happen more often than 4096 epochs, then both BN-1 and BN-2 will always advertise the mimimum cgc. It kind of defeats the purpose of validator custody.

==> I think it should be SHOULD or MUST, not MAY.

@jtraglia
Copy link
Member Author

jtraglia commented Jun 2, 2025

@nalepae but in this situation, the staking operator should configure both BNs to custody all columns regardless of how many validators are connected. Otherwise they're going to "spam" the network with requests every time they update.

@nalepae
Copy link
Contributor

nalepae commented Jun 2, 2025

the staking operator should configure both BNs to custody all columns regardless of how many validators are connected.

There is no incentive to do it, and no penalty for not doing it. Actually there is a slight dis-incentive to do it, since it will consume more bandwidth and disk usage.
Even for a honest node operator, they simply could forget to do it.

A simple way to fix this issue could be to erase the increase/decrease asymetry:

If, in case of decrease, the BN continues, during (at least) the retention period, to custody and advertise the previous (high) count, then the BN-2 will effectively start to advertise the high cgc when the BN-1 starts to advertise the minimum cgc.

@nalepae
Copy link
Contributor

nalepae commented Jun 2, 2025

This should not add any complexity in the implementation, since the same code path is already used for the "increase" case.

I'm perfectly aware that, unless we have an in-protocol system to penalize bad custody behavior, we can't force node operator to behave correctly.

But at least, we should assist honest node operators to behave correctly without having to use any extra-flag.

@jtraglia
Copy link
Member Author

jtraglia commented Jun 3, 2025

Closed in favor of the other PR.

@jtraglia jtraglia closed this Jun 3, 2025
@jtraglia jtraglia deleted the may-backfill branch June 3, 2025 14:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants