Skip to content

[Staking] Use active era and not current era for bonding logic#8807

Open
Ank4n wants to merge 61 commits intomasterfrom
ankn/current-era-refactor
Open

[Staking] Use active era and not current era for bonding logic#8807
Ank4n wants to merge 61 commits intomasterfrom
ankn/current-era-refactor

Conversation

@Ank4n
Copy link
Copy Markdown
Contributor

@Ank4n Ank4n commented Jun 10, 2025

We are using CurrentEra for operations like payout stakers, unbond, etc. This is incorrect usage of it. CurrentEra only indicates that we are preparing for a new era to be activated. It should be only mutated as part of election logic and nowhere else.

As part of this PR, we should make current era read/write functions private to the session_rotation::Rotator and ensure its not accessed or used anywhere else directly (better not use in tests as well but may be acceptable). We should use active_era instead.

We should also ensure this doesn’t break any other pallets that depend on pallet-staking-async (fast-unstake, nomination-pools, delegated-staking), by testing those pallets against pallet-staking-async directly instead of pallet-staking. This is handled by #9016 so let's be sure to merge it before the current PR.

@sigurpol sigurpol force-pushed the ankn/current-era-refactor branch from bde32fe to 44b1fab Compare June 27, 2025 13:28
@Ank4n Ank4n marked this pull request as draft July 7, 2025 09:52
github-merge-queue Bot pushed a commit that referenced this pull request Jul 28, 2025
## Context

The offence handling pipeline has four main stages:

1. **Reporting on RC**: Offences are reported on the Relay Chain (RC)
and exported to Asset Hub (AH) via RC::AHClient.
2. **Queueing**: AH staking pallet receives the offence in `fn
on_new_offence`, performs sanity checks, and enqueues it in
`OffenceQueue` and `OffenceQueueEras`.
3. **Processing**: Offences are processed one by one, starting from the
oldest era in the queue. Processed items are stored in
`UnappliedSlashes`.
4. **Application**: Finally, slashes are applied one page per block
after the slash defer duration from the offence era.

---

## Problem

While unlikely, a spam of offence reports could slow down processing
enough that some offences remain unhandled even after their bonding
period ends.

This creates a rare corner case: a withdrawal could happen for an era
that still has pending offences, which breaks slashing guarantees.

Also, slash application happens gradually (one page per block). If some
slashes are left unapplied at the end of their application era (due to
chain stalls or similar), they must be manually applied using the
permissionless `apply_slash` call.

Both scenarios are rare, but they expose risks to the integrity of
slashing.

---

## What this PR Changes

### 1. Block withdrawals for eras with unprocessed offences
Withdrawals are now restricted to the **minimum of:**

- The active era, and
- The last fully processed offence era.

This ensures withdrawals don't happen for eras that still have pending
offences.

**Why not block withdrawals per account instead?**  
That would require scanning each page of `ErasStakersPaged` for the
validator the staker is exposed to — which is costly. Since this is an
edge case, blocking at the era level is simpler and sufficient.

---

### 2. Block withdrawals if unapplied slashes remain in the previous era
Introduces a new safefguard: withdrawals are blocked if the immediately
concluded era has unapplied slashes. Once the era is cleared,
withdrawals resume as normal. We also only care about previous era, and
if this ends up not enough to nudge participants to clear the unapplied
slashes, the withdrawals should resume again in the next era (provided
no new unapplied slashes remain in current era as well).

When this happens, trying to withdraw would emit the error
`UnappliedSlashesInPreviousEra`. Anyone can look up the unapplied
slashes in the previous era through the storage `UnappliedSlashes` and
apply these via the permissionless call `apply_slash`.

This light enforcement should be enough to maintain slashing guarantees
without being too disruptive.

---

### 3. Ensure a full era for applying slashes
Previously, it was possible to receive an offence report at the very end
of the era when its slashes were meant to be applied.

We now reject offences that arrive **after** the end of the era *before*
their application era. An event `OffenceTooOld` is emitted when this
happens to make the behavior visible.

**Open question:**  
We may want to update the `prune_up_to` value sent from AH to RC to
`ActiveEra - SlashDeferDuration + 1` instead of `ActiveEra -
BondingDuration`. This could further guarantee that late offences never
reach the staking pallet.

---

### 4. Unbonding chunks are keyed by active era
We’re moving away from using `CurrentEra` in business logic (except for
elections). This change aligns unbonding with `ActiveEra`. The rest of
the code will be refactored in
[#8807](#8807).

---

### 5. More checks on offence pipeline health
Added extra try state checks to ensure the offence processing state is
healthy.

---

## Notes

This is mostly a defensive improvement. These situations are extremely
rare, but the added safeguards ensure slashing guarantees are upheld
even in these extreme cases.
paritytech-release-backport-bot Bot pushed a commit that referenced this pull request Jul 30, 2025
## Context

The offence handling pipeline has four main stages:

1. **Reporting on RC**: Offences are reported on the Relay Chain (RC)
and exported to Asset Hub (AH) via RC::AHClient.
2. **Queueing**: AH staking pallet receives the offence in `fn
on_new_offence`, performs sanity checks, and enqueues it in
`OffenceQueue` and `OffenceQueueEras`.
3. **Processing**: Offences are processed one by one, starting from the
oldest era in the queue. Processed items are stored in
`UnappliedSlashes`.
4. **Application**: Finally, slashes are applied one page per block
after the slash defer duration from the offence era.

---

## Problem

While unlikely, a spam of offence reports could slow down processing
enough that some offences remain unhandled even after their bonding
period ends.

This creates a rare corner case: a withdrawal could happen for an era
that still has pending offences, which breaks slashing guarantees.

Also, slash application happens gradually (one page per block). If some
slashes are left unapplied at the end of their application era (due to
chain stalls or similar), they must be manually applied using the
permissionless `apply_slash` call.

Both scenarios are rare, but they expose risks to the integrity of
slashing.

---

## What this PR Changes

### 1. Block withdrawals for eras with unprocessed offences
Withdrawals are now restricted to the **minimum of:**

- The active era, and
- The last fully processed offence era.

This ensures withdrawals don't happen for eras that still have pending
offences.

**Why not block withdrawals per account instead?**
That would require scanning each page of `ErasStakersPaged` for the
validator the staker is exposed to — which is costly. Since this is an
edge case, blocking at the era level is simpler and sufficient.

---

### 2. Block withdrawals if unapplied slashes remain in the previous era
Introduces a new safefguard: withdrawals are blocked if the immediately
concluded era has unapplied slashes. Once the era is cleared,
withdrawals resume as normal. We also only care about previous era, and
if this ends up not enough to nudge participants to clear the unapplied
slashes, the withdrawals should resume again in the next era (provided
no new unapplied slashes remain in current era as well).

When this happens, trying to withdraw would emit the error
`UnappliedSlashesInPreviousEra`. Anyone can look up the unapplied
slashes in the previous era through the storage `UnappliedSlashes` and
apply these via the permissionless call `apply_slash`.

This light enforcement should be enough to maintain slashing guarantees
without being too disruptive.

---

### 3. Ensure a full era for applying slashes
Previously, it was possible to receive an offence report at the very end
of the era when its slashes were meant to be applied.

We now reject offences that arrive **after** the end of the era *before*
their application era. An event `OffenceTooOld` is emitted when this
happens to make the behavior visible.

**Open question:**
We may want to update the `prune_up_to` value sent from AH to RC to
`ActiveEra - SlashDeferDuration + 1` instead of `ActiveEra -
BondingDuration`. This could further guarantee that late offences never
reach the staking pallet.

---

### 4. Unbonding chunks are keyed by active era
We’re moving away from using `CurrentEra` in business logic (except for
elections). This change aligns unbonding with `ActiveEra`. The rest of
the code will be refactored in
[#8807](#8807).

---

### 5. More checks on offence pipeline health
Added extra try state checks to ensure the offence processing state is
healthy.

---

## Notes

This is mostly a defensive improvement. These situations are extremely
rare, but the added safeguards ensure slashing guarantees are upheld
even in these extreme cases.

(cherry picked from commit 204c916)
@Ank4n Ank4n marked this pull request as ready for review July 31, 2025 01:02
fn on_idle_unstake(b: Linear<1, { T::BatchSize::get() }>) {
ErasToCheckPerBlock::<T>::put(1);
// initialise the era.
T::Staking::set_active_era(0);
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove and use from testing utils

@Ank4n Ank4n changed the title [Staking AHM] Only use active era outside election logic [WIP] Only use active era outside election logic Jul 31, 2025
@Ank4n Ank4n changed the title [WIP] Only use active era outside election logic [WIP] Use active era and not current era for bonding logic Aug 7, 2025
@Ank4n Ank4n changed the title [WIP] Use active era and not current era for bonding logic Use active era and not current era for bonding logic Aug 7, 2025
@Ank4n Ank4n changed the title Use active era and not current era for bonding logic [Staking] Use active era and not current era for bonding logic Aug 7, 2025
@sigurpol
Copy link
Copy Markdown
Contributor

sigurpol commented Sep 4, 2025

maybe it's just me but can we also rename CurrentEra in PlannedEra or something like that in the scope of this task?

alvicsam pushed a commit that referenced this pull request Oct 17, 2025
## Context

The offence handling pipeline has four main stages:

1. **Reporting on RC**: Offences are reported on the Relay Chain (RC)
and exported to Asset Hub (AH) via RC::AHClient.
2. **Queueing**: AH staking pallet receives the offence in `fn
on_new_offence`, performs sanity checks, and enqueues it in
`OffenceQueue` and `OffenceQueueEras`.
3. **Processing**: Offences are processed one by one, starting from the
oldest era in the queue. Processed items are stored in
`UnappliedSlashes`.
4. **Application**: Finally, slashes are applied one page per block
after the slash defer duration from the offence era.

---

## Problem

While unlikely, a spam of offence reports could slow down processing
enough that some offences remain unhandled even after their bonding
period ends.

This creates a rare corner case: a withdrawal could happen for an era
that still has pending offences, which breaks slashing guarantees.

Also, slash application happens gradually (one page per block). If some
slashes are left unapplied at the end of their application era (due to
chain stalls or similar), they must be manually applied using the
permissionless `apply_slash` call.

Both scenarios are rare, but they expose risks to the integrity of
slashing.

---

## What this PR Changes

### 1. Block withdrawals for eras with unprocessed offences
Withdrawals are now restricted to the **minimum of:**

- The active era, and
- The last fully processed offence era.

This ensures withdrawals don't happen for eras that still have pending
offences.

**Why not block withdrawals per account instead?**  
That would require scanning each page of `ErasStakersPaged` for the
validator the staker is exposed to — which is costly. Since this is an
edge case, blocking at the era level is simpler and sufficient.

---

### 2. Block withdrawals if unapplied slashes remain in the previous era
Introduces a new safefguard: withdrawals are blocked if the immediately
concluded era has unapplied slashes. Once the era is cleared,
withdrawals resume as normal. We also only care about previous era, and
if this ends up not enough to nudge participants to clear the unapplied
slashes, the withdrawals should resume again in the next era (provided
no new unapplied slashes remain in current era as well).

When this happens, trying to withdraw would emit the error
`UnappliedSlashesInPreviousEra`. Anyone can look up the unapplied
slashes in the previous era through the storage `UnappliedSlashes` and
apply these via the permissionless call `apply_slash`.

This light enforcement should be enough to maintain slashing guarantees
without being too disruptive.

---

### 3. Ensure a full era for applying slashes
Previously, it was possible to receive an offence report at the very end
of the era when its slashes were meant to be applied.

We now reject offences that arrive **after** the end of the era *before*
their application era. An event `OffenceTooOld` is emitted when this
happens to make the behavior visible.

**Open question:**  
We may want to update the `prune_up_to` value sent from AH to RC to
`ActiveEra - SlashDeferDuration + 1` instead of `ActiveEra -
BondingDuration`. This could further guarantee that late offences never
reach the staking pallet.

---

### 4. Unbonding chunks are keyed by active era
We’re moving away from using `CurrentEra` in business logic (except for
elections). This change aligns unbonding with `ActiveEra`. The rest of
the code will be refactored in
[#8807](#8807).

---

### 5. More checks on offence pipeline health
Added extra try state checks to ensure the offence processing state is
healthy.

---

## Notes

This is mostly a defensive improvement. These situations are extremely
rare, but the added safeguards ensure slashing guarantees are upheld
even in these extreme cases.
@paritytech-workflow-stopper
Copy link
Copy Markdown

All GitHub workflows were cancelled due to failure one of the required jobs.
Failed workflow url: https://github.com/paritytech/polkadot-sdk/actions/runs/19711105678
Failed job name: test-linux-stable

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

R0-no-crate-publish-required The change does not require any crates to be re-published. T2-pallets This PR/Issue is related to a particular pallet.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants