Skip to content

Conversation

@michalkucharczyk
Copy link
Contributor

@michalkucharczyk michalkucharczyk commented Mar 3, 2025

Description

Avoid banning future/stale transactions reported as invalid by the authorship module.

Note for reviewers

When re-org is handled by transaction pool, the view for new fork (Bn') is cloned from the tip of the other existing fork (Bn). The new view is not entirely re-validated during the maintain process (it will be revalidated in the background), so it may happen that it contains transactions that are ready on (Bn) but actually are not ready on (Bn'). All required (which are expected to be in retracted set) transactions are submitted to the new view, but order of txs in ready iterator is not updated.

The proper fix would require to re-build the the iterator - which is not trivial as we do not have tags for transactions for block Bn' yet. We could force retracted txs to be before ready transactions but it also does not feel to be a good solution - it still would be best effort trial.

For now allowing future transactions to re-enter the view immediately is in my opinion a good compromise. This PR is a quick fix and actually brings back behavior of txpool from before merging #6008. The bad thing is that incorrect transactions are detected during block authorship, but this situation to happen requires some specific pre-conditions which should be rare.

If this PR is not merged, some transaction will get included into blocks, only after DEFAULT_BAN_TIME_SECS, which is pretty bad.

@michalkucharczyk michalkucharczyk changed the title fatxpool: report_invalid: do not ban Future/Stale txs from re-entring the view fatxpool: report_invalid: do not ban Future/Stale txs from re-entering the view Mar 4, 2025
@michalkucharczyk michalkucharczyk marked this pull request as ready for review March 13, 2025 11:21
@michalkucharczyk
Copy link
Contributor Author

/cmd prdoc --bump minor --audience node_dev

@michalkucharczyk michalkucharczyk added R0-no-crate-publish-required The change does not require any crates to be re-published. T0-node This PR/Issue is related to the topic “node”. labels Mar 13, 2025
Copy link
Contributor

@iulianbarbu iulianbarbu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@michalkucharczyk michalkucharczyk added this pull request to the merge queue Mar 19, 2025
@michalkucharczyk
Copy link
Contributor Author

So I was thinking a bit about selecting the best view to clone from in case of forks - what we discussed in DMs. Leaving it here for the record.

Let's consider following situation:

                  B1[Q1..Qn]   ---  B2[R1..Rn]   
                  vr: R1..Rn        vr: S1..Sn
                /
               /
B0 [P1...Pn] -
vr: Q1..Qn     \
                \
                  B1'[...]   ---  B2'[...]   

Legend
Transactions (Q depends on P):
Pi <- Qi <- Ri <- Si <- Ti

Bn[P] - Block with transaction included
vr: - ready transaction in view for given block

Let's assume we re-org from B2 to B2'.

When selecting B2 as the source for clone, we risk that S1..Sn which are ready in the B2 view, will not reflect the reality of the actual state at B2'. Meaning that in worst case, the block builder will find all the transactions invalid (future), and empty block will be built.

I think, this is quite unlikely to happen. Very specific condition would need to happen: B1' and B2' block would need to NOT contain Q1..Qn + R1..Rn transactions, what means that pools on different nodes are heavily mis-aligned.

When selecting B0 as the source for clone, we would need to revalidate S1..Sn transactions, what may significantly increase maintain duration and may result in building empty block. Light ready_at will not help here because Q1..Qn (which are ready transcation that would be considered in ready_at light) are most probably already included in B1' (or B2').

So when looking at those two options it is still better to choose the tip of the other fork, because the likeliness of having proper (for state B2') is quite high.

And the scenario addressed in this PR is also likely to happen when we have chains of transcations. For independent transactions the issue will not happen - and we will save a lot validations comparing to cloning from B0.

It is also worth mentioning that we most likely have incoming transactions T1..Tn (not depicted) which are kept in the mempool. And once the view for B2' is cloned we also need to submit (what means validate) those transactions from the mempool.

Merged via the queue into master with commit 954d857 Mar 19, 2025
249 of 256 checks passed
@michalkucharczyk michalkucharczyk deleted the mku-fatxpool-report-invalid-fix branch March 19, 2025 16:53
@michalkucharczyk michalkucharczyk added the A4-backport-stable2503 Pull request must be backported to the stable2503 release branch label Apr 1, 2025
@paritytech-release-backport-bot

Created backport PR for stable2503:

Please cherry-pick the changes locally and resolve any conflicts.

git fetch origin backport-7777-to-stable2503
git worktree add --checkout .worktree/backport-7777-to-stable2503 backport-7777-to-stable2503
cd .worktree/backport-7777-to-stable2503
git reset --hard HEAD^
git cherry-pick -x 954d8571df26b5cc0d342c6c05d414bd6912f1cd
git push --force-with-lease

michalkucharczyk added a commit that referenced this pull request Apr 1, 2025
…ing the view (#7777)

Avoid banning future/stale transactions reported as invalid by the
authorship module.

When re-org is handled by transaction pool, the view for new fork
(`Bn'`) is cloned from the tip of the other existing fork (`Bn`). The
new view is not entirely re-validated during the maintain process (it
will be revalidated in the background), so it may happen that it
contains transactions that are ready on (`Bn`) but actually are not
ready on (`Bn'`). All required (which are expected to be in retracted
set) transactions are submitted to the new view, but order of txs in
ready iterator is not updated.

The proper fix would require to re-build the the iterator - which is not
trivial as we do not have tags for transactions for block `Bn'` yet. We
could force retracted txs to be before ready transactions but it also
does not feel to be a good solution - it still would be best effort
trial.

For now allowing future transactions to re-enter the view immediately is
in my opinion a good compromise. This PR is a quick fix and actually
brings back behavior of txpool from before merging #6008. The bad thing
is that incorrect transactions are detected during block authorship, but
this situation to happen requires some specific pre-conditions which
should be rare.

If this PR is not merged, some transaction will get included into
blocks, only after
[`DEFAULT_BAN_TIME_SECS`](https://github.com/paritytech/polkadot-sdk/blob/4b39ff00b887039bc3e02a763a29deb09df56833/substrate/client/transaction-pool/src/graph/rotator.rs#L37),
which is pretty bad.

---------

Co-authored-by: cmd[bot] <41898282+github-actions[bot]@users.noreply.github.com>
(cherry picked from commit 954d857)
ordian added a commit that referenced this pull request Apr 28, 2025
* master: (26 commits)
  Snowbridge V2 (#7402)
  [AHM] Revert multi-block election, slashing and staking client pallets (#7939)
  docs: update local ci execution instruction (#8003)
  Upgrade deps to eliminate ancient dependencies (#7999)
  Removed `pallet:getter` from XCM pallets (#7916)
  Add digest processor xcm emulator (#7915)
  Fix: [Referenda Tracks] Resolve representation issues that are breaking PJS apps (#7671)
  Improve XCMP weight metering (#7963)
  bump version of zombienet-sdk (#7964)
  rpc-v2/archive: Rename archive call method result to value (#7885)
  Bump parachains runtime api to 13 (#7981)
  `bp-runtime`: make macro expansion not rely on `sp-std` in scope. (#7978)
  [CI/CD] Refactor backports flow so that it can determine automatically where to do a backport based on labels (#7976)
  Treasury: update expire date on payout (#7958) (#7959)
  `fatxpool`: report_invalid: do not ban Future/Stale txs from re-entering the view (#7777)
  Fix XCM Barrier Rejection Handling to Return Incomplete with Weight (#7843)
  Bump openssl from 0.10.64 to 0.10.70 (#7442)
  runtime-api: remove redundant version checks (#7610)
  Upgrade link-checker cache to v4 (#7874)
  Updating readmes (#7950)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A4-backport-stable2503 Pull request must be backported to the stable2503 release branch R0-no-crate-publish-required The change does not require any crates to be re-published. T0-node This PR/Issue is related to the topic “node”.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fatxpool: transaction reported as invalid during block building shall be removed from the pool

4 participants