Skip to content

SIMD#83 - Allow conflicting entries in the account locking function#1624

Closed
Huzaifa696 wants to merge 20 commits intoanza-xyz:masterfrom
rakurai-io:allow-locking-conflicting-entries
Closed

SIMD#83 - Allow conflicting entries in the account locking function#1624
Huzaifa696 wants to merge 20 commits intoanza-xyz:masterfrom
rakurai-io:allow-locking-conflicting-entries

Conversation

@Huzaifa696
Copy link
Copy Markdown

@Huzaifa696 Huzaifa696 commented Jun 6, 2024

Problem

Under the old assumption that all entries are disjoint (i.e do not contain conflicting txs), the account lock function do not allow acquiring locks for conflicting txs in the same entry. This assumption is now changed with SIMD#83.

Summary of Changes

Changes to enable account locking of self-conflicting entries:

  • It is now allow for a tx account to conflict with another account in the same entry.
  • The locking function will allow locking self-conflicting entries based on a flag that will be set when the relevant feature is active.
  • Introduce a flag to indicate that the entry is self-conflicting or not. This will be used in replay stage to allow flexible re-batching.

Related to:
#1025

Copy link
Copy Markdown

@apfitzge apfitzge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some initial comments on code (not tests); Also please remove the "fixes" from the PR description, since this will not close all items in the issue

Comment thread core/src/banking_stage/consumer.rs Outdated
Comment thread runtime/src/bank.rs Outdated
Comment thread accounts-db/src/accounts.rs Outdated
Comment thread accounts-db/src/accounts.rs Outdated
Comment thread accounts-db/src/accounts.rs Outdated
Comment thread accounts-db/src/accounts.rs Outdated
loop {
// try to lock the accounts
let batch = bank.prepare_sanitized_batch(transactions);
let (batch, _self_conflicting_batch) =
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just want to clarify, there will be a follow-up PR with adjustments in this replay scheduler in order to use the fact its' a self-conflicting entry.

As PR currently stands it doesn't seem to satisfy the 2nd item of #1025

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, you are right, the PR that handles the conflicting entries on replay side is also in progress.

Comment thread accounts-db/src/accounts.rs Outdated
writable_keys: Vec<&Pubkey>,
readonly_keys: Vec<&Pubkey>,
) -> Result<()> {
entry_accts_lookup: &mut EntryAcctLocks,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: variable name and type isn't consistent. how about entry_account_locks?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

imo, better as batch_account_locks. Entry is a concept at a higher level (PoH) and really shouldn't be making its' way down here.

We have a batch of transactions, they may be from and entry if this is on block-verification, or they may have chance to become an entry if this is block-production.

Comment thread accounts-db/src/accounts.rs Outdated
pub type PubkeyAccountSlot = (Pubkey, AccountSharedData, Slot);

#[derive(Debug, Default)]
struct EntryAcctLocks {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: i prefer EntryAccountLocks, considering this is a close kin to already existing AccountLocks.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment thread accounts-db/src/accounts.rs Outdated
Comment thread accounts-db/src/accounts.rs Outdated
debug!("Writable account in use: {:?}", k);
return (Err(TransactionError::AccountInUse), self_conflicting_tx);
} else {
self_conflicting_tx = true;
Copy link
Copy Markdown
Member

@ryoqun ryoqun Jun 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self_conflicting_tx can be set to true if entry_accts_lookup.writables.contains(k) == false and entry_accts_lookup.readables.contains(k) == false for 2nd batch, which is conflicting with the first batch.

is that semantically correct?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, you are right, this is not semantically correct.

although, self_conflicting_tx will have no effect if the batch locking fails on the replay side in process_entries (which is part of the next PR).
To be clearer, I will return false for every error condition Err(TransactionError::AccountInUse)

Comment thread accounts-db/src/accounts.rs Outdated
Comment thread accounts-db/src/accounts.rs Outdated
Comment thread accounts-db/src/accounts.rs Outdated
}
}
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

below in AccountLocks, I'm considering we may want to have write_locks also track the number of outstanding write-locks, even if all must be in same batch.

It makes it so that the sanity checks we do when unlocking batches not broken by the new functionality.

@Huzaifa696
Copy link
Copy Markdown
Author

Incorporated the above points except the pre-allocation point in [2f4f969], please review.

Note: I also ran cargo fmt so it also contains some formatting changes.

@apfitzge
Copy link
Copy Markdown

Looking at changes.

Note: I also ran cargo fmt so it also contains some formatting changes.

Please also run clippy; it's one of the first thing CI is going to do - so not worth blocking the queue up if we know it will fail on clippy.

Comment thread accounts-db/src/accounts.rs Outdated
Comment thread accounts-db/src/accounts.rs Outdated
Comment thread accounts-db/src/accounts.rs Outdated
batch_account_locks: &mut BatchAccountLocks,
allow_self_conflicting_entries: bool,
) -> (Result<()>, bool) {
let mut self_conflicting_tx = false;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this indiciates a self-conflicting batch. tx still cannot have duplicate keys so can't conflict with itself.

Comment thread accounts-db/src/accounts.rs Outdated
Comment thread accounts-db/src/accounts.rs Outdated
Comment thread core/src/banking_stage/consumer.rs Outdated
Comment thread runtime/src/bank.rs Outdated
) -> TransactionBatch<'a, 'b> {
// this lock_results could be: Ok, AccountInUse, WouldExceedBlockMaxLimit or WouldExceedAccountMaxLimit
let tx_account_lock_limit = self.get_transaction_account_lock_limit();
let feature_set: Arc<FeatureSet> = self.feature_set.clone();
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do not clone here, see other comment.

Comment thread runtime/src/bank.rs Outdated
@Huzaifa696
Copy link
Copy Markdown
Author

Incorporated the latest review points [7cfde0b]. Please review.

My standard practice is to run these checks before pushing:

  • cargo check
  • cargo check --tests
  • cargo fmt
  • cargo clippy
  • cargo test --package solana-accounts-db

Comment thread runtime/src/bank.rs Outdated
Comment thread runtime/src/bank.rs Outdated
Comment thread runtime/src/transaction_batch.rs Outdated
Comment thread sdk/src/feature_set.rs Outdated
Comment thread accounts-db/src/accounts.rs Outdated
@apfitzge
Copy link
Copy Markdown

apfitzge commented Jun 27, 2024

With this sort of change, to something that is critical to Solana's block-verfication, we always need to take some care that we are not making the performance worse.
There are several benchmarks available which will help us gain confidence in this code, here let's just use the blockstore_processor.

Command:

./cargo nightly bench --package solana-ledger --bench blockstore_processor -- --show-output

@70089cce5119c9afaeb2986e2ecaa6d4505ec15d (where this branches off):

test bench_execute_batch_full_batch                        ... bench:     698,738 ns/iter (+/- 26,794)
test bench_execute_batch_full_batch_disable_tx_cost_update ... bench:     710,684 ns/iter (+/- 27,627)
test bench_execute_batch_half_batch                        ... bench:     745,050 ns/iter (+/- 28,821)
test bench_execute_batch_half_batch_disable_tx_cost_update ... bench:     758,186 ns/iter (+/- 30,148)
test bench_execute_batch_unbatched                         ... bench:   1,269,662 ns/iter (+/- 125,881)
test bench_execute_batch_unbatched_disable_tx_cost_update  ... bench:   1,246,133 ns/iter (+/- 74,614)

@7cfde0b44fdfa08b18dbc8368a947f9d4285001f (this PR w/ current state):

test bench_execute_batch_full_batch                        ... bench:     695,840 ns/iter (+/- 41,276)
test bench_execute_batch_full_batch_disable_tx_cost_update ... bench:     713,964 ns/iter (+/- 29,228)
test bench_execute_batch_half_batch                        ... bench:     724,764 ns/iter (+/- 105,778)
test bench_execute_batch_half_batch_disable_tx_cost_update ... bench:     749,462 ns/iter (+/- 26,902)
test bench_execute_batch_unbatched                         ... bench:   1,377,762 ns/iter (+/- 212,255)
test bench_execute_batch_unbatched_disable_tx_cost_update  ... bench:   1,270,989 ns/iter (+/- 122,367)

Times not significantly worse, within uncertainty bounds. However, the uncertainty seems to be increased across the board - my instinct is that this is due to use performing additional allocations for the new hashsets.

If I make the change to use thread_local!:

test bench_execute_batch_full_batch                        ... bench:     677,980 ns/iter (+/- 57,445)
test bench_execute_batch_full_batch_disable_tx_cost_update ... bench:     710,150 ns/iter (+/- 29,212)
test bench_execute_batch_half_batch                        ... bench:     740,624 ns/iter (+/- 25,573)
test bench_execute_batch_half_batch_disable_tx_cost_update ... bench:     748,006 ns/iter (+/- 20,819)
test bench_execute_batch_unbatched                         ... bench:   1,221,819 ns/iter (+/- 45,270)
test bench_execute_batch_unbatched_disable_tx_cost_update  ... bench:   1,259,082 ns/iter (+/- 35,827)

potentially worth a more targetted bench to benchmark just the locking here.

@Huzaifa696 Huzaifa696 force-pushed the allow-locking-conflicting-entries branch from 557dd6b to 7aeab68 Compare July 26, 2024 12:59
@Huzaifa696 Huzaifa696 marked this pull request as ready for review July 30, 2024 08:07
@Huzaifa696 Huzaifa696 force-pushed the allow-locking-conflicting-entries branch from 7aeab68 to 053635a Compare August 6, 2024 10:26
@Huzaifa696
Copy link
Copy Markdown
Author

Rebased again to incorporate the latest API modifications in account_locks.rs file. Unit tests also updated. Even though high level tests for lock_accounts are already written in test_conflicting_accounts_locks, some well directed tests for the new try_lock_accounts_with_conflicting_batches function will be incorporated soon.

@talalim
Copy link
Copy Markdown

talalim commented Aug 8, 2024

added low level tests for try_lock_accounts_with_conflicting_batches

@Huzaifa696 Huzaifa696 force-pushed the allow-locking-conflicting-entries branch 2 times, most recently from e29f2bc to be2ba84 Compare September 4, 2024 05:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants