Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimistic signature verification for commit votes #14643

Merged
merged 63 commits into from
Oct 9, 2024

Conversation

vusirikala
Copy link
Contributor

@vusirikala vusirikala commented Sep 15, 2024

Description

This PR implements optimistic signature verification to reduce the time required to verify commit votes.
When the optimistic signature verification feature flag is enabled, we will not verify these messages up front. We will accumulate the unverified messages, and when the accumulated voting power is higher than a threshold, we will aggregate all the signatures and verify the aggregated signature.
If the verification fails, we need to verify each individual signature. The ValidatorVerifier stores the list of authors that submitted bad messages, and will disable the optimistic signature verification for these malicious voters.

Type of Change

  • New feature
  • Bug fix
  • Breaking change
  • Performance improvement
  • Refactoring
  • Dependency update
  • Documentation update
  • Tests

Which Components or Systems Does This Change Impact?

  • Validator Node
  • Full Node (API, Indexer, etc.)
  • Move/Aptos Virtual Machine
  • Aptos Framework
  • Aptos CLI/SDK
  • Developer Infrastructure
  • Other (specify)

How Has This Been Tested?

Key Areas to Review

Checklist

  • I have read and followed the CONTRIBUTING doc
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I identified and added all stakeholders and component owners affected by this change as reviewers
  • I tested both happy and unhappy path of the functionality
  • I have made corresponding changes to the documentation

Copy link

trunk-io bot commented Sep 15, 2024

⏱️ 3h 59m total CI duration on this PR
Slowest 15 Jobs Cumulative Duration Recent Runs
execution-performance / single-node-performance 1h 6m 🟩🟩🟩
forge-compat-test / forge 33m 🟩🟩
forge-e2e-test / forge 28m 🟩🟩
test-target-determinator 13m 🟩🟩🟩
execution-performance / test-target-determinator 13m 🟩🟩🟩
check 11m 🟩🟩🟩
rust-move-tests 10m 🟩
rust-move-tests 10m 🟩
rust-move-tests 10m 🟩
rust-move-tests 9m 🟩
general-lints 7m 🟩🟩🟩🟩
rust-cargo-deny 7m 🟩🟩🟩🟩
rust-doc-tests 5m 🟩
rust-doc-tests 5m 🟩
rust-doc-tests 3m 🟥

settingsfeedbackdocs ⋅ learn more about trunk.io

@vusirikala vusirikala changed the base branch from main to satya/osv_votes_and_order_votes September 15, 2024 01:26
@vusirikala vusirikala requested review from sitalkedia, danielxiangzl and igor-aptos and removed request for gregnazario, JoshLind and sasha8 September 15, 2024 01:26
@vusirikala vusirikala added the CICD:run-e2e-tests when this label is present github actions will run all land-blocking e2e tests from the PR label Sep 15, 2024

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

match commit_msg.req.verify(&epoch_state_clone.verifier) {
Ok(_) => {
let _ = tx.unbounded_send(commit_msg);
let _ = tx.unbounded_send((commit_msg, true));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Instead of true and false, it reads better if you can create an enum for verified and unverified and pass them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

This comment has been minimized.

This comment has been minimized.

Base automatically changed from satya/osv_votes_and_order_votes to main October 7, 2024 23:20

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

for vote in unverified_votes {
let author = vote.author();
let sig = vote.signature_with_status();
if vote.ledger_info() == commit_ledger_info {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand what does this condition mean, if the ledger info doesn't match we should just filter out those signatures?

@@ -95,7 +74,7 @@ fn aggregate_commit_proof(
// we differentiate buffer items at different stages
// for better code readability
pub struct OrderedItem {
pub unverified_signatures: PartialSignatures,
pub unverified_votes: Vec<CommitVote>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

using vector here loses the guarantee that one author only has one vote in partial signatures, why this is not a ledger info with unverified signatures?

@@ -704,6 +698,17 @@ impl BufferManager {
}
}

fn get_commit_message(commit_vote: CommitVote) -> CommitMessage {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this type of function typically prefix with gen|generate not get

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

for vote in unverified_votes.values() {
let sig = vote.signature_with_status();
if vote.ledger_info() == commit_ledger_info {
li_with_sig.add_signature(vote.author(), sig);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this function signature looks awkward, it should take a ownership of sig instead of a reference and clone inside

@@ -321,7 +321,7 @@ impl Default for ConsensusConfig {
num_bounded_executor_tasks: 16,
enable_pre_commit: true,
max_pending_rounds_in_commit_vote_cache: 100,
optimistic_sig_verification: false,
optimistic_sig_verification: true,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

disable before landing

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

@vusirikala vusirikala enabled auto-merge (squash) October 9, 2024 22:36

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

Copy link
Contributor

github-actions bot commented Oct 9, 2024

✅ Forge suite realistic_env_max_load success on a3ba6dc1a71de655aec773a105feeca5552c937f

two traffics test: inner traffic : committed: 13655.62 txn/s, latency: 2916.16 ms, (p50: 2700 ms, p70: 3000, p90: 3000 ms, p99: 6000 ms), latency samples: 5192200
two traffics test : committed: 100.00 txn/s, latency: 2484.54 ms, (p50: 2400 ms, p70: 2500, p90: 2800 ms, p99: 5800 ms), latency samples: 1780
Latency breakdown for phase 0: ["QsBatchToPos: max: 0.251, avg: 0.222", "QsPosToProposal: max: 0.269, avg: 0.247", "ConsensusProposalToOrdered: max: 0.332, avg: 0.301", "ConsensusOrderedToCommit: max: 0.478, avg: 0.455", "ConsensusProposalToCommit: max: 0.775, avg: 0.756"]
Max non-epoch-change gap was: 0 rounds at version 0 (avg 0.00) [limit 4], 1.01s no progress at version 32781 (avg 0.21s) [limit 15].
Max epoch-change gap was: 0 rounds at version 0 (avg 0.00) [limit 4], 8.23s no progress at version 2050010 (avg 6.64s) [limit 15].
Test Ok

Copy link
Contributor

github-actions bot commented Oct 9, 2024

✅ Forge suite framework_upgrade success on 46bf19eb4f132b9d8fc19eff3f3334cdf9aa1775 ==> a3ba6dc1a71de655aec773a105feeca5552c937f

Compatibility test results for 46bf19eb4f132b9d8fc19eff3f3334cdf9aa1775 ==> a3ba6dc1a71de655aec773a105feeca5552c937f (PR)
Upgrade the nodes to version: a3ba6dc1a71de655aec773a105feeca5552c937f
framework_upgrade::framework-upgrade::full-framework-upgrade : committed: 1155.63 txn/s, submitted: 1159.15 txn/s, failed submission: 3.51 txn/s, expired: 3.51 txn/s, latency: 2558.38 ms, (p50: 2100 ms, p70: 2600, p90: 4200 ms, p99: 6600 ms), latency samples: 105240
framework_upgrade::framework-upgrade::full-framework-upgrade : committed: 1129.51 txn/s, submitted: 1131.74 txn/s, failed submission: 2.23 txn/s, expired: 2.23 txn/s, latency: 2654.13 ms, (p50: 2400 ms, p70: 3000, p90: 3900 ms, p99: 5700 ms), latency samples: 101320
5. check swarm health
Compatibility test for 46bf19eb4f132b9d8fc19eff3f3334cdf9aa1775 ==> a3ba6dc1a71de655aec773a105feeca5552c937f passed
Upgrade the remaining nodes to version: a3ba6dc1a71de655aec773a105feeca5552c937f
framework_upgrade::framework-upgrade::full-framework-upgrade : committed: 814.19 txn/s, submitted: 815.51 txn/s, failed submission: 1.32 txn/s, expired: 1.32 txn/s, latency: 3591.20 ms, (p50: 2400 ms, p70: 4500, p90: 6500 ms, p99: 15300 ms), latency samples: 73880
Test Ok

Copy link
Contributor

github-actions bot commented Oct 9, 2024

✅ Forge suite compat success on 46bf19eb4f132b9d8fc19eff3f3334cdf9aa1775 ==> a3ba6dc1a71de655aec773a105feeca5552c937f

Compatibility test results for 46bf19eb4f132b9d8fc19eff3f3334cdf9aa1775 ==> a3ba6dc1a71de655aec773a105feeca5552c937f (PR)
1. Check liveness of validators at old version: 46bf19eb4f132b9d8fc19eff3f3334cdf9aa1775
compatibility::simple-validator-upgrade::liveness-check : committed: 11266.66 txn/s, latency: 2499.51 ms, (p50: 1900 ms, p70: 2000, p90: 2400 ms, p99: 29000 ms), latency samples: 464220
2. Upgrading first Validator to new version: a3ba6dc1a71de655aec773a105feeca5552c937f
compatibility::simple-validator-upgrade::single-validator-upgrading : committed: 7034.57 txn/s, latency: 3989.63 ms, (p50: 4300 ms, p70: 4700, p90: 4900 ms, p99: 5000 ms), latency samples: 130920
compatibility::simple-validator-upgrade::single-validator-upgrade : committed: 6910.85 txn/s, latency: 4707.04 ms, (p50: 4900 ms, p70: 5100, p90: 6100 ms, p99: 6500 ms), latency samples: 237660
3. Upgrading rest of first batch to new version: a3ba6dc1a71de655aec773a105feeca5552c937f
compatibility::simple-validator-upgrade::half-validator-upgrading : committed: 6733.24 txn/s, latency: 4128.30 ms, (p50: 4600 ms, p70: 5000, p90: 5200 ms, p99: 5400 ms), latency samples: 125640
compatibility::simple-validator-upgrade::half-validator-upgrade : committed: 6932.23 txn/s, latency: 4649.81 ms, (p50: 4800 ms, p70: 5200, p90: 6300 ms, p99: 6800 ms), latency samples: 233680
4. upgrading second batch to new version: a3ba6dc1a71de655aec773a105feeca5552c937f
compatibility::simple-validator-upgrade::rest-validator-upgrading : committed: 11060.83 txn/s, latency: 2497.95 ms, (p50: 2700 ms, p70: 2900, p90: 3000 ms, p99: 3200 ms), latency samples: 192480
compatibility::simple-validator-upgrade::rest-validator-upgrade : committed: 11452.23 txn/s, latency: 2742.18 ms, (p50: 2600 ms, p70: 2800, p90: 3300 ms, p99: 4700 ms), latency samples: 370580
5. check swarm health
Compatibility test for 46bf19eb4f132b9d8fc19eff3f3334cdf9aa1775 ==> a3ba6dc1a71de655aec773a105feeca5552c937f passed
Test Ok

@vusirikala vusirikala merged commit 0efb4fc into main Oct 9, 2024
70 of 90 checks passed
@vusirikala vusirikala deleted the satya/osv_commit_votes branch October 9, 2024 23:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CICD:run-e2e-tests when this label is present github actions will run all land-blocking e2e tests from the PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants