Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use gcp runners for fullnode-sync workflows #14966

Merged
merged 2 commits into from
Oct 17, 2024

Conversation

aluon
Copy link
Contributor

@aluon aluon commented Oct 15, 2024

Description

Change the runners used by the fullnode-sync workflows. We've had issues with workflow runs failing due to insufficient disk space. I'm reverting this back to the GCP runners while I investigate

How Has This Been Tested?

Will run some of these workflows manually for testing

Key Areas to Review

Type of Change

  • New feature
  • Bug fix
  • Breaking change
  • Performance improvement
  • Refactoring
  • Dependency update
  • Documentation update
  • Tests

Which Components or Systems Does This Change Impact?

  • Validator Node
  • Full Node (API, Indexer, etc.)
  • Move/Aptos Virtual Machine
  • Aptos Framework
  • Aptos CLI/SDK
  • Developer Infrastructure
  • Move Compiler
  • Other (specify)

Checklist

  • I have read and followed the CONTRIBUTING doc
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I identified and added all stakeholders and component owners affected by this change as reviewers
  • I tested both happy and unhappy path of the functionality
  • I have made corresponding changes to the documentation

@aluon aluon requested review from JoshLind and a team October 15, 2024 14:38
@aluon aluon requested a review from a team as a code owner October 15, 2024 14:38
Copy link

trunk-io bot commented Oct 15, 2024

⏱️ 4h 12m total CI duration on this PR
Slowest 15 Jobs Cumulative Duration Recent Runs
fullnode-fast-testnet-main / fullnode-sync 57m 🟩
fullnode-intelligent-devnet-main / fullnode-sync 49m 🟥
fullnode-intelligent-devnet-main / fullnode-sync 43m 🟥
rust-cargo-deny 17m 🟩🟩🟩🟩🟩 (+5 more)
check-dynamic-deps 15m 🟩🟩🟩🟩🟩 (+5 more)
test-target-determinator 8m 🟩🟩
execution-performance / test-target-determinator 8m 🟩🟩
check 8m 🟩🟩
rust-doc-tests 5m 🟩
rust-doc-tests 5m 🟩
general-lints 5m 🟩🟩🟩🟩🟩 (+5 more)
semgrep/ci 4m 🟩🟩🟩🟩🟩 (+5 more)
fetch-last-released-docker-image-tag 3m 🟩🟩
check-repo 2m 🟩🟩🟩🟩🟩 (+5 more)
rust-move-tests 2m 🟩

🚨 1 job on the last run was significantly faster/slower than expected

Job Duration vs 7d avg Delta
execution-performance / single-node-performance 11s 20m -99%

settingsfeedbackdocs ⋅ learn more about trunk.io

Copy link
Contributor

@JoshLind JoshLind left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @aluon!

@aluon aluon force-pushed the aluon/fullnode-sync-instance-type branch 6 times, most recently from 5aaed51 to 10bda85 Compare October 16, 2024 19:50
@aluon aluon force-pushed the aluon/fullnode-sync-instance-type branch from 10bda85 to 272e7b9 Compare October 16, 2024 20:28
@aluon aluon changed the title update instance type for fullnode-sync workflows use gcp runners for fullnode-sync workflows Oct 16, 2024
@aluon aluon enabled auto-merge (squash) October 16, 2024 21:51

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

Copy link
Contributor

✅ Forge suite compat success on b29f09f57e898d8d211c8bc3e303f6e50bba2266 ==> 6345d3cc9399fd759b9acc95a40cf75d2e5f87fd

Compatibility test results for b29f09f57e898d8d211c8bc3e303f6e50bba2266 ==> 6345d3cc9399fd759b9acc95a40cf75d2e5f87fd (PR)
1. Check liveness of validators at old version: b29f09f57e898d8d211c8bc3e303f6e50bba2266
compatibility::simple-validator-upgrade::liveness-check : committed: 11359.93 txn/s, latency: 2509.86 ms, (p50: 1900 ms, p70: 2100, p90: 2500 ms, p99: 25100 ms), latency samples: 462020
2. Upgrading first Validator to new version: 6345d3cc9399fd759b9acc95a40cf75d2e5f87fd
compatibility::simple-validator-upgrade::single-validator-upgrading : committed: 7042.73 txn/s, latency: 3979.95 ms, (p50: 4500 ms, p70: 4800, p90: 5000 ms, p99: 5100 ms), latency samples: 134080
compatibility::simple-validator-upgrade::single-validator-upgrade : committed: 6477.59 txn/s, latency: 4888.36 ms, (p50: 5200 ms, p70: 5300, p90: 6800 ms, p99: 7100 ms), latency samples: 212340
3. Upgrading rest of first batch to new version: 6345d3cc9399fd759b9acc95a40cf75d2e5f87fd
compatibility::simple-validator-upgrade::half-validator-upgrading : committed: 6440.73 txn/s, latency: 4212.63 ms, (p50: 4600 ms, p70: 4900, p90: 5200 ms, p99: 5800 ms), latency samples: 130540
compatibility::simple-validator-upgrade::half-validator-upgrade : committed: 6792.92 txn/s, latency: 4753.06 ms, (p50: 4900 ms, p70: 5300, p90: 6600 ms, p99: 7100 ms), latency samples: 231540
4. upgrading second batch to new version: 6345d3cc9399fd759b9acc95a40cf75d2e5f87fd
compatibility::simple-validator-upgrade::rest-validator-upgrading : committed: 5144.60 txn/s, latency: 4180.22 ms, (p50: 2700 ms, p70: 5500, p90: 11300 ms, p99: 13300 ms), latency samples: 128500
compatibility::simple-validator-upgrade::rest-validator-upgrade : committed: 9568.74 txn/s, latency: 3213.62 ms, (p50: 2700 ms, p70: 3400, p90: 6100 ms, p99: 7900 ms), latency samples: 313740
5. check swarm health
Compatibility test for b29f09f57e898d8d211c8bc3e303f6e50bba2266 ==> 6345d3cc9399fd759b9acc95a40cf75d2e5f87fd passed
Test Ok

Copy link
Contributor

✅ Forge suite realistic_env_max_load success on 6345d3cc9399fd759b9acc95a40cf75d2e5f87fd

two traffics test: inner traffic : committed: 13491.02 txn/s, latency: 3020.54 ms, (p50: 2700 ms, p70: 3000, p90: 3300 ms, p99: 5100 ms), latency samples: 5129640
two traffics test : committed: 100.01 txn/s, latency: 2894.34 ms, (p50: 2400 ms, p70: 2700, p90: 3400 ms, p99: 49300 ms), latency samples: 1880
Latency breakdown for phase 0: ["QsBatchToPos: max: 0.234, avg: 0.219", "QsPosToProposal: max: 0.424, avg: 0.331", "ConsensusProposalToOrdered: max: 0.322, avg: 0.302", "ConsensusOrderedToCommit: max: 0.520, avg: 0.464", "ConsensusProposalToCommit: max: 0.824, avg: 0.767"]
Max non-epoch-change gap was: 1 rounds at version 5792566 (avg 0.00) [limit 4], 1.83s no progress at version 5792566 (avg 0.22s) [limit 15].
Max epoch-change gap was: 0 rounds at version 0 (avg 0.00) [limit 4], 8.58s no progress at version 2570381 (avg 4.82s) [limit 15].
Test Ok

Copy link
Contributor

✅ Forge suite framework_upgrade success on b29f09f57e898d8d211c8bc3e303f6e50bba2266 ==> 6345d3cc9399fd759b9acc95a40cf75d2e5f87fd

Compatibility test results for b29f09f57e898d8d211c8bc3e303f6e50bba2266 ==> 6345d3cc9399fd759b9acc95a40cf75d2e5f87fd (PR)
Upgrade the nodes to version: 6345d3cc9399fd759b9acc95a40cf75d2e5f87fd
framework_upgrade::framework-upgrade::full-framework-upgrade : committed: 1262.83 txn/s, submitted: 1264.97 txn/s, failed submission: 2.15 txn/s, expired: 2.15 txn/s, latency: 2556.50 ms, (p50: 2400 ms, p70: 2700, p90: 4200 ms, p99: 6000 ms), latency samples: 105880
framework_upgrade::framework-upgrade::full-framework-upgrade : committed: 1198.84 txn/s, submitted: 1201.12 txn/s, failed submission: 2.28 txn/s, expired: 2.28 txn/s, latency: 2513.42 ms, (p50: 2400 ms, p70: 2700, p90: 4100 ms, p99: 5600 ms), latency samples: 105360
5. check swarm health
Compatibility test for b29f09f57e898d8d211c8bc3e303f6e50bba2266 ==> 6345d3cc9399fd759b9acc95a40cf75d2e5f87fd passed
Upgrade the remaining nodes to version: 6345d3cc9399fd759b9acc95a40cf75d2e5f87fd
framework_upgrade::framework-upgrade::full-framework-upgrade : committed: 1313.40 txn/s, submitted: 1317.27 txn/s, failed submission: 3.87 txn/s, expired: 3.87 txn/s, latency: 2493.52 ms, (p50: 2100 ms, p70: 2400, p90: 4200 ms, p99: 5400 ms), latency samples: 115520
Test Ok

@aluon aluon merged commit 02380fb into main Oct 17, 2024
46 checks passed
@aluon aluon deleted the aluon/fullnode-sync-instance-type branch October 17, 2024 21:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants