validator: Add --enable-bigtable-ledger-upload flag #12040
Conversation
Codecov Report
@@ Coverage Diff @@
## master #12040 +/- ##
=========================================
- Coverage 82.1% 82.1% -0.1%
=========================================
Files 334 336 +2
Lines 78798 78856 +58
=========================================
+ Hits 64768 64777 +9
- Misses 14030 14079 +49 |
| }; | ||
| let exit_bigtable_ledger_upload_service = Arc::new(AtomicBool::new(false)); | ||
|
|
||
| let (bigtable_ledger_storage, _bigtable_ledger_upload_service) = |
There was a problem hiding this comment.
It's slightly awkward that BigTableUploadService is plumbed into RpcService like this. Moving it up a level into Validator would require moving the tokio Runtime and solana_storage_bigtable::LedgerStorage::new as well and I didn't like that code churn in this PR. That's my excuse at least
| sync::{atomic::AtomicBool, Arc}, | ||
| }; | ||
|
|
||
| async fn upload( |
There was a problem hiding this comment.
Changes in this file are really just moving code into ledger/
|
For mainnet-beta, I think we keep the same upload at the end of an epoch process going as the last resort. Then additionally add this flag on one or more of the warehouse nodes to get confirmed block data into BigTable within seconds. At that point, we can start making API nodes ephemeral! |
There was a problem hiding this comment.
Looks good overall. I'll buy your rationalization for the new service in rpc_service, though I agree it's awkward.
I'm wondering about 2 potential races:
-
Is this the right code for the end-of-epoch upload? https://github.com/solana-labs/cluster/blob/3614bbf6d36a275f4a5c0cba10d517367825b508/bin/warehouse-upload-to-storage-bucket.sh#L74 It seems like that is checking the entire ledger for gaps in bigtable storage all at once, and won't catch new confirmed blocks that are uploaded by the BigTableUploadService in the meantime. ... on looking at the lower-level apis more, maybe this isn't an issue, beyond efficiency.
-
Although this isn't merged yet, it will bring the complication of blocktimes and might be worth planning for now: #11955. Because the CacheBlockTimeService is prompted by
handle_votable_bankand the BigTableUploadService by a changes inhighest_confirmed_root, if both PRs get merged as is, the blocktime "should" get cached before the upload happens. But if the blocktime calculation backs up for some reason, there's no guarantee.
Yes that's the right code. But this always runs on a ledger from the previous Epoch, using the old ledger from the warehouse node. In the meantime, the warehouse node has been restarted from a snapshot and is building a new ledger.
I don't think we need to be this aggressive with uploading ledger into BigTable. Would it help to trail the highest confirmed root by 100 slots or so? |
Aha! That explains it. Thanks for that detail.
Yep, that would definitely help. 100 slots should be more than enough |
Cool I added 3af90dd for this for now. But perhaps CacheBlockTimeService could signal the last slot it's added a block time to somehow, another field in block_commitment_cache or blockstore method? This was we remove the possible race entirely. |
|
Maybe once #11955 lands, the loop in here can check if |
…2057) * Add --enable-bigtable-ledger-upload flag (cherry picked from commit d8e2038) * Relocate BigTable uploader to ledger/ crate (cherry picked from commit 91a56ca) # Conflicts: # ledger/Cargo.toml * Cargo.lock (cherry picked from commit 2b8a521) * Add BigTableUploadService (cherry picked from commit bc7731b) * Add BigTableUploadService (cherry picked from commit bafdcf2) * Add exit flag for bigtable upload operations (cherry picked from commit d3611f7) * Remove dead code (cherry picked from commit cd3c134) * Request correct access (cherry picked from commit 4ba43c2) * Add LARGEST_CONFIRMED_ROOT_UPLOAD_DELAY (cherry picked from commit b64fb29) * Resolve merge conflicts Co-authored-by: Michael Vines <mvines@gmail.com>
When enabled, the
--enable-bigtable-ledger-uploadflag will cause the validator to continually upload new confirmed blocks to BigTable in the background.