Create bootstrap replicaset#10
Conversation
b5d9dfb to
80c7496
Compare
80c7496 to
b9ef043
Compare
b9ef043 to
5c3067d
Compare
9f72e80 to
3351061
Compare
joncinque
left a comment
There was a problem hiding this comment.
Looks great overall! Almost entirely nits
| pub tpu_enable_udp: bool, | ||
| pub tpu_disable_quic: bool, |
There was a problem hiding this comment.
nit: I'm wondering if these should be totally omitted. They were useful during the transition from UDP to QUIC, but since the transition has been complete for awhile, we should never want to enable UDP and never want to disable QUIC
| impl std::fmt::Display for ValidatorConfig { | ||
| fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result { | ||
| let known_validators = match &self.known_validators { | ||
| Some(validators) => validators | ||
| .iter() | ||
| .map(|v| v.to_string()) | ||
| .collect::<Vec<_>>() | ||
| .join(", "), | ||
| None => "None".to_string(), | ||
| }; | ||
| write!( | ||
| f, | ||
| "Runtime Config\n\ | ||
| tpu_enable_udp: {}\n\ | ||
| tpu_disable_quic: {}\n\ | ||
| max_ledger_size: {:?}\n\ | ||
| skip_poh_verify: {}\n\ | ||
| no_snapshot_fetch: {}\n\ | ||
| require_tower: {}\n\ | ||
| enable_full_rpc: {}\n\ | ||
| known_validators: {:?}", | ||
| self.tpu_enable_udp, | ||
| self.tpu_disable_quic, | ||
| self.max_ledger_size, | ||
| self.skip_poh_verify, | ||
| self.no_snapshot_fetch, | ||
| self.require_tower, | ||
| self.enable_full_rpc, | ||
| known_validators, | ||
| ) | ||
| } | ||
| } |
There was a problem hiding this comment.
nit: is this needed? I'm wondering if it would be easier to just #[derive(Debug)] and let people use the debug output if needed
| .to_string(); | ||
|
|
||
| let mut validator_library = Library::default(); | ||
| let mut validator_library = ClusterImages::default(); |
There was a problem hiding this comment.
nit: maybe rename this variable to cluster_images?
| error!( | ||
| "The provided --limit-ledger-size value was too small, the minimum value is {DEFAULT_MIN_MAX_LEDGER_SHREDS}" | ||
| ); | ||
| return; |
There was a problem hiding this comment.
nit: I just found out about this, but a nice option if there's an issue on arg parsing that can't be covered by clap solana-labs/solana-program-library#6550 (comment)
Doesn't need to be done here, of course
| Arg::with_name("limit_ledger_size") | ||
| .long("limit-ledger-size") | ||
| .takes_value(true) | ||
| .help("Validator Config. The `--limit-ledger-size` parameter allows you to specify how many ledger | ||
| shreds your node retains on disk. If you do not | ||
| include this parameter, the validator will keep the entire ledger until it runs | ||
| out of disk space. The default value attempts to keep the ledger disk usage | ||
| under 500GB. More or less disk usage may be requested by adding an argument to | ||
| `--limit-ledger-size` if desired. Check `agave-validator --help` for the | ||
| default limit value used by `--limit-ledger-size`. More information about | ||
| selecting a custom limit value is at : https://github.com/solana-labs/solana/blob/583cec922b6107e0f85c7e14cb5e642bc7dfb340/core/src/ledger_cleanup_service.rs#L15-L26"), |
There was a problem hiding this comment.
nit: how about using a default value of DEFAULT_MAX_LEDGER_SHREDS.to_string()? that way you can just unwrap() fetching this later
| Arg::with_name("memory_requests") | ||
| .long("memory-requests") | ||
| .takes_value(true) | ||
| .default_value("70Gi") // 70 Gigabytes |
There was a problem hiding this comment.
micro-nit
| .default_value("70Gi") // 70 Gigabytes | |
| .default_value("70Gi") // 70 Gibibytes |
There was a problem hiding this comment.
this one always gets me
| } | ||
|
|
||
| #[allow(clippy::too_many_arguments)] | ||
| pub fn create_replica_set( |
There was a problem hiding this comment.
I have no idea if this works, so I'll assume it's all correct 😅
There was a problem hiding this comment.
haha it does. but i could change create_replica_set() to accept a struct instead of all the args.
| namespace: &str, | ||
| label_selector: &BTreeMap<String, String>, | ||
| image_name: &DockerImage, | ||
| environment_variables: Vec<EnvVar>, | ||
| command: &[String], |
There was a problem hiding this comment.
nit: since the refs here all end up getting cloned, you may as well take them by value instead of ref here, and leave the burden of copies on the caller
| requests: vec![ | ||
| ("cpu".to_string(), Quantity(cpu_requests)), | ||
| ("memory".to_string(), Quantity(memory_requests)), | ||
| ] | ||
| .into_iter() | ||
| .collect(), |
There was a problem hiding this comment.
nit: you should be able to use BTreeMap::from, ie
| requests: vec![ | |
| ("cpu".to_string(), Quantity(cpu_requests)), | |
| ("memory".to_string(), Quantity(memory_requests)), | |
| ] | |
| .into_iter() | |
| .collect(), | |
| requests: BTreeMap::from([ | |
| ("cpu".to_string(), Quantity(cpu_requests)), | |
| ("memory".to_string(), Quantity(memory_requests)), | |
| ]) |
| pub no_snapshot_fetch: bool, | ||
| pub require_tower: bool, | ||
| pub enable_full_rpc: bool, | ||
| pub known_validators: Option<Vec<Pubkey>>, |
There was a problem hiding this comment.
nit: The logic later of initializing it in add_known_validator is a bit confusing, could this just be a straight-up Vec?
There was a problem hiding this comment.
yes will fix. i think i got addicted to using Option early on. Will try to avoid when not needed lol
| let mut secrets = HashMap::new(); | ||
| let bootstrap_keypair = read_keypair_file(&identity_key_path) | ||
| .expect("Failed to read bootstrap validator keypair file"); | ||
| self.add_known_validator(bootstrap_keypair.pubkey()); |
There was a problem hiding this comment.
nit: just out of curiosity, do we add this one for the rest of validator scripts 🤔 also, it seems that this action is not quite related to the create_bootstrap_secret
There was a problem hiding this comment.
you are def right that this is in the wrong spot. not related to create_bootstrap_secret. will fix. I added the bootstrap to known_validators because for all other validators deployed, I will pass in --known-validators <bootstrap-identity> to the other validators' startup scripts
* address jon nits from #38 * chido comment: refactor known validator from: #10 * rewrite init-metrics.sh in rust * Create non-bootstrap, voting validator accounts * build and push validator docker image * create and deploy validator secret * add validator selectors * create validator replica sets. need shred_version * add in get shred version from genesis * deploy validator replica set * deploy validator service * refactor buildtype skip. will skip release channel pull/extract as well
Summary of Changes
Libraryis nowClusterImagesand its structure is slightly changed10th PR in a series of PRs that will build out the monogon testing framework for deploying validator clusters on Kubernetes