Skip to content
This repository was archived by the owner on Feb 23, 2026. It is now read-only.

Create bootstrap replicaset#10

Merged
gregcusack merged 8 commits intoanza-xyz:mainfrom
gregcusack:create-bootstrap-replicaset
Apr 29, 2024
Merged

Create bootstrap replicaset#10
gregcusack merged 8 commits intoanza-xyz:mainfrom
gregcusack:create-bootstrap-replicaset

Conversation

@gregcusack
Copy link
Copy Markdown
Contributor

@gregcusack gregcusack commented Apr 1, 2024

Summary of Changes

  1. create bootstrap validator replicaset
  2. Update token-2022 to v1.0.0
  3. address a few nits from PR: add bootstrap kubernetes labels and validator management structs #9
  4. Library is now ClusterImages and its structure is slightly changed

10th PR in a series of PRs that will build out the monogon testing framework for deploying validator clusters on Kubernetes

@gregcusack gregcusack force-pushed the create-bootstrap-replicaset branch from 80c7496 to b9ef043 Compare April 23, 2024 16:55
@gregcusack gregcusack force-pushed the create-bootstrap-replicaset branch from b9ef043 to 5c3067d Compare April 23, 2024 16:59
@gregcusack gregcusack requested review from joncinque and yihau April 23, 2024 20:56
gregcusack added a commit to gregcusack/validator-lab that referenced this pull request Apr 25, 2024
@gregcusack gregcusack force-pushed the create-bootstrap-replicaset branch from 9f72e80 to 3351061 Compare April 25, 2024 23:54
@gregcusack gregcusack requested review from joncinque and yihau and removed request for joncinque and yihau April 25, 2024 23:57
Copy link
Copy Markdown

@joncinque joncinque left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great overall! Almost entirely nits

Comment thread src/validator_config.rs Outdated
Comment on lines +4 to +5
pub tpu_enable_udp: bool,
pub tpu_disable_quic: bool,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I'm wondering if these should be totally omitted. They were useful during the transition from UDP to QUIC, but since the transition has been complete for awhile, we should never want to enable UDP and never want to disable QUIC

Comment thread src/validator_config.rs Outdated
Comment on lines +14 to +45
impl std::fmt::Display for ValidatorConfig {
fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result {
let known_validators = match &self.known_validators {
Some(validators) => validators
.iter()
.map(|v| v.to_string())
.collect::<Vec<_>>()
.join(", "),
None => "None".to_string(),
};
write!(
f,
"Runtime Config\n\
tpu_enable_udp: {}\n\
tpu_disable_quic: {}\n\
max_ledger_size: {:?}\n\
skip_poh_verify: {}\n\
no_snapshot_fetch: {}\n\
require_tower: {}\n\
enable_full_rpc: {}\n\
known_validators: {:?}",
self.tpu_enable_udp,
self.tpu_disable_quic,
self.max_ledger_size,
self.skip_poh_verify,
self.no_snapshot_fetch,
self.require_tower,
self.enable_full_rpc,
known_validators,
)
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: is this needed? I'm wondering if it would be easier to just #[derive(Debug)] and let people use the debug output if needed

Comment thread src/main.rs Outdated
.to_string();

let mut validator_library = Library::default();
let mut validator_library = ClusterImages::default();
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe rename this variable to cluster_images?

Comment thread src/main.rs Outdated
Comment on lines +353 to +356
error!(
"The provided --limit-ledger-size value was too small, the minimum value is {DEFAULT_MIN_MAX_LEDGER_SHREDS}"
);
return;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I just found out about this, but a nice option if there's an issue on arg parsing that can't be covered by clap solana-labs/solana-program-library#6550 (comment)

Doesn't need to be done here, of course

Comment thread src/main.rs
Comment on lines +179 to +189
Arg::with_name("limit_ledger_size")
.long("limit-ledger-size")
.takes_value(true)
.help("Validator Config. The `--limit-ledger-size` parameter allows you to specify how many ledger
shreds your node retains on disk. If you do not
include this parameter, the validator will keep the entire ledger until it runs
out of disk space. The default value attempts to keep the ledger disk usage
under 500GB. More or less disk usage may be requested by adding an argument to
`--limit-ledger-size` if desired. Check `agave-validator --help` for the
default limit value used by `--limit-ledger-size`. More information about
selecting a custom limit value is at : https://github.com/solana-labs/solana/blob/583cec922b6107e0f85c7e14cb5e642bc7dfb340/core/src/ledger_cleanup_service.rs#L15-L26"),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: how about using a default value of DEFAULT_MAX_LEDGER_SHREDS.to_string()? that way you can just unwrap() fetching this later

Comment thread src/main.rs Outdated
Arg::with_name("memory_requests")
.long("memory-requests")
.takes_value(true)
.default_value("70Gi") // 70 Gigabytes
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

micro-nit

Suggested change
.default_value("70Gi") // 70 Gigabytes
.default_value("70Gi") // 70 Gibibytes

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this one always gets me

Comment thread src/k8s_helpers.rs
}

#[allow(clippy::too_many_arguments)]
pub fn create_replica_set(
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no idea if this works, so I'll assume it's all correct 😅

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

haha it does. but i could change create_replica_set() to accept a struct instead of all the args.

Comment thread src/k8s_helpers.rs Outdated
Comment on lines +62 to +66
namespace: &str,
label_selector: &BTreeMap<String, String>,
image_name: &DockerImage,
environment_variables: Vec<EnvVar>,
command: &[String],
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: since the refs here all end up getting cloned, you may as well take them by value instead of ref here, and leave the burden of copies on the caller

Comment thread src/kubernetes.rs Outdated
Comment on lines +35 to +40
requests: vec![
("cpu".to_string(), Quantity(cpu_requests)),
("memory".to_string(), Quantity(memory_requests)),
]
.into_iter()
.collect(),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: you should be able to use BTreeMap::from, ie

Suggested change
requests: vec![
("cpu".to_string(), Quantity(cpu_requests)),
("memory".to_string(), Quantity(memory_requests)),
]
.into_iter()
.collect(),
requests: BTreeMap::from([
("cpu".to_string(), Quantity(cpu_requests)),
("memory".to_string(), Quantity(memory_requests)),
])

https://doc.rust-lang.org/std/collections/struct.BTreeMap.html#impl-From%3C%5B(K,+V);+N%5D%3E-for-BTreeMap%3CK,+V%3E

Comment thread src/validator_config.rs Outdated
pub no_snapshot_fetch: bool,
pub require_tower: bool,
pub enable_full_rpc: bool,
pub known_validators: Option<Vec<Pubkey>>,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: The logic later of initializing it in add_known_validator is a bit confusing, could this just be a straight-up Vec?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes will fix. i think i got addicted to using Option early on. Will try to avoid when not needed lol

Comment thread src/kubernetes.rs
let mut secrets = HashMap::new();
let bootstrap_keypair = read_keypair_file(&identity_key_path)
.expect("Failed to read bootstrap validator keypair file");
self.add_known_validator(bootstrap_keypair.pubkey());
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: just out of curiosity, do we add this one for the rest of validator scripts 🤔 also, it seems that this action is not quite related to the create_bootstrap_secret

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are def right that this is in the wrong spot. not related to create_bootstrap_secret. will fix. I added the bootstrap to known_validators because for all other validators deployed, I will pass in --known-validators <bootstrap-identity> to the other validators' startup scripts

@gregcusack gregcusack merged commit 576851d into anza-xyz:main Apr 29, 2024
@gregcusack gregcusack deleted the create-bootstrap-replicaset branch April 29, 2024 17:10
gregcusack added a commit to gregcusack/validator-lab that referenced this pull request May 1, 2024
gregcusack added a commit that referenced this pull request May 6, 2024
* address jon nits from #38

* chido comment: refactor known validator from: #10

* rewrite init-metrics.sh in rust

* Create non-bootstrap, voting validator accounts

* build and push validator docker image

* create and deploy validator secret

* add validator selectors

* create validator replica sets. need shred_version

* add in get shred version from genesis

* deploy validator replica set

* deploy validator service

* refactor buildtype skip. will skip release channel pull/extract as well
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants