Skip to content
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
d13a4ea
update instructions for deploying dev env
internet-diglett Jan 16, 2024
112765e
WIP: background task for syncronizing NAT information for service zones
internet-diglett Jan 16, 2024
085b0db
WIP: ensure service zone nat entries are tracked by RPW
internet-diglett Jan 19, 2024
402f513
wrap up not TODOs
internet-diglett Jan 19, 2024
56ba5be
revert changes to inventory_collection.rs
internet-diglett Jan 19, 2024
c794b1a
add config for nat sync task
internet-diglett Jan 20, 2024
fe1e3e8
PR fixes
internet-diglett Jan 20, 2024
337bb23
Merge branch 'main' into rpw-for-service-zone-nat
internet-diglett Jan 22, 2024
929ba7c
update schema
internet-diglett Jan 22, 2024
d62b814
add additional documentation
internet-diglett Jan 22, 2024
9eb1111
create tracking issue for TODO
internet-diglett Jan 22, 2024
b92b0bd
set mandatory minimums for service zone nat entries
internet-diglett Jan 23, 2024
1c6e60e
adjust minimum ntp count so job will actually run
internet-diglett Jan 23, 2024
22f36d6
remove comment, issue is WIP
internet-diglett Jan 23, 2024
3ebf9df
adjust minimum count for all services to 1
internet-diglett Jan 23, 2024
5811317
bump schema version
internet-diglett Jan 24, 2024
28f61b0
Merge branch 'main' into rpw-for-service-zone-nat
internet-diglett Jan 24, 2024
d3501dc
BUGFIX: nat entries missing after sled restart
internet-diglett Jan 26, 2024
a193845
Merge branch 'main' into rpw-for-service-zone-nat
internet-diglett Jan 26, 2024
8c6a23e
Merge branch 'main' into rpw-for-service-zone-nat
internet-diglett Jan 26, 2024
6cd49f7
bump schema
internet-diglett Jan 26, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 0 additions & 3 deletions nexus/db-queries/src/db/datastore/ip_pool.rs
Original file line number Diff line number Diff line change
Expand Up @@ -137,9 +137,6 @@ impl DataStore {
// join ip_pool to ip_pool_resource and filter

// used in both success and error outcomes
// TODO: Correctness
// we're not propogating the data used to performed the query, which
// makes troubleshooting a lookup failure a lot more time consuming
let lookup_type = LookupType::ByCompositeId(
"Default pool for current silo".to_string(),
);
Expand Down
9 changes: 7 additions & 2 deletions nexus/db-queries/src/db/datastore/ipv4_nat_entry.rs
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,9 @@ use omicron_common::api::external::ResourceType;
use omicron_common::api::external::Vni;

impl DataStore {
/// Currently used to ensure that a NAT entry exists for an Instance.
/// This SHOULD NOT be directly used to create service zone nat entries,
/// as they are updated via a background task.
pub async fn ensure_ipv4_nat_entry(
&self,
opctx: &OpContext,
Expand Down Expand Up @@ -76,10 +79,12 @@ impl DataStore {
Ok(())
}

/// Method for synchronizing service zone nat.
/// Method for synchronizing service zone nat, called by `ServiceZoneNatTracker`
/// background task.
/// Expects a complete set of service zone nat entries.
/// Soft-deletes db entries that are not present in `nat_entries` parameter.
/// Creates missing entries idempotently
/// Creates missing entries idempotently.
///
/// returns the number of records added
pub async fn ipv4_nat_sync_service_zones(
&self,
Expand Down
51 changes: 51 additions & 0 deletions nexus/src/app/background/sync_service_zone_nat.rs
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,18 @@ use std::net::{IpAddr, SocketAddr};
use std::num::NonZeroU32;
use std::sync::Arc;

// Minumum number of boundary NTP zones that should be present in a valid
// set of service zone nat configurations.
const MIN_NTP_COUNT: usize = 1;

// Minumum number of nexus zones that should be present in a valid
// set of service zone nat configurations.
const MIN_NEXUS_COUNT: usize = 3;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure we should do this -- it's possible that one of the three Nexus instances have gone away, and we'd still want our NAT entries to be up-to-date.

This is a reasonable goal to try to achieve in the "graceful shutdown" case -- if we want three Nexus instances to run, we should provision a 4th one before removing one of the original 3 -- but if a sled is yanked from the rack, we do have two Nexus instances. That's just a truth! We should aspire to have the blueprint creator create a new Nexus service, provision it, and get the NAT entries populated, but it's very possible to run under our redundancy expectations. That's why we use redundancy!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would setting all of the minimums to 1 be a reasonable compromise? If we don't have at least 1 Nexus, NTP, and ExternalDns zone that can be found in inventory, I would think we have more serious problems, no?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Chatting with some folks in the update sync today, it sounds like the inventory system -- which gathers the inventory in a "collection", which contains a lot of objects -- may, at any time, simply "not report a sled" within that collection. Theoretically, that means we could see an "inventory collection" that doesn't contain any sleds which contain Nexus, NTP, external DNS, etc.

This has some weird implications for depending on the inventory system as the source-of-truth here, but without a full implementation of blueprints, I acknowledge that there isn't a great alternative yet.

So: "Could we set all the minimums to 1?" That would stop us from propagating the state of the inventory system if we saw a blip that eliminated all of these critical zones. So in that sense, it's arguably better than not doing the check! I also think it avoids "breaking" NAT when we're under-provisioned, as I mentioned in the case below.

If we're on the same page that, eventually, the right source-of-truth is "info from the blueprint, somehow", I think this is a reasonable intermediate step.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it's my understanding that blueprints are the future, inventory is what we're using for now.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(also, I appreciate you tolerating all this churn. Getting RPWs + NAT propagation right is tricky, and having this portion of the system be "not-totally-ready" does make this extra hairy. Thank you for pushing through regardless)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries! I know this is critical so I appreciate the extra eyes and feedback!


// Minumum number of external DNS zones that should be present in a valid
// set of service zone nat configurations.
const MIN_EXTERNAL_DNS_COUNT: usize = 1;

/// Background task that ensures service zones have nat entries
/// persisted in the NAT RPW table
pub struct ServiceZoneNatTracker {
Expand Down Expand Up @@ -93,6 +105,9 @@ impl BackgroundTask for ServiceZoneNatTracker {
};

let mut ipv4_nat_values: Vec<Ipv4NatValues> = vec![];
let mut ntp_count = 0;
let mut nexus_count = 0;
let mut dns_count = 0;

for (sled_id, zones_found) in collection.omicron_zones {
let (_, sled) = match LookupPath::new(opctx, &self.datastore)
Expand Down Expand Up @@ -121,6 +136,7 @@ impl BackgroundTask for ServiceZoneNatTracker {
zones_found.zones;
let zones: Vec<sled_agent_client::types::OmicronZoneConfig> =
zones_config.zones;

for zone in zones {
let zone_type: OmicronZoneType = zone.zone_type;
match zone_type {
Expand Down Expand Up @@ -157,6 +173,7 @@ impl BackgroundTask for ServiceZoneNatTracker {

// Append ipv4 nat entry
ipv4_nat_values.push(nat_value);
ntp_count += 1;
}
OmicronZoneType::Nexus { nic, external_ip, .. } => {
let external_ip = match external_ip {
Expand Down Expand Up @@ -189,6 +206,7 @@ impl BackgroundTask for ServiceZoneNatTracker {

// Append ipv4 nat entry
ipv4_nat_values.push(nat_value);
nexus_count += 1;
},
OmicronZoneType::ExternalDns { nic, dns_address, .. } => {
let socket_addr: SocketAddr = match dns_address.parse() {
Expand Down Expand Up @@ -235,6 +253,7 @@ impl BackgroundTask for ServiceZoneNatTracker {

// Append ipv4 nat entry
ipv4_nat_values.push(nat_value);
dns_count += 1;
},
// we explictly list all cases instead of using a wildcard,
// that way if someone adds a new type to OmicronZoneType that
Expand Down Expand Up @@ -265,6 +284,38 @@ impl BackgroundTask for ServiceZoneNatTracker {
});
}

if dns_count < MIN_EXTERNAL_DNS_COUNT {
error!(
&log,
"generated config for fewer than the minimum allowed number of dns zones";
);
return json!({
"error": "generated config for fewer than the minimum allowed number of dns zones"
});
}

if ntp_count < MIN_NTP_COUNT {
error!(
&log,
"generated config for fewer than the minimum allowed number of ntp zones";
);
return json!({
"error": "generated config for fewer than the minimum allowed number of ntp zones"

});
}

if nexus_count < MIN_NEXUS_COUNT {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Example where this could go awry:

  • Sled pulled (or crashes, or catches fire) we mark it as "Removed"
  • We identify that the services on the sled are not present
  • Inventory reports that two Nexus instances are running
  • 2 < 3
  • So we fail the RPW here, and never update any NAT entries until redundancy is fully restored?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is correct. Will we not attempt to move the service zone to a new sled in this situation?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As soon as we implement service re-provisioning, this would happen, but I think the scope of "new service provisions" is going to start with only Crucible and NTP.

Until that is fully implemented, this check would just stop NAT propagation

error!(
&log,
"generated config for fewer than the minimum allowed number of nexus zones";
);
return json!({
"error": "generated config for fewer than the minimum allowed number of nexus zones"

});
}

// reconcile service zone nat entries
let result = match self.datastore.ipv4_nat_sync_service_zones(opctx, &ipv4_nat_values).await {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed this in chat, but this whole flow has some implications for our calls to ensure_nat_entry / ensure_ipv4_nat_entry:

  • If you're writing code in Nexus to make an instance with a NAT entry, you must call these functions explicitly. No one else will add your NAT entry to the DB!
  • If you're writing code in Nexus to make a service with a NAT entry, you should not call these functions, and should rely on the "Nexus -> Sled Agent -> Inventory -> Service Zone NAT RPW" pathway" to ensure that these entries get populated. If you tried to add the entry to the DB, you'd risk a race condition between "collection via inventory / RPW" and "explicitly inserting the record".

I think that's okay, but we should add documentation around the ensure_nat_entry function to make that distinction extremely clear! I think it's kinda subtle that the same table is treated quite differently depending on the purpose of the NAT entry.

Copy link
Collaborator

@smklein smklein Jan 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, for posterity, I created this drawing to map out the data flow here.

My "TLDR" of the above is that I wanted to ensure we avoided having loops in this data flow graph:

image

(Source: https://docs.google.com/drawings/d/19MkoKsgZ8vuPng6uKaCF1hG2hiHI9735jv9ThLgajVM/edit?usp=sharing )

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is great feedback, adding some documentation to it now. I think one day we may get to a place where we can lift the NAT logic out of the service zone creation functionality in sled-agent, which could help us move towards a more consistent pattern of interacting with this table.

Ok(num) => num,
Expand Down