fix: NAT traversal hardening and 1000-node scale fixes by mickvandijke · Pull Request #70 · saorsa-labs/saorsa-core

mickvandijke · 2026-04-06T14:08:17Z

Summary

Bundles NAT-traversal correctness fixes, a new observed-address cache, and two perf fixes targeting 1000-node routing contention, all against the rc-2026.4.1 release branch.

NAT traversal & dialable addresses

Root cause

DhtNetworkManager::local_dht_node() built its self-entry from the static NodeConfig::listen_addrs() derivation, which returns wildcard IPs (0.0.0.0 / ::) and the configured port (0 for ephemeral binds). Receivers' dialable_addresses() filter then rejected the entry, so DHT-based peer discovery returned zero contactable addresses for any node. Connections only succeeded via side channels (static bootstrap, in-memory cache, relay fallback) — exactly the sporadic-failure pattern reported by saorsa-node-lmdb.

Fix

local_dht_node() is now an async fn that sources its self-entry addresses from TransportHandle::observed_external_address() — the OBSERVED_ADDRESS reflexive address reported by peers — with the new observed-address cache as a fallback for the window immediately after a connection drop. The earlier iteration added a UdpSocket::connect-then-getsockname helper (primary_local_ip) to substitute wildcard IPs, but that was dropped in a follow-up commit because it ran blocking syscalls inside an async context; the OBSERVED_ADDRESS-only flow supersedes it and needs no interface probing.

Other NAT fixes bundled

Bootstrap referrer threading — bootstrap_from_peers now records the queried bootstrap peer as the hole-punch coordinator referrer for every node it returned, completing the dial_candidate → set_hole_punch_preferred_coordinator chain the iterative-lookup path already used.
Event-driven relay/address updates — Relay-established and peer-address-update events drain via a tokio::select! over TransportHandle::recv_relay_established() / recv_peer_address_update() instead of the previous 1-second polling loop, eliminating the race window where outbound DHT queries advertised no relay address for up to a second after relay setup.
Identity-exchange timeout parity — IDENTITY_EXCHANGE_TIMEOUT and BOOTSTRAP_IDENTITY_TIMEOUT_SECS are both now 15 s (previously 5 s). Identity exchange (two RTTs + ML-DSA-65 signature, ~3.3 kB) blew through 5 s sporadically on congested cellular and cross-region links. Review noted the initial commit only bumped the bootstrap path; the non-bootstrap send_dht_request path is now on the same 15 s budget.
Multi-address dial helper — dial_addresses() loops through every dialable address until one succeeds. Applied at 5 call sites (bootstrap_from_peers, spawn_bucket_refresh_task, trigger_self_lookup, find_closest_nodes_network parallel-query block, send_dht_request fallback) so a stale NAT binding on entry style: Fix formatting issues to resolve CI failures #1 no longer kills the dial when entry Comparison between saorsa and libp2p or iroh #2 would have worked.

Observed-address cache (new)

src/transport/observed_address_cache.rs (new, 372 lines) records OBSERVED_ADDRESS reflexive addresses against the local socket they were reported on, with a TTL and recency-based eviction. The cache is consulted by TransportHandle::observed_external_address() when no live connection currently carries a fresh reflexive address, so a node whose upstream connection just dropped can still publish a dialable self-entry for the few seconds before a new connection re-reports the address. Without this fallback the fix above degrades to "works only while you already have at least one live peer with recent OBSERVED_ADDRESS traffic."

1000-node scaling

Two independent contention points identified in 1000-node testnet traces where responses were missing the 25 s caller timeout ([STEP 6 FAILED] / Response channel closed (receiver timed out) cascade):

perf(dht): atomic last_seen under a read lock — routing-table touch was called on every inbound DHT message and always took a RwLock::write(), serialising every read behind each update. NodeInfo.last_seen is now an AtomicInstant and there is a fast path that atomically bumps the timestamp under a read lock when the address merge would be a no-op (address absent, already present, or skipped by the loopback-injection rule). The write-lock slow path is still used when a real address merge is needed.
perf(transport): shard the inbound message dispatcher — the previous single consumer task took three async write locks plus an awaited peer-id registration per message, so messages queued behind each other instead of being processed in parallel. The receive loop is now a light dispatcher that hashes the source IpAddr and hands off to one of 8 parallel shard consumers, each running the full parse/register/broadcast pipeline independently. Hashing by IpAddr (not full SocketAddr) keeps a peer pinned to the same shard across ephemeral-port reconnects so per-peer message ordering is preserved.

Regression tests

tests/dht_self_advertisement.rs (new, 649 lines) pins the published self-entry behaviour with three cases covering public-mode wildcard, port match, and local-mode ephemeral bind. All three FAILED on rc-2026.4.1 with concrete evidence (e.g. [Quic(0.0.0.0:0)] published as the only address; transport bound to port 55886 but DHT advertised port 0). All pass after this branch. They use only public APIs (DhtNetworkManager::find_closest_nodes_local_with_self) so no new test surface area was added to the library.

Out of scope

The dead hole-punch chain in saorsa-transport's nat_traversal_api.rs is fixed in a separate commit in that crate.
The --port 0 default in ant-node remains a deliberate operator-config decision.

Commits

08febcc fix: advertise dialable addresses to DHT and harden NAT traversal
4aeecf6 chore: drop needless struct update in IPDiversityConfig test
a528dec fix: bump IDENTITY_EXCHANGE_TIMEOUT to 15s for slow-link parity
26eb128 fix: drop primary_local_ip and rely on OBSERVED_ADDRESS for self-entry
7086ccb feat: cache observed external addresses as fallback for dropped connections
0a854cd perf(dht): atomic last_seen to run touch under a read lock
cf1f32d perf(transport): shard inbound message dispatcher by source IP

Test plan

cargo test --lib
cargo test --test dht_self_advertisement — 3/3 passing (new regression suite)
cargo test --test two_node_messaging
cargo test --test node_lifecycle
cargo clippy --lib -- -D warnings — clean
Re-run cargo test --lib and clippy after the two perf: commits
Smoke test against saorsa-2.saorsalabs.com:10000 / saorsa-3.saorsalabs.com:10000 bootstrap nodes
Confirm DHT-published self-entries contain dialable addresses on the testnet
Validate 1000-node testnet run no longer hits the [STEP 6 FAILED] / Response channel closed cascade

Greptile Summary

This PR bundles a well-scoped set of fixes and performance improvements targeting DHT self-advertisement correctness, NAT traversal reliability, and routing-table write-lock contention at 1000-node scale.

Core changes:

local_dht_node() is now async and sources self-entry addresses exclusively from TransportHandle::observed_external_addresses() (live QUIC OBSERVED_ADDRESS frames) and the runtime-bound listen address set, discarding the prior NodeConfig::listen_addrs() call that returned un-dialable wildcards/port-0 values
New ObservedAddressCache (src/transport/observed_address_cache.rs) fills the "invisible window" between connection drop and reconnection with frequency+recency-aware address selection; multi-homed partitioned by (local_bind, observed) key
Bootstrap referrer threading: bootstrap_from_peers now passes the bootstrap peer's socket address as hole-punch coordinator referrer to every dial_addresses call
IDENTITY_EXCHANGE_TIMEOUT and BOOTSTRAP_IDENTITY_TIMEOUT_SECS unified at 15 s (was 5 s) to handle congested cellular / cross-region links
NodeInfo.last_seen promoted to AtomicInstant (parking_lot::Mutex<Instant>) so the hot touch path can run under a read lock on the routing table instead of taking a write lock on every inbound DHT message
Inbound message dispatcher sharded across 8 parallel consumer tasks keyed by source-IP hash, eliminating head-of-line blocking under 1000-node load
Relay/address update consumption moved from a 1-second polling loop to an event-driven tokio::select! over recv_relay_established / recv_peer_address_update
New regression suite tests/dht_self_advertisement.rs (649 lines) that confirmed the exact failure modes on the base branch and passes cleanly on this branch

Confidence Score: 5/5

Safe to merge — all remaining findings are P2 style suggestions with no production correctness impact

No P0 or P1 issues found. The two P2 comments flag a write-lock optimization opportunity in the shard consumer and the use of DefaultHasher (which is deterministic within a run and carries no correctness risk for a load-balancing shard key). The root-cause fixes are correct, the new ObservedAddressCache is well-designed and thoroughly tested, the regression suite concretely demonstrates the pre-fix failures, and the atomic touch path is a valid read-lock optimization with proper fallback. Prior concern about blocking I/O in local_dht_node() is fully resolved by the OBSERVED_ADDRESS-only approach.

src/transport_handle.rs — the shard consumer write-lock pattern is worth a follow-up optimization once the scalability impact is measured in production

Important Files Changed

Filename	Overview
src/transport/observed_address_cache.rs	New module: bounded (16-entry), frequency+recency-aware cache for QUIC OBSERVED_ADDRESS fallback; well-designed with per-local-bind partitioning for multi-homed safety; 11 deterministic unit tests cover all selection paths and eviction
src/dht_network_manager.rs	Core fix: local_dht_node() is now async and sources self-entry addresses from observed_external_addresses() and runtime listen_addrs; adds dial_addresses() multi-address retry helper (applied at 5 call sites); bootstrap_referrer threading corrected
src/transport_handle.rs	Single-consumer recv loop replaced by 8-shard parallel dispatcher keyed by source-IP hash; observed_address_cache field + accessor methods added; recv_relay_established / recv_peer_address_update event channels; write lock held across await in shard consumer (P2 style note)
src/transport/saorsa_transport_adapter.rs	Adds ADDRESS_EVENT_CHANNEL_CAPACITY bounded channel (256), drop-logging helper with coalesced warnings, and supporting constants for the sharded dispatcher infrastructure
src/network.rs	BOOTSTRAP_IDENTITY_TIMEOUT_SECS bumped to 15s matching IDENTITY_EXCHANGE_TIMEOUT; polling relay/address loop replaced with event-driven tokio::select! on recv_relay_established / recv_peer_address_update, eliminating 1-second race window
src/dht/core_engine.rs	AtomicInstant (parking_lot::Mutex) replaces plain Instant in NodeInfo.last_seen; touch_node_typed fast-path atomically bumps timestamp under read lock with no bucket mutation; escalates to write lock only when address merge is needed
tests/dht_self_advertisement.rs	New 649-line regression suite: 3 tests covering wildcard/public-mode, port-match, and local-mode ephemeral-bind self-advertisement contracts; all 3 failed on base branch (concrete evidence of the bug) and pass on this branch
src/dht/security_tests.rs	Minor chore: struct update syntax in IPDiversityConfig tests; no logic changes
src/transport.rs	Adds pub(crate) mod observed_address_cache module declaration; no other changes

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Inbound QUIC message] --> B[upstream_rx channel]
    B --> C{shard_index_for_addr\nhash source IP}
    C -->|shard 0| D0[Shard Consumer 0]
    C -->|shard 1| D1[Shard Consumer 1]
    C -->|...| D2[...]
    C -->|shard 7| D7[Shard Consumer 7]
    D0 & D1 & D2 & D7 --> E[parse_protocol_message]
    E --> F{authenticated?}
    F -->|yes| G[peer_to_channel.write\nchannel_to_peers.write\nregister_connection_peer_id]
    F -->|no| H[broadcast P2PEvent]
    G --> I{is /rr/ response?}
    I -->|yes| J[active_requests.write\ndeliver to oneshot]
    I -->|no| H

    subgraph NAT [NAT Self-Entry Fix]
        K[local_dht_node async] --> L{observed_external\naddresses?}
        L -->|yes| M[publish observed addr\nper interface]
        L -->|no| N[ObservedAddressCache\nfallback]
        N --> O{specific IP\nlisten addr?}
        O -->|yes| P[publish listen addr]
        O -->|no| Q[publish nothing —\nhonest empty entry]
    end

Prompt To Fix All With AI

This is a comment left during a code review.
Path: src/transport_handle.rs
Line: 1701-1723

Comment:
**Write lock held across two await points**

`peer_to_channel.write()` is acquired on line 1701 and held across both `channel_to_peers.write().await` (line 1706) and the async `dual_node_for_peer_reg.register_connection_peer_id(...).await` call (line 1717). All 8 shards contend on this single global write lock for every authenticated message — not just on first peer registration — partially re-introducing the serialization the sharding was meant to remove.

A read-first check avoids the write lock on the hot path (already-known peer + already-known channel):

```rust
// Fast path: read lock sufficient when peer+channel already known.
{
    let p2c = peer_to_channel.read().await;
    if p2c.get(app_id).is_some_and(|chs| chs.contains(&channel_id)) {
        // Nothing to register; fall through to event dispatch.
        drop(p2c);
    } else {
        drop(p2c);
        // Slow path: new peer or new channel — take write lock.
        let mut p2c = peer_to_channel.write().await;
        // ... existing write-lock body ...
    }
}
```

The correctness argument in the existing comment (consistency between the app map and the transport's internal registration) still holds on the slow path only; the fast path is safe because there is nothing to register.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: src/transport_handle.rs
Line: 1847-1851

Comment:
**`DefaultHasher` is not guaranteed stable across Rust versions**

`std::collections::hash_map::DefaultHasher::new()` uses SipHash-1-3 with a zeroed key (deterministic within a run), but the Rust reference explicitly states the algorithm is "subject to change at any point in the future." A Rust upgrade that changes the algorithm would silently redistribute all peers across shards, causing a brief message-ordering hiccup while in-flight messages drain from the old shard assignments.

For a load-balancing shard key, correctness is maintained even across a reshuffle (no persistent state is keyed by shard), but replacing with a stable, purpose-built hash (e.g. `rustc-hash::FxHasher` or a simple FNV) would make the choice explicit and immune to compiler updates:

```rust
fn shard_index_for_addr(addr: &SocketAddr) -> usize {
    use std::hash::{Hash, Hasher};
    // FNV-1a — fast, stable, no crate dependency beyond std.
    let mut h: u64 = 14695981039346656037;
    for byte in addr.ip().to_string().bytes() {
        h ^= u64::from(byte);
        h = h.wrapping_mul(1099511628211);
    }
    (h as usize) % MESSAGE_DISPATCH_SHARDS
}
```

How can I resolve this? If you propose a fix, please make it concise.

_{Greploops — Automatically fix all review issues by running /greploops in Claude Code. It iterates: fix, push, re-review, repeat until 5/5 confidence.
Use the Greptile plugin for Claude Code to query reviews, search comments, and manage custom context directly from your terminal.}

_{Reviews (2): Last reviewed commit: "fix: address deep-review findings on PR ..." | Re-trigger Greptile}

`send_dht_request` computes its identity-exchange timeout as `min(request_timeout, IDENTITY_EXCHANGE_TIMEOUT)`. With the constant at 5s the effective budget was always 5s — too tight for cross-region or congested cellular links where the two-RTT exchange plus ML-DSA-65 verification can blow past with retransmits. This is the same root cause that motivated bumping `BOOTSTRAP_IDENTITY_TIMEOUT_SECS` to 15s in 08febcc; that fix only covered the initial bootstrap dial, leaving every subsequent peer dial exposed to the original failure mode. Bumping this constant to 15s matches the bootstrap budget so both code paths now absorb the same class of slow-link delay. Resolves the Greptile P1 finding on PR #70. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The earlier wildcard substitution in `local_dht_node()` opened a UDP socket and used the connect-then-getsockname trick to discover the host's primary outbound interface IP, then published that as a fallback when the listen address was a wildcard. This had three problems: 1. Blocking I/O inside an async fn: `std::net::UdpSocket::{bind, connect,local_addr}` are sync syscalls held across no `.await`, and `local_dht_node()` is on the hot DHT-lookup path. Wrong colour for tokio worker threads. 2. Wrong answer for home-NAT deployments: a node behind a residential NAT publishes an RFC1918 address (192.168.x.x) that internet peers cannot route to. The new `dial_addresses()` helper then burned connect attempts on guaranteed-failed dials. 3. Unnecessary: every node already has an authoritative source for its post-NAT reflexive address — the OBSERVED_ADDRESS frames sent by peers, plumbed through `TransportHandle::observed_external_address()`. The previous fix made it the *first* source but kept the substitution as a fallback "just in case." It is not needed. This commit removes `primary_local_ip()`, `PRIMARY_IP_PROBE_V{4,6}`, and the `UdpSocket` import. `local_dht_node()` now sources addresses from exactly two places: - `transport.observed_external_address()` — the OBSERVED_ADDRESS reflexive address. - `transport.listen_addrs()`, but only entries with a *specific* IP (loopback or any other non-wildcard); wildcards and zero ports are skipped. If neither source produces an address, the published self-entry has empty `addresses`. This is the right answer at the publish layer: better to admit "I don't know how to be reached yet" than to lie with a bind-side wildcard or a guessed LAN IP. The empty window closes naturally once the first peer connects to us and OBSERVED_ADDRESS flows. ### Tests `tests/dht_self_advertisement.rs` rewritten as five tests across two paths: - **Loopback bind path** (single-node, `local: true`): - `local_mode_publishes_dialable_loopback_address` - `local_mode_published_self_entry_matches_runtime_listen_addrs` - `local_mode_never_publishes_port_zero` - **Wildcard / OBSERVED_ADDRESS path** (`local: false`): - `wildcard_bind_with_no_peers_publishes_empty_self_entry` — pins the new "don't lie when you don't know" contract. - `wildcard_bind_publishes_observed_address_after_peer_connection` — primary regression test for the OBSERVED_ADDRESS flow. Two wildcard-bound nodes connect over loopback, identity exchanges, and the test polls `transport.observed_external_address()` until the frame arrives, then asserts the observed value appears in the published self-entry. Resolves the Greptile P2 finding on PR #70. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Bundles seven review-identified fixes (H2, H3, H4, M1, M2, M3, M5) on the NAT-traversal hardening and 1000-node scale work. Each addresses a specific bug or correctness gap surfaced during deep review. H2 — dispatcher fault isolation (src/transport_handle.rs): A single shard-consumer panic caused the inbound dispatcher loop to break, dropping the remaining seven shard senders and cascading down all healthy shards. Switched to `try_send` with explicit handling of both `Full` and `Closed` errors: drops are counted and warn-logged at intervals, and the dispatcher only exits when its upstream channel itself closes. The eight shards now provide real fault isolation. H3 — bridge task split + bounded forwarder channels (src/network.rs, src/transport/saorsa_transport_adapter.rs): The DHT bridge ran `find_closest_nodes_network` inline inside the select loop on the relay branch, blocking the peer-address-update branch for the duration of the iterative lookup; the underlying forwarder mpsc channels were `unbounded_channel` so the queue could grow without limit during the stall. Detached the lookup + publish into its own spawned task per relay event so the select loop keeps polling, and replaced both forwarder channels with bounded `channel(ADDRESS_EVENT_CHANNEL_CAPACITY)` plus a per-event drop counter. H4 — atomic touch fast path respects addr_type (src/dht/core_engine.rs): The read-lock fast path in `touch_last_seen_if_merge_noop` only checked whether the address was already present, ignoring the caller's `addr_type`. The relay-promotion call site in the DHT bridge (`touch_node_typed(peer, addr, AddressType::Relay)`) on an address previously learned as `Direct` would silently bump `last_seen` and drop the relay re-classification, breaking the type-priority ordering that the dialer relies on. Plumbed `addr_type` through `try_touch_last_seen` and `touch_last_seen_if_merge_noop`; the fast path now refuses to short-circuit when the existing classification differs, forcing escalation to the slow path so `merge_typed_address` can re-order. M1 — observed-address cache keyed by local bind (src/transport/observed_address_cache.rs, src/transport/saorsa_transport_adapter.rs, src/transport_handle.rs, src/dht_network_manager.rs): On a multi-homed host, observations from peers reaching the node via different local interfaces were mixed into one keyspace, so an address learned on interface A could be served as the self-entry when only interface B was usable. Cache key changed from `SocketAddr` to `(local_bind, observed)`. Added `most_frequent_recent_per_local_bind` plural API. The forwarder captures per-stack `local_bind` before spawning. New `TransportHandle::observed_external_addresses()` plural accessor combines all live + cached per-bind observations; `local_dht_node()` now publishes all of them so peers reaching the host on any interface can dial back. Adds four new tests covering multi-bind partitioning. M2 — try_send for shard handoff (src/transport_handle.rs): The dispatcher's `send().await` to a per-shard mpsc would block on the upstream from saorsa-transport when one shard's RwLock contention spiked, starving the other seven shards through the same single point of serialisation the sharding was supposed to eliminate. The H2 fix's non-blocking `try_send` also addresses this — recorded as M2 for explicitness. M3 — shard ordering docstring (src/transport_handle.rs): `shard_index_for_addr` claimed to preserve per-peer ordering across reconnects. True only across ephemeral-port reconnects from the same IP. NAT rebinding to a new public IP, mobile Wi-Fi↔cellular roaming, or dual-stack failover routes the peer to a different shard, breaking causal ordering at the application layer. Docstring corrected to "per source IP" with the caveat called out. M5 — hold lock across register_connection_peer_id (src/transport_handle.rs): The shard consumer dropped the `peer_to_channel` write lock before calling `dual_node.register_connection_peer_id`, leaving a window in which the app-level map said the peer was known but the transport's internal addr→peer map did not — a hole-punch lookup during that window could fail spuriously. Lock now held across the registration call. Verification: - cargo build --lib: clean - cargo fmt --check: clean - cargo clippy --lib -- -D warnings -D clippy::unwrap_used \ -D clippy::expect_used: clean - cargo clippy --all-targets -- -D warnings: clean - cargo test --lib: 282 passed, 0 failed - cargo test --test dht_self_advertisement: 6/6 passed - cargo test --test two_node_messaging: 7/7 passed - cargo test --test node_lifecycle: 17/17 passed Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Bundles six NAT-traversal fixes uncovered during a deep audit of the sporadic-failure pattern reported by saorsa-node-lmdb. Root cause: DhtNetworkManager::local_dht_node() built its self-entry from the static NodeConfig::listen_addrs() derivation, which returns wildcard IPs (0.0.0.0 / ::) and the *configured* port (0 for ephemeral binds). Receivers' dialable_addresses() filter then rejected the entry, so DHT-based peer discovery returned zero contactable addresses for any node. Connections only succeeded via side channels (static bootstrap, in-memory cache, relay fallback) — exactly the sporadic pattern. Rewritten as `async fn` that sources addresses, in priority order, from TransportHandle::observed_external_address() (the OBSERVED_ADDRESS reflexive address, which was defined on the adapter and never called outside its own file) and the runtime listen_addrs RwLock, with unspecified IPs substituted via a new primary_local_ip() helper that uses the standard UdpSocket::connect-then-getsockname trick. No new dependencies. Other fixes bundled: - bootstrap_from_peers now records the bootstrap peer as the natural hole-punch coordinator referrer for every node it returned, threading through the existing dial_candidate -> set_hole_punch_preferred_coordinator plumbing that the iterative-lookup path already used. - Relay-established and peer-address-update events drained via a tokio::select! over new TransportHandle::recv_relay_established() / recv_peer_address_update() instead of the previous 1-second polling loop, eliminating the race window where outbound DHT queries advertised no relay address for up to a second after relay setup. - BOOTSTRAP_IDENTITY_TIMEOUT_SECS bumped 5 -> 15 with explanatory comment. Identity exchange (two RTTs + ML-DSA-65 signature, ~3.3kB) blew through 5s sporadically on congested cellular and cross-region links. - Added dial_addresses() helper that loops through every dialable address until one succeeds. Replaced 5 single-address dial sites (bootstrap_from_peers, spawn_bucket_refresh_task, trigger_self_lookup, find_closest_nodes_network parallel-query block, send_dht_request fallback) so a stale NAT binding on entry #1 no longer kills the dial when entry #2 would have worked. Regression tests: tests/dht_self_advertisement.rs adds three #[tokio::test] cases that pin the published self-entry behaviour. All three FAILED on main with concrete evidence (e.g. `[Quic(0.0.0.0:0)]` published as the only address; transport bound to port 55886 but DHT advertised port 0). All PASS after this commit. They use only public APIs (DhtNetworkManager::find_closest_nodes_local_with_self) so no new test surface area was added to the library. Verified: cargo test --lib 269/269 cargo test --test dht_self_advertisement 3/3 cargo test --test two_node_messaging 7/7 cargo test --test node_lifecycle 17/17 cargo clippy --lib -- -D warnings clean The dead hole-punch chain in saorsa-transport's nat_traversal_api.rs (Bug 4 from the audit) is fixed in a separate commit in that crate. The --port 0 default in ant-node (Bug 5) is left as a deliberate decision for the operator config layer. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Both fields of `IPDiversityConfig` are explicitly set, so the trailing `..IPDiversityConfig::default()` triggers `clippy::needless_update` under `cargo clippy --all-targets`. Test code only; no functional change. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

`send_dht_request` computes its identity-exchange timeout as `min(request_timeout, IDENTITY_EXCHANGE_TIMEOUT)`. With the constant at 5s the effective budget was always 5s — too tight for cross-region or congested cellular links where the two-RTT exchange plus ML-DSA-65 verification can blow past with retransmits. This is the same root cause that motivated bumping `BOOTSTRAP_IDENTITY_TIMEOUT_SECS` to 15s in 08febcc; that fix only covered the initial bootstrap dial, leaving every subsequent peer dial exposed to the original failure mode. Bumping this constant to 15s matches the bootstrap budget so both code paths now absorb the same class of slow-link delay. Resolves the Greptile P1 finding on PR #70. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The earlier wildcard substitution in `local_dht_node()` opened a UDP socket and used the connect-then-getsockname trick to discover the host's primary outbound interface IP, then published that as a fallback when the listen address was a wildcard. This had three problems: 1. Blocking I/O inside an async fn: `std::net::UdpSocket::{bind, connect,local_addr}` are sync syscalls held across no `.await`, and `local_dht_node()` is on the hot DHT-lookup path. Wrong colour for tokio worker threads. 2. Wrong answer for home-NAT deployments: a node behind a residential NAT publishes an RFC1918 address (192.168.x.x) that internet peers cannot route to. The new `dial_addresses()` helper then burned connect attempts on guaranteed-failed dials. 3. Unnecessary: every node already has an authoritative source for its post-NAT reflexive address — the OBSERVED_ADDRESS frames sent by peers, plumbed through `TransportHandle::observed_external_address()`. The previous fix made it the *first* source but kept the substitution as a fallback "just in case." It is not needed. This commit removes `primary_local_ip()`, `PRIMARY_IP_PROBE_V{4,6}`, and the `UdpSocket` import. `local_dht_node()` now sources addresses from exactly two places: - `transport.observed_external_address()` — the OBSERVED_ADDRESS reflexive address. - `transport.listen_addrs()`, but only entries with a *specific* IP (loopback or any other non-wildcard); wildcards and zero ports are skipped. If neither source produces an address, the published self-entry has empty `addresses`. This is the right answer at the publish layer: better to admit "I don't know how to be reached yet" than to lie with a bind-side wildcard or a guessed LAN IP. The empty window closes naturally once the first peer connects to us and OBSERVED_ADDRESS flows. ### Tests `tests/dht_self_advertisement.rs` rewritten as five tests across two paths: - **Loopback bind path** (single-node, `local: true`): - `local_mode_publishes_dialable_loopback_address` - `local_mode_published_self_entry_matches_runtime_listen_addrs` - `local_mode_never_publishes_port_zero` - **Wildcard / OBSERVED_ADDRESS path** (`local: false`): - `wildcard_bind_with_no_peers_publishes_empty_self_entry` — pins the new "don't lie when you don't know" contract. - `wildcard_bind_publishes_observed_address_after_peer_connection` — primary regression test for the OBSERVED_ADDRESS flow. Two wildcard-bound nodes connect over loopback, identity exchanges, and the test polls `transport.observed_external_address()` until the frame arrives, then asserts the observed value appears in the published self-entry. Resolves the Greptile P2 finding on PR #70. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…ctions saorsa-transport exposes the externally-observed address (from QUIC OBSERVED_ADDRESS frames) only via *active* connections. When every connection drops, the live read returns None and the node has no way to tell the DHT how to be reached. The result is a temporary "invisible window" where the node vanishes from DHT queries between a connection drop and the next reconnection. This commit adds an in-memory cache that records every `P2pEvent::ExternalAddressDiscovered` event the transport emits and serves as a frequency- and recency-aware fallback when no live connection has an observation. ### Design `TransportHandle::observed_external_address()` now has two sources: 1. Live: ask `dual_node.get_observed_external_address()` first. Unchanged; this is the freshest answer while at least one connection is up. 2. Fallback: consult the new `ObservedAddressCache` when the live source is empty. The cache returns the most-frequently-observed address among recent entries, breaking ties by recency. Different peers can legitimately observe a node at different addresses (symmetric NAT, multi-homed hosts, dual-stack divergence). The cache tracks how many distinct `(peer, address)` events have been received and picks "the address most peers agree on" — saorsa-transport dedups events per (peer, address) pair, so the count is effectively a peer consensus signal, not raw event volume. ### NAT-rebinding handling Pure frequency would let a long-lived stale address (count: 10000, last seen 24h ago) win over a fresh new address (count: 5, last seen now). That is the wrong answer after a NAT mapping rebinds. Selection is therefore split into two passes: 1. Among entries seen within `OBSERVATION_RECENCY_WINDOW` (10 min), return the highest-count one. 2. If nothing is recent, fall back to the global highest-count entry. Eviction is recency-based: when the cache is full (`MAX_CACHED_OBSERVATIONS = 16`), the entry with the oldest `last_seen` is removed. Stale high-count entries naturally get pushed out as fresh observations arrive at new addresses. ### Wiring `DualStackNode::spawn_peer_address_update_forwarder` now takes an `Arc<parking_lot::Mutex<ObservedAddressCache>>` and handles a third event variant, `P2pEvent::ExternalAddressDiscovered`, recording each event into the cache. The other two branches (PeerAddressUpdated, RelayEstablished) are unchanged. The silent `Ok(_other)` arm is now explicit with a comment listing the event variants we intentionally skip. `TransportHandle::new` creates the cache and passes a clone to the forwarder; `new_for_tests` creates an empty cache so test handles have the same semantics. Both constructors are in lockstep. ### Persistence The cache is in-memory only. A process restart resets it. A freshly started node re-discovers its current address from live connections rather than trusting potentially-stale state from a previous run. ### Tests - 9 unit tests on `ObservedAddressCache` covering: empty, single, repeated, frequency, recency tiebreak, NAT-rebinding fallback, long-quiet fallback, eviction, and re-observation-on-full. - 1 integration test `observed_address_cache_serves_fallback_after_connection_drop` that connects two wildcard-bound nodes over loopback, waits for `ExternalAddressDiscovered` to reach the cache via the new `cached_observed_external_address` polling accessor, stops node_b to drop the live connection, waits for `connected_peers()` to drain, then asserts that both `observed_external_address()` and the DHT self-entry still return the cached value. Total: 278 lib tests (269 + 9), 6 dht_self_advertisement tests (5 + 1). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

At 1000 nodes the routing table touch was called on every inbound DHT message and always took a write lock, serialising every read behind each update. Wrap `NodeInfo.last_seen` in an `AtomicInstant` and add a fast path that atomically bumps the timestamp under a read lock when the address merge would be a no-op (address absent, already present, or skipped by the loopback-injection rule). The write-lock slow path is still used when a real address merge is needed. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

At 1000 peers the single consumer task draining every inbound message became the dominant serialisation point — each message took three async write locks plus an awaited peer-id registration before the next could even be looked at, causing response delivery to miss the 25 s caller timeout. Split the receive loop into a light dispatcher that hashes the source IP and hands off to one of 8 parallel shard consumers, each running the full parse/register/broadcast pipeline independently. Hashing by `IpAddr` (not full `SocketAddr`) keeps a peer pinned to the same shard across ephemeral-port reconnects so per-peer message ordering is preserved. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Bundles seven review-identified fixes (H2, H3, H4, M1, M2, M3, M5) on the NAT-traversal hardening and 1000-node scale work. Each addresses a specific bug or correctness gap surfaced during deep review. H2 — dispatcher fault isolation (src/transport_handle.rs): A single shard-consumer panic caused the inbound dispatcher loop to break, dropping the remaining seven shard senders and cascading down all healthy shards. Switched to `try_send` with explicit handling of both `Full` and `Closed` errors: drops are counted and warn-logged at intervals, and the dispatcher only exits when its upstream channel itself closes. The eight shards now provide real fault isolation. H3 — bridge task split + bounded forwarder channels (src/network.rs, src/transport/saorsa_transport_adapter.rs): The DHT bridge ran `find_closest_nodes_network` inline inside the select loop on the relay branch, blocking the peer-address-update branch for the duration of the iterative lookup; the underlying forwarder mpsc channels were `unbounded_channel` so the queue could grow without limit during the stall. Detached the lookup + publish into its own spawned task per relay event so the select loop keeps polling, and replaced both forwarder channels with bounded `channel(ADDRESS_EVENT_CHANNEL_CAPACITY)` plus a per-event drop counter. H4 — atomic touch fast path respects addr_type (src/dht/core_engine.rs): The read-lock fast path in `touch_last_seen_if_merge_noop` only checked whether the address was already present, ignoring the caller's `addr_type`. The relay-promotion call site in the DHT bridge (`touch_node_typed(peer, addr, AddressType::Relay)`) on an address previously learned as `Direct` would silently bump `last_seen` and drop the relay re-classification, breaking the type-priority ordering that the dialer relies on. Plumbed `addr_type` through `try_touch_last_seen` and `touch_last_seen_if_merge_noop`; the fast path now refuses to short-circuit when the existing classification differs, forcing escalation to the slow path so `merge_typed_address` can re-order. M1 — observed-address cache keyed by local bind (src/transport/observed_address_cache.rs, src/transport/saorsa_transport_adapter.rs, src/transport_handle.rs, src/dht_network_manager.rs): On a multi-homed host, observations from peers reaching the node via different local interfaces were mixed into one keyspace, so an address learned on interface A could be served as the self-entry when only interface B was usable. Cache key changed from `SocketAddr` to `(local_bind, observed)`. Added `most_frequent_recent_per_local_bind` plural API. The forwarder captures per-stack `local_bind` before spawning. New `TransportHandle::observed_external_addresses()` plural accessor combines all live + cached per-bind observations; `local_dht_node()` now publishes all of them so peers reaching the host on any interface can dial back. Adds four new tests covering multi-bind partitioning. M2 — try_send for shard handoff (src/transport_handle.rs): The dispatcher's `send().await` to a per-shard mpsc would block on the upstream from saorsa-transport when one shard's RwLock contention spiked, starving the other seven shards through the same single point of serialisation the sharding was supposed to eliminate. The H2 fix's non-blocking `try_send` also addresses this — recorded as M2 for explicitness. M3 — shard ordering docstring (src/transport_handle.rs): `shard_index_for_addr` claimed to preserve per-peer ordering across reconnects. True only across ephemeral-port reconnects from the same IP. NAT rebinding to a new public IP, mobile Wi-Fi↔cellular roaming, or dual-stack failover routes the peer to a different shard, breaking causal ordering at the application layer. Docstring corrected to "per source IP" with the caveat called out. M5 — hold lock across register_connection_peer_id (src/transport_handle.rs): The shard consumer dropped the `peer_to_channel` write lock before calling `dual_node.register_connection_peer_id`, leaving a window in which the app-level map said the peer was known but the transport's internal addr→peer map did not — a hole-punch lookup during that window could fail spuriously. Lock now held across the registration call. Verification: - cargo build --lib: clean - cargo fmt --check: clean - cargo clippy --lib -- -D warnings -D clippy::unwrap_used \ -D clippy::expect_used: clean - cargo clippy --all-targets -- -D warnings: clean - cargo test --lib: 282 passed, 0 failed - cargo test --test dht_self_advertisement: 6/6 passed - cargo test --test two_node_messaging: 7/7 passed - cargo test --test node_lifecycle: 17/17 passed Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

At 1000 nodes two hot paths dominated serialisation in the stack, both showing up as response-delivery timeouts under load. This commit combines those two fixes and the deep-review follow-ups that hardened them against cascade failure and regression. ## perf(dht): atomic last_seen under a read lock The routing-table touch was called on every inbound DHT message and always took a write lock, serialising every read behind each update. Wrap `NodeInfo.last_seen` in an `AtomicInstant` and add a fast path that atomically bumps the timestamp under a read lock when the address merge would be a no-op (address absent, already present with the same type, or skipped by the loopback-injection rule). The write-lock slow path still runs when a real address merge or re-classification is needed. (a9a126f) ### Fast path respects addr_type (H4, from deep review) Initially the read-lock fast path only checked whether the address was already present, ignoring the caller's `addr_type`. The relay-promotion call site in the DHT bridge would silently bump `last_seen` and drop the relay re-classification, breaking the type-priority ordering that the dialer relies on. Plumbed `addr_type` through `try_touch_last_seen` and `touch_last_seen_if_merge_noop`; the fast path now refuses to short-circuit when the existing classification differs, forcing escalation to the slow path so `merge_typed_address` can re-order. ## perf(transport): shard inbound message dispatcher by source IP At 1000 peers the single consumer task draining every inbound message became the dominant serialisation point — each message took three async write locks plus an awaited peer-id registration before the next could even be looked at, causing response delivery to miss the 25 s caller timeout. Split the receive loop into a light dispatcher that hashes the source IP and hands off to one of 8 parallel shard consumers, each running the full parse/register/broadcast pipeline independently. Hashing by `IpAddr` (not full `SocketAddr`) keeps a peer pinned to the same shard across ephemeral-port reconnects so per-peer message ordering is preserved for the common case. (ab3a5bf) ### Dispatcher fault isolation (H2/M2, from deep review) A single shard-consumer panic caused the inbound dispatcher loop to break, dropping the remaining seven shard senders and cascading down all healthy shards. Switched to `try_send` with explicit handling of both `Full` and `Closed` errors: drops are counted and warn-logged at intervals (coalesced via `SHARD_DROP_LOG_INTERVAL`), and the dispatcher only exits when its upstream channel itself closes. The non-blocking `try_send` also addresses M2, where a blocking `send().await` on one congested shard could starve the other seven through the same serialisation point the sharding was meant to eliminate. The eight shards now provide real fault isolation. ### Shard ordering docstring corrected (M3, from deep review) `shard_index_for_addr` previously claimed to preserve per-peer ordering across reconnects. This is true only across ephemeral-port reconnects from the same source IP. NAT rebinding to a new public IP, mobile Wi-Fi↔cellular roaming, or dual-stack failover routes the peer to a different shard, breaking causal ordering at the application layer. Docstring now says "per source IP" with the caveat called out explicitly. ### Hold lock across register_connection_peer_id (M5, from deep review) The shard consumer dropped the `peer_to_channel` write lock before calling `dual_node.register_connection_peer_id`, leaving a window in which the app-level map said the peer was known but the transport's internal addr→peer map did not — a hole-punch lookup during that window could fail spuriously. Lock is now held across the registration call and dropped immediately after. ## chore: drop needless struct update in IPDiversityConfig test Both fields of `IPDiversityConfig` are explicitly set, so the trailing `..IPDiversityConfig::default()` triggers `clippy::needless_update` under `cargo clippy --all-targets`. Test code only; no functional change. (171af29) ## Verification - `cargo build --lib`: clean - `cargo fmt --check`: clean - `cargo clippy --lib -- -D warnings -D clippy::unwrap_used -D clippy::expect_used`: clean - `cargo clippy --all-targets -- -D warnings`: clean - `cargo test --lib`: 269 passed, 0 failed This is one of two PRs that split the original PR #70, which bundled NAT-traversal hardening with these 1000-node scale fixes. The other PR covers the NAT-traversal work. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Splits the NAT-traversal and dialable-address work from the original PR #70 bundle. Root cause and fix below; other correctness fixes bundled. ## Root cause `DhtNetworkManager::local_dht_node()` built its self-entry from the static `NodeConfig::listen_addrs()` derivation, which returns wildcard IPs (`0.0.0.0` / `::`) and the configured port (`0` for ephemeral binds). Receivers' `dialable_addresses()` filter then rejected the entry, so DHT-based peer discovery returned **zero contactable addresses** for any node. Connections only succeeded via side channels (static bootstrap, in-memory cache, relay fallback) — exactly the sporadic-failure pattern reported by `saorsa-node-lmdb`. ## Fix `local_dht_node()` is now an `async fn` that sources its self-entry addresses from `TransportHandle::observed_external_addresses()` — the post-NAT addresses observed via QUIC `OBSERVED_ADDRESS` frames, one per local interface that has an observation on a multi-homed host. An intermediate iteration added a `UdpSocket::connect`-then- `getsockname` helper (`primary_local_ip`) to substitute wildcard IPs, but it was dropped in a follow-up because it ran blocking syscalls inside an async context and returned RFC1918 addresses behind home NAT; the OBSERVED_ADDRESS-only flow supersedes it and needs no interface probing. If no source produces an address, the published self-entry has an empty `addresses` vec. This is the right answer at the publish layer: better to admit "I don't know how to be reached yet" than to lie with a bind-side wildcard. The empty window closes naturally once the first peer connects and OBSERVED_ADDRESS flows. ## Other NAT fixes bundled - **Event-driven relay/address updates** — relay-established and peer-address-update events drain via `tokio::select!` over `TransportHandle::recv_relay_established()` / `recv_peer_address_update()` instead of the previous 1-second polling loop, eliminating the race window where outbound DHT queries advertised no relay address for up to a second after relay setup. - **Identity exchange timeout bumped to 15 s** — `send_dht_request` computes its identity-exchange timeout as `min(request_timeout, IDENTITY_EXCHANGE_TIMEOUT)`; with the constant at 5 s the effective budget was always 5 s, too tight for cross-region or congested cellular links where the two-RTT exchange plus ML-DSA-65 verification can blow past with retransmits. Bumped to 15 s to match the bootstrap budget. - **Observed-address cache fallback** — saorsa-transport exposes the externally-observed address only via active connections, so a node whose every connection has just dropped has no answer for `observed_external_address()`. This introduces `ObservedAddressCache`: an in-memory, frequency- and recency-aware cache populated by `P2pEvent::ExternalAddressDiscovered` events. When the live source is empty, the cache returns the most-frequently-observed address among recent entries, breaking ties by recency. NAT-rebind handling via a two-pass selection (recent window wins over stale counts). Keyed by `(local_bind, observed)` so multi-homed hosts don't cross-pollute observations from different interfaces. The cache is reset on process restart — a fresh node re-discovers its current address from live connections rather than trusting stale state. - **Bridge task split + bounded forwarder channels** — the DHT bridge previously ran `find_closest_nodes_network` inline inside the select loop on the relay branch, blocking the peer-address-update branch for the duration of the iterative lookup; the underlying forwarder mpsc channels were `unbounded_channel` so the queue could grow without limit during the stall. Detached the lookup + publish into its own spawned task per relay event so the select loop keeps polling, and replaced both forwarder channels with bounded `channel(ADDRESS_EVENT_CHANNEL_CAPACITY)` plus a per-event drop counter. - **`dial_addresses()` helper** — loops through every dialable address until one succeeds. Replaces single-address dial sites in `bootstrap_from_peers`, `spawn_bucket_refresh_task`, `trigger_self_lookup`, `find_closest_nodes_network` parallel-query block, and `send_dht_request` fallback so a stale NAT binding on entry #1 no longer kills the dial when entry #2 would have worked. ## Regression tests `tests/dht_self_advertisement.rs` adds six `#[tokio::test]` cases that pin the published self-entry behaviour: - **Loopback bind path** (single-node, `local: true`): - `local_mode_publishes_dialable_loopback_address` - `local_mode_published_self_entry_matches_runtime_listen_addrs` - `local_mode_never_publishes_port_zero` - **Wildcard / OBSERVED_ADDRESS path** (`local: false`): - `wildcard_bind_with_no_peers_publishes_empty_self_entry` — pins the "don't lie when you don't know" contract. - `wildcard_bind_publishes_observed_address_after_peer_connection` — primary regression test for the OBSERVED_ADDRESS flow. - `observed_address_cache_serves_fallback_after_connection_drop` — pins the cache fallback behaviour after all live connections drop. All six FAIL on `rc-2026.4.1` with concrete evidence (e.g. `[Quic(0.0.0.0:0)]` published as the only address) and PASS here. `ObservedAddressCache` ships with 9 unit tests covering empty, single, repeated, frequency, recency tiebreak, NAT-rebinding fallback, long-quiet fallback, eviction, and re-observation-on-full. ## chore: drop needless struct update in IPDiversityConfig test Both fields of `IPDiversityConfig` are explicitly set, so the trailing `..IPDiversityConfig::default()` triggers `clippy::needless_update` under `cargo clippy --all-targets`. Test code only; no functional change. ## Verification - `cargo build --lib`: clean - `cargo fmt --check`: clean - `cargo clippy --lib -- -D warnings -D clippy::unwrap_used -D clippy::expect_used`: clean - `cargo clippy --all-targets -- -D warnings`: clean - `cargo test --lib`: 282 passed, 0 failed - `cargo test --test dht_self_advertisement`: 6/6 passed - `cargo test --test two_node_messaging`: 7/7 passed - `cargo test --test node_lifecycle`: 17/17 passed This is one of two PRs that split the original PR #70, which bundled this NAT-traversal hardening with two 1000-node scale fixes in the DHT touch path and the inbound message dispatcher. The other PR covers the 1000-node perf work. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The classification session was sending raw listen_addrs (potentially 0.0.0.0) as candidate addresses in DialBackRequest messages. Probers could not dial wildcard addresses, so all probes failed and every node was classified as private regardless of actual reachability. Now uses transport.observed_external_addresses() as the primary source (real external IP from QUIC OBSERVED_ADDRESS frames). Falls back to listen_addrs filtered to non-wildcard IPs only when no observations are available yet (cold start before any peer has connected). This mirrors the fix in local_dht_node() from PR #70. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

greptile-apps bot reviewed Apr 6, 2026

View reviewed changes

Comment thread src/dht_network_manager.rs Outdated

mickvandijke marked this pull request as draft April 6, 2026 14:17

mickvandijke changed the title ~~fix: advertise dialable addresses to DHT and harden NAT traversal~~ fix: NAT traversal hardening and 1000-node scale fixes Apr 6, 2026

mickvandijke marked this pull request as ready for review April 6, 2026 18:12

mickvandijke and others added 8 commits April 6, 2026 20:43

mickvandijke force-pushed the fix/dht-dialable-addresses-nat-hardening branch from fb02b49 to 3b7131d Compare April 6, 2026 18:44

jacderida approved these changes Apr 6, 2026

View reviewed changes

jacderida merged commit ada78f2 into rc-2026.4.1 Apr 6, 2026
1 of 2 checks passed

This was referenced Apr 6, 2026

fix: advertise dialable addresses to DHT and harden NAT traversal #71

Closed

perf(dht,transport): 1000-node scale fixes #72

Closed

jacderida mentioned this pull request Apr 13, 2026

fix: shard channel saturation causing upload degradation at 1000-node scale #80

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: NAT traversal hardening and 1000-node scale fixes#70

fix: NAT traversal hardening and 1000-node scale fixes#70
jacderida merged 8 commits intorc-2026.4.1from
fix/dht-dialable-addresses-nat-hardening

mickvandijke commented Apr 6, 2026 •

edited by greptile-apps bot

Loading

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

mickvandijke commented Apr 6, 2026 • edited by greptile-apps bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

NAT traversal & dialable addresses

Root cause

Fix

Other NAT fixes bundled

Observed-address cache (new)

1000-node scaling

Regression tests

Out of scope

Commits

Test plan

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Flowchart

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

mickvandijke commented Apr 6, 2026 •

edited by greptile-apps bot

Loading