Skip to content

style: Fix formatting issues to resolve CI failures#1

Closed
devin-ai-integration[bot] wants to merge 5 commits intomainfrom
devin/1757418585-ci-fixes
Closed

style: Fix formatting issues to resolve CI failures#1
devin-ai-integration[bot] wants to merge 5 commits intomainfrom
devin/1757418585-ci-fixes

Conversation

@devin-ai-integration
Copy link
Copy Markdown

style: Fix formatting issues to resolve CI failures

Summary

Applies cargo fmt --all to fix formatting inconsistencies across the codebase that were causing CI failures. The changes include:

  • Import statement reorganization (grouping standard library, third-party, and local imports)
  • Consistent whitespace and line ending fixes
  • Multi-line formatting improvements for better readability
  • Removal of trailing whitespace

No functional changes - this is purely a formatting/style fix to bring the code into compliance with rustfmt standards.

Review & Testing Checklist for Human

  • CI formatting checks pass - Verify that the rustfmt CI job now succeeds
  • Compilation still works - Note that there are existing compilation issues with the external saorsa-rsps-0.2.0 dependency that uses unstable Rust features (unrelated to this formatting fix)
  • No accidental logic changes - Scan the diff to ensure only whitespace/formatting was modified, no actual code logic

Notes

  • This PR addresses the formatting component of CI failures but does not resolve the underlying compilation issues with the saorsa-rsps external dependency
  • The compilation failures are due to unstable let expressions in the external dependency, which cannot be fixed in this repository
  • All formatting changes were generated automatically using cargo fmt --all with Rust 1.85.0

Link to Devin run: https://app.devin.ai/sessions/e2bf1cabed934e31af2ee50beb8371b6
Requested by: David Irvine (@dirvine)

- Fix whitespace and formatting across all source files
- Addresses CI formatting check failures
- Note: compilation issues exist in saorsa-rsps dependency using unstable let expressions

Co-Authored-By: David Irvine <david.irvine@maidsafe.net>
@devin-ai-integration
Copy link
Copy Markdown
Author

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

devin-ai-integration bot and others added 4 commits September 9, 2025 12:32
…ssions

- Added patch.crates-io pointing to git repository HEAD
- Still contains unstable let expressions, need alternative approach

Co-Authored-By: David Irvine <david.irvine@maidsafe.net>
- Add identity_set_website_root for website root updates
- Add group_member_add/remove for group membership management
- Add group_epoch_bump for group epoch operations
- Remove invalid local patch to /tmp path that breaks CI
- Functions restored from api.rs.backup to fix test compilation

Co-Authored-By: David Irvine <david.irvine@maidsafe.net>
- Add canonical message prefix constant for identity website root updates
- Fixes compilation error in identity_set_website_root function
- Constant value matches backup file implementation

Co-Authored-By: David Irvine <david.irvine@maidsafe.net>
- Add website_root: Option<Key> field to IdentityPacketV1 struct definition
- Initialize website_root: None in register_identity function
- Fix dht_put_bytes call to use &id_key reference
- Prefix unused parameters in group_epoch_bump with underscore
- Resolves compilation errors for identity_set_website_root function

Co-Authored-By: David Irvine <david.irvine@maidsafe.net>
@devin-ai-integration
Copy link
Copy Markdown
Author

Closing due to inactivity for more than 7 days. Configure here.

mickvandijke added a commit that referenced this pull request Apr 6, 2026
Bundles six NAT-traversal fixes uncovered during a deep audit of the
sporadic-failure pattern reported by saorsa-node-lmdb.

Root cause:
  DhtNetworkManager::local_dht_node() built its self-entry from the
  static NodeConfig::listen_addrs() derivation, which returns wildcard
  IPs (0.0.0.0 / ::) and the *configured* port (0 for ephemeral binds).
  Receivers' dialable_addresses() filter then rejected the entry, so
  DHT-based peer discovery returned zero contactable addresses for any
  node. Connections only succeeded via side channels (static bootstrap,
  in-memory cache, relay fallback) — exactly the sporadic pattern.

  Rewritten as `async fn` that sources addresses, in priority order,
  from TransportHandle::observed_external_address() (the OBSERVED_ADDRESS
  reflexive address, which was defined on the adapter and never called
  outside its own file) and the runtime listen_addrs RwLock, with
  unspecified IPs substituted via a new primary_local_ip() helper that
  uses the standard UdpSocket::connect-then-getsockname trick. No new
  dependencies.

Other fixes bundled:
  - bootstrap_from_peers now records the bootstrap peer as the natural
    hole-punch coordinator referrer for every node it returned, threading
    through the existing dial_candidate -> set_hole_punch_preferred_coordinator
    plumbing that the iterative-lookup path already used.
  - Relay-established and peer-address-update events drained via a
    tokio::select! over new TransportHandle::recv_relay_established() /
    recv_peer_address_update() instead of the previous 1-second polling
    loop, eliminating the race window where outbound DHT queries
    advertised no relay address for up to a second after relay setup.
  - BOOTSTRAP_IDENTITY_TIMEOUT_SECS bumped 5 -> 15 with explanatory
    comment. Identity exchange (two RTTs + ML-DSA-65 signature, ~3.3kB)
    blew through 5s sporadically on congested cellular and cross-region
    links.
  - Added dial_addresses() helper that loops through every dialable
    address until one succeeds. Replaced 5 single-address dial sites
    (bootstrap_from_peers, spawn_bucket_refresh_task, trigger_self_lookup,
    find_closest_nodes_network parallel-query block, send_dht_request
    fallback) so a stale NAT binding on entry #1 no longer kills the
    dial when entry #2 would have worked.

Regression tests:
  tests/dht_self_advertisement.rs adds three #[tokio::test] cases that
  pin the published self-entry behaviour. All three FAILED on main with
  concrete evidence (e.g. `[Quic(0.0.0.0:0)]` published as the only
  address; transport bound to port 55886 but DHT advertised port 0).
  All PASS after this commit. They use only public APIs
  (DhtNetworkManager::find_closest_nodes_local_with_self) so no new
  test surface area was added to the library.

Verified:
  cargo test --lib                       269/269
  cargo test --test dht_self_advertisement 3/3
  cargo test --test two_node_messaging   7/7
  cargo test --test node_lifecycle       17/17
  cargo clippy --lib -- -D warnings      clean

The dead hole-punch chain in saorsa-transport's nat_traversal_api.rs
(Bug 4 from the audit) is fixed in a separate commit in that crate.
The --port 0 default in ant-node (Bug 5) is left as a deliberate
decision for the operator config layer.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
mickvandijke added a commit that referenced this pull request Apr 6, 2026
Bundles six NAT-traversal fixes uncovered during a deep audit of the
sporadic-failure pattern reported by saorsa-node-lmdb.

Root cause:
  DhtNetworkManager::local_dht_node() built its self-entry from the
  static NodeConfig::listen_addrs() derivation, which returns wildcard
  IPs (0.0.0.0 / ::) and the *configured* port (0 for ephemeral binds).
  Receivers' dialable_addresses() filter then rejected the entry, so
  DHT-based peer discovery returned zero contactable addresses for any
  node. Connections only succeeded via side channels (static bootstrap,
  in-memory cache, relay fallback) — exactly the sporadic pattern.

  Rewritten as `async fn` that sources addresses, in priority order,
  from TransportHandle::observed_external_address() (the OBSERVED_ADDRESS
  reflexive address, which was defined on the adapter and never called
  outside its own file) and the runtime listen_addrs RwLock, with
  unspecified IPs substituted via a new primary_local_ip() helper that
  uses the standard UdpSocket::connect-then-getsockname trick. No new
  dependencies.

Other fixes bundled:
  - bootstrap_from_peers now records the bootstrap peer as the natural
    hole-punch coordinator referrer for every node it returned, threading
    through the existing dial_candidate -> set_hole_punch_preferred_coordinator
    plumbing that the iterative-lookup path already used.
  - Relay-established and peer-address-update events drained via a
    tokio::select! over new TransportHandle::recv_relay_established() /
    recv_peer_address_update() instead of the previous 1-second polling
    loop, eliminating the race window where outbound DHT queries
    advertised no relay address for up to a second after relay setup.
  - BOOTSTRAP_IDENTITY_TIMEOUT_SECS bumped 5 -> 15 with explanatory
    comment. Identity exchange (two RTTs + ML-DSA-65 signature, ~3.3kB)
    blew through 5s sporadically on congested cellular and cross-region
    links.
  - Added dial_addresses() helper that loops through every dialable
    address until one succeeds. Replaced 5 single-address dial sites
    (bootstrap_from_peers, spawn_bucket_refresh_task, trigger_self_lookup,
    find_closest_nodes_network parallel-query block, send_dht_request
    fallback) so a stale NAT binding on entry #1 no longer kills the
    dial when entry #2 would have worked.

Regression tests:
  tests/dht_self_advertisement.rs adds three #[tokio::test] cases that
  pin the published self-entry behaviour. All three FAILED on main with
  concrete evidence (e.g. `[Quic(0.0.0.0:0)]` published as the only
  address; transport bound to port 55886 but DHT advertised port 0).
  All PASS after this commit. They use only public APIs
  (DhtNetworkManager::find_closest_nodes_local_with_self) so no new
  test surface area was added to the library.

Verified:
  cargo test --lib                       269/269
  cargo test --test dht_self_advertisement 3/3
  cargo test --test two_node_messaging   7/7
  cargo test --test node_lifecycle       17/17
  cargo clippy --lib -- -D warnings      clean

The dead hole-punch chain in saorsa-transport's nat_traversal_api.rs
(Bug 4 from the audit) is fixed in a separate commit in that crate.
The --port 0 default in ant-node (Bug 5) is left as a deliberate
decision for the operator config layer.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
mickvandijke added a commit that referenced this pull request Apr 6, 2026
Splits the NAT-traversal and dialable-address work from the original
PR #70 bundle. Root cause and fix below; other correctness fixes
bundled.

## Root cause

`DhtNetworkManager::local_dht_node()` built its self-entry from the
static `NodeConfig::listen_addrs()` derivation, which returns wildcard
IPs (`0.0.0.0` / `::`) and the configured port (`0` for ephemeral
binds). Receivers' `dialable_addresses()` filter then rejected the
entry, so DHT-based peer discovery returned **zero contactable
addresses** for any node. Connections only succeeded via side channels
(static bootstrap, in-memory cache, relay fallback) — exactly the
sporadic-failure pattern reported by `saorsa-node-lmdb`.

## Fix

`local_dht_node()` is now an `async fn` that sources its self-entry
addresses from `TransportHandle::observed_external_addresses()` — the
post-NAT addresses observed via QUIC `OBSERVED_ADDRESS` frames, one
per local interface that has an observation on a multi-homed host.
An intermediate iteration added a `UdpSocket::connect`-then-
`getsockname` helper (`primary_local_ip`) to substitute wildcard IPs,
but it was dropped in a follow-up because it ran blocking syscalls
inside an async context and returned RFC1918 addresses behind home
NAT; the OBSERVED_ADDRESS-only flow supersedes it and needs no
interface probing.

If no source produces an address, the published self-entry has an
empty `addresses` vec. This is the right answer at the publish layer:
better to admit "I don't know how to be reached yet" than to lie
with a bind-side wildcard. The empty window closes naturally once the
first peer connects and OBSERVED_ADDRESS flows.

## Other NAT fixes bundled

- **Event-driven relay/address updates** — relay-established and
  peer-address-update events drain via `tokio::select!` over
  `TransportHandle::recv_relay_established()` /
  `recv_peer_address_update()` instead of the previous 1-second
  polling loop, eliminating the race window where outbound DHT queries
  advertised no relay address for up to a second after relay setup.

- **Identity exchange timeout bumped to 15 s** — `send_dht_request`
  computes its identity-exchange timeout as
  `min(request_timeout, IDENTITY_EXCHANGE_TIMEOUT)`; with the constant
  at 5 s the effective budget was always 5 s, too tight for
  cross-region or congested cellular links where the two-RTT exchange
  plus ML-DSA-65 verification can blow past with retransmits. Bumped
  to 15 s to match the bootstrap budget.

- **Observed-address cache fallback** — saorsa-transport exposes the
  externally-observed address only via active connections, so a node
  whose every connection has just dropped has no answer for
  `observed_external_address()`. This introduces
  `ObservedAddressCache`: an in-memory, frequency- and recency-aware
  cache populated by `P2pEvent::ExternalAddressDiscovered` events.
  When the live source is empty, the cache returns the
  most-frequently-observed address among recent entries, breaking
  ties by recency. NAT-rebind handling via a two-pass selection
  (recent window wins over stale counts). Keyed by
  `(local_bind, observed)` so multi-homed hosts don't cross-pollute
  observations from different interfaces. The cache is reset on
  process restart — a fresh node re-discovers its current address
  from live connections rather than trusting stale state.

- **Bridge task split + bounded forwarder channels** — the DHT bridge
  previously ran `find_closest_nodes_network` inline inside the select
  loop on the relay branch, blocking the peer-address-update branch
  for the duration of the iterative lookup; the underlying forwarder
  mpsc channels were `unbounded_channel` so the queue could grow
  without limit during the stall. Detached the lookup + publish into
  its own spawned task per relay event so the select loop keeps
  polling, and replaced both forwarder channels with bounded
  `channel(ADDRESS_EVENT_CHANNEL_CAPACITY)` plus a per-event drop
  counter.

- **`dial_addresses()` helper** — loops through every dialable
  address until one succeeds. Replaces single-address dial sites in
  `bootstrap_from_peers`, `spawn_bucket_refresh_task`,
  `trigger_self_lookup`, `find_closest_nodes_network` parallel-query
  block, and `send_dht_request` fallback so a stale NAT binding on
  entry #1 no longer kills the dial when entry #2 would have worked.

## Regression tests

`tests/dht_self_advertisement.rs` adds six `#[tokio::test]` cases that
pin the published self-entry behaviour:

- **Loopback bind path** (single-node, `local: true`):
  - `local_mode_publishes_dialable_loopback_address`
  - `local_mode_published_self_entry_matches_runtime_listen_addrs`
  - `local_mode_never_publishes_port_zero`

- **Wildcard / OBSERVED_ADDRESS path** (`local: false`):
  - `wildcard_bind_with_no_peers_publishes_empty_self_entry` — pins
    the "don't lie when you don't know" contract.
  - `wildcard_bind_publishes_observed_address_after_peer_connection` —
    primary regression test for the OBSERVED_ADDRESS flow.
  - `observed_address_cache_serves_fallback_after_connection_drop` —
    pins the cache fallback behaviour after all live connections drop.

All six FAIL on `rc-2026.4.1` with concrete evidence (e.g.
`[Quic(0.0.0.0:0)]` published as the only address) and PASS here.
`ObservedAddressCache` ships with 9 unit tests covering empty, single,
repeated, frequency, recency tiebreak, NAT-rebinding fallback,
long-quiet fallback, eviction, and re-observation-on-full.

## chore: drop needless struct update in IPDiversityConfig test

Both fields of `IPDiversityConfig` are explicitly set, so the trailing
`..IPDiversityConfig::default()` triggers `clippy::needless_update`
under `cargo clippy --all-targets`. Test code only; no functional
change.

## Verification

- `cargo build --lib`: clean
- `cargo fmt --check`: clean
- `cargo clippy --lib -- -D warnings -D clippy::unwrap_used -D clippy::expect_used`: clean
- `cargo clippy --all-targets -- -D warnings`: clean
- `cargo test --lib`: 282 passed, 0 failed
- `cargo test --test dht_self_advertisement`: 6/6 passed
- `cargo test --test two_node_messaging`: 7/7 passed
- `cargo test --test node_lifecycle`: 17/17 passed

This is one of two PRs that split the original PR #70, which bundled
this NAT-traversal hardening with two 1000-node scale fixes in the
DHT touch path and the inbound message dispatcher. The other PR
covers the 1000-node perf work.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
jacderida added a commit that referenced this pull request Apr 11, 2026
bootstrap_from_peers() was dialing peers as it discovered them from each
bootstrap node's FindNode response. A NAT node discovered from boot-1
would start dialing before boot-3's response provided the coordinator
hints needed for hole-punching — causing DIAL_TIMEOUT failures on upload
#2 while upload #1 (which ran after full discovery) worked fine.

Restructured into three phases:
1. Collect: gather all FindNode responses from all bootstrap peers
2. Merge: extract and set coordinator hints for every discovered node
3. Dial: connect to all nodes with full hint coverage

This ensures every NAT node has its complete coordinator set before any
connection attempt begins, eliminating the race between hint extraction
and dialing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants