Skip to content

feat: best-effort UPnP IGD port mapping#40

Merged
jacderida merged 3 commits into
rc-2026.4.1from
feat/upnp-port-mapping
Apr 6, 2026
Merged

feat: best-effort UPnP IGD port mapping#40
jacderida merged 3 commits into
rc-2026.4.1from
feat/upnp-port-mapping

Conversation

@mickvandijke
Copy link
Copy Markdown
Collaborator

@mickvandijke mickvandijke commented Apr 6, 2026

Summary

Adds optional UPnP IGD port mapping that opportunistically asks the local router to forward the endpoint's UDP port and surfaces the resulting public address as a high-priority NAT traversal candidate. The integration is strictly additive — failure is silent and behaviour matches the pre-UPnP build exactly when no gateway is available.

When the gateway cooperates, peers can dial the port-mapped public address directly and skip hole punching entirely. When it doesn't (no router, UPnP disabled, refused, untrusted external IP), discovery proceeds exactly as before via OBSERVED_ADDRESS and PUNCH_ME_NOW.

What's in the box

  • New `src/upnp.rs` (~520 lines) — `UpnpConfig`, `UpnpState`, `UpnpMappingService`, `UpnpStateRx`. Real backend gated on default-on `upnp` cargo feature; no-op stub when disabled
  • New `CandidateSource::PortMapped` and `DiscoverySourceType::PortMapped` variants threaded through the discovery pipeline
  • Discovery manager publishes the UPnP-mapped address as a `PortMapped` candidate at priority 70_000 (above bound-address 60_000)
  • ICE pair priority gives `PortMapped` a type preference of 110 — between `Host` (126) and `ServerReflexive` (100)
  • `NatTraversalEndpoint` owns the service exclusively; discovery manager only borrows a read-only `UpnpStateRx`. Graceful shutdown moves out the service and issues `DeletePortMapping` with a 500ms budget
  • `--no-upnp` CLI flag on the `saorsa-transport` binary

Best-effort guarantees

  1. `UpnpMappingService::start()` is infallible — never returns `Result`, never blocks. Endpoint construction proceeds while the probe runs in the background.
  2. All failures log at `debug` and transition to a sticky `Unavailable` state. Only success logs at `info`.
  3. Single-shot probe per session — a router that ignored SSDP is left alone.
  4. Lease defaults to 1h, refresh at lease/2. Crash-path safety: 1h bounds the worst-case leak.
  5. Gateway-claimed external IPs validated as plausibly public before use. RFC1918, CGNAT (100.64/10), loopback, link-local, documentation, broadcast and multicast all rejected.
  6. Endpoint owns the service exclusively so shutdown can reclaim it for unmap.

Drive-by

Clippy caught a pre-existing `!target_peer_id.is_some()` at `nat_traversal_api.rs:6144` (introduced in f65af2e) that was blocking warnings-as-errors after a clippy upgrade. Replaced with `.is_none()`.

Test plan

  • `cargo fmt --all -- --check` — clean
  • `cargo clippy --all-targets --features upnp -- -D warnings` — clean
  • `cargo clippy --all-targets --no-default-features --features platform-verifier,network-discovery -- -D warnings` — clean (stub backend)
  • `cargo nextest run --features upnp --lib` — 1468 / 1468 passed
  • New unit tests cover: disabled service path, default config, IPv4 plausibility (RFC1918, CGNAT, loopback, link-local, documentation, broadcast), IPv6 plausibility (link-local, global unicast)
  • New discovery integration tests cover: `Mapped` state surfaces a `PortMapped` candidate at expected priority, `Unavailable` state contributes nothing
  • Manual test on real router with UPnP enabled — not in CI; verify mapping appears in router admin UI and external dial works
  • Manual test on router with UPnP disabled — verify the endpoint behaves identically to a non-UPnP build (no log spam, hole punching still works)

What's deliberately NOT in this PR

Per the design discussion before implementation:

  • No NAT-PMP / PCP support (UPnP-only as requested)
  • No `if-watch` integration for network-change re-probing
  • No metrics dashboard tile
  • No automatic broadcast of `PortMapped` candidates via `ADD_ADDRESS` (covered indirectly because bootstrap nodes will OBSERVED_ADDRESS the same address back when port preservation works — worth a follow-up to close the gap explicitly)

🤖 Generated with Claude Code

Greptile Summary

Adds an optional, default-on UPnP IGD port-mapping layer that surfaces the router-assigned public ip:port as a new PortMapped candidate at priority 70,000 / ICE type-preference 110. The integration is carefully best-effort: UpnpMappingService::start is infallible, all failure paths are silent, the endpoint holds exclusive ownership for graceful unmap on shutdown, and a --no-upnp CLI flag lets users opt out. One minor gap worth closing: the IPv6 branch of is_plausibly_public omits v6.is_documentation() (stable since Rust 1.73), leaving 2001:db8::/32 accepted as plausibly public — and the accompanying unit test inadvertently uses that same documentation prefix as its "global unicast" example.

Confidence Score: 5/5

Safe to merge; only P2 findings remain in a strictly additive best-effort feature

Both findings are P2: a missing is_documentation() guard for IPv6 (no real gateway returns 2001:db8:: as its external IP) and the corresponding misleading test address. Core logic — ownership model, async shutdown, dedup, priority constants, no-op stub path, graceful 500ms unmap — is sound. All 1468 tests pass.

src/upnp.rs — is_plausibly_public IPv6 branch and its unit test

Important Files Changed

Filename Overview
src/upnp.rs New best-effort UPnP service; IPv6 plausibility check omits is_documentation() and unit test uses RFC 3849 address as its global-unicast example
src/candidate_discovery.rs Adds PortMapped source variant and tri-site UPnP candidate injection with correct address-equality dedup
src/nat_traversal_api.rs Integrates UpnpMappingService with correct exclusive ownership, async shutdown (mutex take before await), and ICE type-preference 110 for PortMapped
src/connection/nat_traversal.rs Adds PortMapped to CandidateSource enum and maps it to ServerReflexive in classify_candidate_type
src/unified_config.rs Exposes UpnpConfig on NatConfig flowing through to NatTraversalConfig with correct defaults
src/bin/saorsa-transport.rs Adds --no-upnp CLI flag calling UpnpConfig::disabled()
Cargo.toml Adds optional igd-next 0.17 gated on new default-on upnp feature

Sequence Diagram

sequenceDiagram
    participant E as NatTraversalEndpoint
    participant U as UpnpMappingService
    participant G as IGD Gateway
    participant D as CandidateDiscoveryManager

    E->>U: start(local_port, config)
    U-->>E: service (state=Probing)
    E->>D: set_upnp_state_rx(rx)

    U->>G: search_gateway (SSDP, 2s timeout)
    G-->>U: gateway
    U->>G: get_external_ip()
    G-->>U: external_ip
    U->>U: is_plausibly_public(ip)?
    U->>G: add_port OR add_any_port
    G-->>U: mapped_port
    U-->>D: watch => Mapped {external, lease_expires_at}

    D->>D: upnp_candidate() snapshot
    D->>D: try_publish_upnp_candidate()
    D-->>E: LocalCandidateDiscovered(PortMapped)

    Note over U: Refresh loop at lease/2
    U->>G: add_port(mapped_port, lease)
    G-->>U: renewed

    E->>U: shutdown()
    U->>U: notify + abort background task
    U->>G: remove_port (500ms budget)
    G-->>U: deleted
Loading
Prompt To Fix All With AI
This is a comment left during a code review.
Path: src/upnp.rs
Line: 289-298

Comment:
**IPv6 documentation range accepted as plausibly public**

The IPv4 path explicitly calls `addr.is_documentation()`, rejecting RFC 5737 prefixes. The IPv6 path omits the equivalent check, so `2001:db8::/32` (RFC 3849 documentation range) passes as public. `Ipv6Addr::is_documentation()` is stable since Rust 1.73.0 — the project minimum of 1.88.0 makes this straightforward to add.

```suggestion
        IpAddr::V6(v6) => {
            // Reject loopback, unspecified, multicast, link-local, and
            // documentation (2001:db8::/32, RFC 3849). Anything else
            // (global unicast, ULA) is acceptable.
            !(v6.is_loopback()
                || v6.is_unspecified()
                || v6.is_multicast()
                || v6.is_documentation()
                || is_ipv6_link_local(v6))
        }
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: src/upnp.rs
Line: 700-706

Comment:
**Test example address is in the RFC 3849 documentation range**

`2001:db8::1` is the IPv6 documentation prefix (RFC 3849), not a globally routable address. Using it as the "global unicast" example means the test does not actually cover a real public address. If `v6.is_documentation()` is added to `is_plausibly_public`, this assertion will fail. Consider replacing with a genuinely routable address such as `2600::1`.

```suggestion
    fn accepts_global_unicast_ipv6_and_rejects_link_local() {
        let global = Ipv6Addr::new(0x2600, 0, 0, 0, 0, 0, 0, 1); // 2600::1
        let link_local = Ipv6Addr::new(0xfe80, 0, 0, 0, 0, 0, 0, 1);
        assert!(is_plausibly_public(IpAddr::V6(global)));
        assert!(!is_plausibly_public(IpAddr::V6(link_local)));
        assert!(!is_plausibly_public(IpAddr::V6(Ipv6Addr::LOCALHOST)));
```

How can I resolve this? If you propose a fix, please make it concise.

Reviews (1): Last reviewed commit: "feat: best-effort UPnP IGD port mapping ..." | Re-trigger Greptile

Greptile also left 2 inline comments on this PR.

Adds an optional UPnP IGD client that opportunistically asks the local
router to forward the endpoint's UDP port and surfaces the resulting
public address as a high-priority candidate alongside locally-discovered
and peer-observed candidates.

The integration is strictly additive: failure is silent (no router, UPnP
disabled, refused mapping, untrusted external IP) and behaviour matches
the pre-UPnP build exactly when the gateway is unavailable. The endpoint
construction never blocks on the probe — discovery runs in a background
task with a 2s deadline.

Key design points:

* Single-shot probe per session. A router that did not answer SSDP is
  left alone for the rest of the session — no aggressive re-probing.
* Lease renewed at half the lease duration (default 1h, so refresh every
  30 minutes). Crash-path safety: 1h lease bounds the worst-case leak.
* Gateway-claimed external IPs are validated as plausibly public before
  use. RFC1918, CGNAT (100.64/10), loopback, link-local, documentation,
  broadcast and multicast all rejected so a misbehaving router cannot
  poison candidate discovery.
* Endpoint owns the service exclusively; the discovery manager only
  holds a read-only UpnpStateRx (watch::Receiver). This lets the
  shutdown path move out the service and call DeletePortMapping
  directly with a 500ms budget.
* PortMapped slots into priority calculation between Host (126) and
  ServerReflexive (100) at 110, and gets discovery priority 70_000 so
  it outranks the bound-address promotion (60_000).

Drive-by: clippy caught a pre-existing `!is_some()` in the PUNCH_ME_NOW
coordination path that was blocking warnings-as-errors after a clippy
upgrade — replaced with `.is_none()`.

Tests: 1468 lib tests pass with --features upnp. New unit tests cover
config defaults, the disabled path, the IP plausibility classifier
across IPv4/IPv6 special ranges, and the discovery integration with a
pinned UpnpStateRx.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Comment thread src/upnp.rs Outdated
Comment thread src/upnp.rs
mickvandijke and others added 2 commits April 6, 2026 16:00
The IPv4 plausibility check rejects RFC 5737 documentation space
(192.0.2/24, 198.51.100/24, 203.0.113/24) so a misbehaving router
cannot poison NAT traversal candidate discovery by returning a
documentation address as its "external" IP. The IPv6 side of the
classifier was missing the equivalent rejection of RFC 3849
`2001:db8::/32`, leaving a small but real asymmetry in the stated
threat model.

Split `is_plausibly_public` into symmetric v4/v6 helpers, add an
`is_ipv6_documentation` check backed by named constants for the
prefix bytes, and replace the hand-rolled `is_ipv6_link_local` with
stdlib's `Ipv6Addr::is_unicast_link_local()` (stable since 1.76 —
well under the project MSRV of 1.88).

Fixes up the test that would have caught this: the positive
assertion in `accepts_global_unicast_ipv6_and_rejects_link_local`
was using `2001:db8::1` as its "global unicast" example, which
accidentally hid the gap. Swap it for `2606:4700:4700::1111`
(Cloudflare DNS), add `rejects_ipv6_documentation_range` with a
neighbouring `/32` as a positive control, and add
`rejects_ipv6_multicast_and_unspecified` for basic coverage
completeness.

Drive-bys from the deep review of PR #40:

* `CandidateDiscoveryManager::set_upnp_state_rx` downgraded from
  `pub` to `pub(crate)` — it's an internal wiring hook used only
  by the endpoint constructor, not a public API.
* Clarifying comment on `local_socket_for_mapping` explaining why
  the sync `std::net::UdpSocket::bind`/`connect` pair is safe to
  call inside the tokio task (pure kernel route lookup, no wire
  I/O, runs once per session).

Tests: 1470/1470 lib tests pass with `--features upnp`, and the
UPnP module now has 11 passing tests (up from 9) across both the
real and stub backends.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
PR #41 removed the dead `NatTraversalEndpoint::attempt_hole_punching`
chain (`get_candidate_pairs_for_addr`, `calculate_candidate_pair_priority`,
`attempt_quic_hole_punching`, `create_path_challenge_packet`,
`store_successful_candidate_pair`, `get_successful_candidate_address`)
as unreachable code gated behind `#[allow(dead_code)]` that could
not have worked in production.

PR #40 had added `CandidateSource::PortMapped` handling inside two of
those exact functions (`is_local_side` in `get_candidate_pairs_for_addr`
and a type-preference slot in `calculate_candidate_pair_priority`).
Those additions patched dead code and are safely dropped — the live
production pairing path in `P2pEndpoint::connect_with_fallback_inner`
drives the coordinator-mediated PUNCH_ME_NOW flow and consumes
priorities from `crate::connection::nat_traversal::calculate_candidate_priority`,
which works off `CandidateType`, not `CandidateSource`. The mapping
`CandidateSource::PortMapped -> CandidateType::ServerReflexive` in
`classify_candidate_type` at `src/connection/nat_traversal.rs:307`
(introduced by PR #40 and untouched by PR #41) is what carries the
PortMapped variant through the production pairing formula. No
further plumbing is required.

Extended PR #41's tombstone comment with a note explaining why
PortMapped needs no replacement pairing logic at this site, so the
next person grepping for `PortMapped` in `nat_traversal_api.rs`
lands on the explanation directly.

Validation:
- cargo fmt --all -- --check clean
- cargo clippy --features upnp --all-targets -- -D warnings clean
- cargo clippy --no-default-features --features platform-verifier,network-discovery --all-targets -- -D warnings clean
- cargo nextest run --features upnp --lib: 1470/1470 passed

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@jacderida jacderida merged commit 3d73241 into rc-2026.4.1 Apr 6, 2026
2 of 3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants