feat(tunnel): Add mTLS authentication tests (T-6.2)#5178
feat(tunnel): Add mTLS authentication tests (T-6.2)#5178obtFusi wants to merge 44 commits intonetbirdio:mainfrom
Conversation
- PR lint workflow (Conventional Commits validation) - Auto-label workflow (Epic/Story/Task + type detection) - Dependabot config (Go, Docker, GitHub Actions) - Issue templates (Bug, Feature, Epic, Story, Task) - PR template with checklist Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
ci: add GitHub configuration from network-agent pattern
Implements server-side mTLS authentication infrastructure: - MTLSIdentity extraction from client certificates - SAN DNSName as primary identity (not CN!) - Template OID (v2) and Template Name (v1) parsing - BMPString (UTF-16BE) decoding for AD CS templates - PeerType determination (machine/user/unknown) - Issuer fingerprint via VerifiedChains (strong binding) - gRPC interceptors (unary + stream) with method-based routing Includes: - ADR-001: mTLS Port Strategy - ADR-002: CNG Signer Interface (for T-1.1) - Test certificates for unit tests - Comprehensive test coverage Closes #14 (T-1.2) Closes #15 (T-1.3) Refs #13 (T-1.1 blocked - needs Windows) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Security audit documents should not be committed to public repository. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
T-1.1: CNG crypto.Signer spike (spike/cng-signer/) - Pure Go Windows CNG integration via golang.org/x/sys/windows - Non-exportable private key signing with crypto.Signer interface - Tested on DC01: 1.6ms signing latency, no CGO required - Fixed CertDuplicateCertificateContext bug for context retention T-1.3: SAN/Template parser spike (spike/san-parser/) - Extracts SAN DNSName (primary identity, NOT CN) - Parses AD CS Template OID/Name from extensions - Determines PeerType (machine/user) from template analysis - Tested on DC01: All checks passed Also includes: - scripts/lab/autounattend.xml for Windows VM provisioning Closes #13, #15 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- setup-lab-ca.ps1: Automates AD CS setup, template creation, GPO - verify-lab-ca.ps1: Validates CA configuration (7 checks) - test-client-enrollment.ps1: Tests machine cert enrollment via SYSTEM context Key improvements based on T-2.7 learnings: - Machine cert enrollment requires SYSTEM context (Scheduled Task) - Template created via ADSI with proper flags - RPC port range restriction (5000-5100) for firewall Closes #24 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Fix regex for CA name parsing (pipe to Out-String) - Cast PropertyValueCollection to int for bitwise ops - Fix GPO link check using Get-ADObject - Fix RPC port range regex Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- RegisterMachinePeer, SyncMachinePeer, GetMachineRoutes, ReportMachineStatus - MachineIdentity, MachineRegisterRequest/Response, MachineSyncRequest/Response - MachineRoutesRequest/Response, MachineStatusRequest/Response - MachineUpdateType enum Refs #27 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Implements T-3.4: AllowedDomains pro-Account Scoping - Add AccountID and MatchedDomain fields to MTLSIdentity struct - Add MTLSDomainAccountMapping and MTLSAccountAllowedDomains config - Implement getAccountIDFromDomain() for domain-to-account mapping - Implement getAllowedDomainsForAccount() for per-account domain lists - Implement validateDomainForAccount() for cross-tenant prevention - Add checkMultiAccountSpan() for security logging - Update extractMTLSIdentity() to validate against account domains - Add comprehensive unit tests for account mapping Security: Prevents cross-tenant certificate acceptance by validating that certificate SANs match only the mapped account's allowed domains. Fail-safe: No configured domains = reject all. Closes #30 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Implement gRPC handlers for machine peer registration using mTLS: - RegisterMachinePeer: Register machine peers via certificate auth - SyncMachinePeer: Streaming sync for machine peers (stub) - GetMachineRoutes: Retrieve DC routes for machine peers (stub) - ReportMachineStatus: Machine status reporting Architectural changes: - Create shared/mtls package for Identity type to avoid import cycles - Update mtls_auth.go to use shared Identity via type alias - Remove duplicate GetMTLSIdentity function The handlers extract mTLS identity from context (set by interceptor) and use AccountID from domain-account mapping for multi-tenant isolation. Closes #32 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…ll mTLS support Implements all features from Issue #32: 1. validateIssuerCA - CA-Fingerprint validation per account - Added MTLSAccountAllowedIssuers config field - ValidateIssuerCA function in shared/mtls package - Per Security Review: Empty allowlist = DENY (explicit config required) 2. Meta fields for audit trail - Extended PeerSystemMeta with mTLS-specific fields: - PeerType, AuthMethod, CertDNSName, CertDomain - CertIssuerFP, CertSerial, CertTemplate - FirstAuthTime, LastCertAuthTime - extractMachinePeerMeta enriches metadata with mTLS identity 3. Re-registration logic - LoginPeer handles both new and existing peers - Cross-account registration blocked (security check) - mTLS metadata updated on re-registration 4. Security validations - Issuer CA validation in all Machine Tunnel RPCs - Account isolation via MTLSIdentity.AccountID - Fingerprint-based comparison (not DN string matching) 5. Rate-limit/Replay protection: Stubbed for MVP (TODO) Files changed: - config/config.go: Added MTLSAccountAllowedIssuers - mtls_auth.go: Added ValidateIssuerCA, MTLSConfig updated - shared/mtls/identity.go: ValidatorConfig, ValidateIssuerCA - shared/grpc/machine_tunnel.go: Full implementation - server/peer/peer.go: Extended PeerSystemMeta with mTLS fields Closes #32 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Implements unique DNS label generation for mTLS-authenticated peers to prevent hostname collisions across different domains. Features: - GenerateUniqueDNSLabel: Creates FQDN-hash based labels Example: "win10-pc.customer-a.local" -> "win10-pc-a1b2c3d4" - ValidateDNSLabel: RFC 1123 compliance check - sanitizeForDNS: Hostname sanitization (underscores, spaces -> hyphens) - CheckDNSLabelCollision: Helper for collision detection Technical details: - 32-bit SHA256 hash suffix (8 hex chars) for ~0.001% collision rate - Automatic hostname truncation for labels > 63 chars - Case-insensitive FQDN hashing - Fallback to IP-based label on validation failure Integration: - AddPeer in peer.go now uses hash-based labels for mTLS peers - Detection via peer.Meta.CertDNSName and peer.Meta.CertDomain fields Unit tests: - Uniqueness across domains/hostnames - Truncation for long hostnames - RFC 1123 validation (all edge cases) - Sanitization (underscores, spaces, special chars) Closes #33 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add MTLSServer type with RequireAndVerifyClientCert on port 33074 - Add MTLSPort config option for dedicated mTLS-only server - Integrate mTLS server lifecycle into BaseServer (Start/Stop) - Add GetMTLSServer() for external service registration - Load CA pool from directory (.crt/.pem/.cer) and/or single file - Initialize mTLS validator config with account-issuer mappings - TLS 1.2+ minimum required for mTLS connections Port 33073 (standard): NoClientCert - user auth, setup keys Port 33074 (mTLS): RequireAndVerifyClientCert - machine tunnel only Closes #34 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Fix duplicate word 'LoginPeer' in comment (machine_tunnel.go) - Convert if-else chains to switch statements (mtls_auth.go, peer.go) - Add nolint directive for deprecated Audience field test (conversion_test.go) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
We had both .github/PULL_REQUEST_TEMPLATE.md (our custom) and .github/pull_request_template.md (upstream). On macOS with its case-insensitive filesystem, this causes git diff failures in CI. Keep the upstream template (lowercase) for compatibility. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
We had both .github/PULL_REQUEST_TEMPLATE.md (our custom) and .github/pull_request_template.md (upstream). On macOS with its case-insensitive filesystem, this causes git diff failures in CI. Keep the upstream template (lowercase) for compatibility. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
feat(server): Complete S-3 Server mTLS Implementation
Adds Dockerfile.multistage that builds the management server binary inside a golang:1.25 container, solving the ar archive issue. Problem: Building with `go build ./management/cmd/` produced an ar archive instead of an ELF executable because cmd/ has `package cmd` (library), not `package main`. Solution: Use `go build ./management/` which contains main.go with `package main` and `func main()`. Benefits: - No cross-compilation issues (builds inside Linux container) - Produces correct ELF binary (~52MB) - Smaller final image (ubuntu:24.04 base) - Build flags: -ldflags="-s -w" for smaller binary Usage: docker build -f management/Dockerfile.multistage -t netbird-fork/management:latest . Relates to: #93 (T-3.9: Deploy Fork to Lab) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…-5.1) Implement Bootstrap() method that supports two-phase authentication: - Phase 1: Setup-Key authentication for initial enrollment (before cert) - Phase 2: mTLS authentication with machine certificate (after AD CS enrollment) Components: - bootstrap.go: Main bootstrap logic with hasMachineCert() check - bootstrap_test.go: Unit tests for all edge cases (15 tests passing) The bootstrap automatically selects the appropriate auth method: - If MachineCertEnabled and valid cert exists: use mTLS via RegisterMachinePeer RPC - Otherwise: fall back to Setup-Key via standard Login/Register RPC Closes #47 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Implement DC connectivity checks and domain join helpers: - CheckDCConnectivity: Validates LDAP, Kerberos, DNS, SMB, NTP ports - ValidatePreJoinRequirements: Pre-join checklist with all requirements - GenerateDomainJoinScript: Generates PowerShell script for domain join PowerShell bootstrap script (scripts/bootstrap-new-client.ps1): - Full Phase 1 → Domain Join → Cert → Phase 2 workflow - NTP sync with public NTP (pre-tunnel) and DC (pre-join) - DC connectivity verification via tunnel - Certificate enrollment via AD CS (certreq) - Config update for mTLS transition Tests: 18 new tests for DC connectivity and domain join (all passing) Closes #48 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Use net.JoinHostPort() for TCP and UDP port checks (IPv6 compatible) - Extract credential prompt to constant with nolint directive - The prompt message is NOT a credential, just UI text Part of T-5.2 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
feat(build): Add multi-stage Dockerfile for management server
…-fallback feat(client): Add machine tunnel bootstrap and domain join (T-5.1, T-5.2)
- Add ValidateMachineCertificate() for machine cert validation - Add GenerateCertEnrollmentScript() for AD CS enrollment via certreq - Add ParseCertificateFile() for cert info extraction - Add NeedsRenewal() for certificate renewal detection - Add WatchCertificateExpiry() for proactive renewal monitoring - Add ExtractIssuerFingerprint() for mTLS issuer verification - 32 tests covering all cert validation scenarios Validates: - SAN DNSNames (not CN!) matching hostname.domain format - Certificate expiry and minimum validity - Renewal threshold (30 days before expiry) - Case-insensitive hostname matching - Certificate chain for issuer fingerprint Closes T-5.3 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add comprehensive PowerShell scripts for test environment management: - reset-netbird-machine.ps1: Safely reset NetBird Machine Tunnel - Stops and removes service - Removes WireGuard interface - SCOPED NRPT cleanup (only NetBird-Machine-* prefix, not all rules!) - SCOPED firewall rule cleanup - Optional config backup - verify-nrpt-cleanup.ps1: Verify NRPT cleanup - Checks both registry paths (Policy and Dnscache) - Checks PowerShell Get-DnsClientNrptRule - Reports any remaining NetBird rules - reinstall-and-test.ps1: Automated reinstall and test cycle - Full reset -> install -> start -> verify workflow - Waits for tunnel establishment - Basic connectivity tests CRITICAL: Uses Registry-based scoped cleanup to avoid removing other NRPT rules (GPO, VPN, etc.) Closes T-5.5 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
feat(scripts): Add reset and test scripts for Windows VM testing (T-5.5)
feat(tunnel): Add certificate enrollment after domain join (T-5.3)
- Update to v2.0.0 with Smart Cert Selection (no thumbprint needed) - Add REVOKE Setup-Key warning at script end (Step 8) - Redact Setup-Key in logs (show only last 4 chars: ****-****-****-****-XXXX) - Add security documentation in .NOTES: - SHA256 checksum verification instructions - Authenticode signing instructions - Setup-Key handling best practices - Step 7 now uses machine_cert_template_name + machine_cert_san_must_match - Remove hardcoded thumbprint requirement Tested on Windows VM in WhatIf mode - all 8 steps execute correctly. Closes #50 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
feat(scripts): T-5.4 Bootstrap Script v2.0 with Smart Selection
- Add PowerShell test script (Test-TunnelEstablishment.ps1) with: - TC1: Boot + Login tests (Service, Interface, Routes, DC connectivity) - TC2: DNS-SRV Discovery for LDAP - TC3: DNS-SRV Discovery for Kerberos (UDP/TCP) - TC4: UDP Kerberos connectivity validation - Proper CI exit codes and formatted output - Add Go GUID-based interface verification: - FindWireGuardInterface() with priority search (GUID > Description > Name) - VerifyInterface() for status validation - HasRouteToNetwork() for route checking - Windows-specific via winipcfg, stubs for other platforms - Add GitHub Actions workflow (e2e-tunnel.yml): - Manual workflow_dispatch for lab testing - PowerShell syntax validation - Test result documentation Tested on Windows 11 VM (10.0.0.160): - TC1.2-TC1.6: PASS (WireGuard interface, routes, DC connectivity) - TC2.1b: PASS (LDAP SRV via nslookup) - TC4.1-TC4.2: PASS (DC discovery, UDP Kerberos) Closes #54 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Fixes nilerr lint error by renaming inner error variable. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add waitForServer helper that polls the server until it's ready, preventing flaky test failures when the server goroutine hasn't started listening before the test proceeds. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add waitForServerReady helper that polls the SSH server until it's accepting connections, preventing flaky test failures when the server goroutine hasn't started its Accept loop before tests proceed. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Increase dial timeout from 100ms to 500ms - Use exponential backoff (10ms -> 100ms cap) - Increase overall timeout from 5s to 10s for slow Windows runners - Add better error message including last error details Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Go 1.21+ has builtin min function, custom definition shadows it and triggers golangci-lint predeclared error. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The SSH server may need a moment to reset its state after accepting a probe connection that closes without completing the handshake. Add 100ms delay after successful probe to avoid interference. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Replace TCP probe with actual SSH connection attempt in waitForServerReady. The previous approach only verified TCP listener was ready, but the SSH server needs more time to initialize its internal state. Now we attempt a real SSH handshake (which will fail auth) to ensure the server is fully operational before tests proceed. This fixes flaky tests on FreeBSD and other platforms where the SSH server's Accept loop wasn't ready when tests started connecting.
Changed waitForServerReady to only consider the server ready when it receives an SSH-protocol response (error containing 'ssh:'). This is more reliable than checking for network-level errors like 'connection refused', which may vary across platforms (FreeBSD, Windows, etc.). The server is ready when: - SSH handshake completes (any ssh: error means we talked SSH) - Connection succeeds (unexpected but valid) Added 200ms delay after successful probe to ensure server's Accept loop is stable for subsequent connections.
…ests
The pattern `defer require.NoError(t, server.Stop())` was calling
Stop() immediately because Go evaluates function arguments at defer
time, not when the deferred function runs.
Changed to `defer func() { require.NoError(t, server.Stop()) }()`
which properly defers the entire call until function exit.
Also simplified waitForServerReady to use lightweight SSH banner
check instead of full SSH handshake.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add trust_windows.go with the following functions required by trusttest/main.go: - GetCertFingerprint: SHA-256 fingerprint of certificate - GetCertPin: SPKI pin (sha256//BASE64) from cert file - GetCertPinFromDER: SPKI pin from DER-encoded cert - VerifyServerCert: Verify cert against pin - VerifyServerCertChain: Verify cert chain against pin - InstallCACert: Install CA cert to Windows store (certutil) - RemoveCACert: Remove CA cert from Windows store (certutil) - TrustStoreRoot/TrustStoreCA: Store type constants Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add security_windows.go with DPAPI encryption, ACL hardening, and SecureConfig for encrypted setup key management - Add eventlog_windows.go with Windows Event Log integration - Add nolint:forbidigo comments to CLI test tools (securitytest, trusttest) - Implements T-5.6 security features for Windows pre-login VPN Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Handle LocalFree return values properly in security_windows.go - Convert if-else chain to switch statement in securitytest/main.go Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
feat(e2e): Add tunnel establishment E2E tests (T-6.1)
Add comprehensive mTLS authentication tests including: Unit Tests (mtls_test.go): - TestMTLSCertGeneration: Certificate generation for mTLS - TestMTLSMultiSANValidation: TC22/TC23 Multi-SAN validation logic - TestMTLSIssuerValidation: TC19/TC27 Issuer fingerprint validation E2E Test Tool (cmd/mtlstest): - TC19: Wrong CA rejection test - TC21: No client cert → Unauthenticated test - TC22: Multi-SAN with partial match acceptance - TC23: Multi-SAN with no match rejection - TC27: Issuer fingerprint validation demonstration The mtlstest binary is designed to run on Windows VMs against the mTLS server (port 33074) to validate: - Client cert requirement enforcement - Domain-to-account mapping - Issuer fingerprint validation via VerifiedChains Closes #55 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
|
|
Opened on wrong repository by mistake. This should go to our fork obtFusi/netbird-fork. |
|
Caution Review failedThe pull request is closed. 📝 WalkthroughWalkthroughIntroduces comprehensive Windows pre-login VPN machine tunnel with two-phase bootstrap (Setup-Key then mTLS), automatic certificate enrollment, domain join automation, management server mTLS authentication, peer registration/sync RPCs, and supporting infrastructure including git hooks, GitHub workflows, PowerShell scripts, and test certificates. Changes
Sequence Diagram(s)sequenceDiagram
participant Client as NetBird Client
participant Mgmt as Management Server
participant CA as Windows CA
participant DC as Domain Controller
Client->>Client: Phase 1: Setup-Key Bootstrap
activate Client
Client->>Mgmt: Connect with SetupKey
Mgmt->>Mgmt: Validate SetupKey
Mgmt->>Client: Return PeerConfig, NetbirdConfig
Client->>Client: Generate WireGuard Key
Client->>Client: Store in encrypted config
deactivate Client
Client->>Client: Configure Tunnel & Join Domain
activate Client
Client->>DC: Verify DC Connectivity (LDAP/Kerberos/DNS)
DC->>Client: Acknowledge
Client->>CA: Enroll Machine Certificate
CA->>Client: Issue Cert with SAN (hostname.domain)
Client->>Client: Validate & store cert
deactivate Client
Client->>Client: Phase 2: mTLS Bootstrap
activate Client
Client->>Mgmt: RegisterMachinePeer with Client Cert (mTLS)
Mgmt->>Mgmt: Extract Identity from Cert<br/>(DNSName, Domain, Issuer)
Mgmt->>Mgmt: Validate Issuer Fingerprint
Mgmt->>Mgmt: Map Domain to Account
Mgmt->>Client: Return MachineIdentity,<br/>AllowedDCRoutes, DNSConfig
Client->>Client: Update config, remove SetupKey
deactivate Client
Client->>Mgmt: SyncMachinePeer (mTLS stream)
activate Client
activate Mgmt
Mgmt->>Mgmt: Authorize via mTLS Identity
Mgmt->>Client: NetworkMap, DNSConfig
Client->>Client: Apply routes & DNS
deactivate Mgmt
deactivate Client
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Possibly related PRs
Suggested reviewers
Poem
✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
| types: [opened] | ||
|
|
||
| permissions: | ||
| issues: write |
Check notice
Code scanning / SonarCloud
Write permissions should be defined at the job level Low
|
|
||
| permissions: | ||
| issues: write | ||
| pull-requests: write |
Check notice
Code scanning / SonarCloud
Write permissions should be defined at the job level Low




Summary
mtlstestE2E test binary for Windows VMsTest Cases Implemented
Unit Test Results
Infrastructure Verification
E2E Test Tool Usage
Files Changed
client/internal/tunnel/mtls_test.go- Unit testsclient/internal/tunnel/cmd/mtlstest/main.go- E2E test binaryCloses #55
🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Documentation
Chores
✏️ Tip: You can customize this high-level summary in your review settings.