Skip to content

Fix connection reuse + port pre-flight#6715

Open
Mzack9999 wants to merge 12 commits intodevfrom
feat-6702-http-client
Open

Fix connection reuse + port pre-flight#6715
Mzack9999 wants to merge 12 commits intodevfrom
feat-6702-http-client

Conversation

@Mzack9999
Copy link
Member

@Mzack9999 Mzack9999 commented Dec 18, 2025

Proposed changes

Checklist

  • Pull request is created against the dev branch
  • All checks passed (lint, unit/integration/regression tests etc.) with my changes
  • I have added tests that prove my fix is effective or that my feature works
  • I have added necessary documentation (if appropriate)

Summary by CodeRabbit

  • New Features
    • Preflight port scanning to detect and filter responsive targets before execution
    • Per-host rate limiting for finer-grained throttling
    • Per-host HTTP client pooling to optimize connections
    • HTTP client sharding to distribute load across multiple pools
    • Connection reuse tracking with per-host statistics
    • Automatic HTTP→HTTPS port mismatch detection and correction
    • New CLI options to enable/disable these behaviors

✏️ Tip: You can customize this high-level summary in your review settings.

@Mzack9999 Mzack9999 self-assigned this Dec 18, 2025
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 18, 2025

Walkthrough

Adds optional preflight TCP port scanning, per-host HTTP client pooling, sharded HTTP clients, per-host rate limiting, connection-reuse and HTTP→HTTPS port trackers, new CLI flags and Options toggles, and supporting protocolstate and pool implementations.

Changes

Cohort / File(s) Summary
CLI & Configuration
cmd/nuclei/main.go, pkg/types/types.go
New CLI flags and Options fields: preflight-portscan, per-host-client-pool, http-client-shards, per-host-rate-limit.
Preflight Port Scanning
internal/runner/preflight_portscan.go
New preflight TCP connect scanner: derives ports, resolves IPs, parallel batched scans, filters input provider, and reports statistics.
Runner & Execution
internal/runner/runner.go
Integrates optional preflight phase, toggles global rate limiter when per-host rate limiting enabled, records input count in protocolstate, and emits per-host/shard/trackers stats.
Protocol State
pkg/protocols/common/protocolstate/dialers.go, pkg/protocols/common/protocolstate/state.go
Added Dialers fields (PerHostHTTPPool, PerHostRateLimitPool, ConnectionReuseTracker, HTTPToHTTPSPortTracker, ShardedHTTPPool, InputCount) and exported SetInputCount(); InputCount initialized on dialers setup.
HTTP Request Flow
pkg/protocols/http/http.go, pkg/protocols/http/build_request.go, pkg/protocols/http/request.go, pkg/protocols/http/request_fuzz.go
Added ConnectionReusePolicy and AnalyzeConnectionReuse(); policy-driven keep-alive handling; introduced per-host rate limiting calls, per-target client selection, connection reuse tracking via httptrace, and HTTP→HTTPS mismatch detection/recording.
HTTP Client Pooling & Rate Limits
pkg/protocols/http/httpclientpool/clientpool.go
Added public APIs for per-target client retrieval, per-host rate limiter access/recording, connection reuse and HTTP→HTTPS trackers; early tracker initialization; routing to per-host or sharded pools.
Per-Host Client Pool
pkg/protocols/http/httpclientpool/perhost_pool.go
New expirable LRU-backed per-host HTTP client pool with GetOrCreate, eviction, stats, client info, resize, and close operations.
Per-Host Rate Limit Pool
pkg/protocols/http/httpclientpool/perhost_ratelimit_pool.go
New expirable LRU-backed per-host rate limiter pool with GetOrCreate, eviction, PPS tracking, stats, and request recording.
Sharded Client Pool
pkg/protocols/http/httpclientpool/sharded_pool.go
New sharded HTTP client pool with host-to-shard hashing, configurable shard count (Default 16, Min 4, Max 256), per-shard stats, and client creation per shard.
Connection Reuse Tracker
pkg/protocols/http/httpclientpool/connection_reuse_tracker.go
New ConnectionReuseTracker with global and per-host reuse/new counts, LRU expirable cache, stats and print helpers, and thread-safe recording.
HTTP-to-HTTPS Port Tracker
pkg/protocols/http/httpclientpool/http_to_https_tracker.go
New tracker marking host:port entries requiring HTTPS, with detection/correction counters and stats reporting.
Utilities & Tests
pkg/protocols/utils/http/requtils.go, lib/tests/sdk_test.go, pkg/protocols/common/protocolstate/memguardian_test.go, pkg/protocols/http/request_test.go
ShouldDisableKeepAlive tightened to require TemplateThreads == 0; goleak ignores added for expirable LRU TTL goroutine in tests.
Dependencies
go.mod
github.com/hashicorp/golang-lru/v2 moved from indirect to direct require (v2.0.7).

Sequence Diagram(s)

sequenceDiagram
    participant Runner
    participant Preflight as Preflight Scanner
    participant Resolver as DNS Resolver
    participant PortScanner as TCP Port Scanner
    participant InputProv as Input Provider
    participant Stats as Stats Reporter

    Note over Runner,Preflight: Enabled when --preflight-portscan is set

    Runner->>Preflight: start preflight (templates, ports)
    Preflight->>Resolver: resolve hostnames -> IPs (parallel)
    Resolver-->>Preflight: resolved IP lists

    loop per input target
        Preflight->>PortScanner: scan candidate ports on resolved IPs (batched/parallel)
        alt open port found
            PortScanner-->>Preflight: record open port, mark kept
        else none open
            PortScanner-->>Preflight: mark filtered
        end
    end

    Preflight->>InputProv: wrap provider to expose only kept targets
    Preflight->>Stats: emit summary (total, kept, filtered, per-port counts)
    Runner->>InputProv: proceed with filtered inputs
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

"🐰
New pools and trackers hop in tune,
Preflight probes by afternoon,
Per-host limits, shards that bite,
Connections reused through the night,
A carrot-code chorus—hop, delight!"

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 64.71% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Fix connection reuse + port pre-flight' directly relates to the main changes in the changeset, which introduce connection reuse tracking, preflight port scanning, and related optimizations across multiple files.
✨ Finishing touches
  • 📝 Generate docstrings

📜 Recent review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0de52bb and 6bc5df6.

📒 Files selected for processing (2)
  • go.mod
  • pkg/protocols/common/protocolstate/state.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • pkg/protocols/common/protocolstate/state.go
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Lint
🔇 Additional comments (1)
go.mod (1)

84-84: golang-lru/v2 v2.0.7 is current and free of known vulnerabilities. The dependency is the latest stable release with no CVEs in public databases, making it safe for the HTTP client pooling, rate limiting, and connection-reuse tracking features.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@dwisiswant0 dwisiswant0 added this to the v3.7.0-alpha.1 milestone Dec 23, 2025
@Ice3man543 Ice3man543 marked this pull request as ready for review December 24, 2025 14:07
@auto-assign auto-assign bot requested a review from dogancanbakir December 24, 2025 14:07
@Ice3man543
Copy link
Member

@coderabbitai review

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 24, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@Ice3man543 Ice3man543 self-requested a review December 24, 2025 14:08
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

♻️ Duplicate comments (2)
pkg/protocols/http/httpclientpool/perhost_ratelimit_pool.go (1)

211-261: Host normalization expects URLs/host:port, not full HTTP requests

normalizeHostForRateLimit (and extractHostPortFromString) correctly handle URLs, bare hosts, and host:port strings, but will mis-normalize if given a full HTTP request line (e.g., "GET http://example.com/ HTTP/1.1\r\n..."). This leads to odd keys such as "GET:80" and breaks per-host grouping.

The pool itself is fine; just ensure call sites pass URLs or host:port (not Request.String()), as suggested in the request.go comment.

Also applies to: 263-307

pkg/protocols/http/httpclientpool/sharded_pool.go (1)

276-281: TLS security configuration: InsecureSkipVerify and MinVersion: TLS 1.0.

These settings were already flagged by CodeQL in past reviews. While InsecureSkipVerify: true is typical for security scanning tools that need to test any target, and TLS 1.0 support may be intentional for compatibility, this is worth acknowledging.

For a security scanning tool like Nuclei, these settings are likely intentional to maximize target compatibility during vulnerability assessment.

🧹 Nitpick comments (19)
pkg/protocols/common/protocolstate/memguardian_test.go (1)

103-110: Consider adding the expirable LRU exclusion for consistency.

TestMemGuardianReset has its own goleak.VerifyNone call but doesn't include the github.com/hashicorp/golang-lru/v2/expirable exclusion that was added to TestMemGuardianGoroutineLeak. If this test can trigger code paths that use the expirable LRU cache, it may become flaky.

🔎 Proposed fix
 func TestMemGuardianReset(t *testing.T) {
 	defer goleak.VerifyNone(t,
 		goleak.IgnoreAnyContainingPkg("go.opencensus.io/stats/view"),
 		goleak.IgnoreAnyContainingPkg("github.com/syndtr/goleveldb"),
 		goleak.IgnoreAnyContainingPkg("github.com/go-rod/rod"),
 		goleak.IgnoreAnyContainingPkg("github.com/projectdiscovery/interactsh/pkg/server"),
 		goleak.IgnoreAnyContainingPkg("github.com/projectdiscovery/ratelimit"),
+		// expirable LRU cache creates a background goroutine for TTL expiration that persists
+		// see: https://github.com/hashicorp/golang-lru/blob/770151e9c8cdfae1797826b7b74c33d6f103fbd8/expirable/expirable_lru.go#L79
+		goleak.IgnoreAnyContainingPkg("github.com/hashicorp/golang-lru/v2/expirable"),
 	)
pkg/protocols/common/protocolstate/state.go (1)

212-214: Minor: Redundant initialization.

Setting dialersInstance.InputCount = 0 is redundant since Go zero-initializes struct fields. The comment explains intent, but the explicit assignment adds no value.

🔎 Proposed simplification
-	// Set input count for sharding calculation (will be updated later when input provider is ready)
-	dialersInstance.InputCount = 0
-
+	// InputCount will be updated later when input provider is ready (via SetInputCount)
 	return nil
pkg/protocols/http/request_fuzz.go (1)

184-187: Prefer using generated request URL for per-host rate limiting

Here you use input.MetaInput.Input as the hostname, which may not reflect the final fuzzed request (e.g. when the rule changes host/port or when Input is a raw HTTP request string). Since gr.Request is already the concrete HTTP request, consider deriving the URL/host from it (e.g. its URL) and passing that into rateLimitTake for more accurate per-host limiting, falling back to input.MetaInput.Input only if needed.

internal/runner/runner.go (2)

688-694: Preflight failures currently abort the scan

preflightResolveAndPortScan is treated as mandatory when PreflightPortScan is enabled: any error (e.g., dialers not initialized, DNS resolution setup issues) will fail RunEnumeration. If preflight is meant to be a best-effort optimization rather than a hard requirement, you may want to log and continue instead of returning an error here, or at least document that enabling the flag can cause the whole scan to abort on preflight errors.


717-733: Dialer stats printing could reuse a single lookup and optionally lock

At the end of the run you call protocolstate.GetDialersWithId multiple times and then read PerHostHTTPPool, PerHostRateLimitPool, ConnectionReuseTracker, ShardedHTTPPool, and HTTPToHTTPSPortTracker without taking the embedded mutex. In practice this runs after enumeration and is unlikely to race, but to keep it robust:

  • Fetch dialers once into a local variable.
  • Consider taking dialers.Lock() around reads of the pool/tracker fields.

This avoids potential data races if any background worker still touches dialers.

Also applies to: 770-813

pkg/protocols/common/protocolstate/dialers.go (1)

18-23: New dialer fields are fine, but consider concrete types long-term

The added fields (PerHostHTTPPool, PerHostRateLimitPool, ConnectionReuseTracker, HTTPToHTTPSPortTracker, ShardedHTTPPool) are all any, which is consistent with avoiding import cycles but pushes type-safety to call sites via type assertions. If you later introduce a thin local interface for each (e.g. exposing just PrintStats / GetOrCreate), you can keep protocolstate decoupled while still getting compile-time guarantees.

internal/runner/preflight_portscan.go (2)

36-66: Filtering input provider assumes a fixed post-preflight target set

filteringInputProvider wraps the base provider and exposes Count() as a fixed allowCnt while delegating Iterate to the underlying provider and filtering by allowed. This is fine as long as the base provider is not mutated after preflight. If you ever add inputs dynamically later, Count() and the actual number of iterated targets could diverge; documenting that assumption (or computing Count() from allowed on demand) would make the behavior clearer.


71-91: Preflight resolver/scan flow is solid; double-check Fastdialer initialization and unused param

The preflight workflow (template-driven port set, resolve with Fastdialer, then short TCP dials with early stop) is well-structured and bounded. A couple of small points:

  • dialers := protocolstate.GetDialersWithId(r.options.ExecutionId) is checked for nil, but dialers.Fastdialer is assumed non-nil; if preflight can run before dialers are fully initialized, this will panic rather than fail gracefully.
  • preflightOneResolved receives dialers *protocolstate.Dialers but doesn’t use it; you can drop that parameter to simplify the signature unless you plan to use it later.

If Fastdialer is guaranteed to be initialized before preflightResolveAndPortScan runs, consider adding a brief comment to that effect.

Also applies to: 151-188, 335-444

pkg/protocols/http/httpclientpool/http_to_https_tracker.go (2)

55-76: Side effect in read operation: RequiresHTTPS increments counter on every call.

RequiresHTTPS is semantically a query/read operation, but it increments totalCorrections each time it's called and returns true. This means:

  1. Multiple calls for the same host:port will inflate the corrections count
  2. The counter tracks "checks that returned true" rather than actual corrections applied

If the intent is to track actual corrections made (e.g., when a URL is rewritten), consider moving the counter increment to the caller where the correction is applied.

🔎 Proposed refactor: separate query from recording
 // RequiresHTTPS checks if a host:port requires HTTPS
 func (t *HTTPToHTTPSPortTracker) RequiresHTTPS(hostPort string) bool {
 	if hostPort == "" {
 		return false
 	}

 	normalizedHostPort := normalizeHostPortForTracker(hostPort)
 	if normalizedHostPort == "" {
 		return false
 	}

 	requiresHTTPS, ok := t.ports.Get(normalizedHostPort)
 	if !ok {
 		return false
 	}

-	if requiresHTTPS {
-		t.totalCorrections.Add(1)
-	}
-
 	return requiresHTTPS
 }
+
+// RecordCorrection records that a correction was applied
+func (t *HTTPToHTTPSPortTracker) RecordCorrection() {
+	t.totalCorrections.Add(1)
+}

108-203: Code duplication: normalizeHostPortForTracker and extractHostPortFromStringForHTTPS are nearly identical to functions in connection_reuse_tracker.go.

The functions normalizeHostPortForTracker/extractHostPortFromStringForHTTPS here and normalizeHostForConnectionReuse/extractHostPortFromStringForReuse in connection_reuse_tracker.go share almost identical logic. Consider extracting a shared helper to reduce duplication and maintenance burden.

pkg/protocols/http/httpclientpool/connection_reuse_tracker.go (3)

179-214: Double-checked locking uses Peek instead of Get inside the lock.

In getOrCreateEntry, the second check at line 188 uses cache.Peek() while the first check at line 180 uses cache.Get(). This is intentional and correct for LRU caches—Get updates access time (touching the entry), while Peek does not. However, the explicit Store(0) calls at lines 196-205 are redundant since Go zero-initializes atomic values.

🔎 Remove redundant zero-initialization
 	entry := &connectionReuseEntry{
 		host:      normalizedHost,
 		createdAt: time.Now(),
 	}
-	entry.totalConnections.Store(0)
-	entry.totalReused.Store(0)
-	entry.totalNewConnections.Store(0)
-	entry.accessCount.Store(0)
-	entry.totalHTTPConnections.Store(0)
-	entry.totalHTTPSConnections.Store(0)
-	entry.totalHTTPReused.Store(0)
-	entry.totalHTTPSReused.Store(0)
-	entry.totalHTTPNewConnections.Store(0)
-	entry.totalHTTPSNewConnections.Store(0)

207-211: Unused variable evicted is assigned but only used in a no-op statement.

The return value from cache.Add is assigned to evicted but then discarded with _ = evicted. This is dead code. The Add method returns a boolean indicating if an entry was evicted, which could be logged or tracked.

🔎 Simplify or use the eviction indicator
-	evicted := t.cache.Add(normalizedHost, entry)
-	if evicted {
-		_ = evicted
-		// Entry was evicted, but we still return the new entry
-	}
+	_ = t.cache.Add(normalizedHost, entry)

272-377: PrintPerHostStats holds the mutex while performing potentially slow logging operations.

The method acquires t.mu.Lock() at line 277 and holds it throughout the entire iteration and logging (lines 295-376). This blocks other operations like RecordConnection that also need the mutex via getOrCreateEntry. Consider collecting data under the lock and releasing it before logging.

🔎 Proposed refactor to reduce lock contention
 func (t *ConnectionReuseTracker) PrintPerHostStats() {
 	if t.Size() == 0 {
 		return
 	}

+	// Collect stats under lock
 	t.mu.Lock()
-	defer t.mu.Unlock()
-
 	hostStats := []struct {
 		// ... fields ...
 	}{}

 	for _, key := range t.cache.Keys() {
 		// ... collect stats ...
 	}
+	t.mu.Unlock()

 	if len(hostStats) == 0 {
 		return
 	}

+	// Log outside the lock
 	gologger.Info().Msgf("[connection-reuse-tracker] Per-host connection reuse:")
 	for _, stat := range hostStats {
 		// ... logging ...
 	}
 }
pkg/protocols/http/httpclientpool/clientpool.go (1)

41-46: Trackers initialized but return values discarded.

These calls ensure trackers are created early, but discarding the return values means any initialization errors would be silently ignored. Consider logging or returning an error if initialization fails.

pkg/protocols/http/httpclientpool/perhost_pool.go (3)

137-170: Inconsistent normalization: this function returns scheme://host:port while others return host:port.

The normalizeHost function here returns a full URL format (e.g., https://example.com:443), while normalizeHostForConnectionReuse and normalizeHostPortForTracker in other files return just host:port format. This could cause confusion when correlating data across pools and trackers.

Additionally, lines 162-167 are unreachable—if port != "" at line 158, line 159 returns, so lines 162-167 checking port == "" will always be true (redundant conditions).

🔎 Simplify unreachable code
 	port := parsed.Port()
 	if port != "" {
 		return fmt.Sprintf("%s://%s:%s", scheme, parsed.Hostname(), port)
 	}

-	if scheme == "https" && port == "" {
+	if scheme == "https" {
 		return fmt.Sprintf("%s://%s:443", scheme, parsed.Hostname())
 	}
-	if scheme == "http" && port == "" {
-		return fmt.Sprintf("%s://%s:80", scheme, parsed.Hostname())
-	}
-
-	return fmt.Sprintf("%s://%s", scheme, host)
+	return fmt.Sprintf("%s://%s:80", scheme, parsed.Hostname())

237-246: Hit rate calculation has an off-by-one error in the denominator.

At line 244, the hit rate is calculated as Hits * 100 / (Hits + Misses + 1). The +1 prevents division by zero but artificially lowers the hit rate, especially for small sample sizes. A 100% hit rate becomes ~99.9% for 1000 hits.

🔎 Fix hit rate calculation
 func (p *PerHostClientPool) PrintStats() {
 	stats := p.Stats()
 	if stats.Size == 0 {
 		return
 	}
+	hitRate := float64(0)
+	total := stats.Hits + stats.Misses
+	if total > 0 {
+		hitRate = float64(stats.Hits) * 100 / float64(total)
+	}
 	gologger.Verbose().Msgf("[perhost-pool] Connection reuse stats: Hits=%d Misses=%d HitRate=%.1f%% Hosts=%d",
-		stats.Hits, stats.Misses,
-		float64(stats.Hits)*100/float64(stats.Hits+stats.Misses+1),
-		stats.Size)
+		stats.Hits, stats.Misses, hitRate, stats.Size)
 }

248-249: Empty PrintTransportStats method.

This method has no implementation. If it's intended as a placeholder for future functionality, consider adding a TODO comment or removing it if not needed.

pkg/protocols/http/httpclientpool/sharded_pool.go (2)

98-103: Unused conditional: baseMaxIdleConnsPerHost is set to 500 in both branches.

Lines 99-103 check if baseConfig.Threads == 0 but set baseMaxIdleConnsPerHost = 500 in both cases, making the condition pointless.

🔎 Simplify redundant conditional
-	// Base max idle conns per host (from existing logic: 500 when threading enabled)
-	baseMaxIdleConnsPerHost := 500
-	if baseConfig.Threads == 0 {
-		// If no threading, we still want some pooling for sharding
-		baseMaxIdleConnsPerHost = 500
-	}
+	// Base max idle conns per host for sharding (consistent regardless of threading)
+	baseMaxIdleConnsPerHost := 500

225-391: Significant code duplication: wrappedGetWithCustomMaxIdle duplicates ~150 lines from wrappedGet in clientpool.go.

This function replicates most of the client creation logic from wrappedGet. Changes to TLS configuration, proxy handling, or transport settings would need to be made in both places. Consider refactoring to share the common logic.

🔎 Potential refactoring approach

Extract the common transport and client creation logic into a shared helper function that accepts maxIdleConnsPerHost as a parameter:

func createHTTPClientWithConfig(
    options *types.Options,
    configuration *Configuration,
    maxIdleConnsPerHost int,
    enableCookieJar bool,
) (*retryablehttp.Client, error) {
    // ... shared logic ...
}

Then both wrappedGet and wrappedGetWithCustomMaxIdle can call this helper with their respective parameters.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c80ac99 and 0de52bb.

📒 Files selected for processing (21)
  • cmd/nuclei/main.go
  • go.mod
  • internal/runner/preflight_portscan.go
  • internal/runner/runner.go
  • lib/tests/sdk_test.go
  • pkg/protocols/common/protocolstate/dialers.go
  • pkg/protocols/common/protocolstate/memguardian_test.go
  • pkg/protocols/common/protocolstate/state.go
  • pkg/protocols/http/build_request.go
  • pkg/protocols/http/http.go
  • pkg/protocols/http/httpclientpool/clientpool.go
  • pkg/protocols/http/httpclientpool/connection_reuse_tracker.go
  • pkg/protocols/http/httpclientpool/http_to_https_tracker.go
  • pkg/protocols/http/httpclientpool/perhost_pool.go
  • pkg/protocols/http/httpclientpool/perhost_ratelimit_pool.go
  • pkg/protocols/http/httpclientpool/sharded_pool.go
  • pkg/protocols/http/request.go
  • pkg/protocols/http/request_fuzz.go
  • pkg/protocols/http/request_test.go
  • pkg/protocols/utils/http/requtils.go
  • pkg/types/types.go
🧰 Additional context used
📓 Path-based instructions (5)
**/*.go

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.go: Format Go code using go fmt ./...
Run static analysis using go vet ./... on Go code

Files:

  • pkg/protocols/utils/http/requtils.go
  • pkg/protocols/common/protocolstate/memguardian_test.go
  • pkg/protocols/http/build_request.go
  • pkg/protocols/common/protocolstate/state.go
  • pkg/protocols/http/request_fuzz.go
  • pkg/protocols/http/request_test.go
  • cmd/nuclei/main.go
  • pkg/protocols/common/protocolstate/dialers.go
  • internal/runner/runner.go
  • lib/tests/sdk_test.go
  • pkg/protocols/http/httpclientpool/connection_reuse_tracker.go
  • pkg/protocols/http/httpclientpool/clientpool.go
  • pkg/protocols/http/httpclientpool/perhost_pool.go
  • pkg/protocols/http/request.go
  • pkg/protocols/http/httpclientpool/http_to_https_tracker.go
  • pkg/protocols/http/httpclientpool/sharded_pool.go
  • pkg/protocols/http/http.go
  • internal/runner/preflight_portscan.go
  • pkg/protocols/http/httpclientpool/perhost_ratelimit_pool.go
  • pkg/types/types.go
pkg/protocols/**/*.go

📄 CodeRabbit inference engine (CLAUDE.md)

pkg/protocols/**/*.go: Each protocol implementation should implement the Request interface with Compile(), ExecuteWithResults(), Match(), and Extract() methods
Protocol implementations should embed Operators for matching/extraction functionality

Files:

  • pkg/protocols/utils/http/requtils.go
  • pkg/protocols/common/protocolstate/memguardian_test.go
  • pkg/protocols/http/build_request.go
  • pkg/protocols/common/protocolstate/state.go
  • pkg/protocols/http/request_fuzz.go
  • pkg/protocols/http/request_test.go
  • pkg/protocols/common/protocolstate/dialers.go
  • pkg/protocols/http/httpclientpool/connection_reuse_tracker.go
  • pkg/protocols/http/httpclientpool/clientpool.go
  • pkg/protocols/http/httpclientpool/perhost_pool.go
  • pkg/protocols/http/request.go
  • pkg/protocols/http/httpclientpool/http_to_https_tracker.go
  • pkg/protocols/http/httpclientpool/sharded_pool.go
  • pkg/protocols/http/http.go
  • pkg/protocols/http/httpclientpool/perhost_ratelimit_pool.go
cmd/nuclei/**/*.go

📄 CodeRabbit inference engine (CLAUDE.md)

Main CLI entry point with flag parsing and configuration should be located in cmd/nuclei

Files:

  • cmd/nuclei/main.go
internal/runner/**/*.go

📄 CodeRabbit inference engine (CLAUDE.md)

Core scanning orchestration logic should be implemented in internal/runner

Files:

  • internal/runner/runner.go
  • internal/runner/preflight_portscan.go
lib/**/*.go

📄 CodeRabbit inference engine (CLAUDE.md)

SDK for embedding nuclei as a library should be implemented in lib/

Files:

  • lib/tests/sdk_test.go
🧬 Code graph analysis (12)
pkg/protocols/http/build_request.go (1)
pkg/protocols/http/http.go (2)
  • ReuseUnsafe (573-573)
  • ReuseSafe (571-571)
pkg/protocols/http/request_fuzz.go (1)
pkg/protocols/common/contextargs/metainput.go (1)
  • MetaInput (18-30)
cmd/nuclei/main.go (1)
pkg/protocols/http/httpclientpool/perhost_pool.go (1)
  • PerHostClientPool (15-23)
pkg/protocols/common/protocolstate/dialers.go (3)
pkg/protocols/http/httpclientpool/perhost_ratelimit_pool.go (1)
  • PerHostRateLimitPool (20-30)
pkg/protocols/http/httpclientpool/connection_reuse_tracker.go (1)
  • ConnectionReuseTracker (16-32)
pkg/protocols/http/httpclientpool/http_to_https_tracker.go (1)
  • HTTPToHTTPSPortTracker (17-23)
internal/runner/runner.go (5)
pkg/utils/utils.go (1)
  • GetRateLimiter (80-85)
pkg/protocols/common/protocolstate/state.go (2)
  • SetInputCount (219-229)
  • GetDialersWithId (40-46)
pkg/protocols/http/httpclientpool/perhost_ratelimit_pool.go (1)
  • PerHostRateLimitPool (20-30)
pkg/protocols/http/httpclientpool/clientpool.go (1)
  • GetConnectionReuseTracker (514-536)
pkg/protocols/http/httpclientpool/http_to_https_tracker.go (1)
  • HTTPToHTTPSPortTracker (17-23)
pkg/protocols/http/httpclientpool/connection_reuse_tracker.go (1)
pkg/protocols/utils/variables.go (4)
  • Scheme (50-50)
  • Hostname (44-44)
  • Host (45-45)
  • Port (46-46)
pkg/protocols/http/httpclientpool/clientpool.go (4)
pkg/protocols/http/httpclientpool/perhost_pool.go (2)
  • PerHostClientPool (15-23)
  • NewPerHostClientPool (31-60)
pkg/protocols/http/httpclientpool/perhost_ratelimit_pool.go (2)
  • PerHostRateLimitPool (20-30)
  • NewPerHostRateLimitPool (43-81)
pkg/protocols/http/httpclientpool/connection_reuse_tracker.go (2)
  • ConnectionReuseTracker (16-32)
  • NewConnectionReuseTracker (51-82)
pkg/protocols/http/httpclientpool/http_to_https_tracker.go (2)
  • HTTPToHTTPSPortTracker (17-23)
  • NewHTTPToHTTPSPortTracker (26-30)
pkg/protocols/http/httpclientpool/http_to_https_tracker.go (2)
pkg/protocols/http/httpclientpool/clientpool.go (1)
  • Get (177-187)
pkg/protocols/utils/variables.go (4)
  • Scheme (50-50)
  • Hostname (44-44)
  • Host (45-45)
  • Port (46-46)
pkg/protocols/http/http.go (2)
pkg/protocols/utils/http/requtils.go (1)
  • ShouldDisableKeepAlive (49-52)
pkg/protocols/http/httpclientpool/clientpool.go (2)
  • Configuration (83-98)
  • ConnectionConfiguration (51-59)
internal/runner/preflight_portscan.go (5)
pkg/input/provider/interface.go (1)
  • InputProvider (60-75)
pkg/input/types/probe.go (1)
  • InputLivenessProbe (4-9)
pkg/protocols/common/contextargs/metainput.go (1)
  • MetaInput (18-30)
pkg/protocols/common/protocolstate/state.go (1)
  • GetDialersWithId (40-46)
pkg/protocols/common/protocolstate/dialers.go (1)
  • Dialers (13-29)
pkg/protocols/http/httpclientpool/perhost_ratelimit_pool.go (2)
pkg/utils/utils.go (1)
  • GetRateLimiter (80-85)
pkg/protocols/utils/variables.go (4)
  • Scheme (50-50)
  • Hostname (44-44)
  • Host (45-45)
  • Port (46-46)
pkg/types/types.go (1)
pkg/protocols/http/httpclientpool/perhost_pool.go (1)
  • PerHostClientPool (15-23)
🪛 GitHub Check: CodeQL
pkg/protocols/http/httpclientpool/sharded_pool.go

[failure] 279-279: Disabled TLS certificate check
InsecureSkipVerify should not be used in production code.


[failure] 280-280: Insecure TLS configuration
Using insecure TLS version VersionTLS10 for MinVersion.

🔇 Additional comments (24)
lib/tests/sdk_test.go (1)

22-24: LGTM!

The leak suppression for the expirable LRU cache background goroutine is correctly added with a helpful reference to the source code. This aligns with the same pattern used in other test files in this PR.

pkg/protocols/common/protocolstate/memguardian_test.go (1)

21-23: LGTM!

The leak suppression is correctly added for the expirable LRU background goroutine with the same documentation pattern used elsewhere.

pkg/protocols/common/protocolstate/state.go (1)

218-229: LGTM with minor observation.

The function is thread-safe with proper locking. The silent return when dialers == nil is acceptable for graceful handling, though consider logging a warning if this condition is unexpected during normal operation.

pkg/protocols/http/request_test.go (1)

391-393: LGTM!

The expirable LRU leak suppression is consistently applied here as in other test files, with the same documentation reference.

cmd/nuclei/main.go (2)

409-409: LGTM!

The per-host-rate-limit flag is appropriately placed in the Rate-Limit group with a clear description.


438-440: LGTM. The new optimization flags are well-documented with sensible defaults. The http-client-shards description accurately reflects the implementation—the max 256 limit is enforced via MaxShardCount constant in the sharded client pool, which caps both automatically calculated and explicitly provided shard counts.

pkg/protocols/http/build_request.go (2)

453-474: Well-structured policy-driven connection reuse logic.

The switch statement cleanly handles all three cases:

  • ReuseUnsafe: Forces connection close for safety
  • ReuseSafe: Enables connection pooling by removing close headers
  • Default: Preserves legacy behavior for backward compatibility

The implementation correctly differentiates between setting req.Close = true (which affects the transport layer) and the Connection header (which affects the HTTP protocol layer).


462-468: The connectionReusePolicy field is properly initialized during request compilation in the Compile() method (lines 312-313 of pkg/protocols/http/http.go), where AnalyzeConnectionReuse() is called and its result is assigned to the field before build_request.go is executed. The ReuseSafe case is reachable and will execute correctly when appropriate.

go.mod (1)

83-83: LGTM - Dependency promoted to direct.

Moving golang-lru/v2 from indirect to direct dependency is appropriate since the codebase now explicitly uses the expirable LRU cache features for per-host pools and trackers. Version v2.0.7 is the latest stable release.

internal/runner/runner.go (1)

395-400: Global vs per-host rate-limiter wiring looks consistent

Making the global limiter unlimited when PerHostRateLimit is enabled and falling back to the standard limiter otherwise keeps existing behavior while letting the per-host pool enforce limits. This is a good separation of concerns.

pkg/protocols/http/request.go (2)

74-88: rateLimitTake helper cleanly centralizes rate limiting

Centralizing per-host vs global rate-limiter logic in rateLimitTake keeps call sites simpler and makes it easy to evolve the strategy later. The fallback to the global limiter when the per-host limiter lookup fails is also a good defensive choice.


1058-1073: HTTP→HTTPS 400-body detection is a useful heuristic

Leveraging the specific 400 response body text (“The plain HTTP request was sent to HTTPS port”) to record HTTP→HTTPS port mismatches is a nice enhancement and should help the tracker build a better view of misconfigured targets. Once the URL source is switched to the actual URL (see previous comment), this logic should give accurate stats without impacting normal execution.

pkg/protocols/http/http.go (1)

161-164: Connection reuse policy analysis and wiring into Compile looks sound

The new ConnectionReusePolicy plus AnalyzeConnectionReuse() gives you a clear, extensible way to decide when to disable keep-alive:

  • Explicit Connection: close (raw or headers), time_delay analyzer, and future requiresConnectionClosure cases map to ReuseUnsafe.
  • Everything else defaults to ReuseSafe, enabling pooling/sharding, while ReuseUnknown falls back to the old ShouldDisableKeepAlive behavior.

Storing the policy on the Request and using it to set ConnectionConfiguration.DisableKeepAlive in Compile() keeps this logic centralized and makes it easy to evolve as more patterns are identified.

Also applies to: 311-329, 564-574, 590-633

pkg/protocols/http/httpclientpool/http_to_https_tracker.go (2)

14-30: LGTM! Clean tracker implementation.

The HTTPToHTTPSPortTracker struct is well-designed with atomic counters for statistics and a thread-safe map for port tracking.


78-95: The suggested refactor is incorrect and would not compile.

SyncLockMap from projectdiscovery/utils does not have a Len() method. The existing code comment at line 80 is accurate—this is not misleading. While the "approximate" language could be clarified, using t.ports.Len() as suggested would fail to compile. If retrieving the exact count from SyncLockMap were necessary, alternatives like len(t.ports.GetAll()) would be required, though the current approach of using totalDetections is reasonable given it provides an accurate unique count (duplicates are rejected at line 44-45).

Likely an incorrect or invalid review comment.

pkg/protocols/http/httpclientpool/connection_reuse_tracker.go (1)

51-82: LGTM! Well-structured tracker with sensible defaults.

The constructor properly handles zero values with reasonable defaults (24-hour TTL for scan-duration tracking) and the TTL selection logic is correct.

pkg/protocols/http/httpclientpool/clientpool.go (4)

189-199: LGTM! Cookie jar pointer hash ensures unique clients for multi-threaded cookie scenarios.

The hashWithCookieJar function correctly uses the pointer address of the cookie jar to create unique client keys when multi-threading with cookies is detected, preventing unintended client sharing.


408-434: Potential race: pool is used after lock release without re-checking.

At lines 410-414, dialers.PerHostHTTPPool is assigned under lock, then immediately unlocked. The subsequent type assertion and usage at lines 416-433 happen outside the lock. If another goroutine modifies dialers.PerHostHTTPPool between unlock and line 416, the behavior is undefined.

However, since the pool is only ever set once (nil → new pool) and never reset, this is likely safe in practice. Consider using a sync.Once pattern for clearer thread-safety guarantees.


436-458: Sharded pool creation error handling returns early without unlocking on the happy path.

At line 446, if pool creation fails, the code correctly unlocks and returns. But on success at lines 449-451, it unlocks after setting the pool. The structure is correct, but the error from GetClientForHost at line 455 is ignored (assigned to _). While GetClientForHost in sharded_pool.go only returns the shard index (not an error), the signature suggests it could. This is fine but worth noting.


513-536: LGTM! Thread-safe lazy initialization of ConnectionReuseTracker.

The function correctly uses locking to ensure single initialization of the tracker. The pattern of creating the tracker under lock and then doing a type assertion is appropriate.

pkg/protocols/http/httpclientpool/perhost_pool.go (2)

31-60: LGTM! Well-structured pool with proper TTL handling and eviction logging.

The constructor handles zero values with sensible defaults, correctly selects the minimum of idle time and lifetime for TTL, and logs evictions for debugging.


62-102: LGTM! Double-checked locking pattern is correctly implemented.

The fast-path check at line 68 avoids lock acquisition for cache hits. The slow path at line 77 uses Peek to avoid updating access time during the double-check, which is correct for LRU semantics. Hit/miss counting is accurate.

pkg/protocols/http/httpclientpool/sharded_pool.go (2)

58-81: LGTM! Shard count calculation is well-designed.

The formula sqrt(inputSize) * 2 clamped between 4 and 256 provides reasonable scaling. For example: 100 inputs → 20 shards, 10000 inputs → 200 shards.


460-464: LGTM! Appropriate no-op Close method.

The comment correctly explains that clients are managed by the main HTTPClientPool, so no cleanup is needed here. This maintains interface compatibility.

Comment on lines +136 to +148
// GetClientForHost returns the HTTP client for the given host based on consistent hashing
// Returns the client and the shard index
func (p *ShardedClientPool) GetClientForHost(host string) (*retryablehttp.Client, int) {
shardIndex := p.getShardIndex(host)
shard := p.shards[shardIndex]

p.shardRequests[shardIndex].Add(1)
p.totalRequests.Add(1)
shard.requestCount.Add(1)
shard.lastAccess.Store(time.Now())

return shard.client, shardIndex
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

hostCount is never incremented.

The ShardEntry struct has a hostCount field (line 51) intended to track unique hosts per shard, but GetClientForHost only increments requestCount. The hostCount remains zero, making the HostCount stat meaningless.

🔎 Consider tracking unique hosts per shard

Tracking unique hosts would require a set per shard, which adds memory overhead. If not needed, consider removing hostCount from ShardEntry and ShardStat. If needed, consider using a concurrent set or accepting approximate counts.

🤖 Prompt for AI Agents
In pkg/protocols/http/httpclientpool/sharded_pool.go around lines 136 to 148,
GetClientForHost never updates the ShardEntry.hostCount so the HostCount stat is
always zero; either remove hostCount from ShardEntry/ShardStat if you don't need
unique-host tracking, or implement concurrent tracking: add a per-shard
concurrent set/map (e.g., map[string]struct{} protected by shard-level mutex or
sync.Map) and on each GetClientForHost check if host is already present, if not
add it and increment hostCount (atomic.AddInt64) while holding the lock (or
relying on sync.Map semantics), ensuring thread-safety and avoiding
double-counting.

Comment on lines +288 to 296
hostname := t.updatedInput.MetaInput.Input
if t.req != nil && t.req.URL() != "" {
hostname = t.req.URL()
} else if t.req != nil && t.req.request != nil && t.req.request.URL != nil {
// Extract from request URL if available
hostname = t.req.request.String()
}
request.rateLimitTake(hostname)
select {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Avoid passing full HTTP request strings into per-host and tracking helpers

Several new call sites derive the “hostname” or URL from retryablehttp.Request.String() (full HTTP request line + headers) and pass it into helpers that expect a URL or host:port:

  • executeParallelHTTP:
    • hostname = t.req.request.String() when t.req.URL() is empty.
  • executeRequest:
    • targetURL prefers generatedRequest.request.String().
    • hostnameForReuse prefers generatedRequest.request.String().
    • HTTP-to-HTTPS correction and mismatch recording use generatedRequest.request.String() as requestURL.

Helpers like the per-host rate limit pool (normalizeHostForRateLimit), connection reuse tracker, and HTTP-to-HTTPS port tracker generally parse URLs/hostnames. Feeding them a full HTTP request string ("GET http://host/path HTTP/1.1\r\n...") will produce incorrect normalization (e.g., keys like GET:80), which breaks per-host grouping and skews reuse/port-mismatch stats, even though the scan still works.

Better to consistently use the actual URL:

  • generatedRequest.request.URL.String() when request.URL is non-nil.
  • generatedRequest.URL() (your wrapper) where available.
  • Fall back to input.MetaInput.Input only if no URL is available.
Proposed diff to use URL strings instead of full request strings
@@ func (request *Request) executeParallelHTTP(…) {
-				hostname := t.updatedInput.MetaInput.Input
-				if t.req != nil && t.req.URL() != "" {
-					hostname = t.req.URL()
-				} else if t.req != nil && t.req.request != nil && t.req.request.URL != nil {
-					// Extract from request URL if available
-					hostname = t.req.request.String()
-				}
+				hostname := t.updatedInput.MetaInput.Input
+				if t.req != nil && t.req.URL() != "" {
+					hostname = t.req.URL()
+				} else if t.req != nil && t.req.request != nil && t.req.request.URL != nil {
+					// Extract from request URL if available
+					hostname = t.req.request.URL.String()
+				}
 				request.rateLimitTake(hostname)
@@ func (request *Request) executeRequest(…) (err error) {
-			targetURL := input.MetaInput.Input
-			if generatedRequest.request != nil && generatedRequest.request.URL != nil {
-				targetURL = generatedRequest.request.String()
-			} else if generatedRequest.request != nil {
-				targetURL = generatedRequest.request.String()
-			}
+			targetURL := input.MetaInput.Input
+			if generatedRequest.request != nil && generatedRequest.request.URL != nil {
+				targetURL = generatedRequest.request.URL.String()
+			}
@@
-			if generatedRequest.request != nil && generatedRequest.request.URL != nil {
-				tracker := httpclientpool.GetHTTPToHTTPSPortTracker(request.options.Options)
-				if tracker != nil {
-					requestURL := generatedRequest.request.String()
+			if generatedRequest.request != nil && generatedRequest.request.URL != nil {
+				tracker := httpclientpool.GetHTTPToHTTPSPortTracker(request.options.Options)
+				if tracker != nil {
+					requestURL := generatedRequest.request.URL.String()
 					if tracker.RequiresHTTPS(requestURL) {
@@
-				hostnameForReuse := input.MetaInput.Input
-				if generatedRequest.request.URL != nil {
-					// Use the actual request URL - normalization will extract host:port correctly
-					hostnameForReuse = generatedRequest.request.String()
-				} else if generatedRequest.URL() != "" {
-					// Fallback to generated request URL method
-					hostnameForReuse = generatedRequest.URL()
-				} else if targetURL != "" {
-					hostnameForReuse = targetURL
-				}
+				hostnameForReuse := input.MetaInput.Input
+				if generatedRequest.request.URL != nil {
+					hostnameForReuse = generatedRequest.request.URL.String()
+				} else if generatedRequest.URL() != "" {
+					hostnameForReuse = generatedRequest.URL()
+				} else if targetURL != "" {
+					hostnameForReuse = targetURL
+				}
@@
-			var requestURL string
-			if generatedRequest.request != nil && generatedRequest.request.URL != nil {
-				requestURL = generatedRequest.request.String()
+			var requestURL string
+			if generatedRequest.request != nil && generatedRequest.request.URL != nil {
+				requestURL = generatedRequest.request.URL.String()
 			} else if generatedRequest.rawRequest != nil && generatedRequest.rawRequest.FullURL != "" {
 				requestURL = generatedRequest.rawRequest.FullURL
 			} else if respChain.Request() != nil && respChain.Request().URL != nil {
 				requestURL = respChain.Request().URL.String()
 			}

Also applies to: 561-568, 839-915, 1058-1073

Comment on lines 48 to 52
// ShouldDisableKeepAlive depending on scan strategy
func ShouldDisableKeepAlive(options *types.Options) bool {
// with host-spray strategy keep-alive must be enabled
return options.ScanStrategy != scanstrategy.HostSpray.String()
return options.TemplateThreads == 0 && options.ScanStrategy != scanstrategy.HostSpray.String()
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, check how ShouldDisableKeepAlive is used in the codebase
rg -n "ShouldDisableKeepAlive" --type go -C 3

Repository: projectdiscovery/nuclei

Length of output: 190


🏁 Script executed:

# Find build_request.go file
find . -name "build_request.go" -type f

Repository: projectdiscovery/nuclei

Length of output: 104


🏁 Script executed:

# Also check for connectionReusePolicy references
rg -n "connectionReusePolicy" --type go -C 2

Repository: projectdiscovery/nuclei

Length of output: 1550


🏁 Script executed:

# Search for ShouldDisableKeepAlive usage without git context
rg -n "ShouldDisableKeepAlive" --type go

Repository: projectdiscovery/nuclei

Length of output: 496


🏁 Script executed:

# Get the full context around the connectionReusePolicy switch in build_request.go
sed -n '450,480p' ./pkg/protocols/http/build_request.go

Repository: projectdiscovery/nuclei

Length of output: 1227


🏁 Script executed:

# Check the Compile method in http.go to see how connectionReusePolicy and keep-alive interact
sed -n '310,330p' ./pkg/protocols/http/http.go

Repository: projectdiscovery/nuclei

Length of output: 865


🏁 Script executed:

# Look for AnalyzeConnectionReuse method definition
rg -n "AnalyzeConnectionReuse" --type go -A 10

Repository: projectdiscovery/nuclei

Length of output: 2095


🏁 Script executed:

# Check the Request struct definition to understand Threads vs TemplateThreads
rg -n "type Request struct" ./pkg/protocols/http/http.go -A 50 | head -80

Repository: projectdiscovery/nuclei

Length of output: 3558


🏁 Script executed:

# Search for Threads field in http.go
rg -n "^\s+Threads\s" ./pkg/protocols/http/http.go -B 2 -A 2

Repository: projectdiscovery/nuclei

Length of output: 471


🏁 Script executed:

# Check if Threads and TemplateThreads are the same or related
rg -n "TemplateThreads|\.Threads\s*=" ./pkg/protocols/http/http.go

Repository: projectdiscovery/nuclei

Length of output: 272


Address field mismatch in thread count checking.

The condition change uses options.TemplateThreads, but the default case in build_request.go (line 469) checks r.request.Threads instead. These are different fields—template-level vs. per-request threads—and can produce inconsistent keep-alive behavior. When the connection reuse policy is ReuseUnknown, ShouldDisableKeepAlive() decides keep-alive based on template threads during Compile(), but build_request.go may override this decision based on per-request threads. Ensure both checks consistently reference the same thread field.

🤖 Prompt for AI Agents
In pkg/protocols/utils/http/requtils.go around lines 48 to 52, the keep-alive
decision is using options.TemplateThreads which mismatches the per-request
thread check in build_request.go; change the condition to use the same thread
field used elsewhere (options.Threads) so both Compile() and build_request.go
make consistent keep-alive decisions — update the boolean expression to check
options.Threads == 0 (keeping the existing HostSpray exception) and ensure any
related comments reflect that this is a per-request thread check.

Comment on lines +204 to +219
// PreflightPortScan enables a preflight resolve + TCP portscan and filters targets
// before running templates. Disabled by default.
PreflightPortScan bool
// PerHostClientPool enables per-host HTTP client pooling for better connection reuse.
// When enabled, each host gets its own client instance keyed by (host, configuration).
// Disabled by default.
PerHostClientPool bool
// HTTPClientShards enables HTTP client sharding for connection pooling.
// When enabled, hosts are distributed across a fixed number of HTTP client shards (auto-calculated, max 256).
// This provides a balance between connection reuse and memory efficiency.
// Disabled by default.
HTTPClientShards bool
// PerHostRateLimit enables per-host rate limiting for HTTP requests.
// When enabled, each host gets its own rate limiter and global rate limit becomes unlimited.
// Disabled by default.
PerHostRateLimit bool
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Copy the new optimization flags in Options.Copy()

The new booleans (PreflightPortScan, PerHostClientPool, HTTPClientShards, PerHostRateLimit) are added to Options but are not copied in Options.Copy(). Any code using a copied options instance will silently lose these settings, which can break preflight and per‑host behavior in sub-executors (e.g., auth provider flows).

I recommend propagating them in Options.Copy().

Proposed diff for Options.Copy()
 func (options *Options) Copy() *Options {
 	optCopy := &Options{
@@
-		DisableHTTPProbe:               options.DisableHTTPProbe,
-		LeaveDefaultPorts:              options.LeaveDefaultPorts,
+		DisableHTTPProbe:               options.DisableHTTPProbe,
+		// Optimization flags
+		PreflightPortScan:              options.PreflightPortScan,
+		PerHostClientPool:              options.PerHostClientPool,
+		HTTPClientShards:               options.HTTPClientShards,
+		PerHostRateLimit:               options.PerHostRateLimit,
+		LeaveDefaultPorts:              options.LeaveDefaultPorts,
 		AutomaticScan:                  options.AutomaticScan,

Also applies to: 491-573

🤖 Prompt for AI Agents
In pkg/types/types.go around lines 204 to 219 (and similarly in the
Options.Copy() implementation region around lines 491–573), the new boolean
fields PreflightPortScan, PerHostClientPool, HTTPClientShards, and
PerHostRateLimit are not propagated when duplicating an Options instance; update
Options.Copy() to set these four fields on the returned copy from the receiver
so copied options preserve the new flags (ensure you assign
copy.PreflightPortScan = o.PreflightPortScan, copy.PerHostClientPool =
o.PerHostClientPool, copy.HTTPClientShards = o.HTTPClientShards, and
copy.PerHostRateLimit = o.PerHostRateLimit).

Copy link
Member

@dwisiswant0 dwisiswant0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's only one concern for correctness, else is just nits (perf, bottlenecks view).

Comment on lines +204 to +207
// Disable cookies for sharded clients to avoid concurrent map writes
// cookiejar.Jar is not thread-safe and sharded clients are shared across goroutines
// If cookies are needed, use per-host pooling instead
cfg.DisableCookie = true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could use httpclientpool.PerHostClientPool here for session-dependent scans, since it keeps clients & cookie jars isolated (read: concurrent-safe) per target. Disabling cookies in here can break multi-step templates that rely on cookie-reuse (ex. login -> auth check).

tl:dr; session state gets lost once sharding is enabled.

Thoughts?

Comment on lines +83 to +85
func (p *PerHostRateLimitPool) GetOrCreate(
host string,
) (*ratelimit.Limiter, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heavy read op here. (GetForTarget -> GetOrCreate)

Maybe we should switch from LRU to adaptive W-TinyLFU (w/ github.com/maypok86/otter/v2 - already introduced in #6630) to get lock-free access?

Thus, no more leaks & growing goleak.Ingore* list.

Comment on lines +394 to +397
if len(entry.requestTimestamps) > 100 {
// Keep only last 100 timestamps
entry.requestTimestamps = entry.requestTimestamps[len(entry.requestTimestamps)-100:]
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implicit re-allocs (sliding window) here.

Since this is hardcapped at 100, maybe a circular buffer would be a better fit than a ring buffer for requestTimestamps?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless it's configurable, but I get that 100 is a heuristic balance for high precision PPS calc, not just somewhat magic number.

// This is used to automatically detect and correct cases where HTTP requests
// are sent to HTTPS ports (detected via 400 error with specific message)
type HTTPToHTTPSPortTracker struct {
ports *mapsutil.SyncLockMap[string, bool]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heavy read op here.

Maybe we can switch to sync.Map (Get + Set vs. LoadOrStore) here to make it atomic (since this is a grow-only discovery cache) and lock-free?

@dogancanbakir dogancanbakir removed this from the v3.7.0 milestone Jan 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Per-host HTTP client pool for TemplateSpray: bounded keep-alive + connection reuse (LRU/TTL)

4 participants