Skip to content

feat: http(s) probing optimization#6511

Merged
Mzack9999 merged 2 commits intoprojectdiscovery:devfrom
matejsmycka:http-probing-optimizations
Oct 6, 2025
Merged

feat: http(s) probing optimization#6511
Mzack9999 merged 2 commits intoprojectdiscovery:devfrom
matejsmycka:http-probing-optimizations

Conversation

@matejsmycka
Copy link
Contributor

@matejsmycka matejsmycka commented Oct 6, 2025

Proposed changes

This PR optimizes HTTP probing. Nuclei tries all ports with HTTPS first; this means twice as many requests have to be made on HTTP (80,8080) ports, making probing slower.

This PR implements determineSchemeOrder, which makes performance better on 80 and 8080 ports without making it slower on normal scenarios.

TLDR: This PR cuts all needed probing requests to port 80/8080 to half.

Checklist

  • Pull request is created against the dev branch
  • All checks passed (lint, unit/integration/regression tests etc.) with my changes
  • I have added tests that prove my fix is effective or that my feature works
  • I have added necessary documentation (if appropriate)

Summary by CodeRabbit

  • New Features

    • Smarter URL probing: automatically prioritizes HTTP or HTTPS based on detected ports (typically preferring HTTPS), reducing timeouts and improving connection success.
    • Backward-compatible behavior preserved: continues trying alternatives on errors and returns no result if none succeed; no configuration changes required.
  • Tests

    • Added comprehensive tests to validate the new probing order across various URL formats and port scenarios.

@auto-assign auto-assign bot requested a review from Mzack9999 October 6, 2025 11:08
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 6, 2025

Walkthrough

Adds a heuristic to choose HTTP/HTTPS probing order based on input host/port. Implements determineSchemeOrder and updates ProbeURL to iterate over this order. Introduces supporting variables and imports. Adds unit tests validating scheme ordering for inputs with and without explicit ports.

Changes

Cohort / File(s) Summary
HTTP probe logic update
pkg/utils/http_probe.go
Adds determineSchemeOrder, new scheme/port variables, and net/sliceutil imports. ProbeURL now derives scheme order via heuristic and iterates accordingly, preserving prior error handling and return behavior. Updated comments.
Unit tests for scheme ordering
pkg/utils/http_probe_test.go
New tests verifying determineSchemeOrder across inputs (with/without ports), using subtests and require.Equal assertions.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor Caller
  participant Utils as utils.ProbeURL
  participant Heuristic as determineSchemeOrder
  participant HTTP as Try HTTP
  participant HTTPS as Try HTTPS

  Caller->>Utils: ProbeURL(input)
  Utils->>Heuristic: derive scheme order from input
  Heuristic-->>Utils: []string (e.g., [https, http] or [http, https])

  loop for each scheme in order
    alt scheme == https
      Utils->>HTTPS: attempt probe
      opt on error
        HTTPS-->>Utils: error
        note right of Utils: Continue to next scheme
      end
      opt on success
        HTTPS-->>Utils: reachable
        Utils-->>Caller: return resolved URL
      end
    else scheme == http
      Utils->>HTTP: attempt probe
      opt on error
        HTTP-->>Utils: error
        note right of Utils: Continue to next scheme
      end
      opt on success
        HTTP-->>Utils: reachable
        Utils-->>Caller: return resolved URL
      end
    end
  end

  Utils-->>Caller: "" if none succeed
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

I twitch my whiskers, ports in sight,
80 hums, 443 shines bright—
I hop through schemes in ordered pairs,
Heuristic trails past network snares.
A nibble of tests, a sip of cheer,
The right URL appears—oh dear! 🥕✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title “feat: http(s) probing optimization” succinctly captures the main enhancement introduced in the changeset, namely optimizing the HTTP/HTTPS probing order, and it is clear and directly related to the PR’s primary objective of reducing redundant requests on common ports.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (4)
pkg/utils/http_probe.go (3)

13-18: Consider using constants and expanding the port list.

The variables are never modified and should be declared as constants. Additionally, the port list only covers :80 and :8080, but many other ports commonly serve HTTP traffic (e.g., :8000, :8008, :8888, :3000).

Apply this diff to use constants:

-var httpPorts = []string{
+var httpPorts = []string{
 	":80",
 	":8080",
+	":8000",
+	":8008",
+	":3000",
 }
-var httpSchemesHttpFirst = []string{"http", "https"}
-var httpSchemesHttpsFirst = []string{"https", "http"}
+var (
+	httpSchemesHttpFirst  = []string{"http", "https"}
+	httpSchemesHttpsFirst = []string{"https", "http"}
+)

36-38: Improve documentation clarity.

The comment "First http scheme tried is selected based on heuristics" is awkward and unclear. Consider describing the actual heuristic used.

Apply this diff:

 // ProbeURL probes the scheme for a URL.
-// First http scheme tried is selected based on heuristics
+// The scheme order is determined by heuristics: HTTP-first for ports 80/8080, HTTPS-first otherwise.
 // If none succeeds, probing is abandoned for such URLs.

40-43: Follow Go naming conventions and remove unnecessary blank line.

Local variable names should use camelCase, not PascalCase. The blank line at line 40 is also unnecessary.

Apply this diff:

 func ProbeURL(input string, httpxclient *httpx.HTTPX) string {
-
-	HttpSchemesOrdered := determineSchemeOrder(input)
-
-	for _, scheme := range HttpSchemesOrdered {
+	schemesOrdered := determineSchemeOrder(input)
+	for _, scheme := range schemesOrdered {
pkg/utils/http_probe_test.go (1)

9-37: Expand test coverage to include edge cases.

The tests cover basic scenarios well but are missing critical edge cases that could expose bugs in the implementation:

  • IPv6 addresses: [::1]:80, [2001:db8::1]:8080
  • URLs with authentication: user:pass@host:80
  • Other common HTTP ports: :8000, :8008, :3000
  • Malformed inputs: :080 (leading zeros), host: (empty port)

Add these test cases to validate robust handling:

 		{
 			input:    "example.com:8080",
 			expected: []string{"http", "https"},
 		},
+		{
+			input:    "[::1]:80",
+			expected: []string{"http", "https"},
+		},
+		{
+			input:    "[2001:db8::1]:8080",
+			expected: []string{"http", "https"},
+		},
+		{
+			input:    "example.com:8000",
+			expected: []string{"http", "https"},
+		},
+		{
+			input:    "user:pass@example.com:80",
+			expected: []string{"http", "https"},
+		},
 	}

Note: Some of these cases will fail with the current implementation and should be fixed as suggested in the review of http_probe.go.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 75016d1 and 6a97345.

📒 Files selected for processing (2)
  • pkg/utils/http_probe.go (2 hunks)
  • pkg/utils/http_probe_test.go (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.go

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.go: Format Go code using go fmt
Run static analysis with go vet

Files:

  • pkg/utils/http_probe.go
  • pkg/utils/http_probe_test.go

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
pkg/utils/http_probe_test.go (1)

32-32: Optional: Remove unnecessary range variable capture.

The tc := tc pattern was required in Go versions prior to 1.22 to capture loop variables for closures, but is now unnecessary if the project targets Go 1.22+. This is purely cosmetic and can be safely deferred.

-		tc := tc // capture range variable
pkg/utils/http_probe.go (2)

42-42: Use camelCase for local variable naming.

Go convention prefers camelCase for local variables. The variable name HttpSchemesOrdered should be httpSchemesOrdered to follow idiomatic Go style.

-	HttpSchemesOrdered := determineSchemeOrder(input)
+	httpSchemesOrdered := determineSchemeOrder(input)
 
-	for _, scheme := range HttpSchemesOrdered {
+	for _, scheme := range httpSchemesOrdered {

13-18: Consider more descriptive naming for package-level variables.

While the current names are functional, consider renaming httpPorts to commonHTTPPorts or knownHTTPPorts to better convey that this is a curated list of ports where HTTP is typically served, not an exhaustive list of all possible HTTP ports.

-var httpPorts = []string{
+var commonHTTPPorts = []string{
 	":80",
 	":8080",
 }

Then update the usage in determineSchemeOrder:

-		for _, httpPort := range httpPorts {
+		for _, httpPort := range commonHTTPPorts {
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6a97345 and f623969.

📒 Files selected for processing (2)
  • pkg/utils/http_probe.go (2 hunks)
  • pkg/utils/http_probe_test.go (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.go

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.go: Format Go code using go fmt
Run static analysis with go vet

Files:

  • pkg/utils/http_probe_test.go
  • pkg/utils/http_probe.go
🔇 Additional comments (2)
pkg/utils/http_probe_test.go (1)

9-38: Excellent test coverage for the scheme ordering logic.

The test cases comprehensively validate the heuristic-based probing optimization, including IPv4, IPv6, and various port scenarios. The implementation correctly exercises the determineSchemeOrder function.

pkg/utils/http_probe.go (1)

20-35: Good: IPv6 and authentication handling implemented correctly.

The use of net.SplitHostPort properly addresses the concern raised in the past review about IPv6 addresses (like [::1]:80) and URLs with authentication. This implementation correctly extracts the port component while handling bracket-enclosed IPv6 addresses.

@matejsmycka matejsmycka force-pushed the http-probing-optimizations branch from fc6a41c to 67571b3 Compare October 6, 2025 11:30
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
pkg/utils/http_probe.go (1)

13-18: Consider using all-caps for HTTP/HTTPS acronyms.

Go convention typically writes acronyms in all-caps within identifiers. Consider renaming:

  • commonHttpPortscommonHTTPPorts
  • httpSchemesHttpFirsthttpSchemesHTTPFirst
  • httpSchemesHttpsFirsthttpSchemesHTTPSFirst

Apply this diff to follow Go naming conventions:

-var commonHttpPorts = []string{
+var commonHTTPPorts = []string{
 	":80",
 	":8080",
 }
-var httpSchemesHttpFirst = []string{"http", "https"}
-var httpSchemesHttpsFirst = []string{"https", "http"}
+var httpSchemesHTTPFirst = []string{"http", "https"}
+var httpSchemesHTTPSFirst = []string{"https", "http"}

Then update references in determineSchemeOrder:

 func determineSchemeOrder(input string) []string {
 	// Full urls (with http(s):// prefix) are not probed
 	_, port, err := net.SplitHostPort(input)
 	if err == nil {
 		// Check if port is a known HTTP port
 		portWithColon := ":" + port
-		for _, httpPort := range commonHttpPorts {
+		for _, httpPort := range commonHTTPPorts {
 			if portWithColon == httpPort {
-				return httpSchemesHttpFirst
+				return httpSchemesHTTPFirst
 			}
 		}
 	}
-	return httpSchemesHttpsFirst
+	return httpSchemesHTTPSFirst
 }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f623969 and fc6a41c.

📒 Files selected for processing (2)
  • pkg/utils/http_probe.go (2 hunks)
  • pkg/utils/http_probe_test.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • pkg/utils/http_probe_test.go
🧰 Additional context used
📓 Path-based instructions (1)
**/*.go

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.go: Format Go code using go fmt
Run static analysis with go vet

Files:

  • pkg/utils/http_probe.go
🔇 Additional comments (3)
pkg/utils/http_probe.go (3)

5-5: LGTM!

The net import is correctly added to support net.SplitHostPort in the new port-detection logic.


20-35: Well done addressing the IPv6 concern!

The use of net.SplitHostPort correctly handles:

  • IPv6 addresses with brackets: [::1]:80, [2001:db8::1]:8080
  • Standard host:port: example.com:80
  • Edge cases: The fallback to httpSchemesHttpsFirst when parsing fails is appropriate for inputs without ports or malformed addresses.

The logic is sound and resolves the issue raised in previous reviews.


37-44: LGTM!

The updated comments and implementation correctly integrate the heuristic-based scheme ordering. The flow is clear: determine the scheme order based on the input, then iterate over schemes in that order.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
pkg/utils/http_probe.go (1)

42-60: Strip or reject inputs with existing schemes
In ProbeURL (pkg/utils/http_probe.go:42), check if input contains :// and either strip the prefix (e.g. via strings.SplitN(input, "://", 2)[1]) or return an error. Otherwise fmt.Sprintf("%s://%s", scheme, input) will produce malformed URLs like http://http://example.com.

🧹 Nitpick comments (2)
pkg/utils/http_probe.go (2)

27-37: Clarify error handling and heuristic limitations.

The function silently ignores errors from net.SplitHostPort by design, which is correct (falling back to HTTPS-first). However, this implicit behavior could be clearer to future maintainers.

Consider adding a brief comment explaining the intentional error handling:

 // determineSchemeOrder for the input
 func determineSchemeOrder(input string) []string {
 	// if input has port that is commonly used for HTTP, return http then https
+	// if SplitHostPort fails (no port, invalid format), default to https-first
 	if _, port, err := net.SplitHostPort(input); err == nil {
 		if sliceutil.Contains(commonHttpPorts, port) {
 			return httpFirstSchemes
 		}
 	}

 	return defaultHttpSchemes
 }

66-66: Update method comment to differentiate from the function.

The comment is identical to the standalone ProbeURL function above (line 39), which may cause confusion. This method is a wrapper that delegates to the function.

Consider making the comment more specific:

-// ProbeURL probes the scheme for a URL.
+// ProbeURL implements types.InputLivenessProbe by delegating to the ProbeURL function.
 func (i *inputLivenessChecker) ProbeURL(input string) (string, error) {
 	return ProbeURL(input, i.client), nil
 }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fc6a41c and e0d3bb4.

📒 Files selected for processing (2)
  • pkg/utils/http_probe.go (2 hunks)
  • pkg/utils/http_probe_test.go (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.go

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.go: Format Go code using go fmt
Run static analysis with go vet

Files:

  • pkg/utils/http_probe.go
  • pkg/utils/http_probe_test.go
🔇 Additional comments (1)
pkg/utils/http_probe_test.go (1)

9-37: LGTM! Comprehensive test coverage.

The test cases cover the key scenarios:

  • No port and uncommon ports default to HTTPS-first
  • Ports 80 and 8080 prioritize HTTP-first
  • IPv4 and IPv6 addresses are handled correctly (including bracket notation)

The table-driven approach makes it easy to add more cases if needed.

@Mzack9999
Copy link
Member

@CodeRabbit create an issue for this PR

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 6, 2025

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Create a GitHub issue for PR #6511

gh issue create \
  --title "HTTP(S) probing optimization feature" \
  --body "This issue tracks the HTTP(S) probing optimization feature implemented in PR #6511.

**Context:**
The feature introduces an optimization for HTTP probing by changing the order in which HTTP/HTTPS schemes are tried during probing. Currently, Nuclei tries HTTPS first for all ports, which causes redundant requests on common HTTP ports (notably 80 and 8080). The optimization makes probing prefer HTTP on those ports, cutting the number of probing requests for ports 80/8080 approximately in half while maintaining performance in other scenarios.

**Related PR:**
- #6511

**Requested by:** @Mzack9999" \
  --assignee Mzack9999

Length of output: 54


@Mzack9999: I've created issue #6512 to track this PR's HTTP(S) probing optimization feature. You can view it here: #6512

@Mzack9999 Mzack9999 linked an issue Oct 6, 2025 that may be closed by this pull request
@Mzack9999 Mzack9999 merged commit 603a86b into projectdiscovery:dev Oct 6, 2025
20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

HTTP(S) probing optimization feature

2 participants