feat: http(s) probing optimization#6511
Conversation
WalkthroughAdds a heuristic to choose HTTP/HTTPS probing order based on input host/port. Implements determineSchemeOrder and updates ProbeURL to iterate over this order. Introduces supporting variables and imports. Adds unit tests validating scheme ordering for inputs with and without explicit ports. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor Caller
participant Utils as utils.ProbeURL
participant Heuristic as determineSchemeOrder
participant HTTP as Try HTTP
participant HTTPS as Try HTTPS
Caller->>Utils: ProbeURL(input)
Utils->>Heuristic: derive scheme order from input
Heuristic-->>Utils: []string (e.g., [https, http] or [http, https])
loop for each scheme in order
alt scheme == https
Utils->>HTTPS: attempt probe
opt on error
HTTPS-->>Utils: error
note right of Utils: Continue to next scheme
end
opt on success
HTTPS-->>Utils: reachable
Utils-->>Caller: return resolved URL
end
else scheme == http
Utils->>HTTP: attempt probe
opt on error
HTTP-->>Utils: error
note right of Utils: Continue to next scheme
end
opt on success
HTTP-->>Utils: reachable
Utils-->>Caller: return resolved URL
end
end
end
Utils-->>Caller: "" if none succeed
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (4)
pkg/utils/http_probe.go (3)
13-18: Consider using constants and expanding the port list.The variables are never modified and should be declared as constants. Additionally, the port list only covers
:80and:8080, but many other ports commonly serve HTTP traffic (e.g.,:8000,:8008,:8888,:3000).Apply this diff to use constants:
-var httpPorts = []string{ +var httpPorts = []string{ ":80", ":8080", + ":8000", + ":8008", + ":3000", } -var httpSchemesHttpFirst = []string{"http", "https"} -var httpSchemesHttpsFirst = []string{"https", "http"} +var ( + httpSchemesHttpFirst = []string{"http", "https"} + httpSchemesHttpsFirst = []string{"https", "http"} +)
36-38: Improve documentation clarity.The comment "First http scheme tried is selected based on heuristics" is awkward and unclear. Consider describing the actual heuristic used.
Apply this diff:
// ProbeURL probes the scheme for a URL. -// First http scheme tried is selected based on heuristics +// The scheme order is determined by heuristics: HTTP-first for ports 80/8080, HTTPS-first otherwise. // If none succeeds, probing is abandoned for such URLs.
40-43: Follow Go naming conventions and remove unnecessary blank line.Local variable names should use camelCase, not PascalCase. The blank line at line 40 is also unnecessary.
Apply this diff:
func ProbeURL(input string, httpxclient *httpx.HTTPX) string { - - HttpSchemesOrdered := determineSchemeOrder(input) - - for _, scheme := range HttpSchemesOrdered { + schemesOrdered := determineSchemeOrder(input) + for _, scheme := range schemesOrdered {pkg/utils/http_probe_test.go (1)
9-37: Expand test coverage to include edge cases.The tests cover basic scenarios well but are missing critical edge cases that could expose bugs in the implementation:
- IPv6 addresses:
[::1]:80,[2001:db8::1]:8080- URLs with authentication:
user:pass@host:80- Other common HTTP ports:
:8000,:8008,:3000- Malformed inputs:
:080(leading zeros),host:(empty port)Add these test cases to validate robust handling:
{ input: "example.com:8080", expected: []string{"http", "https"}, }, + { + input: "[::1]:80", + expected: []string{"http", "https"}, + }, + { + input: "[2001:db8::1]:8080", + expected: []string{"http", "https"}, + }, + { + input: "example.com:8000", + expected: []string{"http", "https"}, + }, + { + input: "user:pass@example.com:80", + expected: []string{"http", "https"}, + }, }Note: Some of these cases will fail with the current implementation and should be fixed as suggested in the review of
http_probe.go.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
pkg/utils/http_probe.go(2 hunks)pkg/utils/http_probe_test.go(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.go
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.go: Format Go code using go fmt
Run static analysis with go vet
Files:
pkg/utils/http_probe.gopkg/utils/http_probe_test.go
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (3)
pkg/utils/http_probe_test.go (1)
32-32: Optional: Remove unnecessary range variable capture.The
tc := tcpattern was required in Go versions prior to 1.22 to capture loop variables for closures, but is now unnecessary if the project targets Go 1.22+. This is purely cosmetic and can be safely deferred.- tc := tc // capture range variablepkg/utils/http_probe.go (2)
42-42: Use camelCase for local variable naming.Go convention prefers camelCase for local variables. The variable name
HttpSchemesOrderedshould behttpSchemesOrderedto follow idiomatic Go style.- HttpSchemesOrdered := determineSchemeOrder(input) + httpSchemesOrdered := determineSchemeOrder(input) - for _, scheme := range HttpSchemesOrdered { + for _, scheme := range httpSchemesOrdered {
13-18: Consider more descriptive naming for package-level variables.While the current names are functional, consider renaming
httpPortstocommonHTTPPortsorknownHTTPPortsto better convey that this is a curated list of ports where HTTP is typically served, not an exhaustive list of all possible HTTP ports.-var httpPorts = []string{ +var commonHTTPPorts = []string{ ":80", ":8080", }Then update the usage in
determineSchemeOrder:- for _, httpPort := range httpPorts { + for _, httpPort := range commonHTTPPorts {
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
pkg/utils/http_probe.go(2 hunks)pkg/utils/http_probe_test.go(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.go
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.go: Format Go code using go fmt
Run static analysis with go vet
Files:
pkg/utils/http_probe_test.gopkg/utils/http_probe.go
🔇 Additional comments (2)
pkg/utils/http_probe_test.go (1)
9-38: Excellent test coverage for the scheme ordering logic.The test cases comprehensively validate the heuristic-based probing optimization, including IPv4, IPv6, and various port scenarios. The implementation correctly exercises the
determineSchemeOrderfunction.pkg/utils/http_probe.go (1)
20-35: Good: IPv6 and authentication handling implemented correctly.The use of
net.SplitHostPortproperly addresses the concern raised in the past review about IPv6 addresses (like[::1]:80) and URLs with authentication. This implementation correctly extracts the port component while handling bracket-enclosed IPv6 addresses.
fc6a41c to
67571b3
Compare
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (1)
pkg/utils/http_probe.go (1)
13-18: Consider using all-caps for HTTP/HTTPS acronyms.Go convention typically writes acronyms in all-caps within identifiers. Consider renaming:
commonHttpPorts→commonHTTPPortshttpSchemesHttpFirst→httpSchemesHTTPFirsthttpSchemesHttpsFirst→httpSchemesHTTPSFirstApply this diff to follow Go naming conventions:
-var commonHttpPorts = []string{ +var commonHTTPPorts = []string{ ":80", ":8080", } -var httpSchemesHttpFirst = []string{"http", "https"} -var httpSchemesHttpsFirst = []string{"https", "http"} +var httpSchemesHTTPFirst = []string{"http", "https"} +var httpSchemesHTTPSFirst = []string{"https", "http"}Then update references in
determineSchemeOrder:func determineSchemeOrder(input string) []string { // Full urls (with http(s):// prefix) are not probed _, port, err := net.SplitHostPort(input) if err == nil { // Check if port is a known HTTP port portWithColon := ":" + port - for _, httpPort := range commonHttpPorts { + for _, httpPort := range commonHTTPPorts { if portWithColon == httpPort { - return httpSchemesHttpFirst + return httpSchemesHTTPFirst } } } - return httpSchemesHttpsFirst + return httpSchemesHTTPSFirst }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
pkg/utils/http_probe.go(2 hunks)pkg/utils/http_probe_test.go(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- pkg/utils/http_probe_test.go
🧰 Additional context used
📓 Path-based instructions (1)
**/*.go
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.go: Format Go code using go fmt
Run static analysis with go vet
Files:
pkg/utils/http_probe.go
🔇 Additional comments (3)
pkg/utils/http_probe.go (3)
5-5: LGTM!The
netimport is correctly added to supportnet.SplitHostPortin the new port-detection logic.
20-35: Well done addressing the IPv6 concern!The use of
net.SplitHostPortcorrectly handles:
- IPv6 addresses with brackets:
[::1]:80,[2001:db8::1]:8080- Standard host:port:
example.com:80- Edge cases: The fallback to
httpSchemesHttpsFirstwhen parsing fails is appropriate for inputs without ports or malformed addresses.The logic is sound and resolves the issue raised in previous reviews.
37-44: LGTM!The updated comments and implementation correctly integrate the heuristic-based scheme ordering. The flow is clear: determine the scheme order based on the input, then iterate over schemes in that order.
There was a problem hiding this comment.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
pkg/utils/http_probe.go (1)
42-60: Strip or reject inputs with existing schemes
In ProbeURL (pkg/utils/http_probe.go:42), check ifinputcontains://and either strip the prefix (e.g. viastrings.SplitN(input, "://", 2)[1]) or return an error. Otherwisefmt.Sprintf("%s://%s", scheme, input)will produce malformed URLs likehttp://http://example.com.
🧹 Nitpick comments (2)
pkg/utils/http_probe.go (2)
27-37: Clarify error handling and heuristic limitations.The function silently ignores errors from
net.SplitHostPortby design, which is correct (falling back to HTTPS-first). However, this implicit behavior could be clearer to future maintainers.Consider adding a brief comment explaining the intentional error handling:
// determineSchemeOrder for the input func determineSchemeOrder(input string) []string { // if input has port that is commonly used for HTTP, return http then https + // if SplitHostPort fails (no port, invalid format), default to https-first if _, port, err := net.SplitHostPort(input); err == nil { if sliceutil.Contains(commonHttpPorts, port) { return httpFirstSchemes } } return defaultHttpSchemes }
66-66: Update method comment to differentiate from the function.The comment is identical to the standalone
ProbeURLfunction above (line 39), which may cause confusion. This method is a wrapper that delegates to the function.Consider making the comment more specific:
-// ProbeURL probes the scheme for a URL. +// ProbeURL implements types.InputLivenessProbe by delegating to the ProbeURL function. func (i *inputLivenessChecker) ProbeURL(input string) (string, error) { return ProbeURL(input, i.client), nil }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
pkg/utils/http_probe.go(2 hunks)pkg/utils/http_probe_test.go(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.go
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.go: Format Go code using go fmt
Run static analysis with go vet
Files:
pkg/utils/http_probe.gopkg/utils/http_probe_test.go
🔇 Additional comments (1)
pkg/utils/http_probe_test.go (1)
9-37: LGTM! Comprehensive test coverage.The test cases cover the key scenarios:
- No port and uncommon ports default to HTTPS-first
- Ports 80 and 8080 prioritize HTTP-first
- IPv4 and IPv6 addresses are handled correctly (including bracket notation)
The table-driven approach makes it easy to add more cases if needed.
|
@CodeRabbit create an issue for this PR |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain🏁 Script executed: Length of output: 54 @Mzack9999: I've created issue #6512 to track this PR's HTTP(S) probing optimization feature. You can view it here: #6512 |
Proposed changes
This PR optimizes HTTP probing. Nuclei tries all ports with HTTPS first; this means twice as many requests have to be made on HTTP (80,8080) ports, making probing slower.
This PR implements
determineSchemeOrder, which makes performance better on 80 and 8080 ports without making it slower on normal scenarios.TLDR: This PR cuts all needed probing requests to port 80/8080 to half.
Checklist
Summary by CodeRabbit
New Features
Tests