Support concurrent Nuclei engines in the same process#6322
Support concurrent Nuclei engines in the same process#6322ehsandeep merged 14 commits intoprojectdiscovery:devfrom
Conversation
WalkthroughThe changes introduce thread-safe handling for local file access permissions using synchronized maps keyed by execution ID, refactor client pool synchronization using concurrency-safe maps and helper functions, and update template parsing logic to avoid in-place mutations of cached templates by working on deep copies. Obsolete methods and direct global state mutations are removed. Minor documentation, test improvements, and logging capture utilities are also added. Changes
Sequence Diagram(s)sequenceDiagram
participant Engine as NucleiEngine
participant Options as EngineOptions
participant Parser as Parser
Engine->>Options: Check for Parser instance
alt Parser provided and valid
Engine->>Parser: Assign Parser to internal fields
else No Parser provided
Engine->>Parser: Create new Parser
Engine->>Parser: Assign new Parser to internal fields
end
sequenceDiagram
participant TemplateCache as CachedTemplate
participant ParseFunc as Parse
participant NewOptions as ExecutorOptions
ParseFunc->>TemplateCache: Retrieve cached template
ParseFunc->>TemplateCache: Deep copy cached template
ParseFunc->>NewOptions: Create ExecutorOptions with preserved fields
ParseFunc->>TemplateCache: Assign new options to copied template
ParseFunc->>TemplateCache: Update requests with copied options
ParseFunc-->>Caller: Return copied template
sequenceDiagram
participant ProtocolState as protocolstate
participant Options as Options
participant AtomicFlag as LfaAllowed
ProtocolState->>Options: IsLfaAllowed(options)
alt options provided
ProtocolState->>Dialer: Check LocalFileAccessAllowed under lock
else
ProtocolState->>AtomicFlag: Return atomic global flag
end
Possibly related PRs
Suggested reviewers
Poem
✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
|
It looks like the test failure yesterday was due to a temporary quota hit:
The current PR should be ready for review |
There was a problem hiding this comment.
Actionable comments posted: 8
🔭 Outside diff range comments (1)
pkg/protocols/common/protocolstate/state.go (1)
181-181: Ensure consistent use of the per-instance LocalFileAccessAllowed flagRight now we have two separate settings for local-file access:
- A global
LfaAllowedinfile.go(with its own mutex)- A per-execution
LocalFileAccessAllowedon eachdialersinstanceBecause
file.gofirst returns the global flag and only falls back to the per-instance value, you can get different answers depending on which API is called. To fix this, drop the global flag entirely and always consult the instance field:• pkg/protocols/common/protocolstate/file.go
– Remove theLfaAllowedvariable, its mutex, andSetLfaAllowed/GetLfaAllowedlogic
– In all exported funcs (e.g.IsLocalFileAccessAllowed), look up thedialersentry byExecutionIdand return itsLocalFileAccessAllowed• pkg/protocols/common/protocolstate/state.go & headless.go
– Continue initializingLocalFileAccessAllowed: options.AllowLocalFileAccesson the per-execution struct
– Delete the call toSetLfaAllowed(options)in state.goThis will guarantee a single source of truth for local-file access.
🧹 Nitpick comments (3)
pkg/protocols/dns/dnsclientpool/clientpool.go (1)
14-18: Consider using consistent mutex declaration style.For consistency with
poolMutex, consider declaringmas a pointer:-var ( - poolMutex *sync.RWMutex - clientPool map[string]*retryabledns.Client - - normalClient *retryabledns.Client - m sync.Mutex -) +var ( + poolMutex *sync.RWMutex + clientPool map[string]*retryabledns.Client + + normalClient *retryabledns.Client + m *sync.Mutex +)And initialize it in the
Initfunction:func Init(options *types.Options) error { + if m == nil { + m = &sync.Mutex{} + } m.Lock() defer m.Unlock()Also, the reordering of
poolMutexandclientPooldeclarations seems unnecessary. Was there a specific reason for this change?pkg/protocols/protocols.go (1)
454-462: Remove commented code.If these template-specific fields are intentionally preserved (not replaced), please add a clear comment explaining why. Otherwise, remove the commented code to improve readability.
- // Keep the template-specific fields, but replace the rest - /* - e.TemplateID = n.TemplateID - e.TemplatePath = n.TemplatePath - e.TemplateInfo = n.TemplateInfo - e.TemplateVerifier = n.TemplateVerifier - e.RawTemplate = n.RawTemplate - e.Variables = n.Variables - e.Constants = n.Constants - */ + // Keep the template-specific fields unchanged while replacing execution-related fieldspkg/templates/compile.go (1)
163-164: Consider enabling debug logging for cache operationsThe commented debug log would be valuable for troubleshooting concurrent execution issues. Consider making it configurable rather than commenting it out.
-// options.Logger.Error().Msgf("returning cached template %s after recompiling %d requests", tplCopy.Options.TemplateID, tplCopy.Requests()) +if options.Options.Debug { + gologger.Debug().Msgf("returning cached template %s after updating %d requests", tplCopy.Options.TemplateID, tplCopy.Requests()) +}
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (15)
lib/config.go(1 hunks)lib/sdk_private.go(1 hunks)pkg/installer/template.go(1 hunks)pkg/protocols/common/protocolstate/file.go(1 hunks)pkg/protocols/common/protocolstate/headless.go(0 hunks)pkg/protocols/common/protocolstate/memguardian.go(2 hunks)pkg/protocols/common/protocolstate/state.go(1 hunks)pkg/protocols/dns/dnsclientpool/clientpool.go(4 hunks)pkg/protocols/http/httpclientpool/clientpool.go(1 hunks)pkg/protocols/http/request.go(1 hunks)pkg/protocols/http/utils.go(1 hunks)pkg/protocols/protocols.go(2 hunks)pkg/protocols/whois/rdapclientpool/clientpool.go(2 hunks)pkg/templates/compile.go(2 hunks)pkg/templates/parser.go(1 hunks)
💤 Files with no reviewable changes (1)
- pkg/protocols/common/protocolstate/headless.go
🧰 Additional context used
🧠 Learnings (2)
pkg/protocols/http/utils.go (1)
Learnt from: dwisiswant0
PR: projectdiscovery/nuclei#6290
File: pkg/protocols/http/build_request.go:457-464
Timestamp: 2025-06-30T16:34:42.125Z
Learning: In the projectdiscovery/retryablehttp-go package, the Request struct embeds URL fields directly, making req.Scheme, req.Host, and other URL fields accessible directly on the Request object instead of requiring req.URL.Scheme, req.URL.Host, etc.
pkg/protocols/http/request.go (1)
Learnt from: dwisiswant0
PR: projectdiscovery/nuclei#6290
File: pkg/protocols/http/build_request.go:457-464
Timestamp: 2025-06-30T16:34:42.125Z
Learning: In the projectdiscovery/retryablehttp-go package, the Request struct embeds URL fields directly, making req.Scheme, req.Host, and other URL fields accessible directly on the Request object instead of requiring req.URL.Scheme, req.URL.Host, etc.
🧬 Code Graph Analysis (3)
pkg/protocols/common/protocolstate/state.go (1)
pkg/protocols/common/protocolstate/file.go (1)
SetLfaAllowed(38-45)
pkg/protocols/http/request.go (1)
pkg/protocols/http/http.go (1)
Request(35-240)
pkg/templates/parser.go (2)
pkg/loader/parser/parser.go (1)
Parser(7-11)pkg/templates/cache.go (1)
Cache(9-11)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Tests (windows-latest)
🔇 Additional comments (15)
pkg/protocols/common/protocolstate/memguardian.go (3)
19-19: LGTM: Global mutex properly declared for thread safety.The global mutex declaration is correctly placed and follows Go conventions. This provides the necessary synchronization primitive for protecting global state changes across concurrent operations.
23-24: LGTM: Proper mutex usage for thread-safe guardian startup.The mutex lock with defer unlock pattern correctly protects the global state modifications (
memTimerandcancelFunc). The use ofdeferensures the mutex is always released, even on early returns.
48-50: LGTM: Proper mutex usage for thread-safe guardian shutdown.The mutex lock with defer unlock pattern correctly protects the global state modifications during shutdown. This ensures that start and stop operations are properly serialized and cannot interfere with each other.
lib/config.go (1)
540-540: Good documentation improvement.The comment clarification that
WithLoggerallows setting a "shared" gologger instance is helpful and aligns with the PR's goal of supporting concurrent engines in the same process.pkg/protocols/http/utils.go (1)
15-16: Excellent race condition fix.Cloning the request before dumping prevents race conditions with the HTTP transport when multiple goroutines access the same request object. This is a critical improvement for concurrent engine support.
pkg/protocols/whois/rdapclientpool/clientpool.go (2)
33-37: Good thread-safety encapsulation.The
getNormalClient()function properly encapsulates mutex locking around access to the sharednormalClientvariable, ensuring thread-safe access in concurrent scenarios.
49-49: Proper use of the thread-safe accessor.Using
getNormalClient()instead of direct access ensures consistent thread-safe access to the shared client instance.pkg/installer/template.go (1)
56-56: Correct API usage fix.The change from variadic arguments to a slice parameter aligns with the expected
table.Headermethod signature.pkg/protocols/http/httpclientpool/clientpool.go (1)
157-166: Excellent isolation improvement.Creating a local copy of
rawhttp.DefaultOptionsbefore modification prevents race conditions and side effects when multiple engines run concurrently. This ensures each operation works with its own isolated options instead of mutating shared global state.pkg/protocols/http/request.go (1)
867-870: Good fix for preventing race conditions!Cloning the request before modifying its body prevents potential race conditions with the HTTP transport. This is the correct approach for safely generating the curl command.
pkg/templates/parser.go (1)
57-67: LGTM! Thread-safe accessor methods.The new accessor methods
ParsedCount()andCompiledCount()are properly implemented with appropriate locking to ensure thread safety when accessing the cache maps.pkg/protocols/protocols.go (1)
198-201: Good defensive programming!Initializing
templateCtxStoreif nil prevents potential nil pointer dereferences and ensures the method is safe to call even ifCreateTemplateCtxStore()wasn't called first.pkg/protocols/common/protocolstate/state.go (1)
203-203: Good fix for the race condition.Replacing the direct assignment to the global
LfaAllowedvariable with the thread-safeSetLfaAllowed(options)function call properly addresses the race condition issue mentioned in the PR objectives.pkg/protocols/common/protocolstate/file.go (2)
62-66: Good use of the new thread-safe accessor.The update to use
IsLfaAllowed(nil)instead of directly accessing the globalLfaAllowedvariable is correct and maintains backward compatibility while adding thread safety.
19-36: No action required: Dialers mutex and nil guards are correctThe
Dialersstruct embedssync.Mutex, so callingdialers.Lock()/Unlock()is idiomatic and safe. Additionally,IsLfaAllowedalready checksdialers != nil(along withok) before acquiring the lock. The use of the shareddialersstate here is intentional. Please disregard the previous concerns about synchronization and nil handling.Likely an incorrect or invalid review comment.
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
|
@coderabbitai resolve |
✅ Actions performedComments resolved. |
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (2)
Makefile (2)
78-81:-failfastmay mask additional regressions; make it opt-in via env varHalting the test suite on the first failure speeds up CI feedback, but it also prevents spotting independent failures in the same run. CI logs that surface all failing packages in one shot are often more useful for triage.
Consider keeping the flag configurable instead of hard-coded:
- $(GOTEST) $(GOFLAGS) -failfast -p 1 ./... +TEST_EXTRA_FLAGS ?= + $(GOTEST) $(GOFLAGS) $(TEST_EXTRA_FLAGS) ./...Developers can still run
make test TEST_EXTRA_FLAGS="-failfast"locally, while CI can execute the full matrix.
78-81: Serialising packages with-p 1slows CI and hides cross-package racesSetting
-p 1disables the default per-package parallelism that often uncovers race conditions spanning shared resources. Given that this PR’s goal is to make the engine race-free, running tests in parallel packages is actually desirable as a safety net.If the motivation is to reduce flakiness for now, consider gating this with an environment toggle similar to the previous note, or at least add a TODO to remove once outstanding races are fixed.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
Makefile(1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: hdm
PR: projectdiscovery/nuclei#6322
File: pkg/templates/compile.go:79-81
Timestamp: 2025-07-16T21:27:14.897Z
Learning: To make the template caching mechanism in pkg/templates/compile.go production-ready, DSLs need to be updated to use runtime options instead of cached variables, rather than restoring the Compile() calls on each request.
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (1)
pkg/external/customtemplates/github_test.go (1)
24-26: Consider removing commented code if the macOS issue is resolved.The commented skip logic suggests the macOS-specific failure has been addressed. Consider removing this commented code entirely if the issue is permanently resolved.
- // if osutils.IsOSX() { - // t.Skip("skipping on macos due to unknown failure (works locally)") - // }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
pkg/external/customtemplates/github_test.go(2 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: hdm
PR: projectdiscovery/nuclei#6322
File: pkg/templates/compile.go:79-81
Timestamp: 2025-07-16T21:27:14.897Z
Learning: To make the template caching mechanism in pkg/templates/compile.go production-ready, DSLs need to be updated to use runtime options instead of cached variables, rather than restoring the Compile() calls on each request.
pkg/external/customtemplates/github_test.go (2)
Learnt from: hdm
PR: projectdiscovery/nuclei#6322
File: pkg/templates/compile.go:79-81
Timestamp: 2025-07-16T21:27:14.897Z
Learning: In pkg/templates/compile.go, the template caching mechanism intentionally skips calling Compile() on copied requests to achieve performance benefits. This is the intended design, not a bug. The current implementation isn't production-ready but represents the desired direction.
Learnt from: hdm
PR: projectdiscovery/nuclei#6322
File: pkg/templates/compile.go:59-78
Timestamp: 2025-07-16T21:28:08.062Z
Learning: The shallow copy behavior (`tplCopy := *value`) in pkg/templates/compile.go is intentional design for the template caching mechanism. The partial-copy approach is part of the performance optimization strategy, not a bug requiring deep copying.
🔇 Additional comments (4)
pkg/external/customtemplates/github_test.go (4)
5-5: LGTM: Import added for stdout writer implementation.The
osimport is correctly added to support the newstdoutWritertype.
10-10: LGTM: Import added for logging level types.The
levelsimport is correctly added to support the enhanced logging functionality.
16-21: LGTM: Clean adapter implementation for stdout logging.The
stdoutWritertype correctly implements the gologger writer interface, providing a simple way to direct log output to stdout for test debugging.
28-29: LGTM: Enhanced logging setup improves test observability.The logger configuration with debug level and stdout writer will provide better visibility into test execution, which is valuable for debugging the GitHub template download functionality.
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (1)
pkg/utils/capture_writer.go (1)
14-16: Consider thread safety and document the ignored level parameter.The
Writemethod implementation is straightforward, but there are a few considerations:
- Thread safety:
bytes.Bufferis not thread-safe. If this will be used in concurrent scenarios, consider adding synchronization.- Ignored level parameter: The
levelparameter is ignored, which may be intentional for testing but should be documented.- Error handling: While
bytes.Buffer.Writenever returns an error, the method signature suggests it might be implementing an interface that expects error handling.Consider documenting the behavior:
-func (w *CaptureWriter) Write(data []byte, level levels.Level) { +// Write appends data to the buffer, ignoring the log level for testing purposes +func (w *CaptureWriter) Write(data []byte, level levels.Level) { w.Buffer.Write(data) }If thread safety is needed:
+import "sync" + type CaptureWriter struct { Buffer *bytes.Buffer + mu sync.Mutex } func (w *CaptureWriter) Write(data []byte, level levels.Level) { + w.mu.Lock() + defer w.mu.Unlock() w.Buffer.Write(data) }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
pkg/external/customtemplates/github_test.go(2 hunks)pkg/utils/capture_writer.go(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- pkg/external/customtemplates/github_test.go
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: hdm
PR: projectdiscovery/nuclei#6322
File: pkg/templates/compile.go:79-81
Timestamp: 2025-07-16T21:27:14.897Z
Learning: To make the template caching mechanism in pkg/templates/compile.go production-ready, DSLs need to be updated to use runtime options instead of cached variables, rather than restoring the Compile() calls on each request.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: Tests (windows-latest)
- GitHub Check: Tests (macOS-latest)
- GitHub Check: Tests (ubuntu-latest)
🔇 Additional comments (2)
pkg/utils/capture_writer.go (2)
1-7: Clean package structure and appropriate imports.The package declaration and imports are well-organized. The dependency on
github.com/projectdiscovery/gologger/levelscorrectly reflects the method signature requirements.
9-12: Simple and focused struct design.The
CaptureWriterstruct is well-designed for its testing purpose with a clear comment explaining its intent. Using a pointer tobytes.Bufferis appropriate for efficiency.
|
Follow ups:
|
There was a problem hiding this comment.
Actionable comments posted: 1
🔭 Outside diff range comments (1)
pkg/js/libs/fs/fs.go (1)
3-8: Add fmt import for error handling.The
fmtpackage import is needed for the suggested error handling improvements in the context value extraction.Apply this diff to add the missing import:
import ( "context" + "fmt" "os" "github.com/projectdiscovery/nuclei/v3/pkg/protocols/common/protocolstate" )
♻️ Duplicate comments (1)
pkg/js/libs/fs/fs.go (1)
62-64: Add safe context value extraction to prevent runtime panics.Same issue as in
ListDir- the direct type assertion could panic if the context doesn't contain the expected key or value type.Apply the same fix as suggested for
ListDir:- executionId := ctx.Value("executionId").(string) + executionId, ok := ctx.Value("executionId").(string) + if !ok || executionId == "" { + return nil, fmt.Errorf("missing or invalid executionId in context") + }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
pkg/js/libs/fs/fs.go(5 hunks)pkg/protocols/common/protocolstate/file.go(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- pkg/protocols/common/protocolstate/file.go
🧰 Additional context used
🧠 Learnings (1)
pkg/js/libs/fs/fs.go (2)
Learnt from: dwisiswant0
PR: projectdiscovery/nuclei#6290
File: pkg/fuzz/component/path_test.go:43-44
Timestamp: 2025-06-30T16:33:26.746Z
Learning: The user dwisiswant0 declined adding documentation for internal API changes in pkg/fuzz/component/path when the change was from nested URL field access (rebuilt.URL.Path, rebuilt.URL.String()) to direct field access (rebuilt.Path, rebuilt.String()), indicating they don't consider additional documentation necessary for such internal API modifications.
Learnt from: hdm
PR: projectdiscovery/nuclei#6322
File: pkg/protocols/common/protocolstate/file.go:38-55
Timestamp: 2025-07-16T21:31:43.240Z
Learning: In pkg/protocols/common/protocolstate/file.go, the inconsistent behavior between SetLfaAllowed, GetLfaAllowed, and IsLfaAllowed functions is intentional as an interim step before implementing context-specific LFA logic. The current mixed approach of global and per-instance flags is a temporary transitional state, not a design flaw.
🧬 Code Graph Analysis (1)
pkg/js/libs/fs/fs.go (1)
pkg/protocols/common/protocolstate/file.go (1)
NormalizePathWithExecutionId(51-56)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: Tests (ubuntu-latest)
- GitHub Check: Tests (macOS-latest)
- GitHub Check: Tests (windows-latest)
🔇 Additional comments (3)
pkg/js/libs/fs/fs.go (3)
80-81: Good delegation pattern for context propagation.The function correctly delegates to
ReadFileand passes the context through, avoiding duplication of the executionId extraction logic.
97-104: Consistent context propagation throughout the call chain.The function correctly propagates the context parameter to both
ListDirandReadFileAsStringcalls, maintaining consistency with the new context-aware API.
1-112: Overall assessment: Good alignment with concurrent execution objectives.The changes successfully transition the fs library to be context-aware, supporting the PR's objective of enabling concurrent Nuclei engines. The consistent pattern of extracting executionId from context and using
NormalizePathWithExecutionIdis well-implemented.The main concern is the unsafe type assertions that could cause runtime panics in production environments. Once the context value extraction is made safer, these changes will provide a solid foundation for execution ID-based path normalization and local file access control.
|
@coderabbitai create github issue for this and link this PR in issue. |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain@ehsandeep, I'll create a GitHub issue for the multi-threading client pools follow-up and link this PR. 🏁 Script executed: Length of output: 662 @ehsandeep, I've successfully created GitHub issue #6334 for the multi-threading client pools follow-up. The issue includes:
Issue URL: #6334 🐇✨ |
* support for concurrent nuclei engines * clarify LfaAllowed race * remove unused mutex * update LfaAllowed logic to prevent races until it can be reworked for per-execution ID * Update pkg/templates/parser.go Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> * debug tests * debug gh action * fixig gh template test * using atomic * using synclockmap * restore tests concurrency * lint * wiring executionId in js fs --------- Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> Co-authored-by: Mzack9999 <mzack9999@protonmail.com>
Proposed changes
This PR includes a handful of improvements that make it possible to run multiple Nuclei engines concurrently without data races. This PR also makes some progress on a reusable template cache, but that work isn't quite usable yet, and each engine needs its own parser stil.
One item open is how to handle the LfaAllowed global. This line in protocolstate/state.go still triggers a race unless commented out:
This one is tricky to fix since many things check for this variable without having an executionId handy (javascript
fsmodule, etc.).Checklist
Summary by CodeRabbit
New Features
Bug Fixes
Refactor
Tests