Conversation
- Replace gopkg.in/yaml.v2 with gopkg.in/yaml.v3 across the codebase for better compatibility and features. - Enhance YAML unmarshalling in various components to support new parsing capabilities. - Introduce a new InsertionOrderedStringMap type for maintaining order in YAML mappings. - Add tests for YAML preprocessing and decoding to ensure robustness and correctness.
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the WalkthroughThis PR upgrades YAML handling to yaml.v3 across the codebase, introduces caching for DSL expressions and regexes, refactors template caching/parsing with pooling and stricter YAML decoding, adds new YAML helpers for string slices, updates map unmarshalling to yaml.Node, and expands tests for caches, YAML preprocessing, decoding, and ordered maps. Changes
Sequence Diagram(s)sequenceDiagram
participant C as Caller
participant EC as exprcache (DSLCache)
participant GV as govaluate
C->>EC: GetCompiledDSLExpression(expr)
alt found in cache
EC-->>C: *EvaluableExpression
else compile
EC->>GV: NewEvaluableExpressionWithFunctions(expr, dslFns)
GV-->>EC: *EvaluableExpression | error
alt success
EC->>EC: write-lock + store
EC-->>C: *EvaluableExpression
else error
EC-->>C: error
end
end
C->>GV: Evaluate(values)
GV-->>C: result | error
sequenceDiagram
participant E as Extractor/Matcher
participant RC as regexcache
participant RE as regexp
E->>RC: GetCompiledRegex(pattern)
alt cached
RC-->>E: *Regexp
else compile
RC->>RE: Compile(pattern)
RE-->>RC: *Regexp | error
alt success
RC->>RC: write-lock + store
RC-->>E: *Regexp
else error
RC-->>E: error
end
end
sequenceDiagram
participant P as Parser
participant TP as templatePool
participant YU as yamlutil.PreProcess
participant YD as yaml.v3 Decoder
participant TC as Templates Cache
P->>TP: Get *Template
note over P: Acquire parsing semaphore
P->>YU: PreProcess(reader)
alt preprocessed data
P->>YD: decodeYAMLFromData(data, tmpl)
else direct read
P->>YD: decodeYAMLFromReader(reader, tmpl)
end
alt success
P->>TC: StoreWithoutRaw(id, tmpl, err=nil)
P-->>TP: Return tmpl to pool on next reuse
P-->>Caller: *Template
else error
P->>TP: Reset + Put tmpl
P-->>Caller: error
end
note over P: Release semaphore
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Possibly related PRs
Suggested reviewers
Poem
✨ Finishing Touches🧪 Generate unit tests
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Actionable comments posted: 13
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (4)
pkg/protocols/common/generators/generators_test.go (1)
131-135: Fix panic-prone type assertion in TestParsePayloadsWithAggression after yaml.v3 migrationThe YAML decoder from
gopkg.in/yaml.v3now produces nested maps of typemap[string]interface{}, so the current assertion tomap[interface{}]interface{}will always panic at runtime in this test.Please apply the following changes:
• In pkg/protocols/common/generators/generators_test.go (around lines 131–135 in
TestParsePayloadsWithAggression):- parsed, err := parsePayloadsWithAggression(k, v.(map[interface{}]interface{}), aggression) - require.Nil(t, err, "could not parse payloads with aggression") + inner, ok := v.(map[string]interface{}) + require.True(t, ok, "unexpected payload type for %q: %T", k, v) + parsed, err := parsePayloadsWithAggression(k, inner, aggression) + require.NoError(t, err, "could not parse payloads with aggression")• In pkg/protocols/common/generators/generators.go (line 86): update the function signature to accept string-keyed maps:
- func parsePayloadsWithAggression(name string, v map[interface{}]interface{}, aggression string) (map[string]interface{}, error) { + func parsePayloadsWithAggression(name string, v map[string]interface{}, aggression string) (map[string]interface{}, error) {and adjust its internals (or introduce a small adapter) to work with
map[string]interface{}instead ofmap[interface{}]interface{}.These fixes will align both the tests and implementation with the behavior of
yaml.v3, preventing the panic and ensuring correct parsing.pkg/operators/matchers/match.go (1)
185-189: Prevent unbounded memory growth in DSLCache for per‐request expressionsThe global DSLCache used by
GetCompiledDSLExpressionis backed by a simplemap[string]*govaluate.EvaluableExpressionwith no size limit or eviction policy (see pkg/protocols/common/exprcache/exprcache.go:16–19). Calling it on every uniqueresolvedExpression—which can vary per target or request—will steadily bloat memory over long‐running scans.Locations to address:
- pkg/operators/matchers/match.go (around lines 185–189):
expression, err = exprcache.GetCompiledDSLExpression(resolvedExpression) if err != nil { logExpressionEvaluationFailure(matcher.Name, err) return false }Recommended refactors (pick one):
Compile without global cache
Add a “no-cache” helper inexprcache, e.g.:func CompileDSLExpressionNoCache(expr string) (*govaluate.EvaluableExpression, error) { return govaluate.NewEvaluableExpressionWithFunctions(expr, HelperFunctions) }Then update the call site:
- expression, err = exprcache.GetCompiledDSLExpression(resolvedExpression) + expression, err = exprcache.CompileDSLExpressionNoCache(resolvedExpression)Bounded in‐function LRU for the scan
If you still want caching across this matcher, embed a small LRU (e.g. 128 entries) scoped to the scan:// at start of scan: lru := lru.New(128) … // replace compile call: expression, err = lru.GetOrAdd(resolvedExpression, func() (*govaluate.EvaluableExpression, error) { return exprcache.GetCompiledDSLExpression(resolvedExpression) })Addressing this will prevent the cache from growing without bound. Let me know which approach you’d like to adopt.
pkg/templates/tag_filter.go (1)
432-437: Bug: ExcludeIds population checks the wrong map (uses block tags map).This mistakenly gates excludeIds insertion on filter.block membership rather than filter.excludeIds, causing unexpected coupling between tag blocking and id exclusion.
Apply this diff:
- for _, val := range splitCommaTrim(id) { - if _, ok := filter.block[val]; !ok { - filter.excludeIds[val] = struct{}{} - } - } + for _, val := range splitCommaTrim(id) { + if _, ok := filter.excludeIds[val]; !ok { + filter.excludeIds[val] = struct{}{} + } + }pkg/templates/compile.go (1)
409-425: Use srcOptions.TemplatePath for format detection and unify YAML decoding with KnownFields(strict).The switch currently inspects template.Path, which is unset at this stage. While YAML v3 can parse JSON, using the actual source path is more correct and avoids edge cases. Also, to align with the stricter yaml.v3 migration used in parser.go, consider decoding via yaml.Decoder with KnownFields toggled by srcOptions.Options.Validate (or a similar strictness flag).
- switch config.GetTemplateFormatFromExt(template.Path) { + switch config.GetTemplateFormatFromExt(srcOptions.TemplatePath) { case config.JSON: err = json.Unmarshal(data, template) case config.YAML: - err = yaml.Unmarshal(data, template) + { + dec := yaml.NewDecoder(bytes.NewReader(data)) + // Honor validation/strictness if enabled + if srcOptions != nil && srcOptions.Options != nil && srcOptions.Options.Validate { + dec.KnownFields(true) + } + err = dec.Decode(template) + } default: // assume its yaml - if err = yaml.Unmarshal(data, template); err != nil { + { + dec := yaml.NewDecoder(bytes.NewReader(data)) + if srcOptions != nil && srcOptions.Options != nil && srcOptions.Options.Validate { + dec.KnownFields(true) + } + if err = dec.Decode(template); err != nil { return nil, err - } + } + } }This keeps memory usage unchanged while aligning behavior with the new yaml.v3 strict decoding path.
🧹 Nitpick comments (52)
pkg/protocols/common/generators/generators_test.go (3)
108-111: Enable strict YAML decoding to catch unknown fields early.Since the PR aims for stricter YAML handling, toggle KnownFields on the v3 decoder in the test as well.
- err := yaml.NewDecoder(strings.NewReader(testPayload)).Decode(&payloads) - require.Nil(t, err, "could not unmarshal yaml") + dec := yaml.NewDecoder(strings.NewReader(testPayload)) + dec.KnownFields(true) + err := dec.Decode(&payloads) + require.NoError(t, err, "could not unmarshal yaml")
19-19: Use require.NoError for error checks (clearer intent).Switch from require.Nil(err) to require.NoError(err) for readability and better failure messages.
- require.Nil(t, err, "could not create generator") + require.NoError(t, err, "could not create generator")Apply similarly at Lines 39, 61, 110, and 134.
Also applies to: 39-39, 61-61, 110-110, 134-134
136-139: Strengthen assertions: verify content, not just length.Length-only checks can miss ordering/content regressions. Also prefer require.Len for clarity.
- gotValues := parsed[k].([]interface{}) - require.Equal(t, len(values), len(gotValues), "could not get correct number of values") + gotValues := parsed[k].([]interface{}) + require.Len(t, gotValues, len(values), "could not get correct number of values") + // Optional: verify actual values (cast to []string first) + strVals := make([]string, 0, len(gotValues)) + for _, iv := range gotValues { + strVals = append(strVals, iv.(string)) + } + for _, exp := range values { + require.Contains(t, strVals, exp, "missing expected value for aggression=%s", aggression) + }pkg/model/types/severity/severity_test.go (1)
19-19: Prefer require.NoError for error checks in testsUse require.NoError for clarity and better failure messages in tests.
- require.Nil(t, err, "could not marshal yaml") + require.NoError(t, err, "could not marshal yaml")pkg/model/model_test.go (3)
31-36: Use NoError to assert on error valuesNit: prefer require.NoError for readability.
- require.Nil(t, err) + require.NoError(t, err)
55-77: Brittle YAML string equality in testsAsserting exact YAML bytes can be brittle across emitter versions/whitespace changes. Consider structural comparison by unmarshalling both expected and actual and comparing maps/structs.
Example refactor:
- result, err := yaml.Marshal(&info) - require.Nil(t, err) - - expected := `name: Test Template Name -... (omitted for brevity) ... - string_key: string_value -` - require.Equal(t, expected, string(result)) + result, err := yaml.Marshal(&info) + require.NoError(t, err) + + var got map[string]any + require.NoError(t, yaml.Unmarshal(result, &got)) + + const expectedYAML = `name: Test Template Name +author: +- forgedhallpass +- ice3man +tags: +- cve +- misc +description: Test description +reference: Reference1 +severity: high +metadata: + array_key: + - array_value1 + - array_value2 + map_key: + key1: val1 + string_key: string_value +` + var expected map[string]any + require.NoError(t, yaml.Unmarshal([]byte(expectedYAML), &expected)) + require.Equal(t, expected, got)
55-56: Also switch to NoError hereMirror the earlier nit at this call site.
- require.Nil(t, err) + require.NoError(t, err)pkg/utils/yaml/yaml_decode_wrapper.go (1)
9-10: Enable KnownFields on yaml.v3 decoder to catch unknown config keys.With yaml.v3 we can reject unknown fields to avoid silent misconfigurations. This is low risk, and you can make it toggleable if needed.
import ( "io" "strings" "github.com/go-playground/validator/v10" "github.com/pkg/errors" - "gopkg.in/yaml.v3" + "gopkg.in/yaml.v3" ) @@ func DecodeAndValidate(r io.Reader, v interface{}) error { - if err := yaml.NewDecoder(r).Decode(v); err != nil { + dec := yaml.NewDecoder(r) + dec.KnownFields(true) + if err := dec.Decode(v); err != nil { return err }Also applies to: 16-17
pkg/protocols/common/automaticscan/automaticscan.go (1)
37-37: yaml.v3 import: consider strict decoding and log parse errors for the mapping file.We currently ignore decode errors for the Wappalyzer mapping. Enabling KnownFields helps catch typos; logging the error aids diagnosis and doesn’t change control flow.
- if file, err := os.Open(mappingFile); err == nil { - _ = yaml.NewDecoder(file).Decode(&mappingData) - _ = file.Close() - } + if file, err := os.Open(mappingFile); err == nil { + dec := yaml.NewDecoder(file) + dec.KnownFields(true) + if derr := dec.Decode(&mappingData); derr != nil { + gologger.Verbose().Msgf("Failed to parse %s: %v", mappingFile, derr) + } + _ = file.Close() + }Also applies to: 75-81
pkg/catalog/config/ignorefile.go (1)
8-9: Harden YAML parsing: KnownFields and graceful handling of empty files (EOF).Unknown keys in nuclei-ignore can be silently ignored today. Also, empty files yield io.EOF which we treat as a parse error. Recommend strict fields and treating EOF as empty config.
import ( "os" "runtime/debug" "github.com/projectdiscovery/gologger" "gopkg.in/yaml.v3" + "io" + "errors" ) @@ ignore := IgnoreFile{} - if err := yaml.NewDecoder(file).Decode(&ignore); err != nil { - gologger.Error().Msgf("Could not parse nuclei-ignore file: %s\n", err) - return IgnoreFile{} - } + dec := yaml.NewDecoder(file) + dec.KnownFields(true) + if err := dec.Decode(&ignore); err != nil { + if errors.Is(err, io.EOF) { + return IgnoreFile{} + } + gologger.Error().Msgf("Could not parse nuclei-ignore file: %s\n", err) + return IgnoreFile{} + }Also applies to: 28-33
pkg/utils/yaml/preprocess.go (2)
46-52: Good fix: avoid negative slice; but capture only indentation bytes.The guard prevents panics when no prior newline exists—nice. However, padBytes now includes the newline itself (starting at lastNewLineIndex), which is intentional for subsequent replacement, but it also mixes concerns (newline + indent). Consider splitting into (newline) and (indent) to make downstream logic clearer and to enable clean first-line padding.
- var padBytes []byte - if lastNewLineIndex >= 0 { - padBytes = data[lastNewLineIndex:matchIndex] - } else { - padBytes = data[:matchIndex] - } + var padBytes []byte + if lastNewLineIndex >= 0 { + // keep only indentation (exclude the newline) + padBytes = data[lastNewLineIndex+1 : matchIndex] + } else { + padBytes = data[:matchIndex] + }
70-72: Also indent the first line of the included content (optional).Current logic indents lines after the first by replacing “\n” with “\n”. The first line remains unindented, which can break YAML blocks when “# !include:” appears under indentation. Prefix the first line with the indent as well.
- if len(padBytes) > 0 && !bytes.Equal(padBytes, []byte("\n")) { - includeFileContent = bytes.ReplaceAll(includeFileContent, []byte("\n"), padBytes) - } + if len(padBytes) > 0 { + // indent subsequent lines by replacing newline with newline+indent + includeFileContent = bytes.ReplaceAll(includeFileContent, []byte("\n"), append([]byte("\n"), padBytes...)) + // and indent the first line too + includeFileContent = append(padBytes, includeFileContent...) + }If CRLF normalization matters, consider normalizing to LF before replacement to avoid mixed line endings.
pkg/model/types/stringslice/yamlhelper.go (4)
29-38: Default-trim items to avoid surprising whitespace; normalize after trimming.Without a normalizer, a scalar like "a, b ,c" yields ["a", " b ", "c"]. Trim by default, then apply the optional normalizer for consistent behavior across scalar and sequence forms.
- result := strings.Split(v, ",") - if normalizer != nil { - for i, value := range result { - result[i] = normalizer.Normalize(value) - } - } + result := strings.Split(v, ",") + for i := range result { + token := strings.TrimSpace(result[i]) + if normalizer != nil { + token = normalizer.Normalize(token) + } + result[i] = token + } return result, nil @@ - if normalizer != nil { - v = normalizer.Normalize(v) - } + v = strings.TrimSpace(v) + if normalizer != nil { + v = normalizer.Normalize(v) + } out = append(out, v)Also applies to: 41-52
15-18: Defensive checks for nil nodes and alias targets.Node should be non-nil, but a guard prevents panics if callers misuse this helper. Also ensure alias targets are non-nil.
func UnmarshalYAMLNode(node *yaml.Node, normalizer StringNormalizer) ([]string, error) { + if node == nil { + return nil, fmt.Errorf("stringslice: nil YAML node") + } if node.Kind == yaml.DocumentNode && len(node.Content) > 0 { node = node.Content[0] } @@ case yaml.AliasNode: - return UnmarshalYAMLNode(node.Alias, normalizer) + if node.Alias == nil { + return nil, fmt.Errorf("stringslice: invalid alias with nil target") + } + return UnmarshalYAMLNode(node.Alias, normalizer)Also applies to: 54-55
60-62: More helpful error details for unsupported kinds.Printing Kind as an int is opaque. Include kind and tag to aid debugging.
- return nil, fmt.Errorf("stringslice: expected string or sequence, got %v", node.Kind) + return nil, fmt.Errorf("stringslice: expected scalar or sequence, got kind=%d tag=%q", node.Kind, node.Tag)
22-24: Confirm null semantics (error vs. empty slice).You error out on !!null (scalar) and also when Kind==0. Confirm callers rely on hard-fail rather than treating null as empty slice for backward compatibility. I can align all string-slice unmarshallers accordingly if needed.
Also applies to: 57-58
pkg/operators/common/regexcache/cache.go (2)
15-33: Avoid duplicate compilations under contention; consider singleflight or a double-check.Two goroutines can compile the same pattern concurrently, wasting CPU. Add a second check under the write lock, or integrate singleflight for correctness + throughput.
func GetCompiledRegex(pattern string) (*regexp.Regexp, error) { regexCacheLock.RLock() if compiled, exists := regexCache[pattern]; exists { regexCacheLock.RUnlock() return compiled, nil } regexCacheLock.RUnlock() - compiled, err := regexp.Compile(pattern) - if err != nil { - return nil, err - } - - regexCacheLock.Lock() - regexCache[pattern] = compiled - regexCacheLock.Unlock() + compiled, err := regexp.Compile(pattern) + if err != nil { + return nil, err + } + regexCacheLock.Lock() + if existing, ok := regexCache[pattern]; ok { + regexCacheLock.Unlock() + return existing, nil + } + regexCache[pattern] = compiled + regexCacheLock.Unlock() return compiled, nil }If you prefer zero duplicate compile without holding the lock during compilation, I can wire golang.org/x/sync/singleflight.
35-47: Unbounded cache growth; add eviction and observability.Patterns are user-controlled in some contexts. Consider an LRU with a sane cap and simple metrics (hit/miss, size) to prevent memory bloat and to validate perf wins.
I can drop in a tiny LRU using hashicorp/golang-lru or a lock-free sync.Map+size cap depending on your constraints.
pkg/utils/insertion_ordered_map.go (1)
121-134: Set/ForEach behavior depends on keys/values coherence.With the reset fix applied, Set/ForEach remain correct. Without it, ForEach may fetch nil values for phantom keys. Please add a focused test covering reuse of InsertionOrderedStringMap to lock this invariant.
I can add a unit test demonstrating the bug and verifying the fix.
pkg/utils/yaml/yaml_decode_wrapper_test.go (1)
183-188: Adjust the “invalid validation” test to actually trigger InvalidValidationError.Currently YAML decode fails before validation because target is a non-pointer string. Decode into a string pointer so YAML succeeds and validator sees a non-struct.
- { - name: "invalid validation error test", - yamlInput: `name: "test"`, - target: "invalid_target", // This should cause InvalidValidationError - expectError: true, - errorType: "invalid_validation", - }, + { + name: "invalid validation error test", + yamlInput: `"hello"`, + target: new(string), // YAML decodes fine; validator gets non-struct + expectError: true, + errorType: "invalid_validation", + },pkg/utils/yaml/preprocess_test.go (3)
262-271: Restore StrictSyntax after mutation in this test, too.Unlike TestPreProcess, this test doesn’t save/restore StrictSyntax; although tests are not parallel, restoring state is safer and self-contained.
@@ - StrictSyntax = false - result, err := PreProcess([]byte(input)) + original := StrictSyntax + StrictSyntax = false + defer func() { StrictSyntax = original }() + result, err := PreProcess([]byte(input))
194-207: Rename “large recursive includes” to “large include file” (not recursive).This test creates a single large file and includes it once; the name suggests recursion that isn’t present.
- name: "large recursive includes", + name: "large include file",
381-400: Benchmarks: add b.ReportAllocs and document StrictSyntax state.Report allocations to quantify memory improvements, and set StrictSyntax explicitly to avoid accidental perf skew from global changes.
@@ func BenchmarkPreProcess_NoIncludes(b *testing.B) { - b.ResetTimer() + StrictSyntax = false + b.ReportAllocs() + b.ResetTimer() @@ func BenchmarkPreProcess_WithIncludes(b *testing.B) { - b.ResetTimer() + StrictSyntax = false + b.ReportAllocs() + b.ResetTimer() @@ func BenchmarkPreProcess_MultipleIncludes(b *testing.B) { - b.ResetTimer() + StrictSyntax = false + b.ReportAllocs() + b.ResetTimer() @@ func BenchmarkPreProcess_NestedIncludes(b *testing.B) { - b.ResetTimer() + StrictSyntax = false + b.ReportAllocs() + b.ResetTimer()Also applies to: 423-430, 454-461, 508-515
pkg/utils/insertion_ordered_map_test.go (5)
14-68: YAML mixed-type expectations are tightly coupled to stringification semantics.The test asserts that non-string scalars become strings (e.g., 42 → "42"). If the implementation evolves to preserve native types under yaml.v3, these assertions will fail. Consider clarifying the contract in InsertionOrderedStringMap’s UnmarshalYAML doc or add a separate test that explicitly validates the intended “all scalars become strings” rule.
136-176: JSON tests: expectedKeys is declared but never asserted.Either assert iteration order (if your UnmarshalJSON preserves order) or drop expectedKeys from the JSON-table to avoid confusion.
@@ - expectedKeys []string expectedValues map[string]interface{} @@ - expectedKeys: []string{"b", "a", "c"}, // JSON doesn't preserve order, so we get map iteration order expectedValues: map[string]interface{}{"a": "first", "b": "second", "c": "third"},Also applies to: 188-198
262-282: float32 stringification can be flaky; prefer stable formatting in tests.fmt.Sprint(float32(3.14)) is usually "3.14", but binary rounding can surface on some platforms. To avoid flakes, compare numerically or format with %.2f if you need exact decimals.
- {"float32", float32(3.14), "3.14"}, + {"float32", float32(3.14), "3.14"}, // consider numeric compare if flakes show upIf you want to harden now:
- {"float32", float32(3.14), "3.14"}, + {"float32", float32(3.14), fmt.Sprintf("%.2f", float32(3.14))},(then adapt toString test to use fmt.Sprintf similarly)
440-449: Test name/comment mismatch: this YAML isn’t malformed.The case named “yaml with malformed structure” uses a valid simple mapping. Rename to avoid confusion.
- t.Run("yaml with malformed structure", func(t *testing.T) { + t.Run("yaml simple mapping (parser forgivingness)", func(t *testing.T) { @@ - assert.NoError(t, err) // YAML parser is forgiving + assert.NoError(t, err)
370-379: Benchmarks: add b.ReportAllocs to capture memory impact.Given the PR’s performance/memory goals, tracking allocs here is useful.
@@ func BenchmarkInsertionOrderedStringMap_Set(b *testing.B) { - b.ResetTimer() + b.ReportAllocs() + b.ResetTimer() @@ func BenchmarkInsertionOrderedStringMap_UnmarshalJSON(b *testing.B) { - b.ResetTimer() + b.ReportAllocs() + b.ResetTimer()Also applies to: 427-438
pkg/protocols/common/exprcache/exprcache.go (3)
69-77: Unbounded cache can grow without limit; consider a size cap or LRU.If expressions are template/user-driven, the map can grow indefinitely. Consider an LRU (e.g., simple ring or groupcache/lru) or an optional MaxSize with random/oldest eviction to bound memory.
16-27: Minor: document constructors and clarify functions binding immutability.Add doc comments describing thread-safety, typical usage, and that the functions map is captured at construction time.
57-66: Clear() recreates the map—good. Consider returning previous size for observability.A return value can help callers log/measure effect when clearing hot caches.
-func (ec *ExpressionCache) Clear() { +func (ec *ExpressionCache) Clear() int { ec.cacheLock.Lock() - ec.cache = make(map[string]*govaluate.EvaluableExpression) + prev := len(ec.cache) + ec.cache = make(map[string]*govaluate.EvaluableExpression) ec.cacheLock.Unlock() + return prev }(Adjust call sites/tests accordingly.)
pkg/protocols/common/exprcache/exprcache_test.go (3)
108-129: Concurrency test doesn’t detect duplicate compilations.The current test ensures safety and single entry, but it can’t detect wasted duplicate compilations during races. After applying double-checked locking, consider instrumenting compilation counts by injecting a compile hook (e.g., an optional compileFn in ExpressionCache) and assert that only one compilation occurred under parallel Get calls.
I can propose a minimal refactor to add a test-only compile hook if you want to validate the deduction.
165-176: Benchmarks: include allocation metrics.Add b.ReportAllocs() to quantify memory impact of caching.
@@ func BenchmarkExpressionCache_Get(b *testing.B) { - b.ResetTimer() + b.ReportAllocs() + b.ResetTimer() @@ func BenchmarkExpressionCache_GetParallel(b *testing.B) { - b.ResetTimer() + b.ReportAllocs() + b.ResetTimer()Also applies to: 178-191
131-151: Global caches smoke tests look good. Consider a DSL function assertion if stable.If dsl.HelperFunctions contains stable built-ins, add one small assertion using a known DSL function to ensure functions binding works (optional; only if the function set is stable across versions).
pkg/templates/templates_test.go (1)
12-36: Optional: run this test in parallel.This test is read-only and doesn’t mutate globals; you can add t.Parallel() to speed up the suite.
func TestTemplateStruct(t *testing.T) { + t.Parallel()pkg/operators/matchers/compile.go (1)
46-49: Preserve the underlying compile error for diagnosabilityReturning only the pattern loses the root cause. Wrap the error so operators get actionable details.
Apply this diff:
- compiled, err := regexcache.GetCompiledRegex(regex) - if err != nil { - return fmt.Errorf("could not compile regex: %s", regex) - } + compiled, err := regexcache.GetCompiledRegex(regex) + if err != nil { + return fmt.Errorf("could not compile regex %q: %w", regex, err) + }pkg/protocols/common/expressions/variables.go (1)
118-126: Edge-case: treat nil compiled expressions as non-literalIf GetCompiledDSLExpression returns (nil, nil), the current logic returns true, potentially skipping unresolved-variable detection incorrectly. Safer default is false.
Apply this diff:
- expr, err := exprcache.GetCompiledDSLExpression(data) - if err != nil { - return false - } - if expr != nil { - _, err = expr.Evaluate(nil) - return err == nil - } - return true + expr, err := exprcache.GetCompiledDSLExpression(data) + if err != nil || expr == nil { + return false + } + _, err = expr.Evaluate(nil) + return err == nilIf other callers depend on the previous “nil means true” behavior, let’s document and codify it instead. Otherwise, this change reduces false negatives.
pkg/model/types/stringslice/yamlhelper_test.go (1)
1-200: Solid, thorough coverage of YAML node-based unmarshallingTests exercise scalar, sequence, aliasing, normalization, and error cases. Benchmarks are a nice touch for tracking perf of the new helpers.
Two small gaps you might consider:
- Add an alias case with normalization enabled to ensure alias targets go through the normalizer.
- Add a case with trailing/leading commas in scalar input (e.g., ",one,two,"), asserting trimming/normalization behavior is as intended.
pkg/templates/tag_filter.go (3)
448-453: Wrap include-condition compilation error with context.Returning the raw error makes it harder to pinpoint which condition failed. Wrap with the offending condition and preserve the cause.
Apply this diff within the function and add the import:
@@ - compiled, err := exprcache.GetCompiledDSLExpression(includeCondition) - if err != nil { - return nil, err - } + compiled, err := exprcache.GetCompiledDSLExpression(includeCondition) + if err != nil { + return nil, fmt.Errorf("invalid include-condition %q: %w", includeCondition, err) + } filter.includeConditions[includeCondition] = compiledOutside the selected range, add:
import ( + "fmt" "bufio" "errors" "io"
330-335: Comment vs. behavior mismatch on condition-evaluation errors.The comment says “skip” on evaluation error, but the code returns false, rejecting the template. Either fix the comment or change logic to continue as intended.
If “skip” is correct, use:
- if err != nil { - // Using debug as the failure here might be legitimate (eg. template not having optional metadata fields => missing required fields) - gologger.Debug().Msgf("The expression condition couldn't be evaluated correctly for template \"%s\": %s\n", template.ID, err) - return false - } + if err != nil { + // Using debug as the failure here might be legitimate (e.g., optional metadata missing) + gologger.Debug().Msgf("The expression condition couldn't be evaluated correctly for template %q: %v", template.ID, err) + continue + }If rejecting is desired, update the comment to reflect that includeConditions must all evaluate successfully.
231-235: Close raw HTTP request bodies after reading.Minor resource hygiene: ensure Body is closed after io.ReadAll to avoid leaks in non-trivial readers.
- body, _ := io.ReadAll(rawHttpReq.Body) + body, _ := io.ReadAll(rawHttpReq.Body) + _ = rawHttpReq.Body.Close() bodies = append(bodies, string(body))pkg/model/types/stringslice/stringslice_raw.go (1)
17-23: Node-based UnmarshalYAML is clean; confirm null handling compatibility.The new UnmarshalYAMLNode returns an error on nulls (“stringslice: null values not supported”). Previous behavior may have accepted null/empty as empty slice. Please confirm this won’t break existing templates and update docs/tests accordingly if this is an intentional tightening.
If you want to preserve prior semantics (treat null as empty), consider:
func (rawStringSlice *RawStringSlice) UnmarshalYAML(node *yaml.Node) error { - result, err := UnmarshalYAMLNode(node, rawStringSlice) + if node == nil || node.Kind == 0 || node.Tag == "!!null" { + rawStringSlice.Value = []string{} + return nil + } + result, err := UnmarshalYAMLNode(node, rawStringSlice) if err != nil { return err } rawStringSlice.Value = result return nil }pkg/operators/extractors/compile.go (1)
23-28: Preserve underlying regex compilation error for diagnosability.Wrap the original error; current message omits the cause.
- compiled, err := regexcache.GetCompiledRegex(regex) - if err != nil { - return fmt.Errorf("could not compile regex: %s", regex) - } + compiled, err := regexcache.GetCompiledRegex(regex) + if err != nil { + return fmt.Errorf("could not compile regex %q: %w", regex, err) + }pkg/model/types/stringslice/stringslice.go (1)
115-139: Remove unused helpermarshalStringToSliceThis function is only defined in
pkg/model/types/stringslice/stringslice.go(lines 115–139) and is never invoked elsewhere in the repo. You can safely delete it to eliminate dead code.• File:
pkg/model/types/stringslice/stringslice.go
– Delete themarshalStringToSlicefunction (lines 115–139)pkg/protocols/common/expressions/expressions.go (2)
13-16: Remove commented-out helper.Leftover commented code should be dropped to keep the file clean.
-// func GetCompiledDSLExpression(expression string) (*govaluate.EvaluableExpression, error) { -// return exprcache.GetCompiledDSLExpression(expression) -// }
133-143: isExpression may pollute caches during detection; consider a non-caching parse.The detection path attempts compilation and may cache many transient strings during scanning, increasing memory pressure. If feasible, consider a “peek” parse that doesn’t populate the cache (or a short-lived/size-bounded cache) for detection.
pkg/templates/compile.go (3)
66-71: Prefer current filePath for TemplatePath on cache-hit.You restore TemplatePath from the cached value. For safety and clearer logging (symlinks, relative vs absolute paths), set it to the current filePath.
- template.Options.TemplatePath = value.Options.TemplatePath + template.Options.TemplatePath = filePathIf needed, also ensure template.Path is consistent when returning the cached instance.
106-106: Remove leftover commented log.Dead/commented code adds noise.
- // options.Logger.Error().Msgf("returning cached template %s after recompiling %d requests", tplCopy.Options.TemplateID, tplCopy.Requests())
175-186: Nil-safety for Global Matchers options.Minor: isGlobalMatchersEnabled dereferences template.Options.Options without a nil check. Realistically, Options should be set, but if a future call path changes this assumption, it would panic.
- if !template.Options.Options.EnableGlobalMatchersTemplates { + if template.Options == nil || template.Options.Options == nil || !template.Options.Options.EnableGlobalMatchersTemplates { return false }pkg/templates/cache.go (2)
52-67: Unsafe zero-copy conversion on pooled buffers can corrupt cached raw.conversion.String(raw) is zero-copy. If raw comes from a reusable buffer (e.g., sync.Pool), later writes can mutate the cached string. You currently use StoreWithoutRaw in the parser path, which avoids this, but please document this contract or defensively copy when the source is from a pool.
- if raw != nil { - entry.raw = conversion.String(raw) + if raw != nil { + // If raw originates from a pooled/temporary buffer, copy it to avoid aliasing. + // entry.raw = conversion.String(raw) // zero-copy; use only with immutable backing + entry.raw = string(append([]byte(nil), raw...))Alternatively, keep zero-copy but add a comment stating the precondition: raw must not be reused/mutated.
89-93: Update comment: Purge doesn’t “return all objects to pool.”Purge clears the map; pooled items are handled by GC. Fix the doc to avoid confusion.
-// Purge the cache and return all objects to pool +// Purge clears the cache; pooled objects are not explicitly reclaimed.pkg/templates/parser.go (1)
171-179: Bounded parsing concurrency is a sensible default.The parsingSemaphore guard (size 10) is a pragmatic limit. Consider making it configurable in the future if workloads vary widely, but this is fine for now.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (34)
cmd/integration-test/http.go(1 hunks)go.mod(1 hunks)pkg/catalog/config/ignorefile.go(1 hunks)pkg/fuzz/type.go(2 hunks)pkg/model/model_test.go(1 hunks)pkg/model/types/severity/severity_test.go(1 hunks)pkg/model/types/stringslice/stringslice.go(2 hunks)pkg/model/types/stringslice/stringslice_raw.go(2 hunks)pkg/model/types/stringslice/yamlhelper.go(1 hunks)pkg/model/types/stringslice/yamlhelper_test.go(1 hunks)pkg/operators/common/regexcache/cache.go(1 hunks)pkg/operators/extractors/compile.go(3 hunks)pkg/operators/matchers/compile.go(3 hunks)pkg/operators/matchers/match.go(2 hunks)pkg/protocols/common/automaticscan/automaticscan.go(1 hunks)pkg/protocols/common/exprcache/exprcache.go(1 hunks)pkg/protocols/common/exprcache/exprcache_test.go(1 hunks)pkg/protocols/common/expressions/expressions.go(3 hunks)pkg/protocols/common/expressions/variables.go(2 hunks)pkg/protocols/common/generators/generators_test.go(1 hunks)pkg/protocols/common/variables/variables_test.go(1 hunks)pkg/reporting/reporting.go(1 hunks)pkg/templates/cache.go(2 hunks)pkg/templates/compile.go(2 hunks)pkg/templates/parser.go(6 hunks)pkg/templates/tag_filter.go(2 hunks)pkg/templates/templates.go(2 hunks)pkg/templates/templates_test.go(1 hunks)pkg/utils/insertion_ordered_map.go(2 hunks)pkg/utils/insertion_ordered_map_test.go(1 hunks)pkg/utils/yaml/preprocess.go(2 hunks)pkg/utils/yaml/preprocess_test.go(1 hunks)pkg/utils/yaml/yaml_decode_wrapper.go(1 hunks)pkg/utils/yaml/yaml_decode_wrapper_test.go(1 hunks)
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2025-07-16T21:27:14.937Z
Learnt from: hdm
PR: projectdiscovery/nuclei#6322
File: pkg/templates/compile.go:79-81
Timestamp: 2025-07-16T21:27:14.937Z
Learning: To make the template caching mechanism in pkg/templates/compile.go production-ready, DSLs need to be updated to use runtime options instead of cached variables, rather than restoring the Compile() calls on each request.
Applied to files:
pkg/operators/matchers/match.gopkg/protocols/common/expressions/expressions.gopkg/operators/matchers/compile.gopkg/templates/tag_filter.gopkg/protocols/common/expressions/variables.gopkg/operators/extractors/compile.gopkg/templates/compile.gopkg/templates/parser.go
📚 Learning: 2025-07-16T21:27:14.937Z
Learnt from: hdm
PR: projectdiscovery/nuclei#6322
File: pkg/templates/compile.go:79-81
Timestamp: 2025-07-16T21:27:14.937Z
Learning: In pkg/templates/compile.go, the template caching mechanism intentionally skips calling Compile() on copied requests to achieve performance benefits. This is the intended design, not a bug. The current implementation isn't production-ready but represents the desired direction.
Applied to files:
pkg/operators/matchers/compile.gopkg/operators/extractors/compile.gopkg/templates/compile.go
📚 Learning: 2025-07-16T21:28:08.073Z
Learnt from: hdm
PR: projectdiscovery/nuclei#6322
File: pkg/templates/compile.go:59-78
Timestamp: 2025-07-16T21:28:08.073Z
Learning: The shallow copy behavior (`tplCopy := *value`) in pkg/templates/compile.go is intentional design for the template caching mechanism. The partial-copy approach is part of the performance optimization strategy, not a bug requiring deep copying.
Applied to files:
pkg/templates/compile.go
🧬 Code graph analysis (18)
pkg/model/types/stringslice/yamlhelper_test.go (3)
pkg/model/types/stringslice/yamlhelper.go (2)
StringNormalizer(11-13)UnmarshalYAMLNode(15-63)pkg/model/types/stringslice/stringslice.go (1)
StringSlice(30-32)pkg/model/types/stringslice/stringslice_raw.go (1)
RawStringSlice(5-7)
pkg/operators/matchers/match.go (1)
pkg/protocols/common/exprcache/exprcache.go (1)
GetCompiledDSLExpression(83-85)
pkg/model/types/stringslice/stringslice_raw.go (1)
pkg/model/types/stringslice/yamlhelper.go (1)
UnmarshalYAMLNode(15-63)
pkg/protocols/common/expressions/expressions.go (2)
pkg/protocols/common/exprcache/exprcache.go (2)
GetCompiledDSLExpression(83-85)GetCompiledExpression(79-81)pkg/operators/common/dsl/dsl.go (1)
FunctionNames(19-19)
pkg/utils/yaml/preprocess_test.go (1)
pkg/utils/yaml/preprocess.go (2)
StrictSyntax(18-18)PreProcess(21-82)
pkg/operators/matchers/compile.go (2)
pkg/operators/common/regexcache/cache.go (1)
GetCompiledRegex(15-33)pkg/protocols/common/exprcache/exprcache.go (1)
GetCompiledDSLExpression(83-85)
pkg/model/types/stringslice/yamlhelper.go (1)
pkg/utils/utils.go (1)
IsBlank(17-19)
pkg/templates/templates.go (1)
pkg/protocols/common/automaticscan/automaticscan.go (1)
New(69-119)
pkg/protocols/common/exprcache/exprcache_test.go (1)
pkg/protocols/common/exprcache/exprcache.go (5)
ExpressionCache(10-14)NewWithFunctions(22-27)New(16-20)GetCompiledExpression(79-81)GetCompiledDSLExpression(83-85)
pkg/templates/tag_filter.go (1)
pkg/protocols/common/exprcache/exprcache.go (1)
GetCompiledDSLExpression(83-85)
pkg/protocols/common/exprcache/exprcache.go (1)
pkg/operators/common/dsl/dsl.go (1)
HelperFunctions(18-18)
pkg/protocols/common/expressions/variables.go (1)
pkg/protocols/common/exprcache/exprcache.go (1)
GetCompiledDSLExpression(83-85)
pkg/operators/extractors/compile.go (2)
pkg/operators/common/regexcache/cache.go (1)
GetCompiledRegex(15-33)pkg/protocols/common/exprcache/exprcache.go (1)
GetCompiledDSLExpression(83-85)
pkg/utils/yaml/yaml_decode_wrapper_test.go (1)
pkg/utils/yaml/yaml_decode_wrapper.go (1)
DecodeAndValidate(15-34)
pkg/utils/insertion_ordered_map_test.go (1)
pkg/utils/insertion_ordered_map.go (3)
InsertionOrderedStringMap(11-14)NewEmptyInsertionOrderedStringMap(16-21)NewInsertionOrderedStringMap(23-30)
pkg/model/types/stringslice/stringslice.go (1)
pkg/model/types/stringslice/yamlhelper.go (1)
UnmarshalYAMLNode(15-63)
pkg/templates/compile.go (2)
pkg/templates/templates.go (1)
Template(36-165)pkg/protocols/protocols.go (1)
ExecutorOptions(61-141)
pkg/templates/parser.go (3)
pkg/loader/parser/parser.go (1)
Parser(7-11)pkg/templates/cache.go (2)
NewCache(19-28)Cache(11-16)pkg/utils/yaml/preprocess.go (1)
PreProcess(21-82)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Lint
🔇 Additional comments (29)
pkg/protocols/common/generators/generators_test.go (1)
8-8: YAML v3 import LGTM; watch for downstream type changes.The import bump to gopkg.in/yaml.v3 is correct. Note that v3 commonly yields map[string]interface{} when decoding into interface{}-typed values; make sure tests (and parse helpers) aren’t still asserting map[interface{}]interface{}.
Run the script in my other comment to confirm the signature and expected map type of parsePayloadsWithAggression.
pkg/model/types/severity/severity_test.go (1)
6-6: YAML v3 migration import looks goodThe switch to gopkg.in/yaml.v3 is correct. No behavior change expected for yaml.Marshal/yaml.Unmarshal used here.
pkg/model/model_test.go (1)
11-11: YAML v3 import LGTMThe import upgrade to gopkg.in/yaml.v3 is consistent with the repo-wide migration.
pkg/protocols/common/variables/variables_test.go (1)
9-9: YAML v3 import LGTMThe import switch to gopkg.in/yaml.v3 aligns with the migration; usage via yaml.Unmarshal remains compatible.
go.mod (1)
343-343: yaml.v2 marked indirect: no remaining v2 imports or outdated UnmarshalYAML signaturesAll checks passed—there are no direct yaml.v2 references or old-style UnmarshalYAML implementations in the codebase.
• No lingering
gopkg.in/yaml.v2imports found in.gofiles.
• Every YAML import points togopkg.in/yaml.v3.
• Nofunc (…) UnmarshalYAML(func(...) error)signatures detected.cmd/integration-test/http.go (1)
17-17: YAML v3 import change is correctThe migration to gopkg.in/yaml.v3 matches the rest of the codebase; the yaml.Unmarshal call sites in this file remain compatible.
pkg/fuzz/type.go (2)
127-129: Alias handling looks good.Recursive resolution is correct for yaml.v3 nodes.
9-9: No legacy UnmarshalYAML(callback) signatures detectedAll
UnmarshalYAMLimplementations now use the node-based signature (func (… ) UnmarshalYAML(node *yaml.Node) error), and no v2-style callback overloads (func(interface{}) error) remain.pkg/utils/yaml/yaml_decode_wrapper_test.go (1)
1-424: Solid coverage and useful benchmarks.Great breadth: valid/invalid cases, nested, slices, multiple reader types, concurrency, and perf. After fixing validator init, the concurrency test should also pass under -race.
pkg/utils/yaml/preprocess_test.go (1)
40-72: Great coverage of include expansion, indentation, and strict-mode behaviors.The table-driven tests validate core scenarios thoroughly, including recursive preprocessing and indentation padding. This will guard regressions around preprocess.go’s padBytes logic.
pkg/templates/templates_test.go (1)
9-10: yaml.v3 migration looks correct.Import switch only; usage remains the same. Good alignment with the broader repo migration.
pkg/operators/matchers/compile.go (2)
9-11: Good move: centralizing compilation via cachesReplacing direct regexp/govaluate compilation with regexcache/exprcache aligns with the PR’s memory/performance goals and reduces duplicate work across templates.
62-69: LGTM: DSL compilation via exprcacheUsing exprcache here ensures per-template expressions are compiled once and reused, which should cut CPU and allocations on hot paths. The error wrapping with dsl.CompilationError is retained.
pkg/operators/matchers/match.go (1)
12-12: Import swap to exprcache is consistent with the new caching layerThis keeps DSL compile paths uniform across the codebase.
pkg/templates/templates.go (1)
31-31: YAML v3 migration acknowledgedImport swap to gopkg.in/yaml.v3 is expected given the broader repo move.
pkg/protocols/common/expressions/variables.go (1)
8-9: Import change to exprcache is consistentCentralizing DSL compilation here removes direct govaluate/dsl coupling from this package.
pkg/templates/tag_filter.go (1)
16-16: Good move: centralize DSL compilation via exprcache.This aligns with our prior learning to avoid per-request Compile() and use cached expressions. Nice step toward the caching-based design.
pkg/model/types/stringslice/stringslice_raw.go (1)
3-4: YAML v3 migration looks correct.Switching to gopkg.in/yaml.v3 is consistent with the repo-wide move.
pkg/operators/extractors/compile.go (2)
9-11: Imports updated to use caches — aligns with perf goals.Using regexcache and exprcache is consistent with the memory/perf objectives and prior guidance to centralize compilation.
46-51: DSL compilation via exprcache is correct, with structured error.Using dsl.CompilationError keeps error context while benefiting from caching.
pkg/model/types/stringslice/stringslice.go (2)
10-10: YAML v3 import — consistent with the migration.
64-71: Nice consolidation: delegate YAML parsing and normalization to UnmarshalYAMLNode.This reduces duplication and keeps normalization consistent across Raw and normalized types.
pkg/protocols/common/expressions/expressions.go (3)
7-7: Switch to exprcache import — good for centralized caching.
19-24: Eval path: cached compilation is correct.This preserves behavior while eliminating redundant compilations.
60-61: Evaluation path updated to cached compilation — OK.Keeps evaluation semantics, improves perf via caching.
pkg/templates/compile.go (1)
58-75: Shallow-copy cache reuse is OK (matches intended design).The shallow-copy of Template on cache-hit is intentional and aligns with the direction captured in prior learnings; avoiding deep copies is the point. The follow-up fixes above ensure safe option rebinding without re-compilation.
pkg/templates/cache.go (2)
10-16: Nice: lightweight parsedTemplate pooling and copy-on-set.Using a pooled parsedTemplate and storing by value avoids extra heap churn while keeping map contention low. Looks good.
44-48: No action needed — template cache callers already handle nil raw (resolved).Quick summary: I audited usages of Cache.Has / parsedTemplatesCache.Has. Call sites either ignore the raw bytes or already check raw != nil — none rely on len(raw)==0.
Locations reviewed:
- pkg/templates/compile.go — uses "_, raw, err := parser.parsedTemplatesCache.Has(filePath)" and checks "err == nil && raw != nil".
- pkg/templates/parser.go — uses "value, _, err := p.parsedTemplatesCache.Has(templatePath)" and only uses the returned template (ignores raw).
- pkg/catalog/loader/loader.go — "template, _, _ := templatesCache.Has(templatePath)" (raw ignored).
- pkg/templates/cache_test.go — tests call "data, _, err := templates.Has(...)" and don't inspect raw.
- pkg/templates/cache.go — current implementation returns nil for absent raw (snippet below).
Relevant snippet (pkg/templates/cache.go):
if value.raw != "" { return value.template, []byte(value.raw), value.err } return value.template, nil, value.err }Conclusion: No callers found that must be changed to raw != nil checks. Marking this review comment resolved.
pkg/templates/parser.go (1)
156-170: Good use of pooled buffers/readers for preprocessing.Buffered io.Copy + yamlutil.PreProcess with sync.Pool is clean and should reduce GC pressure for large templates. LGTM.
| // UnmarshalYAML implements yaml.Unmarshaler interface. | ||
| func (v *SliceOrMapSlice) UnmarshalYAML(callback func(interface{}) error) error { | ||
| // try to unmarshal it as a string and fallback to map | ||
| if err := callback(&v.Value); err == nil { | ||
| return nil | ||
| } | ||
| func (v *SliceOrMapSlice) UnmarshalYAML(node *yaml.Node) error { | ||
| switch node.Kind { | ||
| case yaml.ScalarNode, yaml.SequenceNode: | ||
| return node.Decode(&v.Value) | ||
|
|
There was a problem hiding this comment.
Zero the alternate field to avoid stale state on reuse.
When decoding into Value, ensure KV is cleared. Otherwise pooled/reused structs can carry stale KV and produce ambiguous state.
case yaml.ScalarNode, yaml.SequenceNode:
- return node.Decode(&v.Value)
+ v.KV = nil
+ return node.Decode(&v.Value)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // UnmarshalYAML implements yaml.Unmarshaler interface. | |
| func (v *SliceOrMapSlice) UnmarshalYAML(callback func(interface{}) error) error { | |
| // try to unmarshal it as a string and fallback to map | |
| if err := callback(&v.Value); err == nil { | |
| return nil | |
| } | |
| func (v *SliceOrMapSlice) UnmarshalYAML(node *yaml.Node) error { | |
| switch node.Kind { | |
| case yaml.ScalarNode, yaml.SequenceNode: | |
| return node.Decode(&v.Value) | |
| // UnmarshalYAML implements yaml.Unmarshaler interface. | |
| func (v *SliceOrMapSlice) UnmarshalYAML(node *yaml.Node) error { | |
| switch node.Kind { | |
| case yaml.ScalarNode, yaml.SequenceNode: | |
| // Clear any previously-set KV data to avoid stale state | |
| v.KV = nil | |
| return node.Decode(&v.Value) |
🤖 Prompt for AI Agents
In pkg/fuzz/type.go around lines 93 to 98, the UnmarshalYAML method currently
decodes into v.Value without clearing the alternate field, which can leave stale
KV data when structs are reused; update the method so that before decoding into
Value you set v.KV = nil (and likewise, before decoding into KV clear v.Value)
to zero the alternate field and avoid stale state from pooled/reused structs,
then proceed with the node.Decode call.
| case yaml.MappingNode: | ||
| // Handle as ordered map to preserve order | ||
| tmpx := mapsutil.NewOrderedMap[string, string]() | ||
| // preserve order | ||
| for _, v := range node { | ||
| tmpx.Set(v.Key.(string), v.Value.(string)) | ||
|
|
||
| // Process key-value pairs in order | ||
| for i := 0; i < len(node.Content); i += 2 { | ||
| if i+1 >= len(node.Content) { | ||
| break | ||
| } | ||
|
|
||
| keyNode := node.Content[i] | ||
| valueNode := node.Content[i+1] | ||
|
|
||
| var key string | ||
| if err := keyNode.Decode(&key); err != nil { | ||
| continue | ||
| } | ||
|
|
||
| var value string | ||
| if err := valueNode.Decode(&value); err != nil { | ||
| continue | ||
| } | ||
|
|
||
| tmpx.Set(key, value) | ||
| } | ||
| v.KV = &tmpx | ||
| return nil | ||
|
|
There was a problem hiding this comment.
🛠️ Refactor suggestion
Initialize/reset fields and capacity; consider surfacing decode errors instead of silent continue.
- Reset Value when decoding a mapping to avoid stale data.
- Pre-size the map for fewer allocations.
- Swallowing key/value decode errors hides malformed YAML; consider failing fast or collecting errors.
- case yaml.MappingNode:
+ case yaml.MappingNode:
// Handle as ordered map to preserve order
- tmpx := mapsutil.NewOrderedMap[string, string]()
+ tmpx := mapsutil.NewOrderedMap[string, string]()
+ v.Value = nil
@@
- if err := keyNode.Decode(&key); err != nil {
- continue
- }
+ if err := keyNode.Decode(&key); err != nil {
+ return fmt.Errorf("invalid mapping key: %w", err)
+ }
@@
- if err := valueNode.Decode(&value); err != nil {
- continue
- }
+ if err := valueNode.Decode(&value); err != nil {
+ return fmt.Errorf("invalid mapping value for key %q: %w", key, err)
+ }If you want to keep a permissive path, I can wire an option flag to control “continue vs. error”.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| case yaml.MappingNode: | |
| // Handle as ordered map to preserve order | |
| tmpx := mapsutil.NewOrderedMap[string, string]() | |
| // preserve order | |
| for _, v := range node { | |
| tmpx.Set(v.Key.(string), v.Value.(string)) | |
| // Process key-value pairs in order | |
| for i := 0; i < len(node.Content); i += 2 { | |
| if i+1 >= len(node.Content) { | |
| break | |
| } | |
| keyNode := node.Content[i] | |
| valueNode := node.Content[i+1] | |
| var key string | |
| if err := keyNode.Decode(&key); err != nil { | |
| continue | |
| } | |
| var value string | |
| if err := valueNode.Decode(&value); err != nil { | |
| continue | |
| } | |
| tmpx.Set(key, value) | |
| } | |
| v.KV = &tmpx | |
| return nil | |
| case yaml.MappingNode: | |
| // Handle as ordered map to preserve order | |
| tmpx := mapsutil.NewOrderedMap[string, string]() | |
| v.Value = nil | |
| // Process key-value pairs in order | |
| for i := 0; i < len(node.Content); i += 2 { | |
| if i+1 >= len(node.Content) { | |
| break | |
| } | |
| keyNode := node.Content[i] | |
| valueNode := node.Content[i+1] | |
| var key string | |
| if err := keyNode.Decode(&key); err != nil { | |
| return fmt.Errorf("invalid mapping key: %w", err) | |
| } | |
| var value string | |
| if err := valueNode.Decode(&value); err != nil { | |
| return fmt.Errorf("invalid mapping value for key %q: %w", key, err) | |
| } | |
| tmpx.Set(key, value) | |
| } | |
| v.KV = &tmpx | |
| return nil |
🤖 Prompt for AI Agents
In pkg/fuzz/type.go around lines 99 to 126, when decoding a yaml.MappingNode you
must reset v.Value (or relevant field) before populating KV, pre-size the
ordered map to node.Content length/2 to avoid extra allocations, and stop
silently swallowing decode errors: return or accumulate and return the first
decode error (or wire a permissive flag to choose continue vs fail-fast).
Specifically, clear/reset the target value(s) at the start of the mapping
branch, create the ordered map with capacity = len(node.Content)/2, replace the
continue on key/value Decode failures with error propagation (or call a
passed-in option to allow continuing), and set v.KV to the populated map only
after successful processing.
| func (ec *ExpressionCache) Get(expression string) (*govaluate.EvaluableExpression, error) { | ||
| ec.cacheLock.RLock() | ||
| if compiled, exists := ec.cache[expression]; exists { | ||
| ec.cacheLock.RUnlock() | ||
| return compiled, nil | ||
| } | ||
| ec.cacheLock.RUnlock() | ||
|
|
||
| var compiled *govaluate.EvaluableExpression | ||
| var err error | ||
|
|
||
| if ec.functions != nil { | ||
| compiled, err = govaluate.NewEvaluableExpressionWithFunctions(expression, ec.functions) | ||
| } else { | ||
| compiled, err = govaluate.NewEvaluableExpression(expression) | ||
| } | ||
|
|
||
| if err != nil { | ||
| return nil, err | ||
| } | ||
|
|
||
| ec.cacheLock.Lock() | ||
| ec.cache[expression] = compiled | ||
| ec.cacheLock.Unlock() | ||
|
|
||
| return compiled, nil | ||
| } |
There was a problem hiding this comment.
🛠️ Refactor suggestion
Avoid duplicate compilation under contention (double-checked locking).
Two goroutines can race to compile the same expression and one result is discarded. It’s safe but wastes CPU. Add a second check under the write lock.
func (ec *ExpressionCache) Get(expression string) (*govaluate.EvaluableExpression, error) {
- ec.cacheLock.RLock()
- if compiled, exists := ec.cache[expression]; exists {
- ec.cacheLock.RUnlock()
- return compiled, nil
- }
- ec.cacheLock.RUnlock()
-
- var compiled *govaluate.EvaluableExpression
- var err error
-
- if ec.functions != nil {
- compiled, err = govaluate.NewEvaluableExpressionWithFunctions(expression, ec.functions)
- } else {
- compiled, err = govaluate.NewEvaluableExpression(expression)
- }
- if err != nil {
- return nil, err
- }
-
- ec.cacheLock.Lock()
- ec.cache[expression] = compiled
- ec.cacheLock.Unlock()
- return compiled, nil
+ ec.cacheLock.RLock()
+ if compiled, ok := ec.cache[expression]; ok {
+ ec.cacheLock.RUnlock()
+ return compiled, nil
+ }
+ ec.cacheLock.RUnlock()
+
+ // Compile without holding locks
+ var compiledNew *govaluate.EvaluableExpression
+ var err error
+ if ec.functions != nil {
+ compiledNew, err = govaluate.NewEvaluableExpressionWithFunctions(expression, ec.functions)
+ } else {
+ compiledNew, err = govaluate.NewEvaluableExpression(expression)
+ }
+ if err != nil {
+ return nil, err
+ }
+
+ // Second-check under write lock to avoid duplicate inserts
+ ec.cacheLock.Lock()
+ if compiledExisting, ok := ec.cache[expression]; ok {
+ ec.cacheLock.Unlock()
+ return compiledExisting, nil
+ }
+ ec.cache[expression] = compiledNew
+ ec.cacheLock.Unlock()
+ return compiledNew, nil
}🤖 Prompt for AI Agents
In pkg/protocols/common/exprcache/exprcache.go around lines 29 to 55, the
current Get method can compile the same expression twice under contention; after
acquiring the write lock to store a newly compiled expression, add a second
check (double-checked locking) to see if the expression was populated by another
goroutine while this one was compiling; if it exists, discard the local compiled
value and return the cached one, otherwise store and return the compiled
expression. Ensure the read lock/unlock and write lock/unlock semantics remain
correct and that errors from compilation are still returned unchanged.
|
|
||
| "go.uber.org/multierr" | ||
| "gopkg.in/yaml.v2" | ||
| "gopkg.in/yaml.v3" |
There was a problem hiding this comment.
💡 Verification agent
🧩 Analysis chain
yaml.v3 migration looks good; close the Encoder and set indentation.
yaml.v3’s Encoder buffers output until Close is called. Here we’re not closing it, which can lead to truncated/partial writes on some writers. Also, explicitly setting indent keeps the file formatting stable across versions.
Apply this diff inside CreateConfigIfNotExists():
- err = yaml.NewEncoder(reportingFile).Encode(options)
- return err
+ enc := yaml.NewEncoder(reportingFile)
+ enc.SetIndent(2)
+ if err = enc.Encode(options); err != nil {
+ return err
+ }
+ if cerr := enc.Close(); cerr != nil {
+ return errkit.Append(errkit.New("could not finalize yaml encoding"), cerr)
+ }
+ return nilRun to spot other yaml.NewEncoder usages missing Close:
Also applies to: 236-237
🏁 Script executed:
#!/bin/bash
rg -n --type=go -C2 '\byaml\.NewEncoder\('Length of output: 533
Ensure YAML encoders are closed and indentation is explicitly set
YAML v3’s Encoder buffers output until Close is called. Omitting Close can lead to truncated or partial writes. Explicitly setting the indent also stabilizes formatting across versions. Please apply the following mandatory refactors:
• pkg/reporting/reporting.go (CreateConfigIfNotExists)
– Replace the one-liner encoder with a named encoder.
– Call SetIndent(2) before encoding.
– Handle Encode errors, then call Close() and propagate any Close error (e.g. via errkit).
--- pkg/reporting/reporting.go
@@ -234,4 +234,13 @@ func CreateConfigIfNotExists(...) error {
- err = yaml.NewEncoder(reportingFile).Encode(options)
- return err
+ enc := yaml.NewEncoder(reportingFile)
+ enc.SetIndent(2)
+ if err = enc.Encode(options); err != nil {
+ return err
+ }
+ if cerr := enc.Close(); cerr != nil {
+ return errkit.Append(errkit.New("could not finalize yaml encoding"), cerr)
+ }
+ return nil
}• cmd/tmc/main.go (around lines 386–388)
– indent is already set via yamlIndentSpaces, but Close() is never called.
– After Encode(infoBlock), invoke yamlEncoder.Close() and handle any returned error (wrapping or appending as per existing conventions).
--- cmd/tmc/main.go
@@ -386,3 +386,7 @@ func updateInfoBlock(...) error {
- yamlEncoder := yaml.NewEncoder(&newInfoBlock)
- yamlEncoder.SetIndent(yamlIndentSpaces)
- err = yamlEncoder.Encode(infoBlock)
+ yamlEncoder := yaml.NewEncoder(&newInfoBlock)
+ yamlEncoder.SetIndent(yamlIndentSpaces)
+ if err = yamlEncoder.Encode(infoBlock); err != nil {
+ return fmt.Errorf("yaml encode infoBlock: %w", err)
+ }
+ if cerr := yamlEncoder.Close(); cerr != nil {
+ return fmt.Errorf("error closing yaml encoder: %w", cerr)
+ }These changes ensure that all buffered data is flushed and formatting remains consistent.
Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In pkg/reporting/reporting.go around line 17 (CreateConfigIfNotExists), replace
the one-liner yaml.NewEncoder with a named encoder variable, call
yamlEncoder.SetIndent(2) before encoding, check and return any error from
yamlEncoder.Encode(...), then call yamlEncoder.Close() and propagate any Close()
error (use errkit wrapping/propagation consistent with project conventions); and
in cmd/tmc/main.go around lines 386–388, after calling Encode(infoBlock) call
yamlEncoder.Close() and handle/return or append any Close() error consistent
with surrounding error handling.
| template.Options.ApplyNewEngineOptions(options) | ||
| if template.CompiledWorkflow != nil { | ||
| template.CompiledWorkflow.Options.ApplyNewEngineOptions(options) | ||
| for _, w := range template.CompiledWorkflow.Workflows { | ||
| for _, ex := range w.Executers { | ||
| ex.Options.ApplyNewEngineOptions(options) | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
💡 Verification agent
🧩 Analysis chain
Cache-hit path misses options rebind for non-workflow executer (stale engine options at runtime).
On cache-hit, options are propagated to CompiledWorkflow executers, but the main template.Executer (non-workflow path) is not updated. This can cause reused executers to run with stale Output/RateLimiter/Logger/etc., leading to cross-scan leakage or incorrect behavior.
This does not require deep copies (which we want to avoid per prior design); we only need to rebind the engine options on the executer if it supports it.
Proposed minimal, non-invasive fix (duck-typing; no recompile, no deep copy):
template.Options.ApplyNewEngineOptions(options)
if template.CompiledWorkflow != nil {
template.CompiledWorkflow.Options.ApplyNewEngineOptions(options)
for _, w := range template.CompiledWorkflow.Workflows {
for _, ex := range w.Executers {
ex.Options.ApplyNewEngineOptions(options)
}
}
}
+ // Rebind engine options for non-workflow executer (if present)
+ if template.Executer != nil {
+ // Prefer an ApplyNewEngineOptions method if exposed by the executer
+ if applier, ok := any(template.Executer).(interface {
+ ApplyNewEngineOptions(*protocols.ExecutorOptions)
+ }); ok {
+ applier.ApplyNewEngineOptions(template.Options)
+ } else if setter, ok := any(template.Executer).(interface {
+ SetOptions(*protocols.ExecutorOptions)
+ }); ok {
+ setter.SetOptions(template.Options)
+ }
+ }To be safe, please verify whether the underlying executer type(s) expose either ApplyNewEngineOptions or SetOptions; if not, I can add a small adapter to tmplexec to support rebinding without recompilation.
Run to confirm availability:
🏁 Script executed:
#!/bin/bash
# Inspect executer types for option-rebinding hooks
rg -nP --type=go -C3 '\b(ApplyNewEngineOptions|SetOptions)\s*\('Length of output: 7741
🏁 Script executed:
#!/bin/bash
rg -nP 'type \w*Template\s+struct' -n pkg/templatesLength of output: 240
🏁 Script executed:
#!/bin/bash
# Locate and display the Template struct definition
start_line=$(grep -n "type Template struct" -m1 pkg/templates/templates.go | cut -d: -f1)
if [ -z "$start_line" ]; then
echo "Template struct not found"
exit 1
fi
# Show 50 lines from the struct definition
end_line=$((start_line + 50))
sed -n "${start_line},${end_line}p" pkg/templates/templates.go
# Search specifically for an Executer field in the Template struct
rg -n "Executer" -C2 pkg/templates/templates.goLength of output: 3170
🏁 Script executed:
#!/bin/bash
# Display the Executer interface definition in protocols.go
rg -nP 'type Executer\s+interface' -n pkg/protocols/protocols.go -C5Length of output: 537
Add explicit option-rebinding support for the main template.Executer
We’ve verified that the protocols.Executer held in Template.Executer does not currently expose any method named ApplyNewEngineOptions or SetOptions (so the proposed duck-typing check will never succeed). To ensure the non-workflow executor gets its engine options rebound on cache-hit, please make the following mandatory changes:
• Define a small interface for rebinding, e.g.:
// In pkg/protocols/protocols.go (or a new file under pkg/templates):
type OptionRebinder interface {
UpdateOptions(*ExecutorOptions)
}• Implement it on the types used as Template.Executer (e.g. in pkg/templates/tmplexec.go):
func (e *executor) UpdateOptions(opts *protocols.ExecutorOptions) {
// rebind the same way workflow executers do
e.Options.ApplyNewEngineOptions(opts)
}• In pkg/templates/compile.go, immediately after applying to CompiledWorkflow, add:
// existing workflow-path rebinding...
if template.CompiledWorkflow != nil {
…
}
+ // Rebind engine options on the main executer (cache-hit path)
+ if rebinder, ok := template.Executer.(OptionRebinder); ok {
+ rebinder.UpdateOptions(options)
+ }These changes ensure that, on a cache hit, both the workflow executers and the main template.Executer are updated with the latest ExecutorOptions.
🤖 Prompt for AI Agents
In pkg/templates/compile.go around lines 74 to 82, the code updates engine
options for CompiledWorkflow and its workflow executers but never rebinding the
main Template.Executer; add explicit option-rebinding support by 1) defining a
small interface (e.g. OptionRebinder with
UpdateOptions(*protocols.ExecutorOptions)) in pkg/protocols (or a new file under
pkg/protocols), 2) implement UpdateOptions on the concrete types used as
Template.Executer (e.g. in pkg/templates/tmplexec.go) to call their internal
Options.ApplyNewEngineOptions(opts), and 3) after applying options to
CompiledWorkflow in compile.go, check if template.Executer implements the
OptionRebinder interface and call UpdateOptions with the same options so the
top-level executor is updated on cache-hit.
| func (insertionOrderedStringMap *InsertionOrderedStringMap) UnmarshalYAML(node *yaml.Node) error { | ||
| if node.Kind != yaml.MappingNode { | ||
| return fmt.Errorf("insertion_ordered_map: expected mapping, got %v", node.Kind) | ||
| } | ||
|
|
||
| insertionOrderedStringMap.values = make(map[string]interface{}) | ||
| for _, v := range data { | ||
| if v.Key == nil { | ||
|
|
||
| // Process key-value pairs in order | ||
| for i := 0; i < len(node.Content); i += 2 { | ||
| if i+1 >= len(node.Content) { | ||
| break | ||
| } | ||
|
|
||
| keyNode := node.Content[i] | ||
| valueNode := node.Content[i+1] | ||
|
|
||
| var key string | ||
| if err := keyNode.Decode(&key); err != nil { | ||
| continue | ||
| } | ||
| insertionOrderedStringMap.Set(v.Key.(string), toString(v.Value)) | ||
|
|
||
| var value interface{} | ||
| if err := valueNode.Decode(&value); err != nil { | ||
| continue | ||
| } | ||
|
|
||
| insertionOrderedStringMap.Set(key, toString(value)) | ||
| } | ||
| return nil | ||
| } |
There was a problem hiding this comment.
Stale keys bug: keys slice is not reset during YAML unmarshal.
You reinitialize the values map but not the keys slice. Reused instances will retain old keys; ForEach will emit keys absent from values with nil values. This causes subtle corruption.
func (insertionOrderedStringMap *InsertionOrderedStringMap) UnmarshalYAML(node *yaml.Node) error {
- if node.Kind != yaml.MappingNode {
+ if node.Kind == yaml.AliasNode {
+ return insertionOrderedStringMap.UnmarshalYAML(node.Alias)
+ }
+ if node.Kind != yaml.MappingNode {
return fmt.Errorf("insertion_ordered_map: expected mapping, got %v", node.Kind)
}
- insertionOrderedStringMap.values = make(map[string]interface{})
+ // Reset internal state; pre-size for fewer allocations
+ capacity := len(node.Content) / 2
+ insertionOrderedStringMap.keys = make([]string, 0, capacity)
+ insertionOrderedStringMap.values = make(map[string]interface{}, capacity)
@@
- if err := keyNode.Decode(&key); err != nil {
- continue
- }
+ if err := keyNode.Decode(&key); err != nil {
+ return fmt.Errorf("invalid key: %w", err)
+ }
@@
- if err := valueNode.Decode(&value); err != nil {
- continue
- }
+ if err := valueNode.Decode(&value); err != nil {
+ return fmt.Errorf("invalid value for key %q: %w", key, err)
+ }📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| func (insertionOrderedStringMap *InsertionOrderedStringMap) UnmarshalYAML(node *yaml.Node) error { | |
| if node.Kind != yaml.MappingNode { | |
| return fmt.Errorf("insertion_ordered_map: expected mapping, got %v", node.Kind) | |
| } | |
| insertionOrderedStringMap.values = make(map[string]interface{}) | |
| for _, v := range data { | |
| if v.Key == nil { | |
| // Process key-value pairs in order | |
| for i := 0; i < len(node.Content); i += 2 { | |
| if i+1 >= len(node.Content) { | |
| break | |
| } | |
| keyNode := node.Content[i] | |
| valueNode := node.Content[i+1] | |
| var key string | |
| if err := keyNode.Decode(&key); err != nil { | |
| continue | |
| } | |
| insertionOrderedStringMap.Set(v.Key.(string), toString(v.Value)) | |
| var value interface{} | |
| if err := valueNode.Decode(&value); err != nil { | |
| continue | |
| } | |
| insertionOrderedStringMap.Set(key, toString(value)) | |
| } | |
| return nil | |
| } | |
| func (insertionOrderedStringMap *InsertionOrderedStringMap) UnmarshalYAML(node *yaml.Node) error { | |
| // Handle YAML aliases (anchors) | |
| if node.Kind == yaml.AliasNode { | |
| return insertionOrderedStringMap.UnmarshalYAML(node.Alias) | |
| } | |
| if node.Kind != yaml.MappingNode { | |
| return fmt.Errorf("insertion_ordered_map: expected mapping, got %v", node.Kind) | |
| } | |
| // Reset internal state; pre-size for fewer allocations | |
| capacity := len(node.Content) / 2 | |
| insertionOrderedStringMap.keys = make([]string, 0, capacity) | |
| insertionOrderedStringMap.values = make(map[string]interface{}, capacity) | |
| // Process key-value pairs in order | |
| for i := 0; i < len(node.Content); i += 2 { | |
| if i+1 >= len(node.Content) { | |
| break | |
| } | |
| keyNode := node.Content[i] | |
| valueNode := node.Content[i+1] | |
| var key string | |
| if err := keyNode.Decode(&key); err != nil { | |
| return fmt.Errorf("invalid key: %w", err) | |
| } | |
| var value interface{} | |
| if err := valueNode.Decode(&value); err != nil { | |
| return fmt.Errorf("invalid value for key %q: %w", key, err) | |
| } | |
| insertionOrderedStringMap.Set(key, toString(value)) | |
| } | |
| return nil | |
| } |
🤖 Prompt for AI Agents
In pkg/utils/insertion_ordered_map.go around lines 36 to 65, the UnmarshalYAML
reinitializes the values map but doesn’t reset the keys slice so reused
instances retain stale keys; before populating from the YAML node, clear or
reassign insertionOrderedStringMap.keys to an empty slice (e.g. nil or
make([]string,0)) so the keys list matches the new values, then proceed to
decode and call Set as before; ensure any early-continue on decode errors does
not leave partial/inconsistent state by only appending keys when a value is
successfully decoded.
| name: "multiple includes with different indentation", | ||
| input: fmt.Sprintf(`level1: | ||
| # !include:%s | ||
| level2: | ||
| # !include:%s`, includeFile1, includeFile2), | ||
| expected: fmt.Sprintf(`level1: | ||
| %s | ||
| level2: | ||
| %s`, | ||
| strings.ReplaceAll(includeContent1, "\n", "\n "), | ||
| strings.ReplaceAll(includeContent2, "\n", "\n ")), | ||
| }, |
There was a problem hiding this comment.
🛠️ Refactor suggestion
Add a case for “same file included at different indentation levels” (exposes a padBytes/positioning bug).
PreProcess computes padBytes using bytes.Index on the first occurrence of the matched substring. If the same include directive appears multiple times with different indentation, padding for later occurrences can be derived from the first match’s indentation, producing mis-indented expansions. Add a test for identical include paths at different depths to lock this down.
Apply this diff to extend the tests:
@@
{
name: "multiple includes with different indentation",
@@
strings.ReplaceAll(includeContent2, "\n", "\n ")),
},
+ {
+ name: "same file included at different indentation",
+ input: fmt.Sprintf(`# top level
+# !include:%[1]s
+parent:
+ child:
+ # !include:%[1]s`, includeFile1),
+ expected: fmt.Sprintf(`top level
+%[1]s
+parent:
+ child:
+ %[2]s`,
+ includeContent1,
+ strings.ReplaceAll(includeContent1, "\n", "\n ")),
+ },Follow-up (in production code pkg/utils/yaml/preprocess.go): switch to FindAllSubmatchIndex (or equivalent) and compute padBytes from each match’s concrete start index instead of bytes.Index on the full buffer. For example:
@@
- importMatches := reImportsPattern.FindAllSubmatch(data, -1)
+ importMatches := reImportsPattern.FindAllSubmatchIndex(data, -1)
@@
- for _, match := range importMatches {
- matchBytes := match[0]
- matchString = string(matchBytes)
- if len(match) > 0 {
- includeFileName = string(match[1])
- }
- matchIndex := bytes.Index(data, matchBytes)
+ for _, loc := range importMatches {
+ // loc: [fullStart, fullEnd, submatch1Start, submatch1End, ...]
+ fullStart, fullEnd := loc[0], loc[1]
+ incStart, incEnd := loc[2], loc[3]
+ matchIndex := fullStart
+ matchBytes := data[fullStart:fullEnd]
+ matchString = string(matchBytes)
+ includeFileName = string(data[incStart:incEnd])
@@
- lastNewLineIndex := bytes.LastIndex(data[:matchIndex], []byte("\n"))
+ lastNewLineIndex := bytes.LastIndex(data[:matchIndex], []byte("\n"))This avoids indentation bleed between identical includes at different offsets. I can send a focused PR patch if helpful.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| name: "multiple includes with different indentation", | |
| input: fmt.Sprintf(`level1: | |
| # !include:%s | |
| level2: | |
| # !include:%s`, includeFile1, includeFile2), | |
| expected: fmt.Sprintf(`level1: | |
| %s | |
| level2: | |
| %s`, | |
| strings.ReplaceAll(includeContent1, "\n", "\n "), | |
| strings.ReplaceAll(includeContent2, "\n", "\n ")), | |
| }, | |
| { | |
| name: "multiple includes with different indentation", | |
| input: fmt.Sprintf(`level1: | |
| # !include:%s | |
| level2: | |
| # !include:%s`, includeFile1, includeFile2), | |
| expected: fmt.Sprintf(`level1: | |
| %s | |
| level2: | |
| %s`, | |
| strings.ReplaceAll(includeContent1, "\n", "\n "), | |
| strings.ReplaceAll(includeContent2, "\n", "\n ")), | |
| }, | |
| { | |
| name: "same file included at different indentation", | |
| input: fmt.Sprintf(`# top level | |
| # !include:%[1]s | |
| parent: | |
| child: | |
| # !include:%[1]s`, includeFile1), | |
| expected: fmt.Sprintf(`top level | |
| %[1]s | |
| parent: | |
| child: | |
| %[2]s`, | |
| includeContent1, | |
| strings.ReplaceAll(includeContent1, "\n", "\n ")), | |
| }, |
| if os.Getuid() == 0 { // Skip if running as root | ||
| t.Skip("Skipping permission test when running as root") | ||
| } | ||
|
|
||
| restrictedFile := filepath.Join(tempDir, "restricted.yaml") | ||
| content := "restricted: content" | ||
| err := os.WriteFile(restrictedFile, []byte(content), 0000) | ||
| if err != nil { | ||
| return "", err | ||
| } | ||
| return fmt.Sprintf("# !include:%s", restrictedFile), nil | ||
| }, | ||
| expectError: true, |
There was a problem hiding this comment.
Fix: os.Getuid is unix-only; the test won’t build on Windows.
os.Getuid is not available on Windows. Gate the test by OS or use a portable approach. Also restore StrictSyntax after mutation.
Apply this diff to make the test portable:
@@
- if os.Getuid() == 0 { // Skip if running as root
- t.Skip("Skipping permission test when running as root")
- }
+ // Skip on non-Unix OSes and when running as root on Unix
+ if runtime.GOOS == "windows" || runtime.GOOS == "plan9" {
+ t.Skip("Skipping permission test on non-Unix OS")
+ }
+ if os.Geteuid() == 0 {
+ t.Skip("Skipping permission test when running as root")
+ }
@@
- restrictedFile := filepath.Join(tempDir, "restricted.yaml")
+ restrictedFile := filepath.Join(tempDir, "restricted.yaml")
content := "restricted: content"
err := os.WriteFile(restrictedFile, []byte(content), 0000)And at the top of this file:
@@
-import (
+import (
"fmt"
"os"
"path/filepath"
"strings"
"testing"
+ "runtime"Note: os.Geteuid is available on Unix. If you need a stricter cross-platform behavior, consider skipping this case entirely on non-Unix.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if os.Getuid() == 0 { // Skip if running as root | |
| t.Skip("Skipping permission test when running as root") | |
| } | |
| restrictedFile := filepath.Join(tempDir, "restricted.yaml") | |
| content := "restricted: content" | |
| err := os.WriteFile(restrictedFile, []byte(content), 0000) | |
| if err != nil { | |
| return "", err | |
| } | |
| return fmt.Sprintf("# !include:%s", restrictedFile), nil | |
| }, | |
| expectError: true, | |
| diff --git a/pkg/utils/yaml/preprocess_test.go b/pkg/utils/yaml/preprocess_test.go | |
| index abcdef0..1234567 100644 | |
| ++ b/pkg/utils/yaml/preprocess_test.go | |
| @@ -1,6 +1,7 @@ | |
| import ( | |
| "fmt" | |
| "os" | |
| "path/filepath" | |
| "strings" | |
| "testing" | |
| "runtime" | |
| ) | |
| func TestPreprocess(t *testing.T) { | |
| @@ -241,10 +242,14 @@ func TestPreprocess(t *testing.T) { | |
| // Mutate the input to point to a file with no permissions | |
| func() (string, error) { | |
| - if os.Getuid() == 0 { // Skip if running as root | |
| - t.Skip("Skipping permission test when running as root") | |
| // Skip on non-Unix OSes and when running as root on Unix | |
| if runtime.GOOS == "windows" || runtime.GOOS == "plan9" { | |
| t.Skip("Skipping permission test on non-Unix OS") | |
| } | |
| if os.Geteuid() == 0 { | |
| t.Skip("Skipping permission test when running as root") | |
| } | |
| restrictedFile := filepath.Join(tempDir, "restricted.yaml") | |
| content := "restricted: content" | |
| err := os.WriteFile(restrictedFile, []byte(content), 0000) | |
| if err != nil { | |
| return "", err | |
| } | |
| return fmt.Sprintf("# !include:%s", restrictedFile), nil | |
| }, | |
| expectError: true, |
🤖 Prompt for AI Agents
In pkg/utils/yaml/preprocess_test.go around lines 241 to 253, the test uses
os.Getuid() which is Unix-only and prevents building on Windows; change the test
to skip this permission check on non-Unix platforms (e.g., guard with a
runtime.GOOS check or build tags) or replace with a portable approach (skip when
runtime.GOOS == "windows"); also ensure that after mutating StrictSyntax in the
test you restore its original value before returning so subsequent tests are
unaffected.
| func TestDecodeAndValidate_ConcurrentAccess(t *testing.T) { | ||
| yamlData := `name: "John Doe" | ||
| email: "john@example.com" | ||
| age: 30 | ||
| username: "johndoe"` | ||
|
|
||
| // Test concurrent access to ensure validator is thread-safe | ||
| done := make(chan bool, 10) | ||
| for i := 0; i < 10; i++ { | ||
| go func() { | ||
| defer func() { done <- true }() | ||
| for j := 0; j < 100; j++ { | ||
| var target ValidStruct | ||
| reader := strings.NewReader(yamlData) | ||
| err := DecodeAndValidate(reader, &target) | ||
| assert.NoError(t, err) | ||
| } | ||
| }() | ||
| } | ||
|
|
||
| for i := 0; i < 10; i++ { | ||
| <-done | ||
| } | ||
| } |
There was a problem hiding this comment.
🛠️ Refactor suggestion
Concurrent access test may race on lazy validator init; fix production code with sync.Once.
DecodeAndValidate initializes a package-level validator without synchronization. Under -race this will flag a data race. Make initialization once-only.
Proposed change in pkg/utils/yaml/yaml_decode_wrapper.go (outside this diff):
package yaml
import "sync"
// ...
var (
validate *validator.Validate
validateOnce sync.Once
)
func DecodeAndValidate(r io.Reader, v interface{}) error {
if err := yaml.NewDecoder(r).Decode(v); err != nil {
return err
}
validateOnce.Do(func() { validate = validator.New() })
// ...
}🤖 Prompt for AI Agents
In pkg/utils/yaml/yaml_decode_wrapper_test.go around lines 293 to 316, the test
exposes a data race because the package-level validator is lazily initialized
without synchronization; fix the production file
pkg/utils/yaml/yaml_decode_wrapper.go by adding a sync.Once-controlled
initialization: declare a package-level validateOnce sync.Once alongside the
validate *validator.Validate variable, ensure the file imports "sync", and call
validateOnce.Do(func(){ validate = validator.New() }) inside DecodeAndValidate
before using validate so the validator is created exactly once and is safe under
concurrent calls.
| "gopkg.in/yaml.v3" | ||
| ) | ||
|
|
||
| var validate *validator.Validate |
There was a problem hiding this comment.
🛠️ Refactor suggestion
Data race risk on package-global validator initialization.
validate is lazily initialized without synchronization. Concurrent calls can race. Use sync.Once.
import (
"io"
"strings"
"github.com/go-playground/validator/v10"
"github.com/pkg/errors"
"gopkg.in/yaml.v3"
+ "sync"
)
var validate *validator.Validate
+var validateOnce sync.Once
@@
- if validate == nil {
- validate = validator.New()
- }
+ validateOnce.Do(func() {
+ validate = validator.New()
+ })Also applies to: 19-21
🤖 Prompt for AI Agents
In pkg/utils/yaml/yaml_decode_wrapper.go around lines 12 and 19-21, the
package-global variable "validate" is lazily initialized without synchronization
which creates a data race under concurrent calls; fix by introducing a sync.Once
(e.g., validateOnce) and perform the initialization of the validator inside
validateOnce.Do(...) so the validator is created exactly once before use, ensure
all code paths use the initialized validator and remove any unsynchronized nil
checks; update imports to include sync and verify no other unsynchronized
package globals remain.
Proposed changes
Checklist
Summary by CodeRabbit
Performance Improvements
Bug Fixes
Chores