Skip to content

fix(templates): mem leaks in parser cache#6584

Merged
Mzack9999 merged 1 commit intodevfrom
dwisiswant0/fix/templates/mem-leaks-in-parser-cache
Nov 3, 2025
Merged

fix(templates): mem leaks in parser cache#6584
Mzack9999 merged 1 commit intodevfrom
dwisiswant0/fix/templates/mem-leaks-in-parser-cache

Conversation

@dwisiswant0
Copy link
Member

@dwisiswant0 dwisiswant0 commented Nov 3, 2025

Fixes duplicate template storage & removes unnecessary raw bytes caching.

Proposed changes

  • Use StoreWithoutRaw() to avoid storing raw bytes.
  • Remove duplicate storage in both caches.
  • Remove ineffective raw bytes retrieval logic.

Proof

dev:

$ go tool pprof -list=ParseTemplate -base=heap_0_dev.prof heap_4_dev.prof | grep -1 ROUTINE
Total: 164.44MB
ROUTINE ======================== github.com/projectdiscovery/nuclei/v3/pkg/templates.(*Parser).ParseTemplate in /home/dw1/Development/PD/nuclei/pkg/templates/parser.go
   10.01MB       80MB (flat, cum) 48.65% of Total
--
         .          .    187:		return nil, err
ROUTINE ======================== github.com/projectdiscovery/nuclei/v3/pkg/templates.ParseTemplateFromReader in /home/dw1/Development/PD/nuclei/pkg/templates/compile.go
         0    73.76MB (flat, cum) 44.86% of Total

this patch (+ projectdiscovery/retryablehttp-go#483):

$ grep "replace" go.mod
replace github.com/projectdiscovery/nuclei/v3 => /home/dw1/Development/PD/nuclei
replace github.com/projectdiscovery/retryablehttp-go => /home/dw1/Development/PD/retryablehttp-go
$ go tool pprof -list=ParseTemplate -base=heap_0_patch.prof heap_4_patch.prof | grep -1 ROUTINE
Total: 140.77MB
ROUTINE ======================== github.com/projectdiscovery/nuclei/v3/pkg/templates.(*Parser).ParseTemplate in /home/dw1/Development/PD/nuclei/pkg/templates/parser.go
       1MB     3.16MB (flat, cum)  2.24% of Total
--
         .          .    195:// LoadWorkflow returns true if the workflow is valid and matches the filtering criteria.
ROUTINE ======================== github.com/projectdiscovery/nuclei/v3/pkg/templates.ParseTemplateFromReader in /home/dw1/Development/PD/nuclei/pkg/templates/compile.go
         0    13.99MB (flat, cum)  9.94% of Total

dev v. this patch:

$ go tool pprof -list=ParseTemplate -base=heap_4_dev.prof heap_4_patch.prof
Total: 402.33MB
ROUTINE ======================== github.com/projectdiscovery/nuclei/v3/pkg/templates.(*Parser).ParseTemplate in /home/dw1/Development/PD/nuclei/pkg/templates/parser.go
   -3.50MB    68.23kB (flat, cum) 0.017% of Total
         .          .    121:			checkOpenFileError(validationWarning)
         .          .    122:			return ret, errkit.Newf("Could not load template %s: %s", templatePath, validationWarning)
         .          .    123:		}
         .          .    124:	}
         .          .    125:	return ret, nil
         .          .    126:}
         .          .    127:
         .          .    128:// ParseTemplate parses a template and returns a *templates.Template structure
         .          .    129:func (p *Parser) ParseTemplate(templatePath string, catalog catalog.Catalog) (any, error) {
         .          .    130:	value, _, err := p.parsedTemplatesCache.Has(templatePath)
         .          .    131:	if value != nil {
         .          .    132:		return value, err
         .          .    133:	}
         .          .    134:
         .          .    135:	reader, err := utils.ReaderFromPathOrURL(templatePath, catalog)
         .          .    136:	if err != nil {
         .          .    137:		return nil, err
         .          .    138:	}
         .          .    139:	defer func() {
         .          .    140:		_ = reader.Close()
         .          .    141:	}()
         .  -528.17kB    142:
         .          .    143:	// For local YAML files, check if preprocessing is needed
         .          .    144:	var data []byte
         .          .    145:	if fileutil.FileExists(templatePath) && config.GetTemplateFormatFromExt(templatePath) == config.YAML {
         .          .    146:		data, err = io.ReadAll(reader)
         .          .    147:		if err != nil {
  -10.01MB   -10.01MB    148:			return nil, err
         .          .    149:		}
         .          .    150:		data, err = yamlutil.PreProcess(data)
         .          .    151:		if err != nil {
         .          .    152:			return nil, err
         .          .    153:		}
         .          .    154:	}
         .          .    155:
    6.50MB     6.50MB    156:	template := &Template{}
         .          .    157:
         .          .    158:	switch config.GetTemplateFormatFromExt(templatePath) {
         .          .    159:	case config.JSON:
         .          .    160:		if data == nil {
         .          .    161:			data, err = io.ReadAll(reader)
         .          .    162:			if err != nil {
         .          .    163:				return nil, err
         .          .    164:			}
         .   -68.89MB    165:		}
         .          .    166:		err = json.Unmarshal(data, template)
         .          .    167:	case config.YAML:
         .          .    168:		if data != nil {
         .          .    169:			// Already read and preprocessed
         .          .    170:			if p.NoStrictSyntax {
         .          .    171:				err = yaml.Unmarshal(data, template)
         .          .    172:			} else {
         .    73.53MB    173:				err = yaml.UnmarshalStrict(data, template)
         .          .    174:			}
         .          .    175:		} else {
         .          .    176:			// Stream directly from reader
         .          .    177:			decoder := yaml.NewDecoder(reader)
         .          .    178:			if !p.NoStrictSyntax {
         .          .    179:				decoder.SetStrict(true)
         .          .    180:			}
         .          .    181:			err = decoder.Decode(template)
         .    -1.10MB    182:		}
         .          .    183:	default:
         .          .    184:		err = fmt.Errorf("failed to identify template format expected JSON or YAML but got %v", templatePath)
         .          .    185:	}
         .          .    186:	if err != nil {
         .          .    187:		return nil, err
         .          .    188:	}
         .          .    189:
         .   565.76kB    190:	p.parsedTemplatesCache.StoreWithoutRaw(templatePath, template, nil)
         .          .    191:
         .          .    192:	return template, nil
         .          .    193:}
         .          .    194:
         .          .    195:// LoadWorkflow returns true if the workflow is valid and matches the filtering criteria.
ROUTINE ======================== github.com/projectdiscovery/nuclei/v3/pkg/templates.ParseTemplateFromReader in /home/dw1/Development/PD/nuclei/pkg/templates/compile.go
         0    14.44MB (flat, cum)  3.59% of Total
         .          .    414:func ParseTemplateFromReader(reader io.Reader, preprocessor Preprocessor, options *protocols.ExecutorOptions) (*Template, error) {
         .    12.70MB    415:	data, err := io.ReadAll(reader)
         .          .    416:	if err != nil {
         .          .    417:		return nil, err
         .          .    418:	}
         .          .    419:
         .          .    420:	// a preprocessor is a variable like
         .          .    421:	// {{randstr}} which is replaced before unmarshalling
         .          .    422:	// as it is known to be a random static value per template
         .          .    423:	hasPreprocessor := false
         .          .    424:	allPreprocessors := getPreprocessors(preprocessor)
         .          .    425:	for _, preprocessor := range allPreprocessors {
         .          .    426:		if preprocessor.Exists(data) {
         .   -13.66MB    427:			hasPreprocessor = true
         .          .    428:			break
         .          .    429:		}
         .          .    430:	}
         .          .    431:
         .          .    432:	if !hasPreprocessor {
         .          .    433:		// if no preprocessors exists parse template and exit
         .    79.03MB    434:		template, err := parseTemplate(data, options)
         .          .    435:		if err != nil {
         .          .    436:			return nil, err
         .          .    437:		}
         .          .    438:		if !template.Verified && len(template.Workflows) == 0 {
         .          .    439:			if config.DefaultConfig.LogAllEvents {
         .          .    440:				gologger.DefaultLogger.Print().Msgf("[%v] Template %s is not signed or tampered\n", aurora.Yellow("WRN").String(), template.ID)
         .          .    441:			}
         .          .    442:		}
         .          .    443:		return template, nil
         .          .    444:	}
         .          .    445:
         .   -62.63MB    446:	// if preprocessor is required / exists in this template
         .          .    447:	// first unmarshal it and check if its verified
         .          .    448:	// persist verified status value and then
         .          .    449:	// expand all preprocessor and reparse template
         .          .    450:
         .          .    451:	// === signature verification before preprocessors ===
         .        1MB    452:	template, err := parseTemplate(data, options)
         .          .    453:	if err != nil {
         .          .    454:		return nil, err
         .          .    455:	}
         .          .    456:	isVerified := template.Verified
         .          .    457:	if !template.Verified && len(template.Workflows) == 0 {
         .          .    458:		// workflows are not signed by default
         .          .    459:		if config.DefaultConfig.LogAllEvents {
         .          .    460:			gologger.DefaultLogger.Print().Msgf("[%v] Template %s is not signed or tampered\n", aurora.Yellow("WRN").String(), template.ID)
         .          .    461:		}
         .          .    462:	}
         .          .    463:
         .  -512.69kB    464:	generatedConstants := map[string]interface{}{}
         .          .    465:	// ==== execute preprocessors ======
         .          .    466:	for _, v := range allPreprocessors {
         .          .    467:		var replaced map[string]interface{}
         .     3.01MB    468:		data, replaced = v.ProcessNReturnData(data)
         .          .    469:		// preprocess kind of act like a constant and are generated while loading
         .          .    470:		// and stay constant for the template lifecycle
         .          .    471:		generatedConstants = generators.MergeMaps(generatedConstants, replaced)
         .          .    472:	}
         .        3MB    473:	reParsed, err := parseTemplate(data, options)
         .          .    474:	if err != nil {
         .          .    475:		return nil, err
         .          .    476:	}
         .          .    477:	// add generated constants to constants map and executer options
         .          .    478:	reParsed.Constants = generators.MergeMaps(reParsed.Constants, generatedConstants)
         .          .    479:	reParsed.Options.Constants = reParsed.Constants
         .    -2.01MB    480:	reParsed.Verified = isVerified
         .          .    481:	return reParsed, nil
         .          .    482:}
         .          .    483:
         .          .    484:// this method does not include any kind of preprocessing
         .       -5MB    485:func parseTemplate(data []byte, srcOptions *protocols.ExecutorOptions) (*Template, error) {
         .          .    486:	// Create a copy of the options specifically for this template
         .          .    487:	options := srcOptions.Copy()
         .          .    488:
         .          .    489:	template := &Template{}
         .  -512.14kB    490:	var err error
         .          .    491:	switch config.GetTemplateFormatFromExt(template.Path) {
         .          .    492:	case config.JSON:
         .          .    493:		err = json.Unmarshal(data, template)
         .          .    494:	case config.YAML:
         .          .    495:		err = yaml.Unmarshal(data, template)

Checklist

  • Pull request is created against the dev branch
  • All checks passed (lint, unit/integration/regression tests etc.) with my changes
  • I have added tests that prove my fix is effective or that my feature works
  • I have added necessary documentation (if appropriate)

Summary by CodeRabbit

  • Refactor
    • Optimized template caching to reduce memory usage by eliminating redundant storage in the caching layer.

Fixes duplicate template storage & removes
unnecessary raw bytes caching.

Mem usage reduced by ~30%.
> 423MB => 299MB heap alloc.

* Use `StoreWithoutRaw()` to avoid storing raw
  bytes.
* Remove duplicate storage in both caches.
* Remove ineffective raw bytes retrieval logic.

Benchmarks show 45% perf improvement with no
regressions.

Signed-off-by: Dwi Siswanto <git@dw1.io>
@dwisiswant0 dwisiswant0 requested a review from Mzack9999 November 3, 2025 14:04
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 3, 2025

Walkthrough

The changes remove in-memory caching of raw template data, requiring the parsing pipeline to always read raw templates directly from disk or URLs. The caching mechanism is simplified to exclude raw content storage via a new StoreWithoutRaw method, and a new compiledTemplatesCache field is added to the Parser.

Changes

Cohort / File(s) Change Summary
Template parsing and caching refactor
pkg/templates/compile.go, pkg/templates/parser.go
Removed in-memory raw template caching; updated to always read raw templates from disk/URL via utils.ReaderFromPathOrURL. Replaced Store(...) calls with StoreWithoutRaw(...) to reduce memory footprint. Added compiledTemplatesCache field to Parser with enhanced documentation. Removed unused bytes import from compile.go.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

  • Verify that removal of cached raw template reads does not introduce performance regressions or functional issues
  • Confirm all Store calls have been properly migrated to StoreWithoutRaw for consistency
  • Check compiledTemplatesCache initialization and lifecycle in Parser

Poem

🐰 A rabbit hops through memory's cache,
Removing bytes to make things dash—
No raw templates stored in ram,
Just lightweight logic, bam bam bam!
Disk it is, forever true,
The caching way, fresh and new.

Pre-merge checks and finishing touches

✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The pull request title 'fix(templates): mem leaks in parser cache' is directly related to the main changes in the changeset. The modifications remove duplicate template storage and eliminate unnecessary raw bytes caching in the templates parser, which directly address memory leaks in the parser cache as stated in the title. The title is concise, specific, and clearly conveys the primary objective of the changes.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch dwisiswant0/fix/templates/mem-leaks-in-parser-cache

Comment @coderabbitai help to get the list of available commands and usage tips.

@dwisiswant0 dwisiswant0 linked an issue Nov 3, 2025 that may be closed by this pull request
@Mzack9999 Mzack9999 merged commit 5147c72 into dev Nov 3, 2025
20 checks passed
@Mzack9999 Mzack9999 deleted the dwisiswant0/fix/templates/mem-leaks-in-parser-cache branch November 3, 2025 15:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

perf: Memory leak in template parser causing excessive heap alloc

2 participants