Security: harden regex usage against ReDoS (S5852)#281
Merged
Conversation
7 tasks
- content-sanitiser: add MAX_SCAN_CHARS=200_000 guard before injection
pattern scan; simplify delete-files regex to remove adjacent optional
whitespace groups that could cause super-linear backtracking
- content-sanitiser tests: add TestReDoSAdversarialInput with 5 regression
tests for long/pathological inputs; all 44 tests pass
- EmailContinueForm: replace unbounded EMAIL_REGEX with bounded-quantifier
version, add MAX_EMAIL_LENGTH=254 constant, maxLength={254} on input,
and explicit length guard in both isEmailValid and handleSubmit
- BlogPostForm: add maxLength={200} to title input; split slug cleanup into
two linear anchored replaces to eliminate alternation flagged by Sonar S5852
- check-naming-conventions.js: add NOSONAR:javascript:S5852 comments with
written justification for both CI-only regex patterns (lines 35, 50)
that run only against trusted repository source
Agent-Logs-Url: https://github.com/NickLetts2/Curvit/sessions/992012cf-81a3-4603-a017-531d943556a4
Co-authored-by: NickLetts2 <90337962+NickLetts2@users.noreply.github.com>
… tests Agent-Logs-Url: https://github.com/NickLetts2/Curvit/sessions/992012cf-81a3-4603-a017-531d943556a4 Co-authored-by: NickLetts2 <90337962+NickLetts2@users.noreply.github.com>
Copilot
AI
changed the title
[WIP] Fix regex usage to harden against ReDoS vulnerabilities
Security: harden regex usage against ReDoS (S5852)
May 15, 2026
Contributor
There was a problem hiding this comment.
Pull request overview
This PR hardens regex usage flagged by SonarCloud S5852 across the content sanitiser, frontend forms, and a contracts CI script. The main intent is reducing ReDoS risk for untrusted uploaded text while preserving existing prompt-injection and form behavior.
Changes:
- Bounds prompt-injection regex scanning in the content sanitiser and simplifies one risky pattern.
- Adds frontend input bounds and adjusts slug/email regex handling.
- Adds regression tests for adversarial sanitiser inputs and documents CI-only regex suppressions.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
services/content-sanitiser/app/main.py |
Adds bounded injection scanning and updates the delete-files regex. |
services/content-sanitiser/tests/test_injection_patterns.py |
Adds delete-files variants and long-input ReDoS regression tests. |
apps/app-frontend/src/components/auth/EmailContinueForm.tsx |
Adds email length limits and bounded validation regex. |
apps/app-frontend/src/components/admin/blog/BlogPostForm.tsx |
Adds title length limit and splits slug trim regex. |
shared/contracts/scripts/check-naming-conventions.js |
Adds Sonar suppression comments for CI-only regex checks. |
Comment on lines
+142
to
+146
| scan_text = text[:MAX_SCAN_CHARS] | ||
| detected: list[str] = [ | ||
| label | ||
| for pattern, label in _INJECTION_PATTERNS | ||
| if pattern.search(text) | ||
| if pattern.search(scan_text) |
Comment on lines
+35
to
+36
| // NOSONAR: javascript:S5852 — runs only against trusted repository source in | ||
| // local/CI contract checks, never against attacker-controlled runtime input. |
Comment on lines
+52
to
+53
| // NOSONAR: javascript:S5852 — runs only against trusted repository source in | ||
| // local/CI contract checks, never against attacker-controlled runtime input. |
NickLetts2
added a commit
that referenced
this pull request
Jun 1, 2026
…redos Security: harden regex usage against ReDoS (S5852)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
SonarCloud flagged five regex hotspots (S5852) across the codebase. The content-sanitiser is the highest-priority target since it scans untrusted uploaded document text; the remainder are public auth input or admin/CI-only contexts.
Content sanitiser (
services/content-sanitiser/app/main.py)MAX_SCAN_CHARS = 200_000; injection pattern scan now operates ontext[:MAX_SCAN_CHARS]rather than the full raw inputdelete-filespattern to eliminate adjacent optional\s+groups that created ambiguous whitespace ownership:Content sanitiser tests
TestReDoSAdversarialInput: 5 regression tests with wall-clock budget assertions against pathological inputs (500 k-char word-space strings, trigger phrase inside/outside scan window, repeateddeletewords)TestPromptInjectionDeleteFiles: 5 additional variant tests confirming the simplified regex still detects all intended phrasingsEmailContinueForm.tsxEMAIL_REGEXwith explicit RFC-aligned bounds:/^[^\s@]{1,64}@[^\s@]{1,253}\.[^\s@]{1,63}$/MAX_EMAIL_LENGTH = 254,maxLength={254}on the input, and length guards in bothisEmailValidandhandleSubmitBlogPostForm.tsxmaxLength={200}on the title input to bound the slug derivation.replace(/^-+|-+$/g, '')into two separate anchored replaces to remove the alternation Sonar flagscheck-naming-conventions.js// NOSONAR: javascript:S5852with written justification on both flagged lines — these patterns run only against trusted repository source in CI, never against attacker-controlled input