-
Notifications
You must be signed in to change notification settings - Fork 378
Only run failed tests for snapshot update #5954
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
🎨 Storybook Build Status✅ Build completed successfully! ⏰ Completed at: 10/08/2025, 02:34:24 AM UTC 🔗 Links🎉 Your Storybook is ready for review! |
🎭 Playwright Test Results⏰ Completed at: 10/08/2025, 02:44:39 AM UTC 📈 Summary
📊 Test Reports by Browser
🎉 Click on the links above to view detailed test results for each browser configuration. |
@christian-byrne I manually wrote this PR description to be as helpful as possible, LMKWYT if you have time |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed offline, but for the record:
I can't block this, but I would strongly oppose adding such complexity before even trying simpler solutions like filtering the update runs using --last-failed
with the results we're already storing.
I'm also worried about the mix in here of having: an inlined github-script, an inlined bash script, and a separate typescript script that collectively depend on one another by loose contract.
|
if: ${{ needs.playwright-tests-chromium-sharded.result == 'failure' }} | ||
run: | | ||
set -euo pipefail | ||
pnpm tsx scripts/cicd/build-failed-screenshot-manifest.ts |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[quality] medium Priority
Issue: Using pnpm tsx without explicit error handling for TypeScript execution
Context: If the TypeScript script fails to compile or execute, the failure might be silently ignored
Suggestion: Add explicit error handling or use --exit-code to ensure script failures propagate properly
continue | ||
fi | ||
echo "Re-running ${#filtered[@]} tests for project $project" | ||
PLAYWRIGHT_JSON_OUTPUT_NAME=playwright-report/report.json \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[security] high Priority
Issue: Potential command injection vulnerability when processing test file names
Context: Files are read from the manifest and passed directly as command arguments without validation. Malicious test file names could inject commands
Suggestion: Add input validation and sanitization for test file names, or use safer parameter passing mechanisms
} | ||
|
||
const raw = await fsp.readFile(reportPath, 'utf8') | ||
const data = JSON.parse(raw) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[quality] medium Priority
Issue: JSON.parse without error handling for malformed JSON
Context: If the Playwright report JSON is malformed or corrupted, the script will crash with an unclear error message
Suggestion: Add try-catch block around JSON.parse with descriptive error message for debugging
last && last.status === 'failed' && hasScreenshotSignal(last) | ||
if (!failedScreenshot) continue | ||
if (!out.has(project)) out.set(project, new Set()) | ||
out.get(project)!.add(loc) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[quality] medium Priority
Issue: Non-null assertion operator without proper null checking
Context: Using out.get(project)! assumes the Map entry exists, but there's a potential race condition or logic error where it might not
Suggestion: Use safer access pattern like const set = out.get(project); if (set) set.add(loc)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comprehensive PR Review
This review is generated by Claude. It may not always be accurate, as with human reviewers. If you believe that any of the comments are invalid or incorrect, please state why for each. For others, please implement the changes in one way or another.
Review Summary
PR: Only run failed tests for snapshot update (#5954)
Impact: 209 additions, 12 deletions across 3 files
Issue Distribution
- Critical: 0
- High: 1
- Medium: 3
- Low: 0
Category Breakdown
- Architecture: 0 issues
- Security: 1 issues
- Performance: 0 issues
- Code Quality: 3 issues
Key Findings
Architecture & Design
The PR implements a selective test execution pattern that aligns well with CI optimization practices. The manifest-based approach is a sound architectural choice that reduces unnecessary test runs while maintaining test coverage for failures. The separation of concerns between the manifest generation (TypeScript) and consumption (bash) is appropriate.
Security Considerations
Critical Finding: Command injection vulnerability in the bash script processing test file names. File names from the manifest are passed directly as command arguments without validation, potentially allowing malicious test files to inject commands. This needs immediate attention as it could compromise the CI environment.
Performance Impact
The PR significantly improves CI performance by running only failed screenshot tests instead of the full suite. This is a substantial optimization that will reduce CI execution time and resource usage. The selective execution pattern should provide meaningful speed improvements for large test suites.
Integration Points
The integration between the CI workflow and the snapshot update workflow is well-designed. The artifact-based communication pattern using GitHub Actions artifacts is robust and follows best practices. The fallback mechanisms ensure reliability when manifests are unavailable.
Positive Observations
- Excellent bash error handling with set -euo pipefail
- Proper fallback mechanisms when manifests are not available
- Clean TypeScript code with appropriate type annotations
- Good separation of concerns between different workflow steps
- Comprehensive artifact handling and retention policies
Next Steps
- Address critical security issue before merge - sanitize test file names in bash script
- Add error handling for JSON parsing in TypeScript manifest script
- Consider safer non-null assertion patterns for TypeScript code
- Add explicit error handling for TypeScript script execution in workflow
This is a comprehensive automated review. For architectural decisions requiring human judgment, please request additional manual review.
Note for reviewers: Contains a lot of bash scripting
Summary
Before, our snapshot update would run all playwright tests, even the ones that did not fail.
With this PR, we now only run the failed tests for snapshot updates.
Changes
┆Issue is synchronized with this Notion page by Unito