-
Notifications
You must be signed in to change notification settings - Fork 1
Fix sending large messages to SQS queue #86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
When the size reaches 80% of the 200KB size for the message, executes the "onTimeout" function and returns the PROGRESS event.
9a97b88 to
5c9be86
Compare
radovanjorgic
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few nits
src/common/event-size-monitor.ts
Outdated
| import { ErrorRecord } from '../types/common'; | ||
| import { EventData } from '../types/extraction'; | ||
|
|
||
| const MAX_EVENT_SIZE = 200_000; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably it is a good practice to put the unit too, so MAX_EVENT_SIZE_BYTES? Or not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are correct.
Fixed.
src/common/event-size-monitor.ts
Outdated
| import { EventData } from '../types/extraction'; | ||
|
|
||
| const MAX_EVENT_SIZE = 200_000; | ||
| const SIZE_LIMIT_THRESHOLD = Math.floor(MAX_EVENT_SIZE * 0.8); // 160_000 bytes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EVENT_SIZE_THRESHOLD_BYTES
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR addresses issues with sending large messages to SQS by implementing size monitoring and limiting event data to 200KB (with an 80% threshold at 160KB). The changes prevent events from exceeding SQS's 250KB limit by proactively detecting when data approaches the threshold and truncating error messages.
Key Changes:
- Added event size monitoring that tracks accumulated artifact size and triggers cleanup at 80% threshold (160KB)
- Implemented automatic error message truncation to 1000 characters before emission
- Added cleanup workflow that emits progress events and executes onTimeout when size limits are reached
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
| src/common/event-size-monitor.ts | New utility module providing size calculation, threshold checking, error message truncation, and logging functions |
| src/workers/worker-adapter.ts | Added size tracking in onUpload callback and error message pruning in emit method |
| src/workers/process-task.ts | Added post-task cleanup logic to execute onTimeout when size limit is triggered |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| */ | ||
| export function getEventDataSize(data: EventData | undefined): number { | ||
| if (!data) return 0; | ||
| return JSON.stringify(data).length; |
Copilot
AI
Nov 25, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
JSON.stringify().length calculates size in UTF-16 code units, not bytes. For accurate byte size calculation (especially for SQS limits), use Buffer.byteLength(JSON.stringify(data), 'utf8') or new TextEncoder().encode(JSON.stringify(data)).length.
| return JSON.stringify(data).length; | |
| return Buffer.byteLength(JSON.stringify(data), 'utf8'); |
| itemType: repo.itemType, | ||
| ...(shouldNormalize && { normalize: repo.normalize }), | ||
| onUpload: (artifact: Artifact) => { | ||
| const newLength = JSON.stringify(artifact).length; |
Copilot
AI
Nov 25, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
JSON.stringify() is called here and the same artifact is likely stringified again later when emitted. Consider caching the stringified result to avoid redundant serialization.
| const newLength = JSON.stringify(artifact).length; | |
| // Cache the stringified artifact to avoid redundant serialization | |
| if (!('_stringified' in artifact)) { | |
| (artifact as any)._stringified = JSON.stringify(artifact); | |
| } | |
| const newLength = (artifact as any)._stringified.length; |
| // Check for size limit (80% of 200KB = 160KB threshold) | ||
| if ( | ||
| this.currentLength > SIZE_LIMIT_THRESHOLD && | ||
| !this.hasWorkerEmitted | ||
| ) { |
Copilot
AI
Nov 25, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment states '80% of 200KB = 160KB' but this should be 160,000 bytes. The comment should clarify that SIZE_LIMIT_THRESHOLD is defined elsewhere to avoid confusion about the actual numeric value being compared.
| } | ||
|
|
||
| try { | ||
| // Always prune error messages to 1000 chars before emit |
Copilot
AI
Nov 25, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] The comment says 'chars' but should say 'characters' for consistency with the function documentation in event-size-monitor.ts. Also, like Comment 3, this should reference that the 1000 limit is defined in the truncateErrorMessage function.
| // Always prune error messages to 1000 chars before emit | |
| // Always prune error messages to 1000 characters before emit (limit defined in truncateErrorMessage) |
| if (!error) return undefined; | ||
|
|
||
| return { | ||
| message: error.message.substring(0, maxLength), |
Copilot
AI
Nov 25, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potential error if error.message is undefined or null. Add a null check: message: error.message?.substring(0, maxLength) ?? ''
| message: error.message.substring(0, maxLength), | |
| message: error.message?.substring(0, maxLength) ?? '', |
| ); | ||
| } | ||
|
|
||
| export { MAX_EVENT_SIZE_BYTES as MAX_EVENT_SIZE, EVENT_SIZE_THRESHOLD_BYTES as SIZE_LIMIT_THRESHOLD }; |
Copilot
AI
Nov 25, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] Exporting constants with renamed aliases on a separate line from function exports reduces readability. Consider moving these constant exports to the top of the file with the constant declarations or using separate export statements for clarity.
Description
This fix limits the max amount of bytes to be sent to SQS to be 200kB, which resolves the issue with the events being comitted to SQS and failing because they surpass the 250kB limit.
Connected Issues
Checklist
npm run testOR no tests needed.npm run test:backwards-compatibility.no-docswritten: no-docs