Skip to content

Conversation

@gasperzgonec
Copy link
Contributor

Description

This fix limits the max amount of bytes to be sent to SQS to be 200kB, which resolves the issue with the events being comitted to SQS and failing because they surpass the 250kB limit.

Connected Issues

Checklist

  • Tests added/updated and ran with npm run test OR no tests needed.
  • Ran backwards compatibility tests with npm run test:backwards-compatibility.
  • Tested airdrop-template linked to this PR.
  • Documentation updated and provided a link to PR / new docs OR no-docs written: no-docs

When the size reaches 80% of the 200KB size for the message, executes
the "onTimeout" function and returns the PROGRESS event.
Copy link
Collaborator

@radovanjorgic radovanjorgic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few nits

import { ErrorRecord } from '../types/common';
import { EventData } from '../types/extraction';

const MAX_EVENT_SIZE = 200_000;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably it is a good practice to put the unit too, so MAX_EVENT_SIZE_BYTES? Or not?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are correct.
Fixed.

import { EventData } from '../types/extraction';

const MAX_EVENT_SIZE = 200_000;
const SIZE_LIMIT_THRESHOLD = Math.floor(MAX_EVENT_SIZE * 0.8); // 160_000 bytes
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EVENT_SIZE_THRESHOLD_BYTES

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@gasperzgonec gasperzgonec requested a review from Copilot November 25, 2025 08:48
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses issues with sending large messages to SQS by implementing size monitoring and limiting event data to 200KB (with an 80% threshold at 160KB). The changes prevent events from exceeding SQS's 250KB limit by proactively detecting when data approaches the threshold and truncating error messages.

Key Changes:

  • Added event size monitoring that tracks accumulated artifact size and triggers cleanup at 80% threshold (160KB)
  • Implemented automatic error message truncation to 1000 characters before emission
  • Added cleanup workflow that emits progress events and executes onTimeout when size limits are reached

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 6 comments.

File Description
src/common/event-size-monitor.ts New utility module providing size calculation, threshold checking, error message truncation, and logging functions
src/workers/worker-adapter.ts Added size tracking in onUpload callback and error message pruning in emit method
src/workers/process-task.ts Added post-task cleanup logic to execute onTimeout when size limit is triggered

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

*/
export function getEventDataSize(data: EventData | undefined): number {
if (!data) return 0;
return JSON.stringify(data).length;
Copy link

Copilot AI Nov 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JSON.stringify().length calculates size in UTF-16 code units, not bytes. For accurate byte size calculation (especially for SQS limits), use Buffer.byteLength(JSON.stringify(data), 'utf8') or new TextEncoder().encode(JSON.stringify(data)).length.

Suggested change
return JSON.stringify(data).length;
return Buffer.byteLength(JSON.stringify(data), 'utf8');

Copilot uses AI. Check for mistakes.
itemType: repo.itemType,
...(shouldNormalize && { normalize: repo.normalize }),
onUpload: (artifact: Artifact) => {
const newLength = JSON.stringify(artifact).length;
Copy link

Copilot AI Nov 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JSON.stringify() is called here and the same artifact is likely stringified again later when emitted. Consider caching the stringified result to avoid redundant serialization.

Suggested change
const newLength = JSON.stringify(artifact).length;
// Cache the stringified artifact to avoid redundant serialization
if (!('_stringified' in artifact)) {
(artifact as any)._stringified = JSON.stringify(artifact);
}
const newLength = (artifact as any)._stringified.length;

Copilot uses AI. Check for mistakes.
Comment on lines +171 to +175
// Check for size limit (80% of 200KB = 160KB threshold)
if (
this.currentLength > SIZE_LIMIT_THRESHOLD &&
!this.hasWorkerEmitted
) {
Copy link

Copilot AI Nov 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment states '80% of 200KB = 160KB' but this should be 160,000 bytes. The comment should clarify that SIZE_LIMIT_THRESHOLD is defined elsewhere to avoid confusion about the actual numeric value being compared.

Copilot uses AI. Check for mistakes.
}

try {
// Always prune error messages to 1000 chars before emit
Copy link

Copilot AI Nov 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The comment says 'chars' but should say 'characters' for consistency with the function documentation in event-size-monitor.ts. Also, like Comment 3, this should reference that the 1000 limit is defined in the truncateErrorMessage function.

Suggested change
// Always prune error messages to 1000 chars before emit
// Always prune error messages to 1000 characters before emit (limit defined in truncateErrorMessage)

Copilot uses AI. Check for mistakes.
if (!error) return undefined;

return {
message: error.message.substring(0, maxLength),
Copy link

Copilot AI Nov 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential error if error.message is undefined or null. Add a null check: message: error.message?.substring(0, maxLength) ?? ''

Suggested change
message: error.message.substring(0, maxLength),
message: error.message?.substring(0, maxLength) ?? '',

Copilot uses AI. Check for mistakes.
);
}

export { MAX_EVENT_SIZE_BYTES as MAX_EVENT_SIZE, EVENT_SIZE_THRESHOLD_BYTES as SIZE_LIMIT_THRESHOLD };
Copy link

Copilot AI Nov 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Exporting constants with renamed aliases on a separate line from function exports reduces readability. Consider moving these constant exports to the top of the file with the constant declarations or using separate export statements for clarity.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants