Skip to content

fix[notask]: wrap job ID counter at MAX_SAFE_INTEGER to prevent precision loss#1085

Merged
sharmaraju352 merged 2 commits into
mainfrom
fix/whispercpp-parakeet-safe-job-id-counter
Mar 24, 2026
Merged

fix[notask]: wrap job ID counter at MAX_SAFE_INTEGER to prevent precision loss#1085
sharmaraju352 merged 2 commits into
mainfrom
fix/whispercpp-parakeet-safe-job-id-counter

Conversation

@sharmaraju352

Copy link
Copy Markdown
Contributor

Summary

  • Add nextSafeId() helper to both whispercpp (whisper.js) and parakeet (parakeet.js) that wraps the job ID counter back to 1 at Number.MAX_SAFE_INTEGER
  • Replaces all raw this._nextJobId += 1 increments (3 sites in whisper, 2 in parakeet)

Problem

The _nextJobId counter was incremented without bounds. JavaScript's Number.MAX_SAFE_INTEGER is 2^53 - 1. Beyond this, integer precision is lost and job ID collisions become theoretically possible in long-running server processes (e.g., benchmark servers with cached models).

Solution

function nextSafeId (current) {
  return current >= Number.MAX_SAFE_INTEGER ? 1 : current + 1
}

This preserves Number type compatibility for all existing consumers (no BigInt migration needed) while eliminating the precision loss risk.

How was it tested?

  • Whisper unit tests (before & after): 24/24 pass, 90/90 assertions
  • Parakeet unit tests (before & after): 11/11 pass, 55/55 assertions
  • SDK integration: whispercpp-filesystem.ts example produces identical output
  • JS-only change — no native addon rebuild required

Made with Cursor

…sion loss

The _nextJobId counter in WhisperInterface and ParakeetInterface was
incremented without bounds. After 2^53 increments, JavaScript loses
integer precision and job ID collisions become possible.

Replace raw += 1 with nextSafeId() that wraps back to 1 at
Number.MAX_SAFE_INTEGER, preserving Number type compatibility for
existing consumers.

Made-with: Cursor
@sharmaraju352 sharmaraju352 requested review from a team as code owners March 23, 2026 10:47
@github-actions

github-actions Bot commented Mar 23, 2026

Copy link
Copy Markdown
Contributor

Tier-based Approval Status

**PR Tier:** TIER1

**Current Status:** ✅ APPROVED

**Requirements:**
- 1 Team Member approval ✅ (1/1)
- 1 Team Lead OR Management approval ✅ (1/1)



---
*This comment is automatically updated when reviews change.*

@ogad-tether ogad-tether left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. This is a small, low-risk hardening change, and wrapping at is good enough given the one-active-job-per-instance model.

@ogad-tether

Copy link
Copy Markdown
Contributor

Clarifying my approval summary: wrapping the counter at Number.MAX_SAFE_INTEGER is a reasonable hardening step here given the one-active-job-per-instance model.

@sharmaraju352

Copy link
Copy Markdown
Contributor Author

/review

@sharmaraju352 sharmaraju352 merged commit b5b5754 into main Mar 24, 2026
23 of 24 checks passed
@sharmaraju352 sharmaraju352 deleted the fix/whispercpp-parakeet-safe-job-id-counter branch March 24, 2026 10:54
GustavoA1604 added a commit that referenced this pull request Mar 25, 2026
* fix: statically link parakeet prebuilds

Made-with: Cursor

* fix: restore parakeet linux runtime loading

Made-with: Cursor

* fix: address parakeet apple prebuild failures

Made-with: Cursor

* chore: remove parakeet release notes file

Made-with: Cursor

* fix: use static requires for mobile bare-pack bundling

The _resolve() helper used computed require paths that bare-pack
could not statically trace, so the addon modules were missing from
the mobile bundle. Use static string literals for mobile paths
(traced by bare-pack) and variable paths for desktop (skipped by
bare-pack since ../../ doesn't exist in the mobile layout).

Made-with: Cursor

* feat[notask]: add download profiler for registry blob performance diagnostics (#1040)

* feat[notask]: add download profiler for registry blob performance diagnostics

Made-with: Cursor

* fix: move profiler deps from devDependencies to dependencies

Made-with: Cursor

* doc: add profile command and example to client README

Made-with: Cursor

* fix: show full peer keys in profiler output for troubleshooting

Made-with: Cursor

* fix: validate parseInt results for interval and timeout CLI flags

Made-with: Cursor

---------

Co-authored-by: Proletter <40578159+Proletter@users.noreply.github.com>
Co-authored-by: Simon Iribarren <simon.ig13@gmail.com>

* fix: resolve dependabot alerts for registry-server transitive deps (#1093)

* fix(registry-server): PBKDF2 for passphrase-derived keys (CodeQL #9) (#1065)

* fix(registry-server): derive passphrase keys with PBKDF2

Replace single-pass SHA-256 with PBKDF2-HMAC-SHA256 (310k iterations)
for deterministic test keys; addresses CodeQL js/insufficient-password-hash.

* chore(registry-server): remove passphrase migration note from guide

---------

Co-authored-by: Proletter <40578159+Proletter@users.noreply.github.com>

* fix[notask]: lazy-load Node builtins in profiler for Bare runtime compatibility (#1096)

* fix[notask]: sanitize SSE output to prevent reflected XSS (#1027)

Co-authored-by: Marco <1369747+elchiapp@users.noreply.github.com>

* [Parakeet] QVAC-13814 feat: add automated benchmarks for parakeet ctc, eou and sortformer models (#991)

* feat: add automated benchmarks for parakeet ctc, eou and sortformer models

Add per-model benchmark config files (config-ctc.yaml, config-eou.yaml,
config-sortformer.yaml) with appropriate defaults for each model type.

Update the CI workflow to support an 'all' option that runs benchmarks
for every model type in a single matrix, and add a weekly schedule
trigger (Sunday 04:00 UTC) for automated regression benchmarking.

Add trigger scripts (trigger-benchmark.sh, trigger-benchmark-all.sh) for
convenient local invocation of benchmark workflows via gh CLI.

Made-with: Cursor

* fix: make prebuilds step non-fatal with npm fallback

When CI prebuilds are not available (no successful prebuilds workflow
run), fall back to installing @qvac/transcription-parakeet from npm
instead of failing the entire benchmark job.

Made-with: Cursor

* fix: use python 3.13 for benchmark client compatibility

Python 3.14 changed Pickler._batch_setitems() signature which breaks
the datasets library. Pin to 3.13 until upstream compatibility is fixed.

Made-with: Cursor

* fix: add named model paths in benchmark server for ctc/eou/sortformer

The addon requires model-type-specific named paths (e.g. ctcModelPath,
eouEncoderPath, sortformerPath) when activating non-TDT models. Add
getNamedPaths() that resolves the correct file paths per model type and
spreads them into the parakeetConfig passed to the addon constructor.

Made-with: Cursor

* fix: spread named paths at config top level, not inside parakeetConfig

The addon reads ctcModelPath/eouEncoderPath/sortformerPath from the
top-level config object (this._config), not from parakeetConfig.

Made-with: Cursor

* fix: use public cgus repo for sortformer model download

The tetherto/sortformer-4spk-v2-onnx HuggingFace repo is gated and
returns an invalid file. Use the public cgus community repo that the
integration tests already rely on.

Made-with: Cursor

* chore: remove redundant trigger-benchmark-all.sh

trigger-benchmark.sh already supports -t all, making the separate
trigger-benchmark-all.sh unnecessary.

Made-with: Cursor

* chore: remove scheduled cron trigger from benchmark workflow

Per review feedback — "automated" means triggered via workflow_dispatch,
not periodic autonomous runs.

Made-with: Cursor

* fix: correct workflow fallback default and remove dead code in trigger script

- Change MODEL_TYPE fallback from 'all' to 'tdt' to match the
  workflow_dispatch UI default
- Replace unreachable $? check (dead code under set -e) with proper
  if-not construct in trigger-benchmark.sh

Made-with: Cursor

---------

Co-authored-by: Raju <raju.sharma>

* fix[notask]: replace global streaming state with per-instance map in whispercpp (#1079)

The streaming processor used three process-global variables (g_streamingMtx,
g_streamingInstance, g_streamingProcessor) which limited the entire process
to a single streaming session and risked dangling-pointer access if the
owning AddonJs instance was destroyed without cleanup.

Replace with an unordered_map keyed by AddonJs* so each addon instance
independently owns its streaming session, eliminating the race condition
and enabling concurrent streaming across multiple instances.

Made-with: Cursor

Co-authored-by: Raju <raju.sharma>

* chore[notask]: replace deprecated istanbul with nyc in decoder-audio (#1082)

* chore[notask]: replace deprecated istanbul with nyc in decoder-audio

The istanbul package has been deprecated since 2016 and carries known
vulnerable transitive dependencies (minimatch ReDoS, uglify-js ReDoS).
Replace with nyc ^17.1.0 (the actively maintained successor) and update
coverage scripts to use nyc CLI syntax.

Made-with: Cursor

* fix[notask]: fix nyc coverage report command to use .nyc_output directory

The nyc report command expects coverage data in .nyc_output/ rather
than reading from --temp-dir directly. Copy brittle's coverage-final.json
into .nyc_output/ before running nyc report so the HTML report generates
cleanly without format warnings.

Made-with: Cursor

---------

Co-authored-by: Raju <raju.sharma>

* Updated dependencies with android-arm64 fix (#1095)

Co-authored-by: gianni <gianfranco.cordella@tether.io>

* fix[notask]: sanitize error messages to prevent filesystem path leakage (#1084)

Error messages in whispercpp and parakeet validateModelFiles() included
full filesystem paths (e.g. "Model file doesn't exist: /home/user/...").
When surfaced via API responses this reveals internal server layout.

Log the full path at debug/error level for operators, but throw generic
messages without paths to callers.

Made-with: Cursor

Co-authored-by: Raju <raju.sharma>

* fix[notask]: wrap job ID counter at MAX_SAFE_INTEGER to prevent precision loss (#1085)

The _nextJobId counter in WhisperInterface and ParakeetInterface was
incremented without bounds. After 2^53 increments, JavaScript loses
integer precision and job ID collisions become possible.

Replace raw += 1 with nextSafeId() that wraps back to 1 at
Number.MAX_SAFE_INTEGER, preserving Number type compatibility for
existing consumers.

Made-with: Cursor

Co-authored-by: Raju <raju.sharma>

* fix: catch unhandled rejections in mobile integration runtime

Register Bare.on('unhandledRejection') and Bare.on('uncaughtException')
handlers to prevent the runtime from aborting (SIGABRT) when network
errors escape the promise chain during model downloads.

Made-with: Cursor

* fix: bundle audio samples and resolve asset paths for mobile tests

Add sample-16k.wav, French.raw, and croatian.raw to testAssets so
integration tests can run transcription on mobile without downloading.
Update getTestPaths to resolve samplesDir from the bundled asset
manifest on mobile instead of a non-existent writableRoot/samples path.

Made-with: Cursor

* chore: bump parakeet to 0.2.4

Made-with: Cursor

* chore: bump parakeet to 0.2.5

Made-with: Cursor

---------

Co-authored-by: Raju <raju.sharma>
Co-authored-by: Yury Samarin <yuri.a.samarin@gmail.com>
Co-authored-by: Proletter <40578159+Proletter@users.noreply.github.com>
Co-authored-by: Simon Iribarren <simon.ig13@gmail.com>
Co-authored-by: Marco <1369747+elchiapp@users.noreply.github.com>
Co-authored-by: Raju Sharma <sharmaraju352@gmail.com>
Co-authored-by: Juan Pablo Garibotti Arias <juan.arias@bitfinex.com>
Co-authored-by: gianni <gianfranco.cordella@tether.io>
Co-authored-by: GustavoA1604 <54457676+GustavoA1604@users.noreply.github.com>
Proletter pushed a commit that referenced this pull request May 24, 2026
…sion loss (#1085)

The _nextJobId counter in WhisperInterface and ParakeetInterface was
incremented without bounds. After 2^53 increments, JavaScript loses
integer precision and job ID collisions become possible.

Replace raw += 1 with nextSafeId() that wraps back to 1 at
Number.MAX_SAFE_INTEGER, preserving Number type compatibility for
existing consumers.

Made-with: Cursor

Co-authored-by: Raju <raju.sharma>
Proletter added a commit that referenced this pull request May 24, 2026
* fix: statically link parakeet prebuilds

Made-with: Cursor

* fix: restore parakeet linux runtime loading

Made-with: Cursor

* fix: address parakeet apple prebuild failures

Made-with: Cursor

* chore: remove parakeet release notes file

Made-with: Cursor

* fix: use static requires for mobile bare-pack bundling

The _resolve() helper used computed require paths that bare-pack
could not statically trace, so the addon modules were missing from
the mobile bundle. Use static string literals for mobile paths
(traced by bare-pack) and variable paths for desktop (skipped by
bare-pack since ../../ doesn't exist in the mobile layout).

Made-with: Cursor

* feat[notask]: add download profiler for registry blob performance diagnostics (#1040)

* feat[notask]: add download profiler for registry blob performance diagnostics

Made-with: Cursor

* fix: move profiler deps from devDependencies to dependencies

Made-with: Cursor

* doc: add profile command and example to client README

Made-with: Cursor

* fix: show full peer keys in profiler output for troubleshooting

Made-with: Cursor

* fix: validate parseInt results for interval and timeout CLI flags

Made-with: Cursor

---------

Co-authored-by: Proletter <40578159+Proletter@users.noreply.github.com>
Co-authored-by: Simon Iribarren <simon.ig13@gmail.com>

* fix: resolve dependabot alerts for registry-server transitive deps (#1093)

* fix(registry-server): PBKDF2 for passphrase-derived keys (CodeQL #9) (#1065)

* fix(registry-server): derive passphrase keys with PBKDF2

Replace single-pass SHA-256 with PBKDF2-HMAC-SHA256 (310k iterations)
for deterministic test keys; addresses CodeQL js/insufficient-password-hash.

* chore(registry-server): remove passphrase migration note from guide

---------

Co-authored-by: Proletter <40578159+Proletter@users.noreply.github.com>

* fix[notask]: lazy-load Node builtins in profiler for Bare runtime compatibility (#1096)

* fix[notask]: sanitize SSE output to prevent reflected XSS (#1027)

Co-authored-by: Marco <1369747+elchiapp@users.noreply.github.com>

* [Parakeet] QVAC-13814 feat: add automated benchmarks for parakeet ctc, eou and sortformer models (#991)

* feat: add automated benchmarks for parakeet ctc, eou and sortformer models

Add per-model benchmark config files (config-ctc.yaml, config-eou.yaml,
config-sortformer.yaml) with appropriate defaults for each model type.

Update the CI workflow to support an 'all' option that runs benchmarks
for every model type in a single matrix, and add a weekly schedule
trigger (Sunday 04:00 UTC) for automated regression benchmarking.

Add trigger scripts (trigger-benchmark.sh, trigger-benchmark-all.sh) for
convenient local invocation of benchmark workflows via gh CLI.

Made-with: Cursor

* fix: make prebuilds step non-fatal with npm fallback

When CI prebuilds are not available (no successful prebuilds workflow
run), fall back to installing @qvac/transcription-parakeet from npm
instead of failing the entire benchmark job.

Made-with: Cursor

* fix: use python 3.13 for benchmark client compatibility

Python 3.14 changed Pickler._batch_setitems() signature which breaks
the datasets library. Pin to 3.13 until upstream compatibility is fixed.

Made-with: Cursor

* fix: add named model paths in benchmark server for ctc/eou/sortformer

The addon requires model-type-specific named paths (e.g. ctcModelPath,
eouEncoderPath, sortformerPath) when activating non-TDT models. Add
getNamedPaths() that resolves the correct file paths per model type and
spreads them into the parakeetConfig passed to the addon constructor.

Made-with: Cursor

* fix: spread named paths at config top level, not inside parakeetConfig

The addon reads ctcModelPath/eouEncoderPath/sortformerPath from the
top-level config object (this._config), not from parakeetConfig.

Made-with: Cursor

* fix: use public cgus repo for sortformer model download

The tetherto/sortformer-4spk-v2-onnx HuggingFace repo is gated and
returns an invalid file. Use the public cgus community repo that the
integration tests already rely on.

Made-with: Cursor

* chore: remove redundant trigger-benchmark-all.sh

trigger-benchmark.sh already supports -t all, making the separate
trigger-benchmark-all.sh unnecessary.

Made-with: Cursor

* chore: remove scheduled cron trigger from benchmark workflow

Per review feedback — "automated" means triggered via workflow_dispatch,
not periodic autonomous runs.

Made-with: Cursor

* fix: correct workflow fallback default and remove dead code in trigger script

- Change MODEL_TYPE fallback from 'all' to 'tdt' to match the
  workflow_dispatch UI default
- Replace unreachable $? check (dead code under set -e) with proper
  if-not construct in trigger-benchmark.sh

Made-with: Cursor

---------

Co-authored-by: Raju <raju.sharma>

* fix[notask]: replace global streaming state with per-instance map in whispercpp (#1079)

The streaming processor used three process-global variables (g_streamingMtx,
g_streamingInstance, g_streamingProcessor) which limited the entire process
to a single streaming session and risked dangling-pointer access if the
owning AddonJs instance was destroyed without cleanup.

Replace with an unordered_map keyed by AddonJs* so each addon instance
independently owns its streaming session, eliminating the race condition
and enabling concurrent streaming across multiple instances.

Made-with: Cursor

Co-authored-by: Raju <raju.sharma>

* chore[notask]: replace deprecated istanbul with nyc in decoder-audio (#1082)

* chore[notask]: replace deprecated istanbul with nyc in decoder-audio

The istanbul package has been deprecated since 2016 and carries known
vulnerable transitive dependencies (minimatch ReDoS, uglify-js ReDoS).
Replace with nyc ^17.1.0 (the actively maintained successor) and update
coverage scripts to use nyc CLI syntax.

Made-with: Cursor

* fix[notask]: fix nyc coverage report command to use .nyc_output directory

The nyc report command expects coverage data in .nyc_output/ rather
than reading from --temp-dir directly. Copy brittle's coverage-final.json
into .nyc_output/ before running nyc report so the HTML report generates
cleanly without format warnings.

Made-with: Cursor

---------

Co-authored-by: Raju <raju.sharma>

* Updated dependencies with android-arm64 fix (#1095)

Co-authored-by: gianni <gianfranco.cordella@tether.io>

* fix[notask]: sanitize error messages to prevent filesystem path leakage (#1084)

Error messages in whispercpp and parakeet validateModelFiles() included
full filesystem paths (e.g. "Model file doesn't exist: /home/user/...").
When surfaced via API responses this reveals internal server layout.

Log the full path at debug/error level for operators, but throw generic
messages without paths to callers.

Made-with: Cursor

Co-authored-by: Raju <raju.sharma>

* fix[notask]: wrap job ID counter at MAX_SAFE_INTEGER to prevent precision loss (#1085)

The _nextJobId counter in WhisperInterface and ParakeetInterface was
incremented without bounds. After 2^53 increments, JavaScript loses
integer precision and job ID collisions become possible.

Replace raw += 1 with nextSafeId() that wraps back to 1 at
Number.MAX_SAFE_INTEGER, preserving Number type compatibility for
existing consumers.

Made-with: Cursor

Co-authored-by: Raju <raju.sharma>

* fix: catch unhandled rejections in mobile integration runtime

Register Bare.on('unhandledRejection') and Bare.on('uncaughtException')
handlers to prevent the runtime from aborting (SIGABRT) when network
errors escape the promise chain during model downloads.

Made-with: Cursor

* fix: bundle audio samples and resolve asset paths for mobile tests

Add sample-16k.wav, French.raw, and croatian.raw to testAssets so
integration tests can run transcription on mobile without downloading.
Update getTestPaths to resolve samplesDir from the bundled asset
manifest on mobile instead of a non-existent writableRoot/samples path.

Made-with: Cursor

* chore: bump parakeet to 0.2.4

Made-with: Cursor

* chore: bump parakeet to 0.2.5

Made-with: Cursor

---------

Co-authored-by: Raju <raju.sharma>
Co-authored-by: Yury Samarin <yuri.a.samarin@gmail.com>
Co-authored-by: Proletter <40578159+Proletter@users.noreply.github.com>
Co-authored-by: Simon Iribarren <simon.ig13@gmail.com>
Co-authored-by: Marco <1369747+elchiapp@users.noreply.github.com>
Co-authored-by: Raju Sharma <sharmaraju352@gmail.com>
Co-authored-by: Juan Pablo Garibotti Arias <juan.arias@bitfinex.com>
Co-authored-by: gianni <gianfranco.cordella@tether.io>
Co-authored-by: GustavoA1604 <54457676+GustavoA1604@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants