fix(cli): retry + cache prebuild downloads in build-dist; add docker test#4051
Conversation
…test
cli-v0.2.4 darwin-arm64 release build failed with `curl: (56) ... 502`
fetching the better-sqlite3 prebuild from GitHub Releases. The curl in
build-dist.ts had no retry, so a single transient 502 brick the entire
matrix run.
Two changes:
1. Wrap all curl invocations in `curlDownload()` with
`--retry 6 --retry-delay 2 --retry-all-errors` (GitHub Releases 5xx
counts as a retryable error only with this flag), tight connect/max
timeouts, and a `.partial → rename` atomic write so a half-finished
download can't masquerade as a cache hit on the next run.
Also cache the better-sqlite3 prebuild on disk like the Node tarball
already is, keyed by version + ABI + target — so re-running the build
after a partial failure is free.
2. Add `packages/cli/scripts/build-dist-linux-docker.sh` to reproduce
the GH Actions Linux build flow (matches `.github/workflows/build-cli.yml`)
inside `oven/bun:<.bun-version>`, with both linux-x64 and linux-arm64
targets. Runs the same install + npm rebuild + build-dist + smoke-test
sequence CI runs, so we can validate workflow changes locally instead
of burning release tag pushes.
Verified locally on darwin-arm64 — full build + smoke test
(`require('better-sqlite3')` etc.) passes; cache hit confirmed on
second run.
📝 WalkthroughWalkthroughAdds Docker-based Linux CLI build automation and improves download/caching infrastructure in the TypeScript build script. The Docker script reproduces Linux builds ( ChangesBuild Infrastructure: Download Helpers & Caching
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Review rate limit: 7/8 reviews remaining, refill in 7 minutes and 30 seconds.Comment |
Greptile SummaryThis PR adds retry + atomic-write logic to Confidence Score: 4/5Safe to merge; changes are well-contained build-script improvements with no product-code impact. Only P2 findings present. The core retry + caching logic in build-dist.ts is correct — atomic partial write, cache-key includes version + ABI + target, and --retry-all-errors is the right flag for 5xx coverage. The Docker script has minor inconsistencies (hardcoded Node version, unretried curl) but neither blocks the intended goal. No files require special attention.
|
| Filename | Overview |
|---|---|
| packages/cli/scripts/build-dist.ts | Adds curlDownload() helper with 6-retry logic, atomic .partial → rename writes, and disk-caching for the better-sqlite3 prebuild tarball. Both changes are well-scoped and correct. |
| packages/cli/scripts/build-dist-linux-docker.sh | New local Docker harness that mirrors CI. NODE_VERSION for the build-tool container (22.22.2) is intentionally different from the bundled runtime (22.13.0); ABI compatibility is maintained within Node 22.x, but the lack of a comment explaining this and the unretried curl |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[build-dist.ts starts] --> B{Node tarball cached?}
B -- No --> C[curlDownload to archivePath.partial\nretry 6, delay 2s, --retry-all-errors]
C --> D[renameSync .partial → archivePath]
D --> E[tar -xzf archivePath]
B -- Yes --> E
E --> F[Copy node binary to dist]
F --> G{better-sqlite3 tarball cached?}
G -- No --> H[curlDownload to cachedTarball.partial\nretry 6, delay 2s, --retry-all-errors]
H --> I[renameSync .partial → cachedTarball]
I --> J[tar -xzf cachedTarball to tmp dir]
G -- Yes --> J
J --> K[cpSync .node binary to dist]
K --> L[Build complete]
Prompt To Fix All With AI
Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.
---
### Issue 1 of 2
packages/cli/scripts/build-dist-linux-docker.sh:26
**`NODE_VERSION` differs from the bundled runtime**
The Docker script hardcodes `NODE_VERSION="22.22.2"` for the container's build tooling, while `build-dist.ts` defines `NODE_VERSION = "22.13.0"` as the Node runtime bundled into the distribution. These intentionally serve different roles, but they're both Node 22.x (ABI 127) so native modules compiled against 22.22.2 in the container will still be compatible with the bundled 22.13.0. However, if the CI workflow's *build tool* Node version ever diverges from 22.x (e.g., bumped to v24), the ABI assumption breaks and the smoke test would pass locally while CI fails. Consider extracting the build-tool Node version from `.nvmrc` / `build-cli.yml` or at least adding a comment linking the two constants.
### Issue 2 of 2
packages/cli/scripts/build-dist-linux-docker.sh:48-49
**Single-shot `curl | tar` in the container setup**
The whole motivation for this PR is that a transient 502 from GitHub/nodejs.org kills the pipeline. The Node.js installation step inside the container (`curl -fsSL … | tar -xJ`) uses bare curl with no retry flags, so a transient failure here still kills the docker run. Since this is a developer-local script (not CI), the impact is low, but it's inconsistent with the fix being applied to `build-dist.ts`. Adding `--retry 3 --retry-delay 2 --retry-all-errors` to this curl call would make it consistent.
Reviews (1): Last reviewed commit: "fix(cli): retry + cache prebuild downloa..." | Re-trigger Greptile
| BUN_VERSION="$(cat "$REPO_ROOT/.bun-version")" | ||
| NODE_VERSION="22.22.2" | ||
|
|
||
| echo "[docker-build] target=$TARGET platform=$PLATFORM bun=$BUN_VERSION node=$NODE_VERSION" |
There was a problem hiding this comment.
NODE_VERSION differs from the bundled runtime
The Docker script hardcodes NODE_VERSION="22.22.2" for the container's build tooling, while build-dist.ts defines NODE_VERSION = "22.13.0" as the Node runtime bundled into the distribution. These intentionally serve different roles, but they're both Node 22.x (ABI 127) so native modules compiled against 22.22.2 in the container will still be compatible with the bundled 22.13.0. However, if the CI workflow's build tool Node version ever diverges from 22.x (e.g., bumped to v24), the ABI assumption breaks and the smoke test would pass locally while CI fails. Consider extracting the build-tool Node version from .nvmrc / build-cli.yml or at least adding a comment linking the two constants.
Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/cli/scripts/build-dist-linux-docker.sh
Line: 26
Comment:
**`NODE_VERSION` differs from the bundled runtime**
The Docker script hardcodes `NODE_VERSION="22.22.2"` for the container's build tooling, while `build-dist.ts` defines `NODE_VERSION = "22.13.0"` as the Node runtime bundled into the distribution. These intentionally serve different roles, but they're both Node 22.x (ABI 127) so native modules compiled against 22.22.2 in the container will still be compatible with the bundled 22.13.0. However, if the CI workflow's *build tool* Node version ever diverges from 22.x (e.g., bumped to v24), the ABI assumption breaks and the smoke test would pass locally while CI fails. Consider extracting the build-tool Node version from `.nvmrc` / `build-cli.yml` or at least adding a comment linking the two constants.
How can I resolve this? If you propose a fix, please make it concise.| bun --version | ||
|
|
There was a problem hiding this comment.
Single-shot
curl | tar in the container setup
The whole motivation for this PR is that a transient 502 from GitHub/nodejs.org kills the pipeline. The Node.js installation step inside the container (curl -fsSL … | tar -xJ) uses bare curl with no retry flags, so a transient failure here still kills the docker run. Since this is a developer-local script (not CI), the impact is low, but it's inconsistent with the fix being applied to build-dist.ts. Adding --retry 3 --retry-delay 2 --retry-all-errors to this curl call would make it consistent.
Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/cli/scripts/build-dist-linux-docker.sh
Line: 48-49
Comment:
**Single-shot `curl | tar` in the container setup**
The whole motivation for this PR is that a transient 502 from GitHub/nodejs.org kills the pipeline. The Node.js installation step inside the container (`curl -fsSL … | tar -xJ`) uses bare curl with no retry flags, so a transient failure here still kills the docker run. Since this is a developer-local script (not CI), the impact is low, but it's inconsistent with the fix being applied to `build-dist.ts`. Adding `--retry 3 --retry-delay 2 --retry-all-errors` to this curl call would make it consistent.
How can I resolve this? If you propose a fix, please make it concise.
🧹 Preview Cleanup CompleteThe following preview resources have been cleaned up:
Thank you for your contribution! 🎉 |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
packages/cli/scripts/build-dist-linux-docker.sh (1)
45-46: ⚡ Quick winNode.js download inside the container is a single-shot
curlwith no retry flags.The whole motivation of this PR is resilient downloads, but the container's Node tarball fetch is a plain pipe without
--retry,--connect-timeout, or--max-time. A transient 5xx or connection reset here fails the entire Docker run.♻️ Proposed fix
- curl -fsSL "https://nodejs.org/dist/v${NODE_VERSION}/node-v${NODE_VERSION}-linux-${NODE_ARCH}.tar.xz" \ - | tar -xJ -C /usr/local --strip-components=1 + curl -fsSL --retry 6 --retry-delay 2 --retry-all-errors \ + --connect-timeout 15 --max-time 180 \ + -o /tmp/node.tar.xz \ + "https://nodejs.org/dist/v${NODE_VERSION}/node-v${NODE_VERSION}-linux-${NODE_ARCH}.tar.xz" + tar -xJ -C /usr/local --strip-components=1 -f /tmp/node.tar.xz + rm /tmp/node.tar.xzSplitting the download and extract also means a partial download is never fed to
tar.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/cli/scripts/build-dist-linux-docker.sh` around lines 45 - 46, Replace the one-shot piped curl+tar with a resilient two-step download and extract: use curl with --fail plus retry flags (e.g., --retry, --retry-connrefused, --retry-delay), connection and overall timeouts (e.g., --connect-timeout, --max-time) to download the Node tarball to a temporary file (referencing NODE_VERSION and NODE_ARCH from the script), verify the download succeeded (non-empty and curl exit code), then run tar -xJ -C /usr/local --strip-components=1 against that temp file, and finally remove the temp file; update the block that currently contains the piped curl/tar so it uses the temp file approach and cleans up on error.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@packages/cli/scripts/build-dist-linux-docker.sh`:
- Around line 58-59: The PTY_DIR assignment uses `ls -d` which can produce
multiple matches and break `cd "$PTY_DIR"`; replace that with shell glob
expansion into an array (matching
node_modules/.bun/node-pty@*/node_modules/node-pty), validate the array
cardinality (if zero -> exit with a clear error, if >1 -> exit with a clear
error instructing to clean the Bun store), then set PTY_DIR to the single
matched path and run the `(cd "$PTY_DIR" && npx --yes node-gyp rebuild)` step;
reference the PTY_DIR variable, the node-pty glob, and the node-gyp rebuild
invocation so reviewers can find and apply the change.
---
Nitpick comments:
In `@packages/cli/scripts/build-dist-linux-docker.sh`:
- Around line 45-46: Replace the one-shot piped curl+tar with a resilient
two-step download and extract: use curl with --fail plus retry flags (e.g.,
--retry, --retry-connrefused, --retry-delay), connection and overall timeouts
(e.g., --connect-timeout, --max-time) to download the Node tarball to a
temporary file (referencing NODE_VERSION and NODE_ARCH from the script), verify
the download succeeded (non-empty and curl exit code), then run tar -xJ -C
/usr/local --strip-components=1 against that temp file, and finally remove the
temp file; update the block that currently contains the piped curl/tar so it
uses the temp file approach and cleans up on error.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 45918a7f-ff95-4e62-87d1-50bdcc50f8f6
📒 Files selected for processing (2)
packages/cli/scripts/build-dist-linux-docker.shpackages/cli/scripts/build-dist.ts
| PTY_DIR=$(ls -d node_modules/.bun/node-pty@*/node_modules/node-pty) | ||
| (cd "$PTY_DIR" && npx --yes node-gyp rebuild) |
There was a problem hiding this comment.
ls -d glob for node-pty directory is fragile with multiple-match results.
If the Bun store contains more than one node-pty@* entry (e.g., after a version bump without a clean install), ls returns multiple lines, PTY_DIR becomes a multi-line string, and cd "$PTY_DIR" fails with "too many arguments" — a confusing error that doesn't point back to this line. Use a glob with explicit cardinality validation instead.
♻️ Proposed fix
- PTY_DIR=$(ls -d node_modules/.bun/node-pty@*/node_modules/node-pty)
- (cd "$PTY_DIR" && npx --yes node-gyp rebuild)
+ mapfile -t PTY_DIRS < <(ls -d node_modules/.bun/node-pty@*/node_modules/node-pty 2>/dev/null || true)
+ if [[ ${`#PTY_DIRS`[@]} -ne 1 ]]; then
+ echo "Expected exactly 1 node-pty in Bun store, found ${`#PTY_DIRS`[@]}" >&2; exit 1
+ fi
+ (cd "${PTY_DIRS[0]}" && npx --yes node-gyp rebuild)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| PTY_DIR=$(ls -d node_modules/.bun/node-pty@*/node_modules/node-pty) | |
| (cd "$PTY_DIR" && npx --yes node-gyp rebuild) | |
| mapfile -t PTY_DIRS < <(ls -d node_modules/.bun/node-pty@*/node_modules/node-pty 2>/dev/null || true) | |
| if [[ ${`#PTY_DIRS`[@]} -ne 1 ]]; then | |
| echo "Expected exactly 1 node-pty in Bun store, found ${`#PTY_DIRS`[@]}" >&2; exit 1 | |
| fi | |
| (cd "${PTY_DIRS[0]}" && npx --yes node-gyp rebuild) |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@packages/cli/scripts/build-dist-linux-docker.sh` around lines 58 - 59, The
PTY_DIR assignment uses `ls -d` which can produce multiple matches and break `cd
"$PTY_DIR"`; replace that with shell glob expansion into an array (matching
node_modules/.bun/node-pty@*/node_modules/node-pty), validate the array
cardinality (if zero -> exit with a clear error, if >1 -> exit with a clear
error instructing to clean the Bun store), then set PTY_DIR to the single
matched path and run the `(cd "$PTY_DIR" && npx --yes node-gyp rebuild)` step;
reference the PTY_DIR variable, the node-pty glob, and the node-gyp rebuild
invocation so reviewers can find and apply the change.
…test (superset-sh#4051) cli-v0.2.4 darwin-arm64 release build failed with `curl: (56) ... 502` fetching the better-sqlite3 prebuild from GitHub Releases. The curl in build-dist.ts had no retry, so a single transient 502 brick the entire matrix run. Two changes: 1. Wrap all curl invocations in `curlDownload()` with `--retry 6 --retry-delay 2 --retry-all-errors` (GitHub Releases 5xx counts as a retryable error only with this flag), tight connect/max timeouts, and a `.partial → rename` atomic write so a half-finished download can't masquerade as a cache hit on the next run. Also cache the better-sqlite3 prebuild on disk like the Node tarball already is, keyed by version + ABI + target — so re-running the build after a partial failure is free. 2. Add `packages/cli/scripts/build-dist-linux-docker.sh` to reproduce the GH Actions Linux build flow (matches `.github/workflows/build-cli.yml`) inside `oven/bun:<.bun-version>`, with both linux-x64 and linux-arm64 targets. Runs the same install + npm rebuild + build-dist + smoke-test sequence CI runs, so we can validate workflow changes locally instead of burning release tag pushes. Verified locally on darwin-arm64 — full build + smoke test (`require('better-sqlite3')` etc.) passes; cache hit confirmed on second run.
Why
After landing #4049,
cli-v0.2.4was re-tagged. Linux now passes (run) — but darwin-arm64 just failed insidebuild-dist.ts:A single transient GitHub Releases 502 → full matrix is dead. Same single-shot curl was used for the Node download too, so this is a strictly weaker sibling of the same failure mode.
What changes
1. Resilient prebuild downloads (
packages/cli/scripts/build-dist.ts)curlDownload(url, dest)with:--retry 6 --retry-delay 2 --retry-all-errors—--retry-all-errorsis required for curl to retry on 5xx (default retry only covers transport errors).--connect-timeout 15 --max-time 180${dest}.partial, thenrenameSync— a half-finished file can't masquerade as a cache hit on the next run.version + ABI + target. Re-running build-dist after a transient failure is now free.2. Local Docker harness (
packages/cli/scripts/build-dist-linux-docker.sh)Reproduces the GH Actions Linux flow exactly (matches
.github/workflows/build-cli.yml) inside theoven/bun:<.bun-version>image, for eitherlinux-x64orlinux-arm64. Runs the samebun install --frozen --ignore-scripts → npm rebuild → build-dist → require()smoke test the CI workflow runs.Usage:
We can now validate
build-cli.yml/build-dist.tschanges locally without burning release tag pushes.Verification
bun run build:dist --target=darwin-arm64runs locally end-to-end against the patchedbuild-dist.ts; smoke test (require('better-sqlite3'),node-pty,@parcel/watcher,libsqlfrom the bundled Node) all OK.[build-dist] using cached better-sqlite3: ...instead of re-fetching.linux-arm64while writing this PR; will note the result in a follow-up comment.Test plan
cli-v0.2.4(delete + re-push the tag at the new main HEAD) to verify all three matrix targets land artifacts and the draft release +cli-latestpointer are createdpackages/cli/scripts/build-dist-linux-docker.sh linux-arm64from a contributor's machine runs cleanly end-to-endSummary by cubic
Make
build-distdownloads resilient with retries, timeouts, and atomic writes, and cache thebetter-sqlite3prebuild to prevent flaky 5xx failures. Add a Docker harness to reproduce the Linux CI build locally forlinux-x64andlinux-arm64.Bug Fixes
curlDownload()with--retry 6 --retry-delay 2 --retry-all-errors, tight timeouts, and atomic.partial→ rename writes.better-sqlite3tarball by version/ABI/target in~/.superset-build-cache; applies the retry wrapper to Node andbetter-sqlite3downloads.New Features
packages/cli/scripts/build-dist-linux-docker.shto mirror the GitHub Actions Linux flow inoven/bun:<.bun-version>.linux-x64andlinux-arm64; runs install, rebuildsnode-ptyand@parcel/watcher, builds, and smoke-tests (require('better-sqlite3'),node-pty,@parcel/watcher,libsql).Written for commit 2217de4. Summary will update on new commits.
Summary by CodeRabbit