tools(git): push-with-retry.sh — DST-decided retry for genuinely external GitHub 5xx#169
tools(git): push-with-retry.sh — DST-decided retry for genuinely external GitHub 5xx#169
Conversation
… investigation Aaron pushed back on a premature retry wrapper (PR #169): retries are a non-determinism smell; DST holds throughout except at genuinely external uncontrollable boundaries (after explicit investigation). Investigation via GIT_TRACE + GIT_CURL_VERBOSE confirmed: - Local git config clean (no trailing slash) - On-wire URL is `/Lucent-Financial-Group/Zeta.git/ git-upload-pack` (correct per spec, not a bug) - HTTP 500 returns from GitHub itself, intermittent Decision: retry IS legitimate (DST-exception case) but the landing order was wrong. PR #169 updated with full investigation chain + paper trail showing the correction. Per-user memory filed: feedback_retries_are_non_determinism_smell_DST_holds_investigate_first_2026_04_23.md Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 0c616c760a
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
Pull request overview
Adds a small Bash wrapper around git push to automatically retry on transient GitHub 5xx failures, with backoff and env overrides.
Changes:
- Introduces
tools/git/push-with-retry.shto retrygit pushup to N times on matched 5xx text. - Adds exponential backoff and environment-variable knobs for attempts/backoff.
- Documents the investigation rationale and usage in the script header.
Aaron 2026-04-23 noted recurring Internal Server Error 500s from GitHub during autonomous-loop tick-close commits. A retry of the same `git push` has succeeded on next attempt within seconds each time. Script: retry up to 3 times (env GIT_PUSH_MAX_ATTEMPTS) on matched 5xx error text (500/502/503/504/Internal Server Error/Bad Gateway/Service Unavailable/Gateway Timeout) with exponential backoff starting at 2s (env GIT_PUSH_BACKOFF_S). Non-transient errors (auth / protected branch / hook / divergence) propagate immediately without retry. Thin-wrapper-over-existing-CLI exemption from the bun+TS default per docs/POST-SETUP-SCRIPT-STACK.md Q3. Open root-cause question noted in the header: Aaron spotted `Zeta.git/` trailing slash in the error URL. Local git config has the canonical `.git` form with no trailing slash, so the trailing slash may be an artifact of git's URL-error formatter. If root cause turns out to be URL-construction, a repo-level fix would supersede this wrapper. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Aaron 2026-04-23 pushback on the prior commit: retries are a non-determinism smell; DST (Deterministic Simulation Theory) holds throughout the factory except when explicitly decided- against for real external uncontrollable reasons; investigate before reaching for retry. Investigation this tick: - Local git config clean (no trailing slash on remote URL) - GIT_TRACE=1 + GIT_CURL_VERBOSE=1 on git ls-remote origin shows on-wire URL is `/Lucent-Financial-Group/Zeta.git/ git-upload-pack` — the `.git/` in git's error messages is `.git/git-upload-pack` truncated by git's error formatter (correct URL path, not a client-side bug) - The HTTP 500 returns from GitHub itself, reproduces intermittently across commands - Conclusion: genuinely external GitHub transient, which IS the DST-exception case — retry legitimate after investigation Script header updated with the full investigation chain so the paper trail records what was checked and why retry was decided-against-DST for this specific boundary. Memory filed per-user: feedback_retries_are_non_determinism_smell_DST_holds_investigate_first_2026_04_23.md Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Seven Copilot findings addressed:
P0 bugs (real):
- `exit_code=$?` after `if git push; then ... fi` block was
capturing the if-compound's 0-return, not git push's exit
code. Switched to `set +e` around the push, capture $? on
the next line, then `set -e`. Now correctly propagates
git push's exit code for non-transient errors.
- tmp_stderr cleanup only on normal paths; SIGTERM/SIGINT
could leak temp files. Moved mktemp outside the loop;
added `trap cleanup EXIT`.
- Env vars used in arithmetic contexts without validation.
Added `int_re` regex check before arithmetic fires; exits
with code 2 on bad env (not set -e crash).
Discipline:
- Personal-name attribution ("Aaron's DST discipline") →
role-refs ("the DST discipline" / "the maintainer's
trailing-slash hypothesis"). Matches BP rule on name-
attribution in code/docs/skills.
- Header exit-code documentation updated: 0 success / 1 all-
retries-exhausted / 2 env-validation-failed / N
propagated-from-git-push.
Xref:
- Per-user memory reference shortened to remove the exact
filename path (which lives at ~/.claude/projects/<slug>/
memory/; not in-repo).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
0c616c7 to
d44f8a1
Compare
…; attribution discipline absorbed Milestone: 100th autonomous-loop tick this session. Cron 20c92390 held across all 100; no re-arm needed. PR #164 (Overlay A #5 semiring-parameterized-zeta queue close) MERGED — 16 session PRs. PR #169 unblocked: 2 real bash bugs (set -e exit-code capture + SIGTERM tmp leak), 5 discipline fixes. Rebased on advanced main; pushed. Aaron attribution-on-everything + papers-publishing: filed cross-cutting discipline memory. Every named-agent contribution attributed to the agent; default-loop agent attributes explicitly when no persona is worn. Load-bearing for paper authorship going forward. Copilot-as-pair-reviewer caught TWO real P0 bash bugs — continued value. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: d44f8a1420
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| fi | ||
|
|
||
| # Only retry on transient 5xx errors from the remote. | ||
| if grep -qE "(500|502|503|504|Internal Server Error|Bad Gateway|Service Unavailable|Gateway Timeout)" "$tmp_stderr"; then |
There was a problem hiding this comment.
Restrict 5xx matching to actual HTTP status text
The retry classifier currently treats any occurrence of 500|502|503|504 in stderr as a transient GitHub outage, which causes false positives on ordinary non-transient failures (for example, a rejected ref like feature/502-refactor). In that case this loop retries unnecessarily and eventually returns 1 instead of propagating the original git push exit code, so callers lose the real failure signal. Matching should require explicit HTTP/remote 5xx context (or anchored tokens) rather than bare numbers.
Useful? React with 👍 / 👎.
Summary
Thin
git pushretry wrapper for transient GitHub 5xx errors, landing after a root-cause investigation per Aaron's DST discipline (retries are a non-determinism smell; investigate first).Investigation (2026-04-23)
Aaron pushed back on an earlier version of this PR that reached for retry without investigation. Investigation chain:
Zeta.git/with a trailing slash — possible client-side URL-construction bug.remote.origin.url = https://github.com/Lucent-Financial-Group/Zeta.git— no trailing slash.GIT_TRACE=1 GIT_CURL_VERBOSE=1 git ls-remote originshows the actual URL is/Lucent-Financial-Group/Zeta.git/git-upload-pack. The.git/in error messages is.git/git-upload-packtruncated by git's error formatter — correct URL path per Git-over-HTTPS spec, not a bug.Conclusion: this is a genuinely external GitHub transient (the DST-exception case — "real external reasons we can't control"). Retry is legitimate here after investigation, not as a default reach.
The script
git pushwith up to 3 retries (envGIT_PUSH_MAX_ATTEMPTS) on matched 5xx text (500/502/503/504/Internal Server Error/Bad Gateway/Service Unavailable/Gateway Timeout).GIT_PUSH_BACKOFF_S).docs/POST-SETUP-SCRIPT-STACK.mdQ3.Paper trail
feedback_retries_are_non_determinism_smell_DST_holds_investigate_first_2026_04_23.mdcaptures the DST discipline + the investigation-before-retry orderIf the 500-rate escalates
Revisit — the retry wrapper is a mitigation for the current observed rate. If the rate rises or a new root cause surfaces, this wrapper should be replaced by the actual fix.
🤖 Generated with Claude Code