fix: add exponential backoff retry to setup-api-client npm install#253
Merged
fix: add exponential backoff retry to setup-api-client npm install#253
Conversation
Port the retry-with-backoff logic from Workflows repo to Counter_Risk's local copy of setup-api-client. The guard check on PR #249 failed with a transient npm registry 403 on safe-buffer because the old code only had a single --legacy-peer-deps fallback with no backoff. Changes: - 3 retry attempts with exponential backoff (5s, 10s) - --legacy-peer-deps fallback on first failure - Log stderr from all failed attempts for diagnosability - Pin lru-cache@10.4.3 (was ^10.0.0) for consistency with Workflows https://claude.ai/code/session_01JhCWWDJG8PqwaSbVPCGfm6
Contributor
There was a problem hiding this comment.
Pull request overview
Ports the Workflows repo’s retry-with-backoff npm install logic into Counter_Risk’s local .github/actions/setup-api-client composite action to reduce transient npm registry failures during dependency setup.
Changes:
- Add up to 3 install attempts with exponential backoff and a first-failure
--legacy-peer-depsfallback. - Capture and emit stderr from failed install attempts for easier diagnosis.
- Pin
lru-cacheto10.4.3for consistent hoisting.
Contributor
🤖 Keepalive Loop StatusPR #253 | Agent: Codex | Iteration 0/5 Current State
🔍 Failure Classification| Error type | infrastructure | |
Contributor
Keepalive Work Log (click to expand)
|
GitHub Actions ::warning:: commands truncate/mangle multi-line content. Emit a short annotation message and print full npm stderr in a collapsible ::group:: instead, so logs stay readable. https://claude.ai/code/session_01JhCWWDJG8PqwaSbVPCGfm6
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Automated Status Summary
Scope
PR #228 (issue #227) was merged with all 42 task checkboxes marked complete and the in-process verifier returning PASS. However, post-merge verification by both OpenAI (gpt-5.2, 83% confidence) and Anthropic (claude-sonnet-4-5, 95% confidence) returned FAIL, identifying concrete gaps in the implementation. This follow-up addresses the remaining unmet acceptance criteria.
Root Cause Analysis
The keepalive loop logs reveal a cascading failure across multiple systems:
Codex agent checked PR body checkboxes aggressively. The
.agents/issue-227-ledger.ymlfile shows onlytask-01(of 42) was markeddonein the structured ledger. Yet the PR body went from 42 unchecked → 0 unchecked over 14 iterations. The agent edited the PR body checkboxes directly without updating the ledger, bypassing the structured tracking that was designed to prevent exactly this.autoReconcileTasksamplified the problem. Each keepalive iteration ran an LLM-based auto-reconciliation step that auto-checked ~1 additional task per run based on loose commit-to-task matching. Over 14 iterations this added up. The reconciler used "high-confidence" matching that was insufficiently discriminating — e.g., a commit touchingmapping_diff.pymatched multiple task descriptions simultaneously.cascadeParentCheckboxesmay have inflated counts. The keepalive loop's cascade logic automatically checks all indented child checkboxes when a parent is checked. If the agent or reconciler checked a section-level parent, all sub-tasks under it would cascade to checked.Codex-as-verifier was self-grading. Iteration 14 ran a
verify-acceptanceprompt using the same Codex agent that did the implementation work. Despite the prompt explicitly saying "Do NOT trust checkbox states as evidence of completion," the verifier returnedsuccess. This is a known LLM bias: the agent that produced the work is predisposed to confirm its own output.The keepalive loop trusted checkbox state as the primary completion signal. When all 42 boxes were checked AND the verifier returned success AND the CI gate was green, the loop issued
stop (tasks-complete). There was no independent mechanical verification of the acceptance criteria.Scope violation was not mechanically enforced. The acceptance criterion "PR diff contains only files matching these patterns" was a text checkbox, not an automated check. The agent could (and did) check it off despite the diff containing 15+ out-of-scope files.
Context for Agent
Related Issues/PRs
Context for Agent
Related Issues/PRs
Tasks
Registry-First Resolution for Clearing House
resolve_clearing_house()function insrc/counter_risk/normalize.pythat returnsNameResolutionwithsourcefield using registry-first lookup before_CLEARING_HOUSE_FALLBACK_MAPPINGSnormalize_clearing_house()insrc/counter_risk/normalize.pyto delegate internally toresolve_clearing_house()tests/test_normalization_registry_first.pyverifyingresolve_clearing_house()returnssource='registry'when name exists in registrytests/test_normalization_registry_first.pyverifyingresolve_clearing_house()returnssource='fallback'when name is not in registrytests/test_normalization_registry_first.pyverifyingresolve_clearing_house()consults registry before checking fallback mappingstests/test_normalization_registry_first.pyverifyingresolve_clearing_house()handles missing or empty registry files without raising exceptionsPublic API Source Attribution
normalize_counterparty_with_source()function insrc/counter_risk/normalize.pythat wrapsresolve_counterparty()and returnsNameResolutionnormalize_counterparty_with_source()documenting thesourcefield in returnedNameResolutionobjectpipeline/run.pyreconciliation logic to callnormalize_counterparty_with_source()instead ofresolve_counterparty()directlytests/test_normalization_registry_first.pyverifyingnormalize_counterparty_with_source()returns object with accessible.sourceattributeMissing Config File
config/name_registry.ymlwith minimal valid registry containing at least the entries used by existing test fixturestests/test_mapping_diff_report_cli.pythat importsmapping_diff_reportCLI and runs it againstconfig/name_registry.ymlwithout raising exceptionsTesting Gaps - CLI Parameter Passing
tests/test_mapping_diff_report_cli.pythat mocksgenerate_mapping_diff_reportand verifies CLI forwards correctregistry_pathparametertests/test_mapping_diff_report_cli.pythat mocksgenerate_mapping_diff_reportand verifies CLI forwards correctoutput_formatparameterTesting Gaps - End-to-End Report Sections
tests/fixtures/unmapped_names.csvwith known unmapped counterparty namestests/fixtures/fallback_mapped_names.csvwith known fallback-mapped counterparty namestests/test_mapping_diff_report_cli.pythat runsmapping_diff_reportCLI with fixtures and captures stdoutUNMAPPEDsection header appears in captured outputFALLBACK_MAPPEDsection header appears in captured outputSUGGESTIONSsection header appears in captured outputTesting Gaps - Pipeline Integration
tests/fixtures/name_registry_before.ymlrepresenting initial registry state for pipeline testingtests/fixtures/name_registry_after.ymlrepresenting updated registry state with additional mappingstests/test_normalization_registry_first.pythat loads pipeline withname_registry_before.ymlNameResolution.sourcevaluesNameResolution.sourcevalues when run withname_registry_after.ymlAcceptance criteria
Registry-First Resolution
normalize_clearing_house()consults the name registry before_CLEARING_HOUSE_FALLBACK_MAPPINGSand the resolution path recordssourceasregistryorfallbackresolve_clearing_house()exists insrc/counter_risk/normalize.pyand returns aNameResolutionobject withsourcefield set toregistryorfallbacknormalize_counterparty_with_source().sourcesucceeds withoutAttributeErrorConfig File
config/name_registry.ymlexists in repository rootconfig/name_registry.ymlparses without YAML syntax errorsload_name_registry('config/name_registry.yml')executes without raising exceptionsTesting - CLI
mapping_diff_report --registry config/name_registry.ymlcompletes with exit code 0tests/test_mapping_diff_report_cli.pymocksgenerate_mapping_diff_reportand asserts it receives expected parameterstests/test_mapping_diff_report_cli.pypassTesting - Report Sections
mapping_diff_reportCLI with fixtures and captures output to stringUNMAPPEDsubstringFALLBACK_MAPPEDsubstringSUGGESTIONSsubstringTesting - Pipeline Integration
tests/test_normalization_registry_first.pyruns pipeline reconciliation with two different registry filesNameResolution.sourcevalue differs between the two pipeline runstests/test_normalization_registry_first.pypassScope Constraint
src/counter_risk/normalize.py,src/counter_risk/cli/*,src/counter_risk/reports/*,config/name_registry.yml,tests/test_*registry*.py,tests/test_*mapping_diff*.py,tests/fixtures/*.yml,tests/fixtures/*.csv,pyproject.toml,README.md,docs/*.mdOverall
Head SHA: 08a6c4e
Latest Runs: ❔ in progress — Gate
Required: gate: ❔ in progress