fix(contributors): handle GitHub secondary rate limit#18073
Merged
Conversation
The fetch-github-contributors task was hitting GitHub's secondary rate limit (returned as 403 with Retry-After) on parallel commits API requests, but the existing 403 handler only recognized the primary rate limit (X-RateLimit-Remaining=0 + X-RateLimit-Reset). Secondary-limit 403s fell through to a console.warn and an empty-array return, which was then written to blob storage as a missing entry; consumers fell back to new Date(0) and shipped "Page last update: January 1, 1970" plus an empty contributors modal in production. Drops BATCH_SIZE from 20 to 2 (GitHub's guidance is serial-per-token; 2 keeps us comfortably under both the 100-concurrent ceiling and the 900-points/min budget) and bumps BATCH_DELAY_MS from 50 to 200. Adds readRateLimitWait that detects primary limits, secondary limits via Retry-After, and the body-message fallback when the header is omitted. Adds a bounded retry loop (3 attempts, 5-minute cap per wait) and replaces the silent empty-array return on unexpected non-OK statuses with a thrown error so a real failure surfaces as a failed task run rather than corrupting the blob. Refs #18070. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Co-Authored-By: wackerow <54227730+wackerow@users.noreply.github.com>
✅ Deploy Preview for ethereumorg ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

Summary
The
fetch-github-contributorsTrigger.dev task was silently failing on a subset of paths, leavingappPages[pagePath]missing or empty in Netlify Blobs. Downstream,getAppPageLastCommitDate([])reduces tonew Date(0), which is why production pages like https://ethereum.org/what-is-ethereum/ have been showing "Page last update: January 1, 1970" plus an empty "See contributors" modal (issue #18070).This PR fixes the data side. A separate, smaller PR can add UI defenses around empty contributor arrays.
Why -- research notes
The task is authenticated and was not hitting 429s, so the failure mode initially looked unlike rate limiting. It is. GitHub's REST API has two distinct rate limits:
429(or sometimes403) withX-RateLimit-Remaining: 0andX-RateLimit-Reset. The old code handled this case.403with aRetry-Afterheader and noX-RateLimit-*headers. Triggers include:A captured 403 response from the failing task confirms this exact shape: status 403,
retry-after: 60, noX-RateLimit-Remaining. The old 403 handler keyed onX-RateLimit-Remaining === "0", so secondary-limit 403s fell through to aconsole.warn+return []. The empty array was then written to blob storage, and the fetcher's ownif (contributors.length > 0)filter made "fetch failed" indistinguishable from "no contributors."The likely trigger was concurrency:
BATCH_SIZE = 20parallel requests against/repos/.../commits, multiplied by ~6 historical paths per app page fromgetAllHistoricalPaths(most of which 404 for the legacysrc/pages/...and_-prefixed variants), put us over GitHub's secondary thresholds well before primary quota was exhausted.Changes
Scoped to
src/data-layer/fetchers/fetchGitHubContributors.tsonly.fetchRetry.tsand other fetchers untouched, so blast radius is limited to this task.BATCH_SIZE20 -> 2,BATCH_DELAY_MS50 -> 200. WithparallelBatchrunning batches serially, the task is at most 2 concurrent across its full run -- well under GitHub's 100-concurrent ceiling and 900 points/min budget. A weekly task does not need 20-wide parallelism.readRateLimitWait(response)returns ms-to-wait ornull. Recognizes:X-RateLimit-Remaining: 0+X-RateLimit-Reset.Retry-Afterheader (delta-seconds or HTTP-date), capped at 5 minutes./secondary rate limit|abuse/i) when the header is omitted, defaulting to 60 s per GitHub's guidance.fetchCommitsForPathtakes anattemptcounter, retries rate-limited responses up to 3 times, then throws. Avoids runaway loops on persistent limits.console.warn+return [], which silently corrupted blob storage. 404 still returns[](expected for legacy paths fromgetAllHistoricalPaths).transformCommitswith a typedGitHubCommitinterface, and pulled the repeatedAuthorization+Acceptheaders into agithubApiHeaders()helper used by bothfetchCommitsForPathanddiscoverPathsFromTree.What this does NOT change
fetchRetry.ts-- intentionally untouched. All 403 handling is inline in this fetcher.getAllHistoricalPathsfan-out -- the speculative legacy paths (src/pages/...,_-prefixed variants) remain. They exist to capture pre-App-Router commit history for files that no longer exist at HEAD, so filtering them against the current git tree would lose contributor data. At BATCH_SIZE=2 the 404 cost is negligible.fetchNameLookup's graceful empty-Map fallback -- not on the rate-limit hot path.Test plan
fetch-github-contributorstask manually in Trigger.dev preview/staging.appPagesmap (no missing keys for current App Router pages).Rate limited on ...retries during a normal run, or if retries occur they recover within bounds.Related issues
Generated by Claude Opus 4.7