Github Repository Search resulting in Secondary Rate Limit #1842

vikas-maersk · 2024-05-02T09:24:48Z

Describe the bug
Getting the below error when searching for repositories with a keyword that returns a lot of repositories in the result. My usecase can be fulfilled by just getting say top 50 results, but I do not see any way to do that on the server side but only post getting the results.

org.kohsuke.github.HttpException: {
  "documentation_url": "https://docs.github.com/free-pro-team@latest/rest/overview/rate-limits-for-the-rest-api#about-secondary-rate-limits",
  "message": "You have exceeded a secondary rate limit. Please wait a few minutes before you try again. If you reach out to GitHub Support for help, please include the request ID ...."
}

I was trying the search for repositories within my org and keywords that matched even around 850 repositories were hitting this rate limit. The github documentation quotes a much larger number.

I even tried the in:name and in:displayName within the query but even that did not seem to help.

To Reproduce
Steps to reproduce the behavior:

new GitHubBuilder().withAppInstallationToken(token).build()
                .searchRepositories()
                .org(orgName)
                .q(repoName)
                .list();

Just pass in a repo name like "test" or something that results in a lot of repository results in the above code.

Expected Behaviour
It should return the repositories or else provide a clear reason of why the error. Or else there should be a way to limit the search results on the GitHub side itself.

Additional Info
Version: org.kohsuke:github-api:1.316

The text was updated successfully, but these errors were encountered:

vikas-maersk · 2024-05-06T09:15:54Z

I was able to achieve this using:

new GitHubBuilder().withAppInstallationToken(token).build()
                .searchRepositories()
                .org(orgName)
                .q(query)
                .list()
                .withPageSize(appConfig.getGithubRepositorySearchLimit());

And then instead of calling the toList on the PagedSearchIterable, I iterated the result using nextPage, like this:

repositoriesByNameQuery.iterator().nextPage();

bitwiseman · 2024-05-18T19:12:34Z

Reopening as this is an interesting case for Secondary rate limits.

holly-cummins · 2024-06-19T14:37:25Z

See also #1805

bitwiseman · 2024-06-20T02:57:39Z

This docs page describes secondary rate limit behavior:
https://docs.github.com/en/rest/using-the-rest-api/rate-limits-for-the-rest-api?apiVersion=2022-11-28#about-secondary-rate-limits

As of this reading it says:

You may encounter a secondary rate limit if you:

Make too many concurrent requests. No more than 100 concurrent requests are allowed. This limit is shared across the REST API and GraphQL API.

Make too many requests to a single endpoint per minute. No more than 900 points per minute are allowed for REST API endpoints, and no more than 2,000 points per minute are allowed for the GraphQL API endpoint. For more information about points, see "Calculating points for the secondary rate limit."

Make too many requests per minute. No more than 90 seconds of CPU time per 60 seconds of real time is allowed. No more than 60 seconds of this CPU time may be for the GraphQL API. You can roughly estimate the CPU time by measuring the total response time for your API requests.

Create too much content on GitHub in a short amount of time. In general, no more than 80 content-generating requests per minute and no more than 500 content-generating requests per hour are allowed. Some endpoints have lower content creation limits. Content creation limits include actions taken on the GitHub web interface as well as via the REST API and GraphQL API.

These secondary rate limits are subject to change without notice. You may also encounter a secondary rate limit for undisclosed reasons.

These are incredibly loosely defined guides and you cannot query for them ahead of time. 👎 It looks like we need to take the path some users have suggested and make rate limiting much more resilient, potentially allowing users to write their own rate limit strategies for handling secondary rate limits.

holly-cummins · 2024-06-21T11:09:31Z

From my perspective, I'd be happy with a default strategy which just detects the error and sleeps. In the worst case, detection could be based on pattern matching to the "message": "You have exceeded a secondary rate limit. Please wait a few minutes before you try again.." message in the response. For the sleep, a minute's sleep, perhaps with an exponential backoff, seems reasonable.

That's more or less what I'll have to implement in the client, but it would be convenient if it were implemented centrally. A customisable strategy might be a nice-to-have on top of that, but I think "go slow, but make the request succeed" is probably what most clients would want. Or maybe that's an assumption I'm making?

bitwiseman · 2024-06-21T16:20:28Z

@holly-cummins
The waiting should work as of the latest release v1.322 via the fix to AbuseLimit. Please give it a try.

Unfortunately, GitHub has previously made it clear that it is the responsibility of clients toby well-behaved. Triggering the secondary rate limit too often may eventually get accounts or apps flagged for "abuse", resulting in even longer waits.

The problem is "good behavior" is poorly defined and subject to change without notice. "Go slow" sure, But how slow? They have a metric, but they expect the client to track it, rather than reporting it like they do with the primary rate limit.

holly-cummins · 2024-06-21T16:34:38Z

@holly-cummins The waiting should work as of the latest release v1.322 via the fix to AbuseLimit. Please give it a try.

I was really excited to try 1.322, but am still seeing fairly regular failures because of the rate limit even with 1.322.

I have some node.js code that uses the GitHub API, and I did manage to work around the secondary errors with promise-retry and an exponential backoff. It doesn't translate directly, because it's javascript, but it gives me confidence in the principle. I agree that exponential backoff is pretty crude compared to the clear 'try after this time' guidance that you get when the main rate limiter is triggered, and it does carry the risk of waiting too long, or not waiting long enough, and having to wait even longer.

vikas-maersk closed this as completed May 6, 2024

bitwiseman reopened this May 18, 2024

bitwiseman added the bug label Jun 6, 2024

holly-cummins mentioned this issue Jul 4, 2024

Accept 429 response codes as an indication of the secondary rate limit being exceeded #1895

Merged

5 tasks

bitwiseman closed this as completed in #1895 Aug 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Github Repository Search resulting in Secondary Rate Limit #1842

Github Repository Search resulting in Secondary Rate Limit #1842

vikas-maersk commented May 2, 2024 •

edited

Loading

vikas-maersk commented May 6, 2024

bitwiseman commented May 18, 2024

holly-cummins commented Jun 19, 2024

bitwiseman commented Jun 20, 2024

holly-cummins commented Jun 21, 2024

bitwiseman commented Jun 21, 2024 •

edited

Loading

holly-cummins commented Jun 21, 2024 •

edited

Loading

Github Repository Search resulting in Secondary Rate Limit #1842

Github Repository Search resulting in Secondary Rate Limit #1842

Comments

vikas-maersk commented May 2, 2024 • edited Loading

vikas-maersk commented May 6, 2024

bitwiseman commented May 18, 2024

holly-cummins commented Jun 19, 2024

bitwiseman commented Jun 20, 2024

holly-cummins commented Jun 21, 2024

bitwiseman commented Jun 21, 2024 • edited Loading

holly-cummins commented Jun 21, 2024 • edited Loading

vikas-maersk commented May 2, 2024 •

edited

Loading

bitwiseman commented Jun 21, 2024 •

edited

Loading

holly-cummins commented Jun 21, 2024 •

edited

Loading