Skip to content

Too many requests to Bitbucket, scans are really slow (JENKINS-55071) #187

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
3 of 4 tasks
haidaraM opened this issue Apr 17, 2019 · 24 comments
Closed
3 of 4 tasks

Comments

@haidaraM
Copy link

haidaraM commented Apr 17, 2019

Your checklist for this issue

  • Jenkins version

  • Plugin version

  • Bitbucket cloud

  • Bitbucket server and version

Description

Hello, first thanks for this plugin :-)

We have a Jenkins that scans periodically (every minute) some repositories (5 repositories) on Bitbucket cloud using a technical account that have read access on these repositories. All the repositories are part of a single team. The jobs are multibranch pipelines.

We saw that after some minutes, the scans start to become very slow as described at
https://issues.jenkins-ci.org/browse/JENKINS-55071. In our case, 2mn to scan a repository of 2 branches and a size of 249.7 KB.

After checking the logs (see attached files), looks like Bitbucket is throttling the requests which makes me think that Jenkins or the plugin is making too many requests because the limits of Bitbucket API is very high: 60000 requests per hour (https://confluence.atlassian.com/bitbucket/rate-limits-668173227.html). Is it possible to log the requests made to Bitbucket ? I only get logs about the API rate limit ?

And also, the cache works for the team repositories list when creating a new multibranch Pipeline.

Jenkins: 2.172
Plugin version: 2.4.4

screencapture-jenkins-emp-istefr-fr-log-BitbucketPlugin-2019-04-17-10_55_09
log_config

@jetersen
Copy link
Member

Have you tried to enable the cache?

@jetersen
Copy link
Member

Period scans ought to be 8 hours or more. You ought to rely on events.

@haidaraM
Copy link
Author

The cache only seems to work for the team repositories list: the dropdown list when creating a new multibranch pipeline gets quickly populated. The scans are still slow.

The statistics show that there is one repostitory in the cache.
cache

@jetersen
Copy link
Member

jetersen commented Apr 17, 2019

Right, we don't have to cache for branch indexing 😆

You should enable/setup "Manage hooks" and change periodic scans to 8 hours or more.

@haidaraM
Copy link
Author

Exactly.
The problem is that our Jenkins is not accessible on internet. To trigger the build, we need Jenkins to scan Bitbucket and run our pipelines if some changes occur

@jetersen
Copy link
Member

Surely you can do some webhook magic, some website with a leg on each side of the walls 😆 Hence why #165 was added 🤣

@haidaraM
Copy link
Author

If we can avoid setting a custom URL for Bitbucket hook by limiting the requests, that will be nice too 😄

@jetersen
Copy link
Member

#165 was added to specifically allow manage webhooks to work with setups that don't have internet access.

@haidaraM
Copy link
Author

Yeah just saw #165. I have changed the frequency of the scan to 2mn. No more throttling for the moment but a more proper way will be to use hooks as you said. 2mn should be enough for our use case ☺️

However, do you have an idea why too many requests are sent to Bitbucket when ?

@jetersen
Copy link
Member

jetersen commented Apr 18, 2019

It is simple math :)
The number of bitbucket project multiplied by the number of branches multiplied by your scan frequency = API request rate.

The higher scan frequency will often trigger rate limiting because of that.
Best to rely on events.

You could properly ask if you could open a tiny hole in the firewall to bitbucket cloud to make your life easy for Jenkins.
https://confluence.atlassian.com/bitbucket/what-are-the-bitbucket-cloud-ip-addresses-i-should-use-to-configure-my-corporate-firewall-343343385.html

@stale
Copy link

stale bot commented Jun 17, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Jun 17, 2019
@vikoalucard
Copy link

Hi, we should keep this one active

@stale stale bot removed the stale label Jun 17, 2019
@stale
Copy link

stale bot commented Aug 16, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Aug 16, 2019
@stale stale bot closed this as completed Aug 23, 2019
@MikeKroell
Copy link

I am also experiencing this issue. Is it possible to reopen?

@mjeffrey
Copy link

mjeffrey commented Dec 8, 2020

We have the same issue.

It came up when we switched from the Git plugin to the Bitbucket plugin. Since these plugins should be doing something similar it indicated to me that the Bitbucket plugin is making many more calls than the Git plugin (or calling with a different mechanism that is throttled differently).

I think this should be addressed in the plugin itself.

@bitwiseman bitwiseman reopened this Dec 9, 2020
@stale stale bot removed the stale label Dec 9, 2020
@bitwiseman
Copy link
Contributor

It is likely that there is not much to be done in this plugin. As jetersen said:

The number of bitbucket project multiplied by the number of branches multiplied by your scan frequency = API request rate. The higher scan frequency will often trigger rate limiting because of that. Best to rely on events.

The rate limit that is being hit is probably the "repository" one which is only 1000 per hour.
https://support.atlassian.com/bitbucket-cloud/docs/api-request-limits/

However, if you can find places where the number of requests can be optimized, PRs are welcome.

Another possible solution would be to make this plugin to use http-cache. It looks like this plugin implements its own caching, but I would expect that the lower-level caching would be more effective. Some guidance for that is provided here. The github-branch-source plugin added this functionality via OkHttp a while back and it has been very effective (though also a source of some issues).

@bitwiseman
Copy link
Contributor

I think it should just work. Replace all the builders with cache client builders: https://github.com/jenkinsci/bitbucket-branch-source-plugin/search?q=HttpClientBuilder

HttpClientBuilder.create() -> CachingHttpClients.custom()

This will produce "memory bound" caches, good for easy testing: https://github.com/apache/httpcomponents-client/blob/25c124917bb3ea83c2ff00a6c076e12b260d455a/httpclient5-cache/src/main/java/org/apache/hc/client5/http/impl/cache/CachingHttpClients.java#L58-L60

@MateusVMachado
Copy link

Hi, I'm having this problem too, but it started a few days ago and there was no change in Jenkins, nor the addition of new branchs.
I contacted Atlassian and they sent me a file with all the called endpoints in the last 3 days.
The result is strange, it shows that a commit endpoint has been called more than 2k times.

image

After restarting Jenkins, I realized that the problem stopped, and Atlassian support confirmed this. But after a few hours the problem returned.

Is there a log that can be activated to track these requests?
We are unable to work due to this problem.

@marianobilli
Copy link

Yeah just saw #165. I have changed the frequency of the scan to 2mn. No more throttling for the moment but a more proper way will be to use hooks as you said. 2mn should be enough for our use case ☺️

However, do you have an idea why too many requests are sent to Bitbucket when ?

@haidaraM Where did you change the frequency?

@haidaraM
Copy link
Author

@marianobilli When you create a multibranch pipeline, you can set the frequency at which Jenkins will check if there is a change.
Capture d’écran 2021-06-29 à 17 25 20

@lifeofguenter
Copy link
Contributor

I am going to close this for now. I don't think this is something this plugin can fix. Scanning is a very expensive process - irrespectively when using Bitbucket Cloud or even Bitbucket Server. It just does not scale by design - no matter how we would try to mitigate the issue.

I would recommend building either a proxy or allowing specific webhook endpoints from Bitbucket server IPs (https://support.atlassian.com/organization-administration/docs/ip-addresses-and-domains-for-atlassian-cloud-products/#Outgoing-Connections).

@pbecotte
Copy link

pbecotte commented Mar 1, 2022

Just as a drive by comment...jenkins still scans when using the events api, doesn't it?

@gonchik
Copy link

gonchik commented Mar 25, 2022

it is. :(

@MartinHerren
Copy link

I understand the issue is now closed and not much can be done for it on jenkins' side.

Our CI/CD was unusable since our move from self hosted bitbucket server to bitbucket cloud.

Polling every minute a big repo with many branches was not a problem when self-hosting. Impossible now with the cloud.

There are 2 reasons to prefer polling over webhook triggering:

  • not exposing your Jenkins on the internet. We solved that by exposing only the trigger endpoint through a reverse proxy (nginx) on the internet. Additionally we'll whitelist atlassian's IP ranges.
  • webhook triggering unfortunately doesn't trigger a rebuild of open PRs when the target branch gets updated (either through the merge of a PR or a direct push). This is a non-go issue for us.
    The error clearly seems to lie on bitbucket's side not to trigger a rebuild on impacted PRs. Probably a workaround could be done on Jenkins' side with a plugin to detect target branch changes and retrigger the impacted open PRs itself. So such plugin seems to exist for multibranch pipelines.

In the meantime we are experimenting with a hybrid solution, webhook triggering for pushes/merges, and polling with a much lower rate (currently 15 minutes) to retrigger PRs that need to be rebuilt. This is only an ugly workaround as it creates a 'merge window' up to 15 minutes long on every merge during which every PR can be merged untested....

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests