Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OAuth seems to be leaking Apache HTTP client I/O Dispatcher threads #918

Closed
ryanrupp opened this issue Nov 21, 2024 · 3 comments
Closed
Assignees

Comments

@ryanrupp
Copy link

ryanrupp commented Nov 21, 2024

Jenkins and plugins versions report

Jenkins: 2.479.1
OS: Linux - 5.15.0-124-generic
Java: 17.0.12 - Ubuntu (OpenJDK 64-Bit Server VM)

// basic plugin info (omitted details)
apache-httpcomponents-client-4-api:4.5.14-208.v438351942757
apache-httpcomponents-client-5-api:5.4-124.v31e2987e48f4

cloudbees-bitbucket-branch-source:912.v3b_f74026941c

After upgrading from 906 => 912.v3b_f74026941c after some time Jenkins would eventually start throw OutOfMemoryErrors with:

Found unhandled java.lang.OutOfMemoryError exception:
unable to create native thread: possibly out of memory or process/resource limits reached

Digging into this there were tens of thousands of I/O dispatcher <id> threads and thousands of pool-<id>-thread-<number> ids with stacks such as:

"I/O dispatcher 14952" #110761 prio=5 os_prio=0 cpu=13.24ms elapsed=474.83s tid=0x00007f2526ca2ee0 nid=0x330f52 runnable  [0x00007f203cbfe000]
   java.lang.Thread.State: RUNNABLE
	at sun.nio.ch.EPoll.wait([email protected]/Native Method)
	at sun.nio.ch.EPollSelectorImpl.doSelect([email protected]/EPollSelectorImpl.java:118)
	at sun.nio.ch.SelectorImpl.lockAndDoSelect([email protected]/SelectorImpl.java:129)
	- locked <0x000000077cff65c0> (a sun.nio.ch.Util$2)
	- locked <0x000000077cff6520> (a sun.nio.ch.EPollSelectorImpl)
	at sun.nio.ch.SelectorImpl.select([email protected]/SelectorImpl.java:141)
	at org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:255)
	at org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:104)
	at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:591)
	at java.lang.Thread.run([email protected]/Thread.java:840)

and

"pool-6709-thread-1" #110757 prio=5 os_prio=0 cpu=18.16ms elapsed=474.84s tid=0x0000564707352dc0 nid=0x330f4e runnable  [0x00007f204fcfe000]
   java.lang.Thread.State: RUNNABLE
	at sun.nio.ch.EPoll.wait([email protected]/Native Method)
	at sun.nio.ch.EPollSelectorImpl.doSelect([email protected]/EPollSelectorImpl.java:118)
	at sun.nio.ch.SelectorImpl.lockAndDoSelect([email protected]/SelectorImpl.java:129)
	- locked <0x000000077cea0ca0> (a sun.nio.ch.Util$2)
	- locked <0x000000077cea0bf8> (a sun.nio.ch.EPollSelectorImpl)
	at sun.nio.ch.SelectorImpl.select([email protected]/SelectorImpl.java:141)
	at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor.execute(AbstractMultiworkerIOReactor.java:343)
	at org.apache.http.impl.nio.conn.PoolingNHttpClientConnectionManager.execute(PoolingNHttpClientConnectionManager.java:221)
	at org.apache.http.impl.nio.client.CloseableHttpAsyncClientBase$1.run(CloseableHttpAsyncClientBase.java:64)
	at java.lang.Thread.run([email protected]/Thread.java:840)

Digging into this I narrowed it down to the start/stop of builds. I put a breakpoint in the AbstractMultiworkerIOReactor constructor with remote debugging on our Jenkins instance and it looks like this is coming from BitbucketOAuthAuthenticator, possibly related to this or this I think. It looks like the code path thinks it needs to establish a new HTTP client everytime but then it doesn't get closed. I see at one point there in the commits it's passing the client in with httpClient but later this moves to httpClientConfig.

Here's the callstack when hitting that constructor:

<init>:123, AbstractMultiworkerIOReactor (org.apache.http.impl.nio.reactor)
<init>:82, DefaultConnectingIOReactor (org.apache.http.impl.nio.reactor)
create:43, IOReactorUtils (org.apache.http.impl.nio.client)
build:686, HttpAsyncClientBuilder (org.apache.http.impl.nio.client)
<init>:37, ApacheHttpClient (com.github.scribejava.httpclient.apache)
<init>:33, ApacheHttpClient (com.github.scribejava.httpclient.apache)
createClient:12, ApacheProvider (com.github.scribejava.httpclient.apache)
getClient:46, OAuthService (com.github.scribejava.core.oauth)
<init>:40, OAuthService (com.github.scribejava.core.oauth)
<init>:37, OAuth20Service (com.github.scribejava.core.oauth)
createService:114, DefaultApi20 (com.github.scribejava.core.builder.api)
build:127, ServiceBuilder (com.github.scribejava.core.builder)
<init>:38, BitbucketOAuthAuthenticator (com.cloudbees.jenkins.plugins.bitbucket.api.credentials)
convert:37, BitbucketOAuthAuthenticatorSource (com.cloudbees.jenkins.plugins.bitbucket.api.credentials)
convert:16, BitbucketOAuthAuthenticatorSource (com.cloudbees.jenkins.plugins.bitbucket.api.credentials)
convert:148, AuthenticationTokens (jenkins.authentication.tokens.api)
build:156, BitbucketSCMFileSystem$BuilderImpl (com.cloudbees.jenkins.plugins.bitbucket.filesystem)
of:316, SCMFileSystem (jenkins.scm.api)
create:109, SCMBinder (org.jenkinsci.plugins.workflow.multibranch)
run:311, WorkflowRun (org.jenkinsci.plugins.workflow.job)
execute:101, ResourceController (hudson.model)
run:445, Executor (hudson.model)

What Operating System are you using (both controller, and any agents involved in the problem)?

Ubuntu 22.04

Reproduction steps

Run builds where Bitbucket OAuth is configured

Expected Results

No threads are leaked between builds

Actual Results

HTTP client threads are leaked between builds

Anything else?

No response

Are you interested in contributing a fix?

No response

@ryanrupp
Copy link
Author

ryanrupp commented Nov 21, 2024

Putting the service in the try-with-resources block would fix (it closes the HTTP client), not sure if maybe the goal is to reuse the client though between usages. If using just as a one off request though maybe some additional changes so it doesn't create a thread pool.

@nfalco79 nfalco79 self-assigned this Nov 21, 2024
@nfalco79
Copy link
Member

I'm on this

@nfalco79
Copy link
Member

Thank very must for the issue digging, it always help a lot, and make fix faster ;)
Comparing before vs after update scribe to scribejava and looking into the old implementation I catch that it uses the HttpURLConnection of the JDK. When I have updated to the scribejava I thought the used client implementation was Apache HTTP (my fault).
Now thinking if using the apache http client will give to me benefit against JDK client but...

  • since any changes impact a lot of people (it's like move an elephant in a china cabinet)
  • the Authenticator is instantiated per build, when open config page, re-index scan

...I do not see any benefit to use a client with a pooled connection manager (binded to execution thread) that is creted and destroyed each time. I could share it but I have to manager myself (it is not closed, closeing the client) in the authenticator instance so .... I will change the Apache HTTP client provider to the JDK HTTP client provider.
This will also fix issue #903

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants