Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bazel 4.2.1 doesn't respect --remote_max_connections #14178

Closed
Vertexwahn opened this issue Oct 27, 2021 · 4 comments
Closed

Bazel 4.2.1 doesn't respect --remote_max_connections #14178

Vertexwahn opened this issue Oct 27, 2021 · 4 comments
Assignees
Labels
P1 I'll work on this now. (Assignee required) team-Remote-Exec Issues and PRs for the Execution (Remote) team type: bug

Comments

@Vertexwahn
Copy link
Contributor

Description of the problem:

There is a problem with Bazel 4.2.1. For some reason, it does not respect the --remote_max_connections flag. Builds that run with this version of Bazel open up thousands of sockets when they should in fact be restricted to 100.

We do not see this behavior with Bazel 4.0.0.

When using socketstat (we do here always a build with a remote cache - the system has around 240 sockets open before Bazel starts) we see a large increase of max sockets:

RUN_ID Bazel version Max Sockets
1 4.0.0 341
2 4.2.1 1241
3 4.0.0 332
4 4.2.1 1804
5 4.0.0 360
6 4.2.1 1813
7 4.2.1 2643

Bugs: what's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

TODO (we work on it)

What operating system are you running Bazel on?

Ubuntu 20.04 LTS

What's the output of bazel info release?

release 4.0.0 or release 4.2.1

Have you found anything relevant by searching the web?

Nope.

Further notes

We are using Buildbarn

@limdor limdor mentioned this issue Oct 27, 2021
9 tasks
@katre
Copy link
Member

katre commented Oct 27, 2021

Is this present in Bazel at head? We've just cut the 5.0.0 release, and are preparing to test and verify it, so this could be addressed in that.

@coeuvre
Copy link
Member

coeuvre commented Oct 28, 2021

Not sure why but --remote_max_connections is only used for http cache. You can verify with 4.0.0 by setting it to a lower number e.g. 10.

The difference between 4.0.0 and 4.2.1 for grpc cache is caused by the dynamic connection pool which we introduced after 4.0.0. #11801 (comment)

That said, I do think we should provide a way to set the max connections for grpc remote cache/execution. Reusing --remote_max_connections is an incompatible change since its default value is 100 while people expect the number of grpc connections is unlimited by default.

@coeuvre coeuvre self-assigned this Nov 1, 2021
@coeuvre coeuvre added team-Remote-Exec Issues and PRs for the Execution (Remote) team type: bug P1 I'll work on this now. (Assignee required) labels Nov 1, 2021
@brentleyjones
Copy link
Contributor

@Wyverald Can this get the release blocker milestone instead of #14202, and can it be reopened until the cherry-pick is in?

@Wyverald Wyverald added this to the Bazel 5.0 Release Blockers milestone Nov 16, 2021
Wyverald pushed a commit that referenced this issue Nov 16, 2021
…ons.

`--remote_max_connections` is only applied to HTTP remote cache. This PR makes it apply to gRPC cache/executor as well.

Note that `--remote_max_connections` limits the number of concurrent connections. For HTTP remote cache, one connection could handle one request at one time. For gRPC remote cache/executor, one connection could handle 100+ concurrent requests. So the default value `100` means we could make up to `100` concurrent requests for HTTP remote cache or `10000+` concurrent requests for gRPC remote cache/executor.

Fixes: #14178.

Closes #14202.

PiperOrigin-RevId: 410249542
@Wyverald
Copy link
Member

Cherrypicked \o/

limdor pushed a commit to limdor/bazel that referenced this issue Nov 24, 2021
…ons.

`--remote_max_connections` is only applied to HTTP remote cache. This PR makes it apply to gRPC cache/executor as well.

Note that `--remote_max_connections` limits the number of concurrent connections. For HTTP remote cache, one connection could handle one request at one time. For gRPC remote cache/executor, one connection could handle 100+ concurrent requests. So the default value `100` means we could make up to `100` concurrent requests for HTTP remote cache or `10000+` concurrent requests for gRPC remote cache/executor.

Fixes: bazelbuild#14178.

Closes bazelbuild#14202.

PiperOrigin-RevId: 410249542
(cherry picked from commit 8d5973d)
meteorcloudy pushed a commit that referenced this issue Nov 24, 2021
…ons. (#14318)

`--remote_max_connections` is only applied to HTTP remote cache. This PR makes it apply to gRPC cache/executor as well.

Note that `--remote_max_connections` limits the number of concurrent connections. For HTTP remote cache, one connection could handle one request at one time. For gRPC remote cache/executor, one connection could handle 100+ concurrent requests. So the default value `100` means we could make up to `100` concurrent requests for HTTP remote cache or `10000+` concurrent requests for gRPC remote cache/executor.

Fixes: #14178.

Closes #14202.

PiperOrigin-RevId: 410249542
(cherry picked from commit 8d5973d)

Co-authored-by: Chi Wang <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P1 I'll work on this now. (Assignee required) team-Remote-Exec Issues and PRs for the Execution (Remote) team type: bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants