bbs goes to 100% CPU after a period of time and won't accept requests

## Summary
In our largest foundation, our active BBS instance will intermittently go to 100% cpu and stop accepting requests. The symptoms from the user side are stager errors like "Runner is unavailable:" when pushing, restarting, or staging apps. In the bbs log, we start seeing `2021/06/23 15:32:42 http: TLS handshake error from 10.10.17.188:43720: EOF` errors fill the logs.

Restarting bbs (which moves all traffic to the other diego-api instance) fixes the problem. I suspect this is a resource exhaustion issue as if we restart bbs about twice a week, this error doesn't come up.

We're running diego-release v2.49.0 (with a planned upgrade at the end of the month).

## Steps to Reproduce
We don't have a way to reproduce this, however since this may be a resource exhaustion issue, it's worth mentioning that there are around 39000 events per minute in the `bbs.stdout.log` on a normal day. We have about 10,000 apps in this foundation.

## Diego repo

https://github.com/cloudfoundry/bbs

## Environment Details 

Versions in use:
cf-deployment v16.14.0
diego-release 2.49.0
on a bionic stemcell v1.1

## Possible Causes or Fixes (optional)

I suspect a resource exhaustion issue, but I have no further insight into what might be happening.

## Additional Text Output, Screenshots, contextual information (optional)

I realize this is a vague report. I am happy to collect more info if someone can guide me on what data is needed and how to get it.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

bbs goes to 100% CPU after a period of time and won't accept requests #597

Summary

Steps to Reproduce

Diego repo

Environment Details

Possible Causes or Fixes (optional)

Additional Text Output, Screenshots, contextual information (optional)

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

bbs goes to 100% CPU after a period of time and won't accept requests #597

Description

Summary

Steps to Reproduce

Diego repo

Environment Details

Possible Causes or Fixes (optional)

Additional Text Output, Screenshots, contextual information (optional)

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions