Skip to content

Comments

Coturn helm chart: Increase liveness timeout#3218

Merged
jschaul merged 1 commit intodevelopfrom
coturn-liveness-timeout
Apr 5, 2023
Merged

Coturn helm chart: Increase liveness timeout#3218
jschaul merged 1 commit intodevelopfrom
coturn-liveness-timeout

Conversation

@jschaul
Copy link
Member

@jschaul jschaul commented Apr 5, 2023

Increase default of liveness/readiness probe and make it configurable.

Under high load, the default of failureThreshold=3 timeoutSeconds=1 can lead to restarts of the coturn pod due to the http port being temporarily starved of CPU, leading to an unnecessary restart of the coturn pods. This change should make this less frequent and improve call stability.

Checklist

  • Add a new entry in an appropriate subdirectory of changelog.d
  • Read and follow the PR guidelines

Increase default of liveness/readiness probe and make it configurable.

Under high load, the default of failureThreshold=3 timeoutSeconds=1 can
lead to restarts of the coturn pod due to the http port being
temporarily starved of CPU, leading to an unnecessary restart of the
coturn pods. This change should make this less frequent and improve call
stability.
@zebot zebot added the ok-to-test Approved for running tests in CI, overrides not-ok-to-test if both labels exist label Apr 5, 2023
@supersven
Copy link
Contributor

supersven commented Apr 5, 2023

@jschaul do you know if a not quickly answering coturn may cause any kind of other troubles? I mean: What happens on the client's side in this case? 🤔

@jschaul
Copy link
Member Author

jschaul commented Apr 5, 2023

@jschaul do you know if a not quickly answering coturn may cause any kind of other troubles? I mean: What happens on the client's side in this case? thinking

As far as I'm aware, clients attempt to contact multiple TURN servers at once, and go with the server that answers first. So that shouldn't be an issue.

Also, the metrics/http endpoint is different to the udp/tcp/tls endpoints used for actual TURN traffic. So metrics being slow doesn't need to mean actual turn connections are slow, too.

Copy link
Contributor

@supersven supersven left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's try this. 👍
(Better try things than wait in a half-broken state.)

@jschaul jschaul merged commit 6f1cbab into develop Apr 5, 2023
@jschaul jschaul deleted the coturn-liveness-timeout branch April 5, 2023 14:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ok-to-test Approved for running tests in CI, overrides not-ok-to-test if both labels exist

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants