-
Notifications
You must be signed in to change notification settings - Fork 8.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allowing routes to specify the idle socket timeout #72362
Conversation
hey @kobelb. Any chance to move this forward? let us know how we can help |
I was waiting to catch-up with @joshdover, who returns from PTO shortly, to discuss testing strategies. I'm hopeful we'll be able to move this forward shortly. |
#73103 merged prior to this PR, causing conflicts... 👀 |
@restrry The work that was done as part of https://github.com/elastic/kibana/pull/73103/files conflicts with the approach in this PR. I saw some comments about you hesitating to allow route definitions to specify the "payload timeout", "response timeout" and the "idle socket timeout". However, Fleet would now like to specify the "idle socket timeout". This will cause a conflict with the workaround implemented here to deal with what I consider a bug in hapi. If the user specifies either the "payload timeout" and the "response timeout", we can increment the "idle socket timeout" by a millisecond to work around this. However, if they specify the "idle socket timeout" in addition to these settings, we're going to potentially violate Hapi's rules and leak the fact that we're using Hapi to consumers. The only other option I've thought of is to leave it up to the consumer to ensure these timeouts are configured without violating Hapi's rules, but this seems even worse. |
My point was that we are okay to support this when necessary, so it's not a problem to extend the definition of the route config.
We can add this validation on the platform level as a temporary measurement to buy some time to contribute to Hapi to fix the problem. Another option is to call
Introducing the payload timeout option in #73103 allows us to reduce this time. We struggle with the Hapi logic, but:
Is there an issue to understand the use-case? |
💔 Build Failed
Failed CI Steps
Test FailuresKibana Pipeline / kibana-oss-agent / Chrome UI Functional Tests.test/functional/apps/context/_date_nanos·js.context app context view for date_nanos displays predessors - anchor - successors in right orderStandard Out
Stack Trace
Kibana Pipeline / kibana-xpack-agent / X-Pack API Integration Tests.x-pack/test/api_integration/apis/endpoint/resolver·ts.apis Endpoint plugin Resolver related alerts route endpoint events should return details for the root nodeStandard Out
Stack Trace
Kibana Pipeline / x-pack-intake-agent / X-Pack Jest Tests.x-pack/plugins/uptime/public/components/common/charts/__tests__.PingHistogram component renders the component without errorsStandard Out
Stack Trace
and 1 more failures, only showing the first 3. Build metrics
History
To update your PR or re-run it, just comment with: |
@restrry the use case is that Elastic Agent connects to Kibana and does long-polling . It does long-polling because it means Elastic Agent can react to user-initiated changes immediately and at the same time, we reduce the number of requests that Kibana would need to respond to if we were doing regular polling. Increasing the socket timeout greatly increases the number of Elastic Agents each Kibana instance can handle. Can we help move this forward somehow? |
@restrry @kobelb aiming for 7.10 for this, right? |
This will allow Fleet to configure their routes to have an idle socket timeout that is longer than the global HTTP server's idle socket timeout.
Testing this became quite the challenge...
This PR only allows the route definition to override the "idle socket timeout", it does not allow the route definition to override the "payload timeout" or the "response timeout" as we haven't had any requests for this behavior. This interacts interestingly with the bug in hapi because the idle socket timeout must be larger than the "payload timeout" which defaults to 10 seconds. Until this bug is resolved, that means that the "idle socket timeout" must be larger than 10 seconds, or Hapi will throw an error when the route is defined. I'd prefer to not have to add our own validation for the idle socket timeout, as this will have to be removed when the hapi bug is resolved; however, it's somewhat confusing for the consumers because we leak details about hapi.