[extension/httpforwarder] Shutdown should wait server exit#37735
Conversation
| err := h.server.Close() | ||
| h.shutdownWG.Wait() | ||
| return err |
There was a problem hiding this comment.
The use of Shutdown vs Close is orthogonal to the issue being fixed in this PR: deterministic release of the port used by the component. In both cases there is still the need to wait the call to Serve(listener) to return.
I tend to prefer Close since, in certain cases Shutdown can take much longer given that it needs to wait existing connections to finish gracefully. That said the component owners will have more context if they want to change the current behavior in that regard.
There was a problem hiding this comment.
I may miss something:
// Shutdown gracefully shuts down the server without interrupting any
// active connections. Shutdown works by first closing all open
// listeners, then closing all idle connections, and then waiting
// indefinitely for connections to return to idle and then shut down.
// If the provided context expires before the shutdown is complete,
// Shutdown returns the context's error, otherwise it returns any
// error returned from closing the [Server]'s underlying Listener(s).
A.k.a because it returns the error from the listener that function that you wait for it must be done.
There was a problem hiding this comment.
Take into account the possible interleaves of the calls to Serve and Shutdown/Close. The ownership of listener is transferred to the http.Server when the call to Serve(listener) happens. However, since that happens in a goroutine it is possible that Shutdown is called before the http.Server took the ownership of listener. In this case the listener is not closed until the call to Serve happens. Serve adds a deferred call to listener.Close, even if Shutdown was already called, which ensures that the port is released by the time the method returns.
While this may be rather improbable in practice, it is better for the component to behave deterministically with the shutdown ensuring no left over goroutine and/or port.
dmitryax
left a comment
There was a problem hiding this comment.
LGTM as a solution tofor the flaky test. The Close vs Shutdown discussion can be solved in a separate PR if needed
* main: (55 commits) [chore] Update core dependencies (open-telemetry#38124) Add kafka topics observer implementation (open-telemetry#38060) [exporter/splunk_hec] Mute errors from draining the response body (open-telemetry#38118) [chore] [exporter/splunk_hec] Remove dead code (open-telemetry#38113) Add support for JUnit test results (open-telemetry#37941) [chore] amend changelog for prometheus receiver change (open-telemetry#38109) [chore] Fix dead links in issue-triaging.md (open-telemetry#38105) [chore] fix deprecation (open-telemetry#38107) [exporter/coralogix] Add new batch options to Coralogix exporter (open-telemetry#38082) [chore][exporter/datadog] fix integration test (open-telemetry#38091) [chore] Update otel to unblock contrib test in core repo (open-telemetry#38100) [chore] Bump go-version match to 1.23 (open-telemetry#38099) [exporter/elasticsearch] Add _metric_names_hash to avoid metric rejections (open-telemetry#37511) elasticsearchexporter: refactor encoding; drop metrics support from raw/none/bodymap mapping modes (open-telemetry#37928) [exporter/stefexporter] Fix incorrectly implemented STEF exporter zstd compression option (open-telemetry#38089) [exporter/clickhouse] Add client info for identifying exporter in `system.query_log` (open-telemetry#37146) [chore] Prepare release 0.120.1 (open-telemetry#38055) [extension/httpforwarder] Shutdown should wait server exit (open-telemetry#37735) receiver/prometheusremotewrite: Add two fields timestamp and value. (open-telemetry#37895) [reciver/sqlqueryreceiver] Add support for SapASE (sybase) (open-telemetry#37773) ...
Description
Ensures that the
Shutdownwaits for the server to release the listen port and finish the corresponding go routine.Although a bug on the component it doesn't seem to deserve a changelog since it was detected via tests and there is no issue around the shutdown of the component.
Link to tracking issue
Fixes #37716
Testing
Repro the issue locally and validated no repro with the changes.
Documentation
N/A