Skip to content

Conversation

@connorsmacd
Copy link
Collaborator

@connorsmacd connorsmacd commented Dec 4, 2025

Reference

CDRIVER-5815
CDRIVER-6177

Summary

This PR makes it possible to disable socket timeouts via socketTimeoutMS.

According to the spec, socketTimeoutMS=0 shall be used to disable socket timeouts. However, the C Driver does not comply with this. Since URI option values of 0 are interpreted as unset (see CDRIVER-3730), the default timeout value is used instead. Normally, this would be okay since the spec dictates that the default timeout should be "no timeout", but the C Driver is yet again not spec-compliant as it uses a default of 5 minutes.

To avoid a breaking change, this PR introduces an alternate string option "inf" that can be used to disable the socket timeout. This is a stopgap measure that should be reevaluated for the next major release.

The C Driver also treated timeout values of 0 as "immediate timeouts" just like the POSIX system call poll. Since this is not aligned with the spec, the C Driver was changed to internally treat timeout values of 0 as infinite timeouts. However, there are some parts of the C Driver that were intentionally using immediate timeouts for non-blocking behavior, so a new sentinel MONGOC_SOCKET_TIMEOUT_IMMEDIATE is introduced, which currently is set to the value of INT32_MIN.

Caution

This makes specifying special timeout values very confusing since there are different interpretations depending on the context and representation:

URI Spec socketTimeoutMS C Driver URI socketTimeoutMS C Driver Internal INT32 Representation POSIX INT32 Representation (e.g., the `timeout` arg for `poll`)
Immediate timeout cannot specify cannot specify -2147483648 (INT32_MIN) 0
Infinite/disabled timeout 0 "inf" 0 normally, but technically [-2147483647, 0] due to CDRIVER-4781 <0

@connorsmacd connorsmacd changed the title CDRIVER-5815 treat socketTimeoutMS=0 as infinite timeout CDRIVER-5815 support disabling socket timeouts Dec 17, 2025
@connorsmacd connorsmacd marked this pull request as ready for review December 17, 2025 20:33
@connorsmacd connorsmacd requested a review from a team as a code owner December 17, 2025 20:33
@kevinAlbs kevinAlbs requested a review from eramongodb December 17, 2025 20:35
Copy link
Contributor

@eramongodb eramongodb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am excited to see the further reduction of explicit timeout arithmetic throughout the codebase with the new _mongoc_stream_tls_timer_* helpers.

Just noting an observation:

int64_t timeout_ms = 0; // -> inf
timeout_ms = _mongoc_stream_tls_timer_to_timeout_msec(_mongoc_stream_tls_timer_from_timeout_msec(timeout_ms));
ASSERT_CMPINT32(timeout_ms, ==, MONGOC_SOCKET_TIMEOUT_INFINITE);

Copy link
Collaborator

@kevinAlbs kevinAlbs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The explanation for the various timeout representations is much appreciated. Posting initial comments.

} else if (timeout_msec == 0) {
if (timeout_msec == MONGOC_SOCKET_TIMEOUT_IMMEDIATE) {
return 0;
} else if (timeout_msec <= 0) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I expect this may change how 0 is interpreted when directly using the mongoc_stream_t API. The mongoc_stream_t API is (unfortunately) public and has known users (e.g. userver here). Example:

mongoc_socket_t *sock = mongoc_socket_new(AF_INET, SOCK_STREAM, 0);

// Build address struct for localhost:27017:
struct sockaddr_in server_addr = {0};
server_addr.sin_family = AF_INET;
server_addr.sin_port = htons(27017);
server_addr.sin_addr.s_addr = htonl(INADDR_LOOPBACK);

ssize_t ok = mongoc_socket_connect(sock, (struct sockaddr *)&server_addr, sizeof(server_addr), -1);
ASSERT_CMPSSIZE_T(ok, ==, 0);

mongoc_stream_t *stream = mongoc_stream_socket_new(sock);
mongoc_iovec_t iov = {.iov_base = "foo", .iov_len = 3};
ok = mongoc_stream_writev(stream, &iov, 1, 0 /* timeout */); // Want to be non-blocking.
ASSERT_CMPSSIZE_T(ok, ==, 3);

mongoc_stream_destroy(stream);

Suggest reverting to preserve the meaning of 0 as non-blocking in the mongoc_stream_t API.

Copy link
Collaborator Author

@connorsmacd connorsmacd Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is quite unfortunate. While I could use the POSIX convention just for the public mongoc_stream_t API, I do not think this is the best approach.

Take mongoc_stream_read with OpenSSL enabled for example:

  • When mongoc_stream_read is called, we convert timeout_msec from the POSIX convention to the spec convention, then pass it to stream->readv.
  • stream->readv is _mongoc_stream_tls_openssl_readv in this case, which eventually calls mongoc_stream_tls_openssl_bio_read.
  • In mongoc_stream_tls_openssl_bio_read, we call mongoc_stream_read again, so we convert the timeout_msec back to the POSIX convention.

If we continue to mix conventions like this, then we are just asking for errors. In my opinion, it would be best to use the POSIX convention as much as possible internally and use the spec convention only where needed for spec compliance. In the case of socketTimeoutMS, we can use the spec convention within the URI, but once we extract the timeout from the URI, we convert to the POSIX convention then stick with it. This would have the added benefit of being more intuitive and less error-prone when dealing with POSIX system calls that accept a timeout.

That said, I recognize that being unaligned with the spec in this regard is less than ideal. This also goes against the recommendation in CDRIVER-5815:

IMO, libmongoc should be changed to have 0 actually mean "unlimited"

Thoughts?

Copy link
Contributor

@eramongodb eramongodb Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The mongoc_socket_t and mongoc_stream_t API have been an unfortunate source of codebase and behavior inconsistencies that is difficult to address due to being public API. They are also primarily responsible for blocking further int32 vs. int64 correctness improvements (CDRIVER-4596 and CDRIVER-4599) which were were unable to be scoped for the v2 release, as noted in #1475:

Additionally, an attempt is made to document the unspecified/inconsistent behavior of non-positive timeout values for the URI option documentation and for functions that have a timeout parameter (i.e. mongoc_stream_*()). Scanning through how the timeout values are both represented and interpreted within libmongoc, it does not seem desirable to continue to support non-positive URI timeout values (even 0) despite historical (implicit) support.

Concerning "spec" vs. "POSIX" conventions, I am slightly in favor of limiting the POSIX convention to the mongoc-stream-socket.c component (where get_expiration() converts from spec to POSIX) + implementing specific workarounds for the stream/socket API, since in most contexts except for the socket/stream API, we use the spec convention. If we want to use the POSIX convention throughout the codebase, I expect there will be comparatively more changes needed for that than what this PR currently proposes (e.g. consider also how mlib_timer API treats < 0 as immediate timeout, whereas POSIX convention treats < 0 as infinite timeout).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the important context.

Upon reflection, I think I can avoid some of the mess I was previously concerned about by adding new private mongoc_stream_t functions that mirror the public API, but use the expected timeout convention. All internal uses of mongoc_stream_t can use the private API without changing timeout conventions. The public API can be reimplemented to delegate to the private API by converting timeouts to the correct convention.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On further look, this might be further complicated by streams wrapping streams. For example, a TLS stream might pass a timeout to an underlying base stream:

mongoc_stream_t *base_stream = get_stream();

// Wrap in a TLS stream:
mongoc_ssl_opt_t ssl_opt = {0};
mongoc_stream_t *tls_stream = mongoc_stream_tls_new_with_hostname(base_stream, "localhost", &ssl_opt, 1 /* client */);
ASSERT(tls_stream);

int32_t timeout_ms = 0; // stream convention: 0 means immediate.
size_t ret = mongoc_stream_write(tls_stream, "foo", 3, timeout_ms);
// mongoc_stream_write converts to socket convention: 0 => MONGOC_SOCKET_TIMEOUT_IMMEDIATE
// mongoc_stream_write is called on base_stream with socket convention (but should be stream convention)

It might be confusing to track if timeout_msec is in "socket convention" or "stream convention". This might point to a need for a larger refactor of how timeout representation (a new internal mongoc_socket_timeout_t type?). But for the scope of "socketTimeoutMS=inf", I would rather err towards a simpler change.

Admittedly, I do not see an obvious simplification. Possibly: can the internal representation of the timeout remain the same? Instead, add internal-only stream API for infinite timeout:

// mongoc_stream_writev is public API.
ssize_t
mongoc_stream_writev(mongoc_stream_t *stream, mongoc_iovec_t *iov, size_t iovcnt, int32_t timeout_msec)
{
   // CDRIVER-4781: for backward compatibility.
   if (timeout_msec < 0) {
      timeout_msec = MONGOC_DEFAULT_TIMEOUT_MSEC;
   }
   return mongoc_stream_writev_impl(stream, iov, iovcnt, timeout_msec);
}

// mongoc_stream_writev_notimeout is internal-only.
ssize_t
mongoc_stream_writev_notimeout(mongoc_stream_t *stream, mongoc_iovec_t *iov, size_t iovcnt)
{
   int32_t timeout_msec = -1; // Infinite.
   return mongoc_stream_writev_impl(stream, iov, iovcnt, timeout_msec);
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

infinite timeouts will not work on existing stream wrappers unless they are updated

Good point. Though userver appears to expect negative values mean infinite, I would not expect existing stream implementations to handle negative values. "socketTimeoutMS=inf" is a new option. If this only results in negative values being passed into the stream functions when "inf" is set, I think that is OK. That seems like an opt-in.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly silly alternative: instead of "inf", support "max". Translate to INT32_MAX (~600 hours). I expect that is large enough for the vast majority users wanting to set an infinite timeout. Though this strays further from the original intent, "inf" is already out-of-spec. "max" may still provide a convenience for users that want to set a timeout they expect not to hit.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, so I went with the change to use the POSIX convention internally. I considered using INT32_MAX like you suggested, but I felt less inclined to add yet another quirk to this. That said, if we end up finding more issues with my latest set of changes, then I may just end up using the silly alternative after all.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extending debug_stream_t can test how socketTimeoutMS is passed through to streams set with mongoc_client_set_stream_initiator. See this test for an example.

socketTimeoutMS=INT32_MIN may be a problem. It changes to 0 resulting in immediate timeout:

ASSERT_OR_PRINT(ok, error);                               // After PR: error!
ASSERT_CMPINT32(stats.last_timeout_readv, ==, INT32_MIN); // After PR: 0
ASSERT_CMPINT32(stats.last_timeout_writev, ==, 3600000);  // After PR: 0

IMO: I think "inf" is a convenience more than a necessity. I would rather avoid adding much complexity and avoid any possibly breaking changes.

If "inf" still seems worth pursuing, consider separating refactors into a new PR (e.g. adding a socket timeout convention + changing the TLS timeout handling). That might help separate what changes are needed for "inf" from changes that are intended as improvements to existing timeout handling.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO: I think "inf" is a convenience more than a necessity.

I'm inclined to agree. We are deviating even further from the spec by adding this special string value for socketTimeoutMS and making stream timeout handling more confusing and error-prone. For how much trouble this has been, we already have a reasonable workaround, which is to simply use socketTimeoutMS=2147483647. The only real benefit of these changes is that we can have "truly" infinite timeouts, which has little value in practice.

It's also worth remembering that socketTimeoutMS is deprecated in favor of timeoutMS. CSOT may provide us with a better opportunity to improve timeout handling when (if?) we begin working on it.

} else if (timeout_msec == 0) {
if (timeout_msec == MONGOC_SOCKET_TIMEOUT_IMMEDIATE) {
return 0;
} else if (timeout_msec <= 0) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On further look, this might be further complicated by streams wrapping streams. For example, a TLS stream might pass a timeout to an underlying base stream:

mongoc_stream_t *base_stream = get_stream();

// Wrap in a TLS stream:
mongoc_ssl_opt_t ssl_opt = {0};
mongoc_stream_t *tls_stream = mongoc_stream_tls_new_with_hostname(base_stream, "localhost", &ssl_opt, 1 /* client */);
ASSERT(tls_stream);

int32_t timeout_ms = 0; // stream convention: 0 means immediate.
size_t ret = mongoc_stream_write(tls_stream, "foo", 3, timeout_ms);
// mongoc_stream_write converts to socket convention: 0 => MONGOC_SOCKET_TIMEOUT_IMMEDIATE
// mongoc_stream_write is called on base_stream with socket convention (but should be stream convention)

It might be confusing to track if timeout_msec is in "socket convention" or "stream convention". This might point to a need for a larger refactor of how timeout representation (a new internal mongoc_socket_timeout_t type?). But for the scope of "socketTimeoutMS=inf", I would rather err towards a simpler change.

Admittedly, I do not see an obvious simplification. Possibly: can the internal representation of the timeout remain the same? Instead, add internal-only stream API for infinite timeout:

// mongoc_stream_writev is public API.
ssize_t
mongoc_stream_writev(mongoc_stream_t *stream, mongoc_iovec_t *iov, size_t iovcnt, int32_t timeout_msec)
{
   // CDRIVER-4781: for backward compatibility.
   if (timeout_msec < 0) {
      timeout_msec = MONGOC_DEFAULT_TIMEOUT_MSEC;
   }
   return mongoc_stream_writev_impl(stream, iov, iovcnt, timeout_msec);
}

// mongoc_stream_writev_notimeout is internal-only.
ssize_t
mongoc_stream_writev_notimeout(mongoc_stream_t *stream, mongoc_iovec_t *iov, size_t iovcnt)
{
   int32_t timeout_msec = -1; // Infinite.
   return mongoc_stream_writev_impl(stream, iov, iovcnt, timeout_msec);
}

} else if (timeout_msec == 0) {
if (timeout_msec == MONGOC_SOCKET_TIMEOUT_IMMEDIATE) {
return 0;
} else if (timeout_msec <= 0) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extending debug_stream_t can test how socketTimeoutMS is passed through to streams set with mongoc_client_set_stream_initiator. See this test for an example.

socketTimeoutMS=INT32_MIN may be a problem. It changes to 0 resulting in immediate timeout:

ASSERT_OR_PRINT(ok, error);                               // After PR: error!
ASSERT_CMPINT32(stats.last_timeout_readv, ==, INT32_MIN); // After PR: 0
ASSERT_CMPINT32(stats.last_timeout_writev, ==, 3600000);  // After PR: 0

IMO: I think "inf" is a convenience more than a necessity. I would rather avoid adding much complexity and avoid any possibly breaking changes.

If "inf" still seems worth pursuing, consider separating refactors into a new PR (e.g. adding a socket timeout convention + changing the TLS timeout handling). That might help separate what changes are needed for "inf" from changes that are intended as improvements to existing timeout handling.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants