Skip to content

Route synchronous pycares AresError through the returned future#245

Merged
bdraco merged 7 commits into
masterfrom
fix/issue-231-orphaned-future
May 16, 2026
Merged

Route synchronous pycares AresError through the returned future#245
bdraco merged 7 commits into
masterfrom
fix/issue-231-orphaned-future

Conversation

@bdraco
Copy link
Copy Markdown
Member

@bdraco bdraco commented May 16, 2026

Summary

Home Assistant was about to merge a workaround for this so I figured we should fix it here instead home-assistant/core#170048

Fixes #231; query methods could raise pycares.AresError synchronously (for example on a malformed hostname), leaving the internally created future orphaned with an unretrieved exception and producing a noisy Future exception was never retrieved warning.

Each pycares call is now wrapped in a _capture_ares_error context manager that sets the error on the future, so callers can rely on await to raise just like for asynchronous c-ares errors.

Test plan

  • New tests in tests/test_aiodns.py cover query, query_dns, and the exact example from the issue
  • Full test suite passes locally (89 passed)

When pycares raises AresError synchronously (e.g. ARES_EBADNAME for a
malformed hostname), the future was created inside the resolver method
but never returned to the caller, leaving an orphaned future with an
unretrieved exception and producing a noisy "Future exception was never
retrieved" warning.

Wrap each pycares call with a `_capture_ares_error` context manager
that sets the error on the future and lets it surface from `await`,
matching the behavior of asynchronous c-ares errors.

Fixes #231
@codecov
Copy link
Copy Markdown

codecov Bot commented May 16, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 98.42%. Comparing base (ddef062) to head (0d0e9d8).

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #245      +/-   ##
==========================================
+ Coverage   98.36%   98.42%   +0.06%     
==========================================
  Files           5        5              
  Lines        1345     1399      +54     
  Branches       70       73       +3     
==========================================
+ Hits         1323     1377      +54     
  Misses         16       16              
  Partials        6        6              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

bdraco added 2 commits May 15, 2026 17:32
…ated

The `@contextlib.contextmanager` decorator combined with
`asyncio.Future[Any]` made the wrapped callable's type contain Any,
which mypy's disallow_any_decorated rejects. Parameterize the future
with the existing _T TypeVar to keep the contract polymorphic over
the future's payload type without leaking Any into the signature.
Split the original parametrize into two focused integration tests
sharing a small assertion helper, and add a direct unit test for
_capture_ares_error that exercises the `if fut.done(): return` path
by driving the context manager with __enter__/__exit__ on an already
resolved future.
@bdraco bdraco marked this pull request as ready for review May 16, 2026 00:40
@bdraco bdraco requested a review from Copilot May 16, 2026 00:40
@bdraco
Copy link
Copy Markdown
Member Author

bdraco commented May 16, 2026

@bluetoothbot review

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes an aiodns behavior where pycares.AresError could be raised synchronously (e.g., malformed hostnames), leaving an internally-created Future orphaned and emitting Future exception was never retrieved. The change ensures synchronous c-ares errors are routed into the returned Future, so callers consistently see failures via await.

Changes:

  • Add a _capture_ares_error() context manager to convert synchronous pycares.AresError into aiodns.error.DNSError on the created future.
  • Wrap pycares.Channel calls (query, query_dns, getaddrinfo, etc.) so synchronous errors complete the future instead of bubbling synchronously.
  • Add regression tests covering malformed-name behavior and a done-future edge case.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
aiodns/__init__.py Wraps synchronous pycares calls to ensure errors are surfaced via the returned future; adjusts callbacks to bail out on fut.done().
tests/test_aiodns.py Adds regression tests to ensure malformed hostnames raise through await and that done futures aren’t overridden.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread aiodns/__init__.py Outdated
Comment thread tests/test_aiodns.py Outdated
bdraco added 2 commits May 15, 2026 17:54
Address Copilot review feedback on #245:

* Handle pycares.AresError instances with an unexpected args shape:
  fall through quietly when args is empty and derive the message from
  pycares.errno.strerror(errno) instead of trusting args[1], matching
  the behavior of _callback / _query_callback.
* Drive the done-future unit test through a real `with` block, so the
  scenario being exercised mirrors how the context manager is actually
  used in production code.
If the `if fut.done(): return` short-circuit doesn't fire, the context
manager body would call set_exception() on a done future, raising
InvalidStateError. Just running the `with raise exc` block to
completion is therefore a sufficient assertion, and it keeps the test
free of the extra try/except scaffolding that was only there to avoid
the unreachable-statement warning.
@bluetoothbot
Copy link
Copy Markdown

bluetoothbot commented May 16, 2026

PR Review — Route synchronous pycares AresError through the returned future

Correct, narrowly-scoped fix for #231. The _capture_ares_error context manager properly routes synchronous pycares.AresError through the returned future, and the matching fut.cancelled() -> fut.done() guard updates in the three callbacks correctly handle the case where the context manager set the exception before pycares' own callback fires. No blocking issues. The suggestions are: (1) the not exc.args early-return silently hangs the future — extremely unlikely in practice but a poor failure mode; (2) the fut.done() guards are non-obvious without a comment tying them back to _capture_ares_error; (3) the done-future test should assert post-state explicitly rather than relying on "would have raised"; (4) the four other wrapped entry points (gethostbyname, getaddrinfo, getnameinfo, gethostbyaddr) have no regression coverage. Note: the two existing @Copilot comments look stale — the current code only indexes exc.args[0] (not [1]) and the done-future test already uses a real with block, so neither needs action.


🟢 Suggestions

1. Empty AresError.args silently hangs the future (`aiodns/__init__.py`, L229-234)

When exc.args is empty, _capture_ares_error returns silently without resolving the future. The caller will then await a future that is never set, hanging indefinitely. In practice pycares always constructs AresError with at least (errno, message), so this is unlikely — but the symptom (silent hang) is much worse than the alternative (a generic DNSError surfaced through await). Consider falling back to a sentinel errno such as pycares.errno.ARES_EFORMERR or pycares.errno.ARES_ENOTFOUND, or re-raising synchronously, so the future never gets stuck.

Something like:

except pycares.AresError as exc:
    if fut.done():
        return
    errno = exc.args[0] if exc.args else pycares.errno.ARES_EFORMERR
    fut.set_exception(
        error.DNSError(errno, pycares.errno.strerror(errno))
    )
        except pycares.AresError as exc:
            if fut.done() or not exc.args:
                return
            errno = exc.args[0]
            fut.set_exception(
                error.DNSError(errno, pycares.errno.strerror(errno))
            )
2. Comment why _callback now guards on fut.done() instead of fut.cancelled() (`aiodns/__init__.py`, L161-195)

The switch from fut.cancelled() to fut.done() in _callback, _query_callback, and _gethostbyname_callback is load-bearing: it prevents InvalidStateError when _capture_ares_error already set the exception synchronously and pycares subsequently invokes the callback anyway (the scenario explicitly called out in the new context manager docstring). The change is non-obvious in isolation — a one-line # may already be done if pycares raised synchronously; see _capture_ares_error next to each guard would make the coupling between these three callbacks and the context manager explicit, so future readers don't tighten it back to cancelled() and re-break #231.

        if fut.done():
            return
3. Make the done-future test assertion explicit (`tests/test_aiodns.py`, L1438-1450)

test_capture_ares_error_leaves_done_future_untouched relies on "reaching the body would raise InvalidStateError" as an implicit assertion — the docstring explains this but no actual assertion is made. A future refactor that, say, replaces fut.done() with fut.cancelled() again would silently make this test still pass on the cancelled path while regressing the originally-set-to-result case. Two small hardenings would lock the behavior in:

  1. Assert post-state explicitly: assert fut.result() is None after the with block, proving the existing result wasn't overwritten.
  2. Optionally add a second variant where fut.set_exception(SomeError()) first, then verify fut.exception() is still the original — covers the realistic race (the callback finished the future with an error before the synchronous raise propagated).
        with resolver._capture_ares_error(fut):
            raise pycares.AresError(
                aiodns.error.ARES_EBADNAME, 'Misformatted domain name'
            )
4. Add coverage for the other wrapped entry points (`tests/test_aiodns.py`, L1409-1424)

query and query_dns are exercised with a malformed name, but the same _capture_ares_error wrapper now guards gethostbyname, getaddrinfo, getnameinfo, and gethostbyaddr. None of those paths have a regression test. Even a single parametrized test feeding a malformed name to getaddrinfo (the modern replacement for gethostbyname) would prevent a future refactor from quietly removing the with self._capture_ares_error(fut): wrapper from one of the less-used methods without anyone noticing.


Checklist

  • Synchronous error surfaces via await
  • Done-future guards prevent InvalidStateError
  • Defensive against malformed AresError.args — suggestion #1
  • Non-obvious coupling documented — suggestion #2
  • Tests assert observable state — suggestion #3
  • Coverage for all wrapped entry points — suggestion #4
  • No new resource leaks
  • No security concerns (no injection / unsafe deserialization / secrets)
  • Python 3.11+ compatible

Summary

Correct, narrowly-scoped fix for #231. The _capture_ares_error context manager properly routes synchronous pycares.AresError through the returned future, and the matching fut.cancelled() -> fut.done() guard updates in the three callbacks correctly handle the case where the context manager set the exception before pycares' own callback fires. No blocking issues. The suggestions are: (1) the not exc.args early-return silently hangs the future — extremely unlikely in practice but a poor failure mode; (2) the fut.done() guards are non-obvious without a comment tying them back to _capture_ares_error; (3) the done-future test should assert post-state explicitly rather than relying on "would have raised"; (4) the four other wrapped entry points (gethostbyname, getaddrinfo, getnameinfo, gethostbyaddr) have no regression coverage. Note: the two existing @Copilot comments look stale — the current code only indexes exc.args[0] (not [1]) and the done-future test already uses a real with block, so neither needs action.


Automated review by Kōanc5567de
50c8f44
dadc120
66a6c67
fbf02b0

bdraco added 2 commits May 15, 2026 18:39
* _capture_ares_error: never silently leave the future unset; fall
  back to ARES_EFORMERR when AresError carries no args so callers
  always see a DNSError on `await` instead of an indefinite hang.
* Document the load-bearing fut.done() guard in the three callbacks
  so a future cleanup does not regress them back to fut.cancelled().
* Cover the remaining wrapped entry points (getaddrinfo, getnameinfo,
  gethostbyaddr) by injecting a synchronous AresError via mock; pycares
  itself never raises here today, so a real-input regression test is
  not possible, but the mocked test prevents the wrapper from being
  silently removed in a future refactor.
The trailing AssertionError branch was unreachable given the parametrize
values; switch the final check to an `assert channel_method ==
'gethostbyaddr'` so the dispatch stays exhaustive without dead code.
@bdraco bdraco merged commit 7d436b2 into master May 16, 2026
24 checks passed
@bdraco bdraco deleted the fix/issue-231-orphaned-future branch May 16, 2026 01:55
@bdraco bdraco mentioned this pull request May 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

"Future exception was never retrieved" on invalid query_dns

3 participants