Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix a mismatched new/delete ASAN error #563

Merged
merged 1 commit into from
Sep 29, 2023

Conversation

ispeters
Copy link
Contributor

We discovered a bug in the tear-down of any_sender_of<> that ASAN reports as a mismatched new/delete. I'm not certain what's required to repro the problem, but this is sort of what our repro looks like:

struct CustomScheduler { /* ... */ };

template <typename... T>
using any_scheduled_sender_of = unifex::with_receiver_queries<
    unifex::overload<CustomScheduler(const unifex::this_&) noexcept>(
            unifex::get_scheduler)>::any_sender_of<T...>;

struct Container;

any_scheduled_sender_of<Container> Class::getContainer() noexcept {
  // member_->getContents() returns an any_scheduled_sender_of<Contents>
  return member_->getContents() |
      unifex::then([](auto&& contents) noexcept {
           return Container{contents};
         });
}

void Class::getContainerAsync(Callback callback) noexcept {
  getContainer() |
      unifex::then(
          [callback](auto&& container) noexcept {
            callback(container);
          }) |
      unifex::with_query_value(
          unifex::get_scheduler, scheduler_) |
      unifex::spawn_detached(scope_);
}

I don't know whether the get_scheduler customization is necessary, or if the problem would repro with a standard any_scheduler_of<...>.

The Sender constructed (and spawned) by the above code looks like this:

v2::async_scope::nest_sender
  with_query_value
    then (1)
      any_scheduled_sender_of<Container>
        then (2)
          any_scheduled_sender_of<Contents>
            ...

When we destroy the resulting operation state, the outer then op state (numbered 1, above) tries to destroy the op state for the any_scheduled_sender_of<Container>, which, through a few layers of indirection, correctly invokes the tag_invoke(_deallocate_cpo{}, ...) implementation of _deallocate_cpo[1]. However, the tag_invoke overload that's found is the forwarding implementation on any_sender_of<>'s _op_for[2], which is intended to only forward receiver queries. Since the receiver in question doesn't customize _deallocate_cpo, we end up invoking delete &receiver, which is, coincidentally, stored at the beginning of the operation state for then (2), leading ASAN to believe we're deleting a pointer that was new'd but with the wrong type. In fact, we're deleting an object that was never new'd because we've selected the wrong tag_invoke overload.

I believe that _deallocate_cpo is expecting that the only customizations will be the one provided in any_unique.hpp[3] but, because _deallocate_cpo matches the is_receiver_query_cpo_v predicate, and because we've somehow goofed up how we build the tag_invoke overload set for any_sender_of<>'s operation state, we select a different one that matches better.

This diff fixes the problem by explicitly defining is_receiver_query_cpo_v<_deallocate_cpo> to false, thus making the erroneous tag_invoke overload a non-match. Once the undesired overload is excluded, overload resolution finds the correct one and we delete the correct object.

In the future, we should probably make "is this CPO a receiver query?" an explicit opt-in somehow, rather than answering "yes" to every CPO that is not one of set_value, set_error, set_done, or connect.

  1. https://github.com/facebookexperimental/libunifex/blob/main/include/unifex/any_unique.hpp#L44
  2. https://github.com/facebookexperimental/libunifex/blob/main/include/unifex/any_sender_of.hpp#L187
  3. https://github.com/facebookexperimental/libunifex/blob/main/include/unifex/any_unique.hpp#L64

We discovered a bug in the tear-down of `any_sender_of<>` that ASAN
reports as a mismatched new/delete. I'm not certain what's required to
repro the problem, but this is sort of what our repro looks like:
```
struct CustomScheduler { /* ... */ };

template <typename... T>
using any_scheduled_sender_of = unifex::with_receiver_queries<
    unifex::overload<CustomScheduler(const unifex::this_&) noexcept>(
            unifex::get_scheduler)>::any_sender_of<T...>;

struct Container;

any_scheduled_sender_of<Container> Class::getContainer() noexcept {
  // member_->getContents() returns an any_scheduled_sender_of<Contents>
  return member_->getContents() |
      unifex::then([](auto&& contents) noexcept {
           return Container{contents};
         });
}

void Class::getContainerAsync(Callback callback) noexcept {
  getContainer() |
      unifex::then(
          [callback](auto&& container) noexcept {
            callback(container);
          }) |
      unifex::with_query_value(
          unifex::get_scheduler, scheduler_) |
      unifex::spawn_detached(scope_);
}
```

I don't know whether the `get_scheduler` customization is necessary, or
if the problem would repro with a standard `any_scheduler_of<...>`.

The *Sender* constructed (and spawned) by the above code looks like
this:
```
v2::async_scope::nest_sender
  with_query_value
    then (1)
      any_scheduled_sender_of<Container>
        then (2)
          any_scheduled_sender_of<Contents>
            ...
```
When we destroy the resulting operation state, the outer `then`
op state (numbered 1, above) tries to destroy the op state for the
`any_scheduled_sender_of<Container>`, which, through a few layers of
indirection, correctly invokes the `tag_invoke(_deallocate_cpo{}, ...)`
implementation of `_deallocate_cpo`[1]. However, the `tag_invoke`
overload that's found is the forwarding implementation on
`any_sender_of<>`'s `_op_for`[2], which is *intended* to only forward
receiver queries. Since the receiver in question doesn't customize
`_deallocate_cpo`, we end up invoking `delete &receiver`, which is,
coincidentally, stored at the beginning of the operation state for
`then (2)`, leading ASAN to believe we're deleting a pointer that was
new'd but with the wrong type. In fact, we're deleting an object that
was never new'd because we've selected the wrong `tag_invoke` overload.

I believe that `_deallocate_cpo` is expecting that the only
customizations will be the one provided in `any_unique.hpp`[3] but,
because `_deallocate_cpo` matches the `is_receiver_query_cpo_v`
predicate, and because we've somehow goofed up how we build the
`tag_invoke` overload set for `any_sender_of<>`'s operation state, we
select a different one that matches better.

This diff fixes the problem by explicitly defining
`is_receiver_query_cpo_v<_deallocate_cpo>` to `false`, thus making the
erroneous `tag_invoke` overload a non-match. Once the undesired overload
is excluded, overload resolution finds the correct one and we delete the
correct object.

In the future, we should probably make "is this CPO a receiver query?"
an explicit opt-in somehow, rather than answering "yes" to every CPO
that is not one of `set_value`, `set_error`, `set_done`, or `connect`.

1. https://github.com/facebookexperimental/libunifex/blob/main/include/unifex/any_unique.hpp#L44
2. https://github.com/facebookexperimental/libunifex/blob/main/include/unifex/any_sender_of.hpp#L187
3. https://github.com/facebookexperimental/libunifex/blob/main/include/unifex/any_unique.hpp#L64
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Sep 29, 2023
@jesswong jesswong merged commit 98d4e99 into main Sep 29, 2023
@ispeters ispeters deleted the fix_asan_error_in_any_sender_destruction branch October 2, 2023 17:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants