Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pin the future on the heap in tests to prevent stack overflows #5136

Closed
wants to merge 1 commit into from

Conversation

Hocuri
Copy link

@Hocuri Hocuri commented Oct 27, 2022

Motivation

When we switched from async_std to tokio, some tests started overflowing their stack, which we fixed by setting opt_level = 1 in [profile.dev] back then. This, however, increased the compilation times a lot, so it would be nice to be able to set opt_level = 0 at least for tests. See #2055 for an old issue about this, #4009 for another PR for this issue, and deltachat/deltachat-core-rust#3666 for a PR in our repo that would fix this by not using the macro and instead writing out the code that would be generated.

I don't know why this didn't happen with async_std.

Solution

The first thing I tried was adding lots of #[inline(always)] as outlined by @blasrodri in #2055 (comment), and boxing the future in more places similarly to https://github.com/tokio-rs/tokio/pull/4009/files. Both is in Hocuri@14cad1a, but that didn't fix our tests.

Directly boxing the future, as done in this PR here, does fix our tests, however.

What do you think?

@blasrodri
Copy link
Contributor

Nice! I remember the suggestion at some point was to box ASAP. I tried but never occurred to me to do it so early.

@carllerche
Copy link
Member

I don't think boxing here unconditionally is the right answer. As a workaround, you can always box yourselves. Instead, could you provide a failing test and we can look more into it?

@Noah-Kennedy
Copy link
Contributor

I'm in agreement with Carl here. We should focus on why we are blowing the stack.

@Darksonn Darksonn added A-tokio Area: The main tokio crate M-task Module: tokio/task labels Oct 27, 2022
@Hocuri
Copy link
Author

Hocuri commented Oct 28, 2022

I have no idea how to build a failing test, though. I mean,

    #[tokio::test(flavor = "multi_thread", worker_threads = 2)]
    async fn test_tt() {
        let data = [0; 110_224];
        tokio::time::sleep(std::time::Duration::from_millis(1)).await;
        println!("{}", data[100_000]);
    }

(adapted from #2055) does fail, but it also fails with async_std. What's similar to the failing tests in our deltachat crate is that both tests are fixed by boxing the future.

So, I don't have a failing test except:

git clone https://github.com/deltachat/deltachat-core-rust.git
cd deltachat-core-rust
# Remove the `opt-level = 1` line from Cargo.toml
cargo test test_secure_join
cargo test test_setup_contact

Also, obviously there are some "XY mentioned this issue" in #2055 in case one of them makes it easier to find out what's the problem.

@Hocuri
Copy link
Author

Hocuri commented Oct 28, 2022

As a workaround, you can always box yourselves.

Yes, and yet another workaround is to increase the stack size.

@Noah-Kennedy
Copy link
Contributor

My view of this is that this situation represents a bug, either in tokio, or in user code. Whenever futures are that big, there is a problem.

Either way, I'd say that the best next step is to figure out why futures are that big.

@Darksonn
Copy link
Contributor

The problem here is most likely that we pass it through too many stack frames. I believe that a better solution would be to have every method other than the one called directly by the user take Pin<&mut F> so that only a pointer is passed between the stack frames.

@Darksonn Darksonn closed this Oct 29, 2022
@Noah-Kennedy
Copy link
Contributor

That's my guess as well here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-tokio Area: The main tokio crate M-task Module: tokio/task
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants