-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stackless coroutines #1823
Stackless coroutines #1823
Conversation
} | ||
} | ||
|
||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’d like to see some elaboration on how dropping happens in case a panic inside a coroutine occurs. Especially interesting to my currently sleep deprived brain is such case:
yield a;
drop(b);
panic!("woaw");
drop(c);
return d;
where all of a
, b
, c
and d
are moved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The drop semantics are identical to enum variant drop semantics. Variables are then just lifted up into a variant, as in:
enum State {
bb1 { x: usize },
bb2 { x: usize, y: usize }
...
}
loop {
match state {
bb1 { x } => { ... state = bb2 { x, y } }
bb2 { x, y } => { ... }
}
}
A drop in bb1
naturally drops on the variable in scope. This is what I do in stateful and the semantics all seem to work out quite well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nagisa a way of handling it would be an empty "Panic" variant. Just after each yield point, the locals are restored from the state enum, and the enum is set to this "Panic" variant.
If an actual panic happens the locals are dropped as usual, and the state enum is empty, so everything is fine. Otherwise the new state is stored at the next yield. Sort of like the classic option dance.
Invoking a coroutine which has been left in the Panic variant just panics again.
|
||
invalid: { | ||
... | ||
std::rt::begin_panic(const "invalid state!") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MIR-speak for this is Assert
terminator with false
for cond.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It also seems to me that a somewhat builder-like pattern could be used to avoid this failure case. Namely, if coroutine instead was:
enum CoResult<Y,R,C> {
Yield<Y, C>,
Return<R>
}
and the coroutine closure was required to be passed around via value, then you’d evade the issue with cleanups on panic within a coroutine and also would never be able to “call” coroutine again after it has already returned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same thing applies to Iterator
s, which could've returned Option<(Self, Self::Item)>
instead of Option<Self::Item>
. I'm not sure if that's why we didn't do it, but two drawbacks are that it's less ergonomic to use manually, and that it precludes object safety.
Here, I think a third drawback would be that, as a consequence, it would preclude a blanket Iterator
impl
.
(I think it would have been a better idea to have done this for Iterator
as well in the first place, but...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I regret this not being done. I think this would have been the killer usecase for #1736 .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we had &move
, we could avoid the object safety issues..... :/
Can a regular function be a coroutine as well? For example if the async I/O example was much larger, I'd like to extract some code from the closure and put it in a function. Maybe this can be solved by making that function return a coroutine itself, although that would be confusing. Other than that I've seen an alternative proposal on IRC which has some advantages and drawbacks compared to this one. |
With the caveat that I'm massively multitasking, a few things:
|
I like this proposal, but I would like to explore what happens pushing it even further: I believe we could just merge the |
``` | ||
|
||
### Fn* traits | ||
Coroutines shall implement the `FnMut` trait: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This information is quite crucial and I think it would be easier to read the RFC if somewhere near the beginning there was a sentence like “Coroutine is compiled to a closure which implements FnMut(Args...) -> CoResult<Y, R>
trait. For example, the coro1
coroutine is an impl FnMut() -> CoResult<i32, ()>
”.
@tomaka, @camlorn, @ranma42: Regular functions cannot be merged with coroutines. Coroutines need to keep their state somewhere (the "coroutine environment"), so a regular function pointer cannot point to coroutine. Coroutines can be merged with closures, and this is exactly what I am proposing. In fact, to the outside observer a coroutine is not distinguishable from a "normal" closure that just happens to return And yes, regular functions can return coroutines: see all the examples at the end. |
@camlorn
Python's To do this in Rust, we'd have to add a third variant into CoResult: enum CoResult<'a, Y: 'a, R> {
Yield(Y),
YieldFrom(&'a Iterator<Item=Y>),
Return(R)
} and callers would have to deal with this new variant explicitly. IMO, this isn't worth it. Inlining should take care of the trivial inner loops. |
My point about yield from was more along the lines of the compiler just writing the inner loop, though I do understand that this can be done with a macro. The interesting question is whether such a construct opens up optimization opportunities, for example letting us flatten multiple coroutines into one, and if those opportunities are meaningful. |
@vadimcn I'm confused, |
This is a very important observation and should be the centerpiece of the proposal IMO. I'm slightly worried the ergonomics of generators might suffer with this model, but I like it. I'm also wary about a typing ambiguity with tuples, for example one |
Not sure how it works in ES6.
I was referring to this. I remember reading a longer piece about optimizing nested yields several years ago (not even sure it was about Python), but all my searches come back empty so far. :( Maybe I'll find it later. Edit: If |
@vadimcn Yeah, that's an optimization for the fact that there is no optimizing compiler. |
The former one would be typed |
@vadimcn Huh, have a I seen a draft that used |
@vadimcn: Nice! This is very much along the lines of what I was exploring in stateful, which never was supposed to be a long term solution to this problem. MIR is definitely the way to go. A couple observations:
macro_rules! await {
($e:expr) => ({
let mut future = $e;
loop {
match future.poll() {
Ok(Async::Ready(r)) => break Ok(r),
Ok(Async::NotReady) => yield,
Err(e) => break Err(e.into()),
}
}
})
} In order to let the coroutine to handle errors if it so chooses.
enum CoResult<Y, R> {
Yield(Y),
Return(R),
}
trait Coroutine {
type Yield;
type Return;
fn resume(&mut self) -> CoResult<Self::Yield, Self::Return>;
}
trait CoroutineRef {
type Yield;
type Return;
fn resume<'a>(&'a mut self) -> CoResult<&'a Self::Yield, &'a Self::Return>;
}
trait CoroutineRefMut {
type Yield;
type Return;
fn resume<'a>(&'a mut self) -> CoResult<&'a mut Self::Yield, &'a mut Self::Return>;
}
|
I thought I did? here
Good idea!
Yes, I sorta implicitly assumed we'd do that (unless there are parsing problems).
I am afraid you lost me here...
I'm thinking this should be doable with syntax extensions:
...But do we really need that? Feels a bit like parroting C# and Python.
What would these buy us over FnMut? |
@vadimcn sorry, when I wrote "function" in the previous message I actually meant "closures". You mention that to an outside observer they will be the same. I am suggesting that we might also try to make them look the same from the point of view of the compiler. |
Something like this I guess? fn main() {
some_lib::do_something(|| -> CoResult<u8, String> {
yield 12;
foo();
"hello world".to_owned()
});
}
fn foo() -> impl FnOnce() -> CoResult<u8, ()> {
|| {
yield 54;
}
} I don't understand how that would work. Also I don't think this syntax would obtain the usability seal of approval. |
@tomaka Instead of |
@vadimcn What I had in mind was async I/O. A more real-world example would be this: let database = open_database();
server.set_callback_on_request(move |request| {
move || {
if request.url() == "/" {
yield* home_page(&database)
} else if request.url() == "/foo" {
yield* foo_route(&database)
} else {
error_404()
}
}
});
fn home_page(database: &Database) -> impl Coroutine<(), Response> {
move || {
let news = await!(database.query_news_list());
templates::build_home_page(&news)
}
}
fn foo_route(database: &Database) -> impl Coroutine<(), Response> {
move || {
let foo = await!(database.query_foos());
let file_content = await!(read_file_async("/foo"));
templates::build_foo(&foo, &file_content)
}
}
fn error_404() -> Response {
templates::build_404()
} Usually I'm not the first one to complain about an ugly syntax, but here it looks ugly even to me. I'm also not sure how async I/O would work in practice. In your example there's no way to wake up a coroutine when an async I/O has finished. If the idea is to yield objects that implement I saw a proposal on IRC the other day where you were able to pass a "hidden" parameter (in the sense that it doesn't appear in the arguments list) when executing a coroutine, and you could await on a value |
@vadimcn @eddyb |
@petrochenkov As far as I understand, suspend-up refers to stackless coroutines while suspend-down refers to stackful coroutines. |
While I like the overall design of the RFC, I'm worried that the presence of a yield statement changes the behaviour of return statements, and the return type of the closure. Compare for example a fn once<T>(value: T) -> impl Iterator<Item = T> {
move || {
yield value;
return ();
}
}
fn empty<T>() -> impl Iterator<Item = T> {
|| {
return CoResult::Return(());
}
} Here the empty iterator needs to return Essentially, I think the distinction should be explicit, since a coroutine which happens to never yield should be valid. Something like |
@plietar Proposal: Come up with and implement coroutine traits, then make a shorthand for constraining a type parameter to be a coroutine. Introduce the I like @erickt |
Does anyone think we should not have immovable generators based immovable types? Immovable types RFC An alternative to immovable types is to allocate generators on the heap (which is C++'s solution). This turns out to be quite messy however, especially when dealing with allocation failure. It doesn't guarantee 1-allocation per task (in futures-rs terms) like immovable types does, but most allocations can be optimized away. Immovable generators would allow references to local variables to live across suspend points. This almost recovers normal rules for references. If the resumption of generators can take We could also restrict argument to generators to |
Also, this raises an interesting point: With closures, it seems that the divide is how |
So, the discussion on this thread is still fairly active, but my feeling is that we're probably not ready to accept this RFC yet. I tend to think it should be postponed, with conversation moving onto internals thread, and perhaps with some nice summaries capturing the highlights of this conversation and others in the related area. That said, I'm basically way behind on this thread right now, and perhaps I am wrong. Does anyone think I am totally off-base in wanting to postpone? |
@mark-i-m You can have immovable and movable generators which both implement the same An more flexible and complex alternative is to have immovable generators return a movable value which implements a |
This means that all types of generators must have the same method signatures, though... right? In other words, having different traits means that you can have different method signatures, just as with closures. I don't know if this is the right approach for generators, though... I don't really understand how we could get the same effect with
👍 This seems very much like an idea presented way earlier in the thread (but I cannot find it) by someone else: have each yield produce both a yield value and the next value of the generator itself. Personally, I find this more elegant than having a generator with mutable state hanging around that may or may not move. @nikomatsakis This thread has a lot of interesting discussion, but I don't think is really conclusive in any direction. It is a big discussion of different design alternatives from syntax, to underlying implementation, to immovability, to typing. I think I have kept up with the thread more or less, and I would also really love a(nother) summary of the different points. |
+1 I guess that a proof of concept of the feature can be implemented as a procedural macro when Macros 1.2 is ready. |
@ayosec A proof of concept can be created with a nightly compiler plugin already. |
Can you please explain in more detail, how this would work? That returned value would still need to contain both the borrowed object and the reference, so how would it stay movable? @nikomatsakis: I'll try to come up with a summary soon. But yeah, there's a lot of people unconvinced about various aspects of this design. I think it will take a proof-of-concept implementation to show that this is viable. |
Would it? You could conceive of a system in which the generator-generator does not contains self-references but only metadata. When the generator is later constructed, the metadata is consumed to produce an immovable generator (immovable because it now actually contains self-references), which can then be consumed to produce the next generator-generator and a value. It does sound like a rather elaborate system though... |
Well, yes. That's why I'd like to hear more. |
@mark-i-m There isn't a need for different signatures for movable and immovable generators. A single trait suffices. However having generators returning a new generator when resumed is incompatible with immovable generators. @vadimcn The generator-generator would only need to contain the upvars (like regular closures) and wouldn't have a storage for values crossing suspend points. So it will remain movable until we construct the proper immovable generator which does have this storage. An issue I ran into when playing with an implementation of generators is figuring out when an generator implements OIBITs. For example, our generator can't implement An related issue is the layout of generators. The same set of values decides the size of the generator. We really want to run MIR optimizations on generators before committing to a layout. If we allow |
@rfcbot fcp postpone Based on the last few comments, I am going to move that we postpone this RFC. It seems clear that there isn't quite enough consensus to adopt anything just yet, and more experimentation is needed. I think it would be very helpful if someone could try to produce a summary on internals of the many points raised in this thread, and we can continue the conversation there. |
Team member @nikomatsakis has proposed to postpone this. The next step is review by the rest of the tagged teams: No concerns currently listed. Once these reviewers reach consensus, this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up! See this document for info about what commands tagged team members can give me. |
```rust | ||
impl<T> [T] { | ||
fn iter(&'a self) -> impl DoubleEndedIterator<T> + 'a { | ||
|which_end: mut IterEnd| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
super nit: this should be |mut which_end: IterEnd|
, right?
So, let me try to cap off the discussion by enumerating outstanding issues and where I stand on them. Asynchronous StreamsIMO, the biggest issue with this RFC at the moment, is handling of async streams from futures-rs. Some people (@eddyb was first, I think) had proposed adding a third variant, I agree that this would work nicely for async streams. However... if we abstract ourselves from the needs of futures-rs, this approach leaves a feeling of incompleteness. One might ask, for instance: Why only one extra variant? Why not make this extensible to an arbitrary number of "extra" cases? So, after some head-scratching, I think I've found a way to implement both futures and async stream generators using the same await!() macro. It does require an extra macro to "yield" a future, but otherwise works out quite nicely, IMO. The Self-Borrowing Problemi.e. the inability to hold references to coroutines's "local" variables across a This restriction comes about because borrowing of local variables that are hoisted into the coroutine closure is equivalent to having a regular structure, which stores a reference to another part of self. Obviously, such a structure cannot be moved, because that would invalidate the references! This problem merits a bigger discussion and perhaps a separate RFC, but briefly, I think there are two possible ways we can go here: Ignore the problem (for now).I would say that the "Self-Borrowing Problem" is not as acute as it might appear: in most cases the borrowed value will be external to the coroutine and thus will not cause self-borrowing. Make self-borrowing coroutines immovableWe could allow self-borrowing, but then we must make sure that such coroutines do not ever get moved. Well, "ever" is perhaps too strict: we'd still want to be able to encapsulate creation of a coroutine in a function, and thus we must be able to return them. A reasonable compromise would be that a self-borrowing coroutine becomes pinned after its first invocation. Here's an RFC by @Zoxc, that proposes a possible mechanism for doing that. Btw, here's another approach to making data immovable - without introducing any new features into the language. The downside is that this will conflict with mutable borrowing needed to invoke a Anyhow, let's continue discussion of self-borrowing on Discourse forums (perhaps in this thread ?) TraitsSome folks feel that
I am leaning towards 👎 on putting this in the language, for the following reasons:
Coroutine Declaration SyntaxSome people feel that inference of "coroutine-ness" from the presence of However, coming up with a pleasant and concise syntax proved to be not so easy, mainly because prefixing any unreserved keyword to a closure is ambiguous with I've come to agree that we need some way of forcing a closure into being a coroutine. If nothing else, this is needed for writing reliable macros, otherwise macro writers would have to perform considerable gymnastics to handle the "degenerate coroutine" (i.e. one that never yields) case. That said, I also think it is fine to make such declarations optional. The experience of other languages (C# and Python) shows that inferring "coroutine-ness" from presence of A very basic solution for the "macro problem", proposed earlier in this thread, might be this:
Top-Level Generator FunctionsSome people would like to have syntax sugar for declaring a top-level function returning a coroutine, e.g.
instead of
Again, I am skeptical that we need this, because:
"Coroutines"Some people took issue with the name "stackless coroutines" (and wanted to call these "generators"?). Sheesh.
I rest my case. impl Clone for coroutinesWith respect to cloning, coroutines are not any different than regular closures. When (if) Rust implements cloning for closures, coroutines will get it for free. So - not in scope for this RFC. |
And here's IRLO thread I've created in case anyone has left anything to say. Thanks for participation in this discussion, everyone! |
match $e { | ||
Ok(r) => break Ok(r), | ||
Err(ref e) if e.kind() == ::std::io::ErrorKind::WouldBlock => yield, | ||
Err(e) => break Err(e.into(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small typo (no closing parenthesis)
@withoutboats still waiting on your FCP comment here: #1823 (comment) @vadimcn thanks for that great summary! |
🔔 This is now entering its final comment period, as per the review above. 🔔 |
The final comment period is now complete. |
I'm closing as postponed, per previous discussion. Be sure to check out the final summary and continue discussion on internals. Thanks @vadimcn! |
Support for resumable functions in Clang (as of n4649) and on LLVM 4.0 itself seems fine; it can also do really nice optimizations on chained coroutines (see this presentation by Gor Nishanov). Are you considering using these? These semantics worked really nice on C++, and for sure they would be even better in Rust! |
Resumable functions have to use unsafe pointer tricks to pass values in and out. |
Given resurgence of interest in async IO libraries, I would like to re-introduce my old RFC for stackless coroutine support in Rust, updated to keep up with the times.
The proposed syntax is intentionally bare-bones and does not introduce keywords specific to particular applications (e.g.
await
for async IO). It is my belief that such uses would be better covered with macros and/or syntax extensions.Rendered