Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow for and while loop to return value (without discussion of keyword names) #1767

Closed
JelteF opened this issue Oct 6, 2016 · 40 comments
Closed
Labels
T-lang Relevant to the language team, which will review and decide on the RFC.

Comments

@JelteF
Copy link

JelteF commented Oct 6, 2016

Introduction

I noticed #961 didn't seem to go anywhere because of the discussion about the name of the new block. That's why I just started working on an RFC with the new block name that IMHO seemed to have the least issues, namely the !break one. However, I found out a lot of behavioural details where not actually decided on or discussed.

What this discussion is about (spoiler: not names of keywords)

This is why I want to start a new thread to discuss these details and not the names of keywords. The names used in this issue are not meant as final keywords for an RFC. They are simply meant to allow a discussion without confusion. After the issues here are settled, the names for the keywords can be discussed again in #961 or in a new issue.

Problems

The new proposal would allow for and while to return a value with the break statement, just like proposed for loop in #1624. We need extra behaviour because for and while can exit without a break. However, this can happen in two different ways. Either the loop body is never executed (for with an empty iterator or while with a false condition at the start). Or the body is executed at least once, reaches the end of the loop body or a continue statement and then the loop stops (e.g. because it's out of elements or the condition has become false).

Solutions

For the case where the body is never executed another block is required that can return a value, this block is from here on called noloop. For the cases where the loop would be executed I see three main solutions. These are below with short names that are used in the rest of this issue between brackets:

  1. Use the same noloop block as is use in the other case. ("noloop")
  2. Use a second new block, from here on called nobreak. ("nobreak")
  3. Use the value from the last statement that is executed, which can be continue. ("last-statement")

Then there are also two combinations possible of the options option above:

  1. Make the nobreak block optional and "noloop" when it is not added. ("nobreak-noloop")
  2. Make the nobreak block optional and "last-statement" when it is not added. ("nobreak-last-statement")

My opinion on these solutions

The "noloop" option seems like the worst choice, as it does not allow to differentiate between the two different cases. The "nobreak" option seems better in this regard as it allows to have different behaviour for both cases, but is more verbose in cases where the behaviour should be the same. The "nobreak-noloop" solution allows to have different behaviour for both cases, but is concise when this is not needed.

The "last-stament" option has a big usability advantage over the previous ones, because it allows returning of a value used in the last iteration. However, this comes with the disadvantage that when this is not needed all possible endpoints need to return the same value, i.e. the end of the loop and all continue statements. With the "nobreak-last-statement" solution you can work around this disadvantage by using the nobreak block.

This reasoning is why I think the solutions can be ordered from most desirable to least in the following order:

  1. "nobreak-last-statement"
  2. "last-stament"
  3. "nobreak-noloop"
  4. "nobreak"
  5. "noloop"

Please comment if you have other ideas that could be used, or you have a different opinion en the proposed solutions.

@vitiral
Copy link

vitiral commented Oct 6, 2016

If the loop executes without being entered then the break never get's entered, which means that !break block makes complete sense. I would not special case this.

There is no other cases where a loop can exit without breaking -- except for return or break 'location which are irrelevant because the value of the break is never used.

I'm a little confused about this, other than the name of the !break block, there isn't much confusion (in my mind) about the syntax or structure. It would simply be something like:

let x = while i < 10 {
    if i == 6 {
        break "found";
    }
    if i == 3 {
        return Err("I was odd <= 3");
    }
    i += 2;
} !break {
    "not found"
};

There is no need for a noloop in that case (because !break covers it -- if it didn't enter the loop it also didn't encounter the break so would go to the !break)

@JelteF
Copy link
Author

JelteF commented Oct 6, 2016

You might be right. The main reason why I thought special casing it would be nice, was so you could use the last iterator value as the return value in case nothing was found. Like the following:

let x = for i in [1, 4, 3, 2] {
    if i == 6 {
        break 6;
    }
    i + 10
} noloop {
    0
}

However, I'm not able to quickly think of a real use case for this. And it would also mean that the last statement needs to be executed each loop execution although it is only used the last time.

@vitiral
Copy link

vitiral commented Oct 6, 2016

that is the precise reason that !break needs to exist though. Your example requires !break, because that is what handles not encountering any break statements. The noloop would be separate and could not have anything to do with "the last iterator value" -- if the loop never executed then there would not be a "last iterator value"!!

You are proposing adding an additional branch in the case that the loop is never entered (noloop). However, this case is already covered by !break -- since if it didn't enter the loop it also didn't encounter break -- so I don't think it would be a good idea.

The "last loop value" is a non-starter anyway IMO -- it would be extremely confusing to work with and non-intuative.

@nrc nrc added the T-lang Relevant to the language team, which will review and decide on the RFC. label Oct 6, 2016
@burdges
Copy link

burdges commented Oct 6, 2016

Ick! I strongly prefer doing this sort of thing with return via closures or nested function in languages with those, way more clear and declarative than some strange keyword soup. If I saw one of our grad students using this sort of language feature, then I'd maybe make them use a closure instead.

I suppose #1624 seems okay because loop already denotes an unfamiliar loop, so everyone expects strange flow control, but even there I'd hope tail call optimizations eventually make people loose interest in this loop return value business.

That said, if one wants this sort of feature, there are two reasonable options that avoid needing any keywords :

If you like expressions everywhere, then just make for and while return an enum that describes what happened. I'd suggest just an Option<T> where T is the type of the final expression in the body, and the type supplied by breaks and continues, so None always means the loop never ran, and or_else(), etc. all work. You might consider something more complex, but.. why? You need a value from continue anyways. It'd interact with say #2974 but so does if now. Appears Option<T> gets messy here, but an enum the compiler handled special could avoid breaking existing code.

If you're less dedicated to expressions, and happy to ask for more breaks, then #1624 could be tweaked to allow :

let x = 'a { 
    for i in ... {
        ...
        if foo { break `a bar; }
        ...
    }
    baz
};

In both these cases, there is much less ambiguity for someone reading the code who comes from another language that may do other things.

@vitiral
Copy link

vitiral commented Oct 6, 2016

@burdges what do you think of this:

let x = if y > 10 {
    "small"
} else {
    "big"
};

This is "expressions return values" and is a core principle of rust control flow. These RFC's aim to make it even more clear.

The enum doesn't work, as has been discussed in #961

@notriddle
Copy link
Contributor

I have to disagree pretty heavily with anyone proposing last statement.

First of all: it's not backwards compatible.

Second: it can't borrow anything that the conditional also borrows. For example, we should be able to do this:

let k = for i in &mut v {
    if i == x { break i }
} else {
    x
}

This isn't possible to duplicate with the last statement, because the value in the last statement needs to be held in a temporary variable while the condition is evaluated (i borrows the iterator, preventing next() from being called on it until it's gone). So while the version I have would work, it wouldn't work for this:

let k = for i in &mut v {
    if i == x { break }
    i
} noloop {
    x
}

The big problem is that it desugars to this:

let vi = v.iter_mut();
let mut k;
let mut ran = false;
while Some(i) = vi.next() {
    if i == x { break }
    k = i;
    ran = true;
}
if !ran { k = x }

Except the compiler knows k always gets initialized.

@JelteF
Copy link
Author

JelteF commented Oct 7, 2016

It could easily be made backwards compatible. It would only be enabled when
break returns a value or when the noloop block would be added.

Your second statement is a very good reason not to do that though.

On 7 Oct 2016 4:52 am, "Michael Howell" [email protected] wrote:

I have to disagree pretty heavily with anyone proposing last statement.

First of all: it's not backwards compatible.

Second: it can't borrow anything that the conditional also borrows. For
example, we should be able to do this:

let k = for i in &mut v {
if i == x { break i }
} else {
x
}

This isn't possible to duplicate with the last statement, because the
value in the last statement needs to be held in a temporary variable while
the condition is evaluated (i borrows the iterator, preventing next()
from being called on it until it's gone). So while the version I have would
work, it wouldn't work for this:

let k = for i in &mut v {
if i == x { break }
i
} noloop {
x
}

The big problem is that it desugars to this:

let vi = v.iter_mut();let mut k;let mut ran = false;while Some(i) = vi.next() {
if i == x { break }
k = i;
ran = true;
}if !ran { k = x }

Except the compiler knows k always gets initialized.


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#1767 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABG8JtGTqeBLlVXWum42KGAQkTYr7yWnks5qxbP7gaJpZM4KQNhr
.

@nagisa
Copy link
Member

nagisa commented Oct 7, 2016

One thing to note (wrt keywords) is that since the last RFC we’ve added our first contextual keyword. Although I wouldn’t recommend adding another, it is still an option to consider.

As far as behaviour goes, the @notriddle’s point is fair and has no solution (you oughtn’t work around the borrowchecker here). Remembering the result of last expression in every iteration is probably something that will not happen for this reason. This makes result-ful for and some while loops significantly less useful as well.

Probably the only viable thing to consider is running that extra block if the loop wasn’t exit through break EXPR.

@vitiral
Copy link

vitiral commented Oct 7, 2016

Yes, the only viable syntax (with all possible complications) is:

let x = while i < 10 {
    if i == 6 {
        break "found";  // x == "found"
    }
    if i == 3 {
        return Err("i was odd <= 3"); // x will never be set
    }
    if i == -1 {
        panic!("negative numbers!"); // x will never be set
    }
    i += 2;
} !break {
    // only run if `break` never returned a value in above
    // loop (including if the loop was never entered)
    "not found" // x == "not found"
};

The only real discussion is whether there must be a ; after breaks with values -- which I think there should be (just like there is for return

@Ericson2314
Copy link
Contributor

Ericson2314 commented Oct 7, 2016

return .. and panic!(..) doesn't need semicolon in your example though.s

@vitiral
Copy link

vitiral commented Oct 7, 2016

#1624 (comment) discussed this, and you are right -- ; is not necessary for return, therefore it shouldn't be necessay for break.

I had been of the opinion that it was necessary, I think it is just conventional.

@taralx
Copy link

taralx commented Oct 8, 2016

I believe with the current compiler architecture it would be possible to discriminate between loops whose values are used (and thus require type unification) and those whose values are not used (e.g. due to use of ;). It seems to me that this would provide the better ergonomics.

@joshtriplett
Copy link
Member

Nominating this for @rust-lang/lang discussion, since it seems to have gotten a bit lost.

Also: would this conflict with potential ways to use for/while as iterators, rather than just returning a single value?

@nikomatsakis
Copy link
Contributor

Personally I'm just inclined to leave this as it is. I feel it's "enough" for loop to allow a value to be returned.

@porky11
Copy link

porky11 commented Feb 24, 2017

I'd prefer for/while/etc. {} else {} and else only required, if there is a break in the loop and the return value is used.
In case of loop it may also be useful to add variables, that are only in scope of the loop, else most return values seem useless, since you have to declare them before the loop anyway

loop results = Vec::new(), other_var = 1 {
    //fill vector
    if cond { break results }
}

@nrc
Copy link
Member

nrc commented Mar 9, 2017

I also feel like we shouldn't do anything here - it seems like there is no nice solution and the use case is not strong enough for something complex.

@JelteF
Copy link
Author

JelteF commented Mar 10, 2017

I think this discussion was basically over, because everyone was agreeing on the way that @vitiral proposed. @nrc I'm not sure why you deem that something complex.

@nikomatsakis
Copy link
Contributor

nikomatsakis commented Mar 15, 2017

I am strongly disinclined to include else for loops or anything beyond if. I think the meaning of this is very unclear. As evidence, I submit this survey from PyCon) (Python includes this feature):

survey results

Note that fully 75% of respondents either did not know or guessed the wrong semantics (the correct answering being "the else only executes if there was no break in the while loop").

@JelteF
Copy link
Author

JelteF commented Mar 15, 2017

@nikomatsakis, that's why !break was suggested instead in the other thread, which is very clear in my opinion.

@nikomatsakis
Copy link
Contributor

@JelteF I guess my feeling remains that the need to "produce" values from for loops (or while loops) is really an edge case; almost every time I want it, I realize I could phrase what I want with an iterator. The fact that we're being forced to reach for confusing (else) or unfamiliar and highly unconventional (!break) keywords/syntax in service of an edge goal seems like it's not worth it.

At minimum, I would want to wait until loop { break } is stabilized and in widespread use. That would give us more data on whether we commonly want similar things from a for loop. (Even stabilizing loop/break is currently under dispute precisely because it sees so little use, perhaps partly because it is unstable.)

@vitiral
Copy link

vitiral commented Mar 15, 2017

In my opinion, if for and while loops cannot return values, then none should. Complexity always has a cost, but non-uniform/special-case complexity is the worst

@solson
Copy link
Member

solson commented Mar 15, 2017

But loop is inherently different from for and while because

  1. loop's body is guaranteed to be entered.
  2. loop can only be terminated with a break, so it's obvious where its value comes from.

The language even treats loop {} foo(); and while true {} foo(); differently, where the former gets an unreachable statement warning and the latter does not.

So there's nothing particularly arbitrary about stabilizing loop { break value; } alone. The non-uniform complexity in this situation is due to the pre-existing differences between these loops. This choice falls out naturally as a result.

@withoutboats
Copy link
Contributor

I don't agree at all that it's not uniform. for and while are sugar which expands to loop contain breaks that evaluate to (). This issue basically proposes a syntax for injecting an expression into those sugared over break statements to allow non-unit breaks to unify with them.

@solson
Copy link
Member

solson commented Mar 15, 2017

@withoutboats That's the compiler writer's (or language designer's) POV, but it seems clear from previous discussion that people disagree about what the syntax should be and what it should do for while/for. I'm worried that not enough people have the same understanding of the desugaring.

@glaebhoerl
Copy link
Contributor

There were multiple other possibilities for syntax discussed in the other thread fwiw, many of which are less "highly unconventional" than !break.

@Boscop
Copy link

Boscop commented Mar 15, 2017

Since loop has to break eventually (or loop forever), it makes sense to enable it to evaluate to a value when breaking. But for and while don't have to break.
If we had "break with value" for for and while, it wouldn't be clear what the value should be if they don't break. One possible solution is with else:

let r = while cond() {
    if cond2() { break 42 }
    foo();
} else {
    52
};

Edit: Thinking about it more, I think it would make sense to add "loop break value" for for and while (for consistency) but only if no new keywords are introduced for this small feature. So I think we should use else like above. I know it looks weird at first when seeing an else without an if, but for newcomers it's not more confusing than if let syntax, or nested ifs like this:

if if cond1() {
    foo();
    cond11()
} else {
    cond2()
} {
    bar();
} else {
    baz();
}

Or:

if {
    foo();
    cond()
} {
    bar();
} else {
    baz();
}

Or:

if match x {
    42 => true,
    _ => false,
} {
    bar();
} else {
    baz();
}

Or:

if let Some(x) = if cond() {
    foo();
    a
} else {
    b
} {
    bar();
} else {
    baz();
}

Or:

while {
    let r = foo();
    bar(&r);
    cond(&r)
} {
    baz();
}

Which occur often enough in real-world code that tries to minimize mutable state.

@nrc nrc removed the I-nominated label Apr 6, 2017
@withoutboats
Copy link
Contributor

We decided at the lang team meeting to close this issue. We're not inclined to allow for or while loops to evaluate to anything but () for now. We were all very much in agreement with Niko's earlier comment that evaluating these loops is an edge case and all of the proposed solutions have too great a downside.

@JelteF
Copy link
Author

JelteF commented Apr 8, 2017 via email

@exprosic
Copy link

exprosic commented Feb 8, 2020

I would really love to have this feature, so I try to make my proposal:

  • be useful in everyday situations
  • have a clear and intuitive formal semantics and typing rule
  • avoid the semantical confusion of else

Long story short, the while loop is followed by a then clause:

while BOOL {BLOCK1} then {BLOCK2}

This should be desugared to, and therefore have the same type and semantics with:

loop {if (BOOL) {BLOCK1} else {break {BLOCK2}}}

just as the usual while loop

while BOOL {BLOCK1} // then {}

have already and always been desugared to

loop {if (BOOL) {BLOCK1} else {break {}}}

It requires a bit more care for for but the story remains basically the same.

Note that the break in the then clause is harmless but redundant, since it will be desugared to break (break ...).

The choice of then over else or final is explained in #961

I would suggest then instead of final, since in all currently popular languages where it exists, final(ly) means the exact opposite of getting executed only when not being break-ed before, which is getting executed whatsoever. then would avoids the sort of naming tragedy like return in the Haskell community.

then also avoids the semantical confusion brought by else, since it naturally has a sequential meaning (I eat, then I walk) in parallel with its role in the conditional combination (if/then). In places where it joints two blocks ({ ... } then { ... }) instead of a boolean and a block (x<y then { ... }), the sequential semantics prevails intuitively.

This syntax can be used wherever the loop is meant to find something instead of to do something. Without this feature, we usually do the finding and then put the result somewhere, which is a clumsy emulation of just to find something.

For example:

while l<=r {
  let m = (l+r)/2;
  if a[m] < v {
    l = m+1
  } else if a[m] > v {
    r = m-1
  } else {
    break Some(m)
  }
} then {
  println!("Not found");
  None
}

which means:

loop {
  if (l<=r) {
    let m = (l+r)/2;
    if a[m] < v {
      l = m+1
    } else if a[m] > v {
      r = m-1
    } else {
      break Some(m)
    }
  } else {
    break {
      println!("Not found");
      None
    }
  }
}

Even this desugared version is cleaner than something like

{
  let mut result = None;
  while l<=r {
    let m = (l+r)/2;
    if a[m]<v {
      l = m+1
    } else if a[m]>v {
      r = m-1
    } else {
      result = Some(m);
      break
    }
  }
  if result==None {
    println!("Not found");
  }
  result
}

@scooby
Copy link

scooby commented Dec 1, 2020

What if the clause is next to the conditional?

let var = while BOOL else EXPR { BLOCK }

You're normally reading a while loop as:

  1. Evaluate the conditional.
  2. If true, execute the block.
  3. Else...

This way, else is in proximity to the conditional and the assignment, which I think helps a reader make that association.

And break breaks out of a while loop, so it never hits the conditional. So if you have the intuition that the conditional either goes into the block or stops and returns the else clause, then it makes more sense that the break must bypass else entirely.

@JohnDowson
Copy link

The type of a valued for loop would have to be Either<(), T>.
This means, that there could be a keyword that would desugar to something like:

let foo = for {} or else_value;
let foo = match for {} {
    Left => else_value,
    Right(for_result) => for_result
};

This makes sense to me, because we are logically ORing together a value of a for loop with some other expression.

@tbagrel1
Copy link

Hello,
I just read the discussion thread, and I'm currently not involved at all in the Rust development, but I'm really interested by the language design evolutions (and BTW I just came across a case where I needed a for loop with a return value today at work).

I really don't know if this is ridiculous or not, but maybe for and while loops could return an Option<T> value, where T is the type of the value returned via the break statement. With this solution, we avoid the need of a new keyword or the need to reuse an existant one (which would be weird/unclear as other pointed out), and we would rely on omnipresent Option type to indicate specifically that a value might or might not be produced by the loop.

P.S. : After reading again @JohnDowson last post, his Either<(), T> is kinda equivalent to Option<T>. Using the already established Option<T> would remove the need of another either-like sum type.

I'm really eager to learn, so if my idea is completely stupid, I would really be happy to get some (honest) comments on it :)

@porky11
Copy link

porky11 commented Jul 20, 2021

@tbagrel1 That also came into my mind. But it would be a breaking change.

Currently this is possible:

let x = for _ in 0..5 {};
assert_eq!(x, ());

It's stupid, and noone would do this, but in a case like this, it would break real code:

fn test() {
    for _ in 0..5 {}
}

Currently this compiles, after the change it would break.

But in a new edition, I think, this might be a good idea.
It probably only affects the second case in real code, which shouldn't be too difficult to upgrade automatically using cargo fix --edition.
This change would also still allow for-else loops, in case this feature will also be added later.
And if this change will happen, we might also change a single case if to return an option.

It might have been a good idea in the first place to let simple expressions, which always return the empty tuple, to return a unusable value instead, which would force people to add semicolons after these expressions. I'm not sure, but that's what the never value (!) is meant for, right?

@tbagrel1
Copy link

@porky11
In my original idea, only for loops containing at least one break <expr> would return Option ; for loops without break, or with "empty" break statements would still return (). But I don't know if it is possible and/or a good idea.

I'm not sure about unusable values; AFAIK I think it would make the closure syntax (when one returns ()) a bit too heavy, for little benefit

@porky11
Copy link

porky11 commented Jul 21, 2021

@tbagrel1 In this case, it should work. But I would assume break without an expression to be equivalent to break ().
This might be confusing.

But I also got a similar idea, which is more general and should not break anything. The else case could always return the default value.

  • break 5 => implicit else case is 0
  • break Some(...) => implicit else case is None
  • break vec![1, 2, 3] => implicit else case is the empty Vec

And this should be generalized to ifs as well.

I often have code like this:

let value = if cond {
    Some(value)
} else {
    None
};

let vec = if cond {
    vec
} else {
    Vec::new()
}

Maybe that's worth a new RFC

@tbagrel1
Copy link

for loops with break statements without value could return Option<()> instead of () (it might be more consistent, but I'm not sure), but that would be a breaking change, because

fn main() {
    for _ in 0..5 {
        break
    }
} 

would no longer be valid.
I don't think that relying on default value is a good idea though. It would restrict break statements to types implementing the Default trait, and would be confusing. For example, the famous find_index function couldn't use break statements, because when the requested value is not found, the returned index would be 0 (instead of the -1 value usually used when an element is not found, or way better, the None value).

Moreover, let my_vec = if cond { vec![1,2,3] }, with an implicit empty vec value when the condition is false does not make a lot of sense in my mind when I read it.

@porky11
Copy link

porky11 commented Jul 24, 2021

fn main() {
    for _ in 0..5 {
        break
    }
} 

would no longer be valid.

That's what I was expecting. Your first suggestion, that break without value always returns nothing (()) would not break anything.

I don't think that relying on default value is a good idea though.

You could still always return Some(...).
I just think, implicitly returning an option is not such a good idea.
I know, relying on default, is also not perfect, but it's more flexible.

Especially if you want to add an else case, you would also have to change the if case, which now has to make Some explicit.

Moreover, let my_vec = if cond { vec![1,2,3] }, with an implicit empty vec value when the condition is false does not make a lot of sense in my mind when I read it.

This might be useful when parsing a file, which defines values for different keywords.

You could do something like this:

let x = if file_data.contains_keyword("x") {
    file_data.parse_values_as_vec("x")
}

@tbagrel1
Copy link

We just disagree on this point, I think Option is really more flexible and explicit than your solution. In addition, I'm not a fan of the if assignment with the implicit else case (using default), it doesn't make sense in my head when I read it (seems really counter intuitive for me)
Implicits are rarely a good idea

@truppelito
Copy link

truppelito commented Jul 23, 2022

I love that Rust makes many code constructs into expressions. I really like the fact that I can write let result = loop { do_stuff(); ... break a_value; };. And I've also ran into cases where I would've liked to do it with for and while loops (in fact, the first time I tried it I was actually suprised I couldn't do it). I came across this RFC so I would like to offer my 2 cents, based on a first principles and minimum design effort approach...

Since if, match and loop can all be used as expressions, I think we should fill in the gap for for and while.

Consider (where the breaks are somewhere in the loop block):

let x = loop {
    ...
    break val_1;
    ...
    break val_2;
    ...
};

Since the only way to exit the loop is through the breaks, we can be sure x is initialized before being used. This works the same way as a function/closure:

let x = || -> Type {
    ...
    return val_1;
    ...
    return val_2;
    ...
}();

Now consider a while loop (a for loop works exactly the same way, whenever I talk about while from now on, consider for as well):

let x = while condition {
    ...
    break val_1;
    ...
    break val_2;
    ...
};

The only difference I can see here is that, unlike loop, while is not guaranteed to run a break statement. This can happen for multiple different reasons (never enters the while loop, entres the loop but condition is false before a break is executed, etc). However, I don't care about the exact reason. The fact is: either a break runs, or the loop exits without running a break. This is a clear binary option. So the obvious way to deal with this is that the type of x is Option<T> (where type of val_i is T).

I suspect that

let x: Option<f64> = while condition {
    ...
    break 1.0;
    ...
};

would not surprise anyone who writes Rust code. I think it would in fact feel very familiar.

Also, if you then want to run code based on the break not running (!break mentioned above), just match on the None case of the x value. No need for new keywords.

But what about a break that doesn't return a value? Well, I think that while c { break; } returning Option<()> would be cumbersome. Sometimes we don't care about whether the break ran or not. So to me this makes sense:

// No explicit value in break
let x: () = loop { break; };
let x: () = while c { break; };

// With explicit value in break
let x: () = loop { break (); };
let x: Option<()> = while c { break (); };
let x: f64 = loop { break 1.0; };
let x: Option<f64> = while c { break 1.0; };

That is, if I explicitly state a value to be returned, then the type of the loop/while expresions are T/Option<T>. If I just type break;, then the type is (). This seems very consistent to me, and also allows cases in while where we don't acttually want to return a value, but want to know whether we hit the break or not. In that case, we would have break ();, return type would be Option<()> and we would match on Some(()) (hit the break) or None (didn't hit the break). Admittedly, this sounds like a niche case to me, however, it is possible, and clear.

Importantly, all of this is opt-in, explicit, consistent and (I think) backwards compatible:

  • Want to break out of a loop? Write break;.
  • Want to break out of a loop, but know when you hit the break or not? Write break ();.
  • Want to break out of a loop with a value? Write break value;.

The middle case doesn't make much sense for loop, so both break; and break (); do the same thing1 (which actually to me is the sign of a good design), but otherwise all three cases apply in a consistent manner to all three loop types.

But all this can be done with iterators!

I understand where you're coming from: I'm also not in favor of needless options and complexity. But I don't like this argument in this case. This is not a needless option, and it does not add substantial complexity.

IMO this is a feature that, as stated, has a very low cognitive effort to learn and use. It does not require any new keywords, it fills a gap in the syntax of the language in a (IMO) logical way (also it does not add any major complexity to the language to fill this gap), it is composable with other constructs of the language (because it uses Option)...

Also iterators, as much as I love them, and a substantial part of my code is iterators, I would not use them in all cases. After all, we still have for loops. Personally, I find that the iterator syntax is a bit heavier and sometimes the control/data flow is easier to understand when using loops. And while prototyping I can just write a loop directly, adding stuff to it as I see fit (and once the code is finished, maybe I leave it as is, or I convert to iterators), whereas for iterators I always have to make a game plan in my head before I start writing the first line of code.

Plus, with this suggestion, we can have break and return inside the same loop (break "returns" to the variable, and return goes out of the function), something that while I'm sure you can always make it work with iterators, is just much easier with loops.

In the end, I think the the cost of adding some overlap in language functionality is more than compensated by the fact that it does so without almost any complexity or cognitive load, and it does allow some functionality for people who (at least in some cases) prefer loops to iterators, and also has the advantage of break and return inside the same loop.

PS: Above it was mentioned using default as the return value. This could be done simply with let x: f64 = while c { break 1.0; }.unwrap_or_default();. Because we are using Option, this feature is composable with other common language constructs.

Footnotes

  1. Perhaps a lint could be added to convert break (); to break; in this case.

@truppelito
Copy link

truppelito commented Jul 23, 2022

An extra note: I saw this argument against having a distinction between break; and break ();. I propose the following reframing:

let x: () = || { return; };
let x: () = loop { break; };
let x: () = while c { break; };

let x: () = || { return (); };
let x: () = loop { break (); };
let x: Option<()> = while c { break (); };

let x: f64 = || { return 1.0; };
let x: f64 = loop { break 1.0; };
let x: Option<f64> = while c { break 1.0; };

In cases where return/break must be called to advance control flow, break/return; and break/return (); is equivalent. Otherwise, it's different.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-lang Relevant to the language team, which will review and decide on the RFC.
Projects
None yet
Development

No branches or pull requests