Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add lint for never type regressions #68350

Closed
wants to merge 25 commits into from

Conversation

Aaron1011
Copy link
Member

@Aaron1011 Aaron1011 commented Jan 18, 2020

Fixes #67225
Fixes #66173

Tl;DR: This PR introduces a lint that detects the 'bad' never-type fallback in objc (which results in U.B.), while allowing safe fallback to occur without a warning.

See https://hackmd.io/@FohtAO04T92wF-8_TVATYQ/SJ0vcjyWL for some background on never-type fallback.

The problem

We want to reject "bad" code like this:

fn unconstrained_return<T>() -> Result<T, String> {
    let ffi: fn() -> T = transmute(some_pointer);
    Ok(ffi())
}
fn foo() {
    match unconstrained_return::<_>() {
        Ok(x) => x,  // `x` has type `_`, which is unconstrained
        Err(s) => panic!(s),  // … except for unifying with the type of `panic!()`
        // so that both `match` arms have the same type.
        // Therefore `_` resolves to `!` and we "return" an `Ok(!)` value.
    };
}

in which enabling never-type fallback can cause undefined behavior.

However, we want to allow "good" code like this:

struct E;
impl From<!> for E {
    fn from(x: !) -> E { x }
}
fn foo(never: !) {
    <E as From<_>>::from(never);
}

fn main() {}

which relies on never-type fallback being enabled, but is perfectly safe.

The solution

The key difference between these two examples lies in how the result of never-type fallback is used. In the first example, we end up inferring the generic parameter of unconstrained_return to be !. In the second example, we still infer a generic parameter to be ! (Box::<!>::new(!)), but we also pass an uninhabited parameter to the function.

Another way of looking at this is that the call to unconstrained_return is *potentially live - none of its arguments are uninhabited, so we might (and in fact, do) end up actually executing the call at runtime.

In the second example, Box::new() has an uninhabited argument (the ! type). This means that this call is definitely dead - since the ! type can never be instantiated, it's impossible for the call to every be executed.

This forms the basis for the check. For each method call, we check the following:

  1. Did the generic arguments have unconstrained type variables prior to fallback?
  2. Did any of the generic arguments become uninhabited after fallback?
  3. Are all of the method arguments inhabited?

If the answer to all of these is yes, we emit an error. I've left extensive comments in the code describing how this is accomplished.

These conditions ensure that we don't error on the Box and From<!> examples, while still erroring on the bad objc code.

Further notes

You can test out this branch with the original bad objc code as follows:

  1. Clone https://github.com/Aaron1011/rust-objc
  2. Checkout the bad-fallback branch.
  3. With a local rustc toolchain built from this branch, run cargo build --example example
  4. Note that you get an error due to an unconstrained return type

Unresolved questions

  1. This lint only checks method calls. I believe this is sufficient to catch any undefined behavior caused by fallback changes. Since the introduced undefined behavior relies on actually 'producing' a ! type instance, the user must be doing something 'weird' (calling transmute or some other intrinsic). I don't think it's possible to trigger this without some kind of intrinsic call - however, I'm not 100% certain.

  2. This lint requires us to perform extra work during the type-checking of every single method. This is not ideal - however, changing this would have required significant refactoring to method type-checking. It would be a good idea to due to a perf run to see what kind of impact this has, and it another approach will be required.

  3. This 'lint' is currently a hard error. I believe it should always be possible to fix this error by adding explicit type annotations somewhere (though in the obj case, this may be in the caller of a macro). Unfortunately, I think actually emitting any kind of suggestion to the user will be extremely difficult. Hopefully, this error is so rare that the lack of suggestion isn't a problem. If users are running into this with any frequency, I think we'll need a different approach.

  4. If this PR is accepted, I see two ways of rolling this out:

  5. If the bad objc crate is the only crate known to be affected, we could potentially go from no warning/lint to a hard error in a single release (coupled enabling never-type fallback0.

  6. If we're worried that this could break a lot of crates, we could make this into a future compatibility lint. At some point in the future, we could enable never-type fallback while simultaneously making this a hard error.

What we should not do is make the never-type fallback changes without making this lint (or whatever lint ends up getting accepted) into a hard error. A lint, even a deny-by-default one, would be insufficient, as we would run a serious risk introducing undefined behavior without any kind of explicit acknowledgment from the user.

@rust-highfive
Copy link
Collaborator

r? @varkor

(rust_highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jan 18, 2020
@Centril
Copy link
Contributor

Centril commented Jan 18, 2020

@bors try @rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion

@bors
Copy link
Contributor

bors commented Jan 18, 2020

⌛ Trying commit d4bd422 with merge 873f3b0...

bors added a commit that referenced this pull request Jan 18, 2020
Add lint for never type regressions

Fixes #67225

Tl;DR: This PR introduces a lint that detects the 'bad' never-type fallback in `objc` (which results in U.B.), while allowing safe fallback to occur without a warning.

See https://hackmd.io/@FohtAO04T92wF-8_TVATYQ/SJ0vcjyWL for some background on never-type fallback.

### The problem

We want to reject "bad" code like this:

```rust
fn unconstrained_return<T>() -> Result<T, String> {
    let ffi: fn() -> T = transmute(some_pointer);
    Ok(ffi())
}
fn foo() {
    match unconstrained_return::<_>() {
        Ok(x) => x,  // `x` has type `_`, which is unconstrained
        Err(s) => panic!(s),  // … except for unifying with the type of `panic!()`
        // so that both `match` arms have the same type.
        // Therefore `_` resolves to `!` and we "return" an `Ok(!)` value.
    };
}
```

in which enabling never-type fallback can cause undefined behavior.

However, we want to allow "good" code like this:

```rust
struct E;
impl From<!> for E {
    fn from(x: !) -> E { x }
}
fn foo(never: !) {
    <E as From<_>>::from(never);
}

fn main() {}
```

which relies on never-type fallback being enabled, but is perfectly safe.

### The solution

The key difference between these two examples lies in how the result of never-type fallback is used. In the first example, we end up inferring the generic parameter of `unconstrained_return` to be `!`. In the second example, we still infer a generic parameter to be `!` (`Box::<!>::new(!)`), but we also pass an uninhabited parameter to the function.

Another way of looking at this is that the call to `unconstrained_return` is **potentially live* - none of its arguments are uninhabited, so we might (and in fact, do) end up actually executing the call at runtime.

In the second example, `Box::new()` has an uninhabited argument (the `!` type). This means that this call is **definitely dead** - since the `!` type can never be instantiated, it's impossible for the call to every be executed.

This forms the basis for the check. For each method call, we check the following:

1. Did the generic arguments have unconstrained type variables prior to fallback?
2. Did any of the generic arguments become uninhabited after fallback?
3. Are all of the method arguments inhabited?

If the answer to all of these is *yes*, we emit an error. I've left extensive comments in the code describing how this is accomplished.

These conditions ensure that we don't error on the `Box` and `From<!>` examples, while still erroring on the bad `objc` code.

### Further notes

You can test out this branch with the original bad `objc` code as follows:

1. Clone `https://github.com/Aaron1011/rust-objc`
2. Checkout the `bad-fallback` branch.
3. With a local rustc toolchain built from this branch, run `cargo build --example example`
4. Note that you get an error due to an unconstrained return type

### Unresolved questions

1. This lint only checks method calls. I believe this is sufficient to catch any undefined behavior caused by fallback changes. Since the introduced undefined behavior relies on actually 'producing' a `!` type instance, the user must be doing something 'weird' (calling `transmute` or some other intrinsic). I don't think it's possible to trigger this without *some* kind of intrinsic call - however, I'm not 100% certain.

2. This lint requires us to perform extra work during the type-checking of every single method. This is not ideal - however, changing this would have required significant refactoring to method type-checking. It would be a good idea to due to a perf run to see what kind of impact this has, and it another approach will be required.

3. This 'lint' is currently a hard error. I believe it should always be possible to fix this error by adding explicit type annotations *somewhere* (though in the `obj` case, this may be in the caller of a macro). Unfortunately, I think actually emitting any kind of suggestion to the user will be extremely difficult. Hopefully, this error is so rare that the lack of suggestion isn't a problem. If users are running into this with any frequency, I think we'll need a different approach.

4. If this PR is accepted, I see two ways of rolling this out:

1. If the bad `objc` crate is the only crate known to be affected, we could potentially go from no warning/lint to a hard error in a single release (coupled enabling never-type fallback0.
2. If we're worried that this could break a lot of crates, we could make this into a future compatibility lint. At some point in the future, we could enable never-type fallback while simultaneously making this a hard error.

What we should **not** do is make the never-type fallback changes without making this lint (or whatever lint ends up getting accepted) into a hard error. A lint, even a deny-by-default one, would be insufficient, as we would run a serious risk introducing undefined behavior without any kind of explicit acknowledgment from the user.
Comment on lines +632 to +672
/// Stores additional data about a generic path
/// containing inference variables (e.g. `my_method::<_, u8>(bar)`).
/// This is used by `NeverCompatHandler` to inspect
/// all method calls that contain inference variables.
///
/// This struct is a little strange, in that its data
/// is filled in from two different places in typecheck.
/// Thw two `Option` fields are written by `check_argument_types`
/// and `instantiate_value_path`, since neither method
/// has all of the information it needs.
#[derive(Clone, Debug)]
struct InferredPath<'tcx> {
/// The span of the corresponding expression.
span: Span,
/// The type of this path. For method calls,
/// this is a `ty::FnDef`
ty: Option<Ty<'tcx>>,
/// The types of the arguments (*not* generic substs)
/// provided to this path, if it represents a method
/// call. For example, `foo(true, 25)` would have
/// types `[bool, i32]`. If this path does not
/// correspond to a method, then this will be `None`
///
/// This is a `Cow` rather than a `Vec` or slice
/// to accommodate `check_argument_types`, which may
/// be called with either an interned slice or a Vec.
/// A `Cow` lets us avoid unecessary interning
/// and Vec construction, since we just need to iterate
/// over this
args: Option<Cow<'tcx, [Ty<'tcx>]>>,
/// The unresolved inference variables for each
/// generic substs. Each entry in the outer vec
/// corresponds to a generic substs in the function.
///
/// For example, suppose we have the function
/// `fn foo<T, V> (){ ... }`.
///
/// The method call `foo::<MyStruct<_#0t, #1t>, true>>()`
/// will have an `unresolved_vars` of `[[_#0t, _#1t], []]`
unresolved_vars: Vec<Vec<Ty<'tcx>>>,
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good to move this into never_compat to keep all the logic required for that more self-contained. This file is also already huge and I would prefer not to add anything more miscellaneous to it.

Comment on lines +3798 to +3822
let fn_inputs = fn_inputs.into();
debug!("check_argument_types: storing arguments for expr {:?}", expr);
// We now have the arguments types available for this msthod call,
// so store them in the `inferred_paths` entry for this method call.
// We set `ty` as `None` if we are the first to access the entry
// for this method, and leave it untouched otherwise.
match self.inferred_paths.borrow_mut().entry(expr.hir_id) {
Entry::Vacant(e) => {
debug!("check_argument_types: making new entry for types {:?}", fn_inputs);
e.insert(InferredPath {
span: sp,
ty: None,
args: Some(fn_inputs.clone()),
unresolved_vars: vec![],
});
}
Entry::Occupied(mut e) => {
debug!(
"check_argument_types: modifying existing {:?} with types {:?}",
e.get(),
fn_inputs
);
e.get_mut().args = Some(fn_inputs.clone());
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto with this bit. Let's move the logic into a new function inside never_compat.

Comment on lines +5510 to +5541
if ty_substituted.has_infer_types() {
debug!(
"instantiate_value_path: saving path with infer: ({:?}, {:?})",
span, ty_substituted
);
let parent_id = tcx.hir().get_parent_node(hir_id);
let parent = tcx.hir().get(parent_id);
match parent {
Node::Expr(hir::Expr { span: p_span, kind: ExprKind::Call(..), .. })
| Node::Expr(hir::Expr { span: p_span, kind: ExprKind::MethodCall(..), .. }) => {
// Fill in the type for our parent expression. This might not be
// a method call - if it is, the argumetns will be filled in by
// `check_argument_types`
match self.inferred_paths.borrow_mut().entry(parent_id) {
Entry::Vacant(e) => {
debug!("instantiate_value_path: inserting new path");
e.insert(InferredPath {
span: *p_span,
ty: Some(ty_substituted),
args: None,
unresolved_vars: vec![],
});
}
Entry::Occupied(mut e) => {
debug!("instantiate_value_path: updating existing path {:?}", e.get());
e.get_mut().ty = Some(ty_substituted);
}
}
}
_ => {}
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

..and let's move this part as well into a new function.

use rustc_data_structures::fx::FxHashMap;
use rustc_hir::HirId;

/// Code to detect cases where using `!` (never-type) fallback instead of `()` fallback
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be a module doc comment (//!)?


let mut err = tcx
.sess
.struct_span_err(span, "Fallback to `!` may introduce undefined behavior");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
.struct_span_err(span, "Fallback to `!` may introduce undefined behavior");
.struct_span_warn(span, "fallback to `!` may introduce undefined behavior");

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy with keeping this an error for the purposes of a crater run, but this needs to become a warning (or a lint, depending on your opinions re. --cap-lints) before the PR can land.

@Centril Centril added I-nominated T-lang Relevant to the language team, which will review and decide on the PR/issue. labels Jan 18, 2020
@Centril
Copy link
Contributor

Centril commented Jan 18, 2020

r? @nikomatsakis

@rust-highfive rust-highfive assigned nikomatsakis and unassigned varkor Jan 18, 2020
@Centril
Copy link
Contributor

Centril commented Jan 18, 2020

I'll schedule a crater run as well when the try builder is done. Please avoid pushing to the branch meanwhile.

@rust-highfive
Copy link
Collaborator

The job x86_64-gnu-llvm-7 of your PR failed (pretty log, raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.
2020-01-18T18:32:30.5294787Z ========================== Starting Command Output ===========================
2020-01-18T18:32:30.5298097Z [command]/bin/bash --noprofile --norc /home/vsts/work/_temp/81e89bee-3806-4203-895b-3409a34b3328.sh
2020-01-18T18:32:30.5298258Z 
2020-01-18T18:32:30.5301482Z ##[section]Finishing: Disable git automatic line ending conversion
2020-01-18T18:32:30.5355179Z ##[section]Starting: Checkout rust-lang/rust@refs/pull/68350/merge to s
2020-01-18T18:32:30.5356877Z Task         : Get sources
2020-01-18T18:32:30.5356905Z Description  : Get sources from a repository. Supports Git, TfsVC, and SVN repositories.
2020-01-18T18:32:30.5356933Z Version      : 1.0.0
2020-01-18T18:32:30.5356967Z Author       : Microsoft
---
2020-01-18T18:32:32.3332223Z ##[command]git remote add origin https://github.com/rust-lang/rust
2020-01-18T18:32:32.3341798Z ##[command]git config gc.auto 0
2020-01-18T18:32:32.3343750Z ##[command]git config --get-all http.https://github.com/rust-lang/rust.extraheader
2020-01-18T18:32:32.3345451Z ##[command]git config --get-all http.proxy
2020-01-18T18:32:32.3350647Z ##[command]git -c http.extraheader="AUTHORIZATION: basic ***" fetch --force --tags --prune --progress --no-recurse-submodules --depth=2 origin +refs/heads/*:refs/remotes/origin/* +refs/pull/68350/merge:refs/remotes/pull/68350/merge
---
2020-01-18T19:22:06.7467864Z .................................................................................................... 1700/9544
2020-01-18T19:22:13.5945198Z .................................................................................................... 1800/9544
2020-01-18T19:22:22.9737899Z .................i.................................................................................. 1900/9544
2020-01-18T19:22:29.6337294Z .................................................................................................... 2000/9544
2020-01-18T19:22:43.5335911Z .......iiiii........................................................................................ 2100/9544
2020-01-18T19:22:52.0167321Z .................................................................................................... 2300/9544
2020-01-18T19:22:54.2997212Z .................................................................................................... 2400/9544
2020-01-18T19:22:59.4456337Z .................................................................................................... 2500/9544
2020-01-18T19:23:18.4339985Z .................................................................................................... 2600/9544
---
2020-01-18T19:25:43.1257754Z ....................................................i...............i............................... 4900/9544
2020-01-18T19:25:50.4463401Z .................................................................................................... 5000/9544
2020-01-18T19:25:57.2952001Z ...............................................................................................i.... 5100/9544
2020-01-18T19:26:01.8682542Z .................................................................................................... 5200/9544
2020-01-18T19:26:11.3185702Z ...................................................................ii.ii...........i................ 5300/9544
2020-01-18T19:26:19.3224432Z ....i............................................................................................... 5500/9544
2020-01-18T19:26:28.6451157Z .................................................................................................... 5600/9544
2020-01-18T19:26:34.4046947Z .....................................................i.............................................. 5700/9544
2020-01-18T19:26:40.3772791Z .................................................................................................... 5800/9544
2020-01-18T19:26:40.3772791Z .................................................................................................... 5800/9544
2020-01-18T19:26:49.2310803Z .................................................................................................... 5900/9544
2020-01-18T19:26:55.2855903Z ............................................ii...i..ii...........i.................................. 6000/9544
2020-01-18T19:27:14.4542073Z .................................................................................................... 6200/9544
2020-01-18T19:27:21.6351912Z .................................................................................................... 6300/9544
2020-01-18T19:27:21.6351912Z .................................................................................................... 6300/9544
2020-01-18T19:27:29.3054035Z .............................................................................i..ii.................. 6400/9544
2020-01-18T19:27:53.4140814Z .................................................................................................... 6600/9544
2020-01-18T19:27:57.8086232Z .....................................................i.............................................. 6700/9544
2020-01-18T19:27:59.8156316Z .................................................................................................... 6800/9544
2020-01-18T19:28:01.8393797Z .....................................................i.............................................. 6900/9544
---
2020-01-18T19:29:32.7146333Z .................................................................................................... 7600/9544
2020-01-18T19:29:37.8336286Z .................................................................................................... 7700/9544
2020-01-18T19:29:43.6097560Z .................................................................................................... 7800/9544
2020-01-18T19:29:53.2231403Z .................................................................................................... 7900/9544
2020-01-18T19:29:58.4970691Z ....iiiiiii......................................................................................... 8000/9544
2020-01-18T19:30:11.6946387Z .................................................................................................... 8200/9544
2020-01-18T19:30:21.8835975Z .................................................................................................... 8300/9544
2020-01-18T19:30:33.1422834Z .................................................................................................... 8400/9544
2020-01-18T19:30:38.4387546Z .................................................................................................... 8500/9544
---
2020-01-18T19:32:21.9046852Z ---- [ui] ui/never_type/diverging-fallback-control-flow.rs stdout ----
2020-01-18T19:32:21.9047153Z 
2020-01-18T19:32:21.9047624Z error: test compilation failed although it shouldn't!
2020-01-18T19:32:21.9047919Z status: exit code: 1
2020-01-18T19:32:21.9049135Z command: "/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/bin/rustc" "/checkout/src/test/ui/never_type/diverging-fallback-control-flow.rs" "-Zthreads=1" "--target=x86_64-unknown-linux-gnu" "--error-format" "json" "-Zui-testing" "-Zdeduplicate-diagnostics=no" "-C" "prefer-dynamic" "-o" "/checkout/obj/build/x86_64-unknown-linux-gnu/test/ui/never_type/diverging-fallback-control-flow/a" "-Crpath" "-O" "-Cdebuginfo=0" "-Zunstable-options" "-Lnative=/checkout/obj/build/x86_64-unknown-linux-gnu/native/rust-test-helpers" "-L" "/checkout/obj/build/x86_64-unknown-linux-gnu/test/ui/never_type/diverging-fallback-control-flow/auxiliary"
2020-01-18T19:32:21.9049960Z ------------------------------------------
2020-01-18T19:32:21.9050244Z 
2020-01-18T19:32:21.9050679Z ------------------------------------------
2020-01-18T19:32:21.9050936Z stderr:
---
2020-01-18T19:32:21.9098781Z 
2020-01-18T19:32:21.9098822Z error: Fallback to `!` may introduce undefined behavior
2020-01-18T19:32:21.9099484Z   --> /checkout/src/test/ui/never_type/diverging-fallback-control-flow.rs:69:23
2020-01-18T19:32:21.9099548Z    |
2020-01-18T19:32:21.9099592Z LL |     let _x = match Ok(BadDefault::default()) {
2020-01-18T19:32:21.9099689Z    |
2020-01-18T19:32:21.9099749Z note: the type parameter Self here was inferred to `!`
2020-01-18T19:32:21.9099995Z   --> /checkout/src/test/ui/never_type/diverging-fallback-control-flow.rs:69:23
2020-01-18T19:32:21.9100040Z    |
2020-01-18T19:32:21.9100040Z    |
2020-01-18T19:32:21.9100111Z LL |     let _x = match Ok(BadDefault::default()) {
2020-01-18T19:32:21.9100200Z note: (type parameter defined here)
2020-01-18T19:32:21.9100459Z   --> /checkout/src/test/ui/never_type/diverging-fallback-control-flow.rs:16:1
2020-01-18T19:32:21.9100504Z    |
2020-01-18T19:32:21.9100542Z LL | / trait BadDefault {
2020-01-18T19:32:21.9100542Z LL | / trait BadDefault {
2020-01-18T19:32:21.9100750Z LL | |     fn default() -> Self;
2020-01-18T19:32:21.9100971Z LL | | }
2020-01-18T19:32:21.9101009Z    | |_^
2020-01-18T19:32:21.9101052Z note: ... due to this expression evaluating to `!`
2020-01-18T19:32:21.9101320Z   --> /checkout/src/test/ui/never_type/diverging-fallback-control-flow.rs:69:14
2020-01-18T19:32:21.9101377Z    |
2020-01-18T19:32:21.9101420Z LL |       let _x = match Ok(BadDefault::default()) {
2020-01-18T19:32:21.9101524Z LL | |         Ok(v) => v,
2020-01-18T19:32:21.9101733Z LL | |         Err(()) => return,
2020-01-18T19:32:21.9101812Z LL | |     };
2020-01-18T19:32:21.9101854Z    | |_____^
2020-01-18T19:32:21.9101854Z    | |_____^
2020-01-18T19:32:21.9101902Z    = note: If you want the `!` type to be used here, add explicit type annotations
2020-01-18T19:32:21.9101936Z 
2020-01-18T19:32:21.9101999Z error: Fallback to `!` may introduce undefined behavior
2020-01-18T19:32:21.9102299Z   --> /checkout/src/test/ui/never_type/diverging-fallback-control-flow.rs:76:23
2020-01-18T19:32:21.9102346Z    |
2020-01-18T19:32:21.9102408Z LL |     let _x = match Ok(BadDefault::default()) {
2020-01-18T19:32:21.9102497Z    |
2020-01-18T19:32:21.9102559Z note: the type parameter Self here was inferred to `!`
2020-01-18T19:32:21.9102944Z   --> /checkout/src/test/ui/never_type/diverging-fallback-control-flow.rs:76:23
2020-01-18T19:32:21.9102992Z    |
2020-01-18T19:32:21.9102992Z    |
2020-01-18T19:32:21.9103221Z LL |     let _x = match Ok(BadDefault::default()) {
2020-01-18T19:32:21.9103333Z note: (type parameter defined here)
2020-01-18T19:32:21.9103570Z   --> /checkout/src/test/ui/never_type/diverging-fallback-control-flow.rs:16:1
2020-01-18T19:32:21.9103629Z    |
2020-01-18T19:32:21.9103667Z LL | / trait BadDefault {
---
2020-01-18T19:32:21.9109860Z test result: FAILED. 9493 passed; 1 failed; 50 ignored; 0 measured; 0 filtered out
2020-01-18T19:32:21.9109902Z 
2020-01-18T19:32:21.9109925Z 
2020-01-18T19:32:21.9109948Z 
2020-01-18T19:32:21.9111403Z command did not execute successfully: "/checkout/obj/build/x86_64-unknown-linux-gnu/stage0-tools-bin/compiletest" "/checkout/obj/build/x86_64-unknown-linux-gnu/stage0-tools-bin/compiletest" "--compile-lib-path" "/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/lib" "--run-lib-path" "/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/lib/rustlib/x86_64-unknown-linux-gnu/lib" "--rustc-path" "/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/bin/rustc" "--src-base" "/checkout/src/test/ui" "--build-base" "/checkout/obj/build/x86_64-unknown-linux-gnu/test/ui" "--stage-id" "stage2-x86_64-unknown-linux-gnu" "--mode" "ui" "--target" "x86_64-unknown-linux-gnu" "--host" "x86_64-unknown-linux-gnu" "--llvm-filecheck" "/usr/lib/llvm-7/bin/FileCheck" "--host-rustcflags" "-Crpath -O -Cdebuginfo=0 -Zunstable-options  -Lnative=/checkout/obj/build/x86_64-unknown-linux-gnu/native/rust-test-helpers" "--target-rustcflags" "-Crpath -O -Cdebuginfo=0 -Zunstable-options  -Lnative=/checkout/obj/build/x86_64-unknown-linux-gnu/native/rust-test-helpers" "--docck-python" "/usr/bin/python2.7" "--lldb-python" "/usr/bin/python2.7" "--gdb" "/usr/bin/gdb" "--quiet" "--llvm-version" "7.0.0\n" "--system-llvm" "--cc" "" "--cxx" "" "--cflags" "" "--llvm-components" "" "--llvm-cxxflags" "" "--adb-path" "adb" "--adb-test-dir" "/data/tmp/work" "--android-cross-path" "" "--color" "always"
2020-01-18T19:32:21.9111629Z 
2020-01-18T19:32:21.9111671Z 
2020-01-18T19:32:21.9111912Z thread 'main' panicked at 'Some tests failed', src/tools/compiletest/src/main.rs:387:22
2020-01-18T19:32:21.9111964Z note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
2020-01-18T19:32:21.9111964Z note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
2020-01-18T19:32:21.9112031Z failed to run: /checkout/obj/build/bootstrap/debug/bootstrap test
2020-01-18T19:32:21.9112075Z Build completed unsuccessfully in 0:54:55
2020-01-18T19:32:21.9137734Z == clock drift check ==
2020-01-18T19:32:21.9153682Z   local time: Sat Jan 18 19:32:21 UTC 2020
2020-01-18T19:32:22.1909843Z   network time: Sat, 18 Jan 2020 19:32:22 GMT
2020-01-18T19:32:22.1921375Z == end clock drift check ==
2020-01-18T19:32:22.6815454Z 
2020-01-18T19:32:22.6913124Z ##[error]Bash exited with code '1'.
2020-01-18T19:32:22.6923880Z ##[section]Finishing: Run build
2020-01-18T19:32:22.6942059Z ##[section]Starting: Checkout rust-lang/rust@refs/pull/68350/merge to s
2020-01-18T19:32:22.6943678Z Task         : Get sources
2020-01-18T19:32:22.6943720Z Description  : Get sources from a repository. Supports Git, TfsVC, and SVN repositories.
2020-01-18T19:32:22.6943777Z Version      : 1.0.0
2020-01-18T19:32:22.6943813Z Author       : Microsoft
2020-01-18T19:32:22.6943813Z Author       : Microsoft
2020-01-18T19:32:22.6943854Z Help         : [More Information](https://go.microsoft.com/fwlink/?LinkId=798199)
2020-01-18T19:32:22.6943897Z ==============================================================================
2020-01-18T19:32:23.0957737Z Cleaning any cached credential from repository: rust-lang/rust (GitHub)
2020-01-18T19:32:23.0998208Z ##[section]Finishing: Checkout rust-lang/rust@refs/pull/68350/merge to s
2020-01-18T19:32:23.1098465Z Cleaning up task key
2020-01-18T19:32:23.1099230Z Start cleaning up orphan processes.
2020-01-18T19:32:23.1215398Z Terminate orphan process: pid (3374) (python)
2020-01-18T19:32:23.1476452Z ##[section]Finishing: Finalize Job

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

@Centril
Copy link
Contributor

Centril commented Jan 23, 2020

We discussed this in the T-Lang meeting today but did not reach a conclusion re. rollout. We're awaiting crater data for further evaluation next week.

@Aaron1011
Copy link
Member Author

@Centril: Are the meeting minutes publically available?

@Centril
Copy link
Contributor

Centril commented Jan 23, 2020

I think @nikomatsakis will publish them soon in the lang team repo and we also should have a recording up soon on YouTube.

@nikomatsakis
Copy link
Contributor

I am particularly curious to see what sorts of false errors we see.

One alternative I wanted to note that might be worth considering:

If we wanted to detect only cases where the "type variables which fell back to ! appear in live code" (TV_fall), that doesn't have to be as complex as you suggested. The typeck code is already detecting dead code for the purposes of issuing warnings, so what I imagined doing was collecting a set of HirId for "dead expressions" -- I don't imagine these sets should be so big -- and then, during writeback, checking whether the variables in TV_fall appear in a type associated with some HirId that is not in that set.

This has the advantage of of naturally avoiding, I think, all of the false errors that we've seen? I'm not sure if it has its own set of false errors, though.

@nikomatsakis
Copy link
Contributor

@Aaron1011 the minutes are here, they include a link to the recording

@Aaron1011
Copy link
Member Author

Aaron1011 commented Jan 24, 2020

The typeck code is already detecting dead code for the purposes of issuing warnings

I thought about doing something like that. However, the current "unreachable code" is fairly coarse-grained - it uses a single diverges flag. It wasn't obvious to me that this would work around complex control flow (e.g. nested match expressions, assigment expressions with a block as the LHS, etc).

Additionally, the diverges-based dead-code detection has (in some ways) a different goal from a never-type fallback lint:

The dead-code detection needs to detect dead code with no false positives (so that we don't lint code that's actually live). False negatives (failing to detect dead code) are fine, since this lint is best effort.

A never-type fallback lint needs to detect live code with as few false positives as possible (so we avoid linting code that never actually executes). It might be possible to achieve both of these goals with the same infrastructure, but I think it would be difficult.

This has the advantage of of naturally avoiding, I think, all of the false errors that we've seen? I'm not sure if it has its own set of false errors, though.

That's wouldn't avoid the let vec = Vec::new(); vec.push(return;) case, since the relevant code (Vec::<!>::new()) is actually live. However, it could conceivably avoid the false-positives that I've found and fixed, but with less code.

@nikomatsakis
Copy link
Contributor

That's wouldn't avoid the let vec = Vec::new(); vec.push(return;) case, since the relevant code (Vec::<!>::new()) is actually live. However, it could conceivably avoid the false-positives that I've found and fixed, but with less code.

I suppose. I was intended I think to look for live expressions that yielded the type variable ?X directly -- so creating a Vec<?X> would not be considering a lintable warning.

Ah, this reminds me -- in your current code, are you looking for cases where the function returns "some type that mentions the type variable"? Is there a reason not to look for "some function that returns the type variable itself"?

Well, I guess you could have a case like:

let x: Foo<_> = make_foo();
x.method();

where

struct Foo<T> {
  the_function: fn() -> T,
}

impl Foo {
  ...
  fn method(&self) { (self.the_function)() }
}

i.e., the code that produces the T here (which is ultimately derived via fallback) is not in the function where the fallback occurs.

In general, the idea of special-casing type constructions is ok, but could be generalized I think to be a test on the return type. Basically if you have a type Foo<T>, it's ok as long as you can't construct a Foo<T> without having a T value...

Hmm, I guess you could also have some kind of function that takes a type parameter T and tries to create it somehow:

fn foo<T>(...) {
    let x: T = ...;
}

and the caller could somehow determine T via fallback. In that case, changing the fallback from () might be wrong. But it's hard to imagine a realistic example of this that would (a) not be dramatically wrong in some other way and (b) compile.

It'd be good to extend my write-up with all of these cases, I think. I'm not sure which fraction make sure to lint.

@Aaron1011
Copy link
Member Author

Aaron1011 commented Jan 24, 2020

Hmm, I guess you could also have some kind of function that takes a type parameter T and tries to create it somehow:

Yeah, that was my concern as well. I wanted to err on the side of linting more code, and seeing if the number of false positives was too high.

If the Crater results come back with more than a few errors that would be solved by checking the return type, I think we should definitely reconsider the current implementation. However, if Crater shows few to no cases where this happens, then I think a small risk of "questionable false positives" (in code not covered by Crater) is worth the increased coverage of potential UB.

@Aaron1011
Copy link
Member Author

Aaron1011 commented Jan 24, 2020

I suppose. I was intended I think to look for live expressions that yielded the type variable ?X directly -- so creating a Vec<?X> would not be considering a lintable warning.

Do you mean returning it directly (e.g. fn foo<R>() -> R) without wrapping it anything else?

My concern is that an objc-like could be reasonably extended to wrap the return type. I'm imagining something like:

struct FFIReturn<T> {
    val: T,
    timestamp: usize
}

Given that a caller-chosen type parameter is already being returned, it seems perfectly reasonable for such a function to add 'extra' information in some way.

@Aaron1011
Copy link
Member Author

In general, the idea of special-casing type constructions is ok, but could be generalized I think to be a test on the return type.

My rationale for the special-casing was less to do with the return type, and more to do with such functions being guaranteed to be "sane". In general, we can't know whether the function being called does something "reasonable" or not - except for when it's compiler generated.

@craterbot
Copy link
Collaborator

🎉 Experiment pr-68350 is completed!
📊 274 regressed and 3 fixed (88601 total)
📰 Open the full report.

⚠️ If you notice any spurious failure please add them to the blacklist!
ℹ️ Crater is a tool to run experiments across parts of the Rust ecosystem. Learn more

@craterbot craterbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-crater Status: Waiting on a crater run to be completed. labels Jan 26, 2020
@Aaron1011
Copy link
Member Author

The lint is triggering on calls to this raise function:

#[inline]
fn raise_e(m: impl Into<String>) -> PErr {
  PErr::Etc { msg: m.into() }
}

#[inline]
fn raise<T>(m: impl Into<String>) -> Result<T, PErr> {
  Err(raise_e(m))
}

...

raise("no sprites")?;

Unfortunately, raise is indistinguishable form the bad objc code from the outside. A call is made with all arguments inhabited, and a type parameter (used in the return type) is inferred to ! due to fallback.

@Aaron1011
Copy link
Member Author

Report from https://github.com/Mark-Simulacrum/crater-generate-report/:

root: bson - 2 (2 gh, 0 crates.io) detected crates which regressed due to this

root: cis_client - 5 (5 gh, 0 crates.io) detected crates which regressed due to this

  • fiji-flo/dino-park-fence: start v. end
  • fiji-flo/dino-park-fossil: start v. end
  • mozilla-iam/dino-park-evo: start v. end
  • mozilla-iam/dino-park-lookout: start v. end
  • mozilla-iam/dino-park-whoami: start v. end

root: jni - 11 (7 gh, 4 crates.io) detected crates which regressed due to this

root: libimagentrytag - 4 (0 gh, 4 crates.io) detected crates which regressed due to this

root: oauth2 - 3 (2 gh, 1 crates.io) detected crates which regressed due to this

root: proptest-derive - 31 (1 gh, 30 crates.io) detected crates which regressed due to this

  • async-compression-0.2.0: start v. end
  • bitmaps-2.0.0: start v. end
  • im-14.1.0: start v. end
  • im-rc-14.1.0: start v. end
  • metadata-backup-0.1.0: start v. end
  • proptest-derive-0.1.2: start v. end
  • ring-channel-0.8.1: start v. end
  • sized-chunks-0.5.1: start v. end
  • solana-librapay-api-0.20.1: start v. end
  • solana-move-loader-api-0.20.1: start v. end
  • solana-move-loader-program-0.20.1: start v. end
  • solana_libra_bytecode_verifier-0.0.1-sol4: start v. end
  • solana_libra_canonical_serialization-0.0.1-sol4: start v. end
  • solana_libra_config-0.0.1-sol4: start v. end
  • solana_libra_crypto-0.0.1-sol4: start v. end
  • solana_libra_invalid_mutations-0.0.1-sol4: start v. end
  • solana_libra_ir_to_bytecode-0.0.1-sol4: start v. end
  • solana_libra_ir_to_bytecode_syntax-0.0.1-sol4: start v. end
  • solana_libra_language_e2e_tests-0.0.1-sol4: start v. end
  • solana_libra_proptest_helpers-0.0.1-sol4: start v. end
  • solana_libra_proto_conv-0.0.0: start v. end
  • solana_libra_state_view-0.0.1-sol4: start v. end
  • solana_libra_stdlib-0.0.1-sol4: start v. end
  • solana_libra_transaction_builder-0.0.1-sol4: start v. end
  • solana_libra_types-0.0.1-sol4: start v. end
  • solana_libra_vm-0.0.1-sol4: start v. end
  • solana_libra_vm_genesis-0.0.1-sol4: start v. end
  • solana_libra_vm_runtime-0.0.1-sol4: start v. end
  • solana_libra_vm_runtime_types-0.0.1-sol4: start v. end
  • un_algebra-0.9.0: start v. end
  • trollaklass/troll-of-fame-rust: start v. end

root: redis - 32 (17 gh, 15 crates.io) detected crates which regressed due to this

  • root: r2d2_redis_cluster-0.1.5: start v. end
  • root: reredis-0.1.0-alpha.2: start v. end
  • root: rust-qt/generator-example: start v. end

root: rlua - 25 (14 gh, 11 crates.io) detected crates which regressed due to this

root: warp - 73 (53 gh, 20 crates.io) detected crates which regressed due to this

  • root: nstoddard/webgl-gui-demo: start v. end
  • root: xtp-0.1.0-alpha.3: start v. end

root: unknown causes - 70 (46 gh, 24 crates.io) detected crates which regressed due to this

@Aaron1011
Copy link
Member Author

There seem to be four types of regressions here:

  1. Spurious regressions caused by the fact that impl<T> From<!> for T is a reservation impl, and Infallible is still a separate enum (not a type alias for !). Hopefully, all of thes wil be fixed by adding the proper impls. These issues aren't directly related to this PR, though.
  2. False positives that occur due to how this lint is designed. For example, sg-sprite and proptest both contain code of the form:
pub fn bad<T>(something: SomeType) -> Wrapper<T> { ... }

which, while completely safe (there's no unsafe calls going on), triggers the lint.
3. Genuinely questionable code caught by this lint. For example, the conquer-once crate is now creating a MaybeUninit::<!>::uninit() instead of a MaybeUninit::<()>::uninit(). While it looks like it won't cause any issues with this crate, I think it makes sense that we're linting about this.
4. ICEs caused by a bug in my lint. These appear to be hiding actual errors that would have otherwise been reported - I'll investigate further after I fix the bug.

@Aaron1011
Copy link
Member Author

To summarize:

True positives:
codechain-agent-hub (though a type error would have prevented it from compiling even without the lint)
ps-util (same as above)

False positives:
conquer-once
ffi-support
proptest-derive
bishop-cli
futures
cube-engine
dqcsim
sg-sprite
metatape
lox_rs
mot

All of the false positives seem to have the same underlying cause: A Result (or Result-like type) is produced from a function with a signature like:

pub fn bad<T>(something: SomeType) -> Wrapper<T> { ... }

Only one variant of the enum is ever constructed (the one not involving the type parameter), though there may be several method calls between the call to bad and the construction of the enum.

@Aaron1011
Copy link
Member Author

I think it might be possible to refine the check as follows:

  1. Keep the existing logic. However, instead of emitting a lint, store the (caller_def_id, callee_def_id) pair in a side table (where caller_def_id is the method containing the 'suspicious' call, and callee_def_id is the 'suspicious' call
  2. During mono item collection (specifically, collect_crate_mono_items), do a check whenever we visit a TerminatorKind::Call. If the call matches a (caller_def_id, callee_def_id) pair, then enter a 'refined checking' mode.
  3. In 'refined checking' mode, check for any uninhabited types in visit_ty in MirNeighborCollector. If we encounter an uninhabited type, see if it's present in the un-monoprhized MIR for this function (e.g. the function with identity substs). If not, then emit the original warning (the warning currently emitted during typecheck).

I believe that this check will catch any cases where an uninhabited type is 'produced' (e.g. a live expression actually gets an uninhaibted type) due to never-type fallback. It should exclude all of the current false-positives, since the uninhabited type parameter never winds up being produced in any way.

However, it has some significant downsides:

  1. It makes an already complicated check even more complicated.
  2. It emits warnings after type-checking, which we don't currently do.
  3. It relies on mono item collection running. I think this would require us to do extra work during a cargo check build to ensure that warnings are displayed).
  4. It might have significantly more overhead than the current approach, since we need to perform at least one check (if the (caller_def_id, callee_def_id) matches) for every single instantiation of a function.

I think we have a few options:

  1. Keep the current implementation. This would require submitting PRs to fix the current false-positives, and accepting a substantial risk of additional false positives in crates not covered by crater.
  2. Implement the MIR-based 'refinement' described above.
  3. Give up on this approach entirely.

@nikomatsakis
Copy link
Contributor

@Aaron1011 thanks for the detailed examination! Super useful.

I am definitely nervous about the idea of having the lint be non-modular (i.e., examining things across functions).

(Honestly, this is making me wonder if the idea of changing fallback from () is right on the merits -- it just seems more fragile.)

There is of course another alternative. Rather than give up on this approach entirely, we could refine the lint to just look for return types that are uninhabited (and which involve a "fallback" variable) and thus permit -> Wrapper<T>. It is, after all, a lint, and it's not meant to give static guarantees, but just to help people find problems in practice. I believe that every actual problem that we know of in practice manifests as a -> T return type somewhere within the same function that caused the fallback to occur..?

@nikomatsakis
Copy link
Contributor

Well, what I just wrote might not be quite true -- I guess that the case of

let x = try!(Deserialize::deserialize(...))

wouldn't be caught by the lint as I described, but would require us to be testing the type of "live expressions".

@Aaron1011
Copy link
Member Author

Aaron1011 commented Jan 27, 2020

There is of course another alternative. Rather than give up on this approach entirely, we could refine the lint to just look for return types that are uninhabited (and which involve a "fallback" variable) and thus permit -> Wrapper<T>.

Unfortunately, that would fail to catch the original objc example, which returns a Result<R, MessageError>.

@Aaron1011
Copy link
Member Author

Aaron1011 commented Jan 28, 2020

I'm starting to think that the core premise of this lint is flawed. My original idea was that we could use a 'top-down' approach: we start from the function where fallback occurred, and see if control flow reaches a potentially 'questionable' function.

However, the Crater run shows that the following pattern is used by many different crates:

fn make_it<T>(param: Something) -> Wrapper<T> { ... }

Wrapper is often an enum like Result, with a generic parameter (or parameters) used only by one variant. This means that it's possible to use any type for that generic parameter simply by constructing a different variant (e.g. Result<!, MyErr>:Err(make_my_err())).

I would have hoped that the combination of the make_it pattern and never-type fallback would be rare. Sadly, this does not seem to be the case, as the ? operator actually causes fallback due to the return expression in the desugaring.

From the 'outside' (e.g. without inspecting function bodies), there is fundamentally no difference between a 'good' function like make_it and a 'bad' function like send_message. The difference is entirely due to how the fallback-affected type parameter is used, often several nested calls away from the original function.

I'm starting to think that a 'bottom-up' approach might be more effective. The immediate cause of the problem is a call to transmute which constructs a type 'containing' an uninhabited type (in the case of objc, a function pointer that returns !). As far as I know, this is the only way that never-type fallback can cause unsoundness - transmute is the only 'escape hatch' that lets you claim to produce an uninhabited type without actually diverging (panicking, looping forever, aborting, etc., other than raw pointer dereferences.

My idea is to use a greatly simplified version of the MIR-based check I previously proposed. There are three steps:

  1. During type inference, we record whether or not fallback occurred at all when type-checking a function - we don't do any complicated checking of method calls.
  2. During monomorphization (in collect_crate_mono_items), we look for any calls to transmute that contain any uninhabited types due to generic parameters. That is, we see if the output type of transmute has an uninhabited type when the calling function is monomorphized, and if it still has an uninhabited type when the calling function is not monomorphized (identity substs).
  3. If the above check succeeds, we 'walk up the call stack', and look for any function where fallback occurred. If we find one, emit a (possibly more detailed) lint. I think the implementation of collect_items_rec would allow this to be done very efficiently (if starting_point is affected by fallback, pass it down as a parameter to the recursive call).

This approach still has all of the drawbacks of a MIR-based lint (non-modularity, interaction with cargo check, etc). However, I think it has some significant advantages:

  1. It's much simpler. The core logic is checking the call to transmute, and finding the earliest 'fallback affected' function. I think this is much easier to understand than this PR, which involves inspecting all function calls for a particular pattern (fallback-affected subst, inhabited arguments).
  2. It should be much less prone to false positives, under the following assumptions:
  • Calls to transmute are relatively rare (compared to the code hit by this PR, such as the ? operator).
  • Calls to transmute involving uninhabited types are even rarer (as uses of uninhabited types are rare in general).
  • Calls to transmute in a function with generic args are even rarer.
  • Fallback is generally 'self-contained', and doesn't propagate too far from the original function.

However, this additional complexity might not be worth the cost. If we don't anticipate any affected crates other than objc, we might want to reconsider the idea of a lint. Conversely, if we anticipate many affected crates, then the risk of breaking crates might not be worth the benefits of never-type fallback (though I think this would be extremely unfortunate).

@nikomatsakis
Copy link
Contributor

nikomatsakis commented Jan 28, 2020

@Aaron1011

Regarding the objc case of returning Result, I agree that's a problem, which is why I wrote in this comment that we'd have to switch to checking the result types of all live expressions instead (i.e., because the Result<!, ..> is matched within the function). I'm sure you can make examples that this wouldn't catch, I just don't know that we have any evidence of those examples occuring in practice.

Regarding the transmute proposal, a few brief thoughts:

  • Is transmute really the core problem? What about MaybeUninit, can you produce a ! that way by constructing it and then accessing the field? (Though that is likely not used in practice, at least I don't recall any examples)
  • I remain very hesitant about any sort of non-modular analysis, but I'll have to ponder the specific description you gave. I admit it sounds plausible at first read.

@Aaron1011
Copy link
Member Author

Aaron1011 commented Jan 28, 2020

What about MaybeUninit, can you produce a ! that way by constructing it and then accessing the field?

MaybeUninit::assume_init will panic if you try to use it to instantiate an uninhabited type.

The only non-transmute way that I know of would be derefencing a raw pointer. I think we could detect that in the same way as detecting 'bad' transmute calls. However, I think it's very unlikely that any code is derefencing a *const/mut () as a result of fallback.

@bors
Copy link
Contributor

bors commented Mar 30, 2020

☔ The latest upstream changes (presumably #70536) made this pull request unmergeable. Please resolve the merge conflicts.

@nikomatsakis
Copy link
Contributor

I'm going to go ahead and close this PR. I think at this point we've decided that this is not the approach we want to take. Let's continue the discussion on #66173, I just posted a follow-up there yesterday.

@crlf0710 crlf0710 added the F-never_type `#![feature(never_type)]` label Jul 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
F-never_type `#![feature(never_type)]` S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-lang Relevant to the language team, which will review and decide on the PR/issue.
Projects
None yet