Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

can we in some cases have more limited forms of "undefined behavior"? #6

Closed
nikomatsakis opened this issue Aug 28, 2018 · 15 comments
Closed
Labels
A-abstract-machine Topic: concerning the abstract machine in general (as opposed to any specific part of it)

Comments

@nikomatsakis
Copy link
Contributor

In #5, @gereeter raised the point that defining all manner of errors as yielding "undefined behavior" is an awfully strong statement. It theoretically permits the compiler to "change the past" and may not mesh so well with the way that users think. On the other hand, weaker definitions may not permit the kinds of optimizations we want and may not be supported by LLVM.

In a sense, this is a cross-cutting concern: we need to figure out what's allowed and not allowed, but separately, we should consider if there are cases where we can contain the repercussions.

I'm not sure the best way to handle this, but I'm opening this up as its own potential discussion topic for the future.

@nikomatsakis nikomatsakis changed the title can we in some cases have more limited forms of "undefiend behavior"? can we in some cases have more limited forms of "undefined behavior"? Aug 28, 2018
@gnzlbg
Copy link
Contributor

gnzlbg commented Aug 28, 2018

I agree with @rkruppe that in many cases UB is the safest choice since it allow us to define the behavior later.

I am not sure I fully understood @gereeter argument (maybe it can elaborate), but it appears that it seems to be stating that there are other options beyond defining and not defining some program behavior.

AFAICT, we can not define the behavior of something, e.g., if a null pointer is dereferenced, the program is illegal (the guidelines do not speak about the behavior of illegal programs). Anything else we do, makes it defined behavior:

  • if a null pointer is dereferenced, the program panic!s
  • if a null pointer is dereferenced, the program can either panic, or load zeroed memory, or ...

Maybe @gereeter point is that we should prefer defined behavior to undefined behavior ? If so, I fully agree, but in many cases, like in the pointer dereference case, we can't and that's ok (e.g., those who want to avoid it can stick to safe Rust, those using raw pointers are explicitly opting into that for "reasons").

Also, because dereferencing a null pointer is undefined behavior, we could have a "fortified" build mode that panic!s on null pointer dereference, miri can offer a stack trace, etc. I'm with @RalfJung here that no matter where set the bar for UB, it should be possible to instrument a binary such that undefined behavior is never invoked.

@RalfJung
Copy link
Member

RalfJung commented Aug 30, 2018

From the other thread

there are lots of other notions of "incorrect": undef/varying value, poison, frozen poison/arbitrary value, some sort of sticky poison that on creation also marks some memory locations as permanently inaccessible, etc.

It seems likely that LLVM will move to just one kind of "incorrect value", and that is poison. The "floating" undef has lead to no ends of problems, and many optimizations are unsound, some of them leading to real miscompilations.

So, I think we should restrict ourselves to two possibilities when we want to declare an operation illegal:

  • It is UB.
  • It returns poison (which miri unfortunately calls Undef).

LLVM basically uses the latter whenever it can get away with that, as that leads to easier code motion optimizations. However, when specifying a surface language, that is not our concern, and hence it is usually preferrable to declare UB -- the optimizer then still has the choice of actually "just" considering this as returning poison. The only reason we even have Undef in our "abstract machine" at all is that we have to handle reading uninitialized memory, and for practical reasons we cannot make that UB.


@gereeter proposes another form that's somewhat "in the middle": Weak UB taints the "future" of the execution, but cannot "travel back in time" like C's Strong UB can. Adding another kind of UB complicates the specification (and constraints optimizations and codegen to LLVM, because LLVM does not have anything resembling weak UB), so I'd like to see strong reasons for why the two kinds we have are not enough.

Now, @gereeter provides some good arguments, but I think those goals are fairly unachievable. For example:

Security. If secrets like passwords and keys are stored in some volatile memory and properly cleared after use, it would be useful to know that some bug in unrelated code long after the sensitive computation is complete can't expose the secrets.

It is an open problem, as far as I know, how to specify a language that can be both optimized and provide reasonable guarantees for "cleaning" memory. Our work here is hard enough without adding additional unsolved problems to it, so I am worried that we are adding too many things on our plate here.

Also, this as well as the other arguments only work out if all UB is weak.
So, I don't think this adds another choice per illegal operation. It just adds one global option to our semantics: Whether "UB" means strong UB or weak UB.

If we restrict ourselves to strong UB now, we can always offer a weak UB mode later once we actually have codegen that can support this. That is something which might be possible cranelift codegen backend for debug builds, but as long as we use LLVM, I think that is ways off.

@nikomatsakis
Copy link
Contributor Author

@RalfJung can you say a bit more about what "poison" means in LLVM (vs "floating undef" or whatever the alternatives are)?

@RalfJung
Copy link
Member

RalfJung commented Sep 1, 2018

"poison" is like the Uninit in my blog post: Any byte (or maybe bit, that detail is not determined yet) is either some "normal" value, or "poison". Copying data with "poison" in it around is fine, storing it to memory and loading it from memory happens as expected and preserves "being poison". (It is an open question what happens when you load an i32 and some bytes/bits are poison -- does the entire i32 become poison, or does it preserve byte/bit-level precision?) All other operations "propagate" poison where possible: poison * 0 is poison, posion + 5 is poison, and so on. If poison is used for a conditional jump, we have UB.
Together with poison comes an operation freeze which takes in any value, and returns a non-poison value. It does this by non-deterministically choosing anything when the input is poison. So while if poison is UB, if freeze(poison) is non-deterministic. freeze is a way to "decontaminate" a possibly poisoned value (with no side-effect on data that was not poisoned to begin with). This is very useful for some optimizations that turn arithmetic into conditional jumps, and must not introduce UB. It is exactly the ingredient that LLVM was missing, and that lead to it having undef.

"floating undef" (the undef of LLVM) is a whole different beast. LLVM with "floating undef" actually has an even more complicated notion of a byte or value: A value is a set of options. "Normal" values are singletons sets, but undef is the set of all values. undef * 2 is the set of all even values. undef == undef is the set {0, 1}. Basically, for all operations, the result is the set of values that can be obtained by performing the operation with any of the possible choices from either operand.
Note that what I said so far is fully deterministic! At no point are we making a choice. Instead, we track all possible choices in one execution. Non-determinism enters the picture once you do a conditional jump on a non-singleton value: Then you non-deterministically pick an element from the set and use that.
This model has a host of problems and some optimizations that LLVM performs are unsound for it. It is also unclear what happens when you store a multi-byte non-singleton value into memory: Do we somehow track the correlation between those bytes (say the value is {0^32, 1^32}, i.e. it is either all-0 or all-1)? None of these questions have ever been answered in a consistent way, to my knowledge. See this paper for more details. It has long been proposed to remove undef from LLVM, and there seems to be general agreement except some of the questions around posion that I mentioned above still need to be worked out.

@gnzlbg
Copy link
Contributor

gnzlbg commented Sep 3, 2018

@RalfJung thank you for writing this. When talking about freeze, you mention:

This is very useful for some optimizations that turn arithmetic into conditional jumps, [...] It does this by non-deterministically choosing anything when the input is poison

Could you maybe elaborate a bit more on how freeze works? I wonder whether it just "forgets" poison, whether it preserves attributes, what guarantees I get on the result value, etc. For example, say I have a nonnull poison pointer and I launder it with freeze, do I just get a non-poison pointer back, or does it preserve the nonnull attribute so that I get a nonnull pointer back? If I get a nonnull pointer back, am I guaranteed that dereferencing this pointer won't introduce undefined-behavior of the type null-pointer dereference? That is, does freeze make sure that the bit-pattern of the non-poison pointer is not null ? Also, I wonder whether it is possible to "hook" freeze to e.g. panic if the value passed to it was poison.

@hanna-kruppe
Copy link

Poison is compile time fiction, it is used in the language semantics to justify some code transformations, but it is not a distinct value at run time (unless one deliberately builds a VM like miri that detects UB rather than exploit it). I wouldn't go as far as saying the choice between poison, undef, and instant UB has no impact on the compiled machine code at all, but it certainly isn't a real real value the program could detect and handle specially.

For example, say I have a nonnull poison pointer and I launder it with freeze, do I just get a non-poison pointer back, or does it preserve the nonnull attribute so that I get a nonnull pointer back

It is not immediately clear to me whether it is or should be instant UB if a value annotated as nonnull (or as being in a certain range, or not aliasing other pointers, or other such properties) turns out to be poison, but placing restrictions on the possible outcomes of freeze based on attributes added to it elsewhere seems nonsensical to me.

Attributes such as nonnull are intended to enable optimizations, not hinder them, but this would force worse codegen (e.g. for non-null, the code generator couldn't just map the freeze value to any register or stack slot without initialization, it would have to emit runtime code to ensure that location contains a non-zero value). Furthermore, these attributes can freely be dropped or ignored or moved around by the compiler (because they're only intended to enable optimizations), so "a nonnull pointer" is not even something the compiler can or even wants to reliably detect in general.

Another angle is, if you handle poison values, you're already hovering one centimeter over the pit of lava that is UB, and the only reason you don't fall in is you promised you won't try to get too smart with the garbage value. If this clashes with wanting to assume the value is well-behaved in certain ways, well, maybe stop assuming that, or don't handle garbage values, or manually fix up the result of freeze to conform to the assumption (e.g., see if it's null and change it if it is).

@RalfJung
Copy link
Member

RalfJung commented Sep 6, 2018

Not sure what a "nonnull poison value is". Attributes are attached to variables/function arguments, not values -- and poison is a value.

The semantics of freeze(poison) is to non-deterministically pick a non-poison bit pattern (the size of which is determined by the type of this operation). Notably, this is the only non-determinism in this scheme: There is no non-determinism (in the abstract machine!) when allocating new memory (it is deterministically initialized to be all-poison), and there is no non-determinism when performing an arithmetic operation on poison (the result is deterministicially posion).

Wrt. attributes, I do not know whether it is UB to pass "posion" for an argument with the nonnull attribute. This is something the LLVM devs would have to answer. But my guess is "yes" -- basically, I would expect that "posion" satisfies no attribute, and @rkruppe spelled out all the reasons why :)

@RalfJung RalfJung added the A-abstract-machine Topic: concerning the abstract machine in general (as opposed to any specific part of it) label Sep 18, 2019
@JakobDegen
Copy link
Contributor

Discussed in backlog bonanza, closing. The answer to the question in the title is probably "yes," but there's no super clear compelling motivation. People are welcome to open new issues if there are specific problems they are facing

@thomcc
Copy link
Member

thomcc commented May 23, 2023

Note that C is proposing it: https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3128.pdf

@digama0
Copy link

digama0 commented May 23, 2023

We discussed that in the meeting (not the C proposal itself but the rust equivalent), there seems to be some disagreement about to what extent UB honors observable behaviors. (In particular, if one annotates a syscall or other external function call with a mustreturn style annotation, it is possible that UB after it may propagate back past the IO.) My own interpretation would favor what is called "concrete UB" there: any IO events prior to UB are part of the observable behaviors and will be preserved, it is only the last stretch of unobservable computations prior to UB that is compromised. The answer to this question is tied up in our nondeterminism model. It is probably best to use a new issue.

@RalfJung
Copy link
Member

RalfJung commented May 24, 2023

That C proposal seems rather limiting, e.g. one can no longer move memory accesses up across print calls that definitely return. I am curious if C compilers will really implement that.

For Rust it's probably less bad since references give us stronger guarantees and earlier UB. But anyway I agree with Mario, we shouldn't have different kinds of "UB" but we might want to have an issue tracking whether our UB can exhibit "time travel" or not.

@JakobDegen
Copy link
Contributor

Ah, it wasn't clear to me that there was actual disagreement here. My impression is that we can't actually consider anything other than time-travel UB because LLVM won't let us (although I'm having trouble getting it to do even obviously correct optimizations, so who knows)

@digama0
Copy link

digama0 commented May 24, 2023

Well, UB "time-travel" is partially a result of colloquial misinterpretation of the execution semantics. UB time-travels past unobservable computations because those computations are not temporally sequenced in the first place, they are more like nodes in a data dependence graph. The kind of time-travel that is actually observable would be traveling past IO, and I would say that for the most part we don't allow that, except in specific (TBD) enumerated circumstances. If you call an opaque function and then do a UB the function call will definitely happen, because it might just abort the program.

@chorman0773
Copy link
Contributor

Right, but not all observable behaviour is entirely opaque - and as discussed, if the thing being implemented is the abstract machine (and not an emulator of the abstract machine, so something closer to miri), then it knows exactly when the program terminates or reaches a UB state (and the AM doesn't necessarily track the prior observable behaviour and may simply discard it when it halts with an error).

@RalfJung
Copy link
Member

I opened an issue for time-traveling UB: #407

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-abstract-machine Topic: concerning the abstract machine in general (as opposed to any specific part of it)
Projects
None yet
Development

No branches or pull requests

8 participants