allocation functions and return-position noalias #196

hanna-kruppe · 2019-08-23T07:53:03Z

@comex brought this up in rust-lang/rust#63787 (comment) and IIUC believes there's unsoundness lurking there? Let's continue discussion here since it's not directly relevant to the specific problems of Ref.

The text was updated successfully, but these errors were encountered:

RalfJung · 2019-08-23T07:55:36Z

Could you summarize what the unsoundness is? I have seen this example but I think that's an LLVM bug -- specifically, this is https://bugs.llvm.org/show_bug.cgi?id=35229.

hanna-kruppe · 2019-08-23T08:19:56Z

I also think that's no unsoundness other than the GVN bug, I just want to talk it out somewhere other than the Ref issue.

@comex (re: rust-lang/rust#63787 (comment))

In 4.6 it discusses the problems with propagating pointer equalities in GVN – that is, in a code path where a == b must have previously evaluated to true, replacing a with b. This is the same optimization I mentioned in my proof-of-concept for the Ref issue earlier in this thread.

This conflict of integer GVN vs pointer provenance causes miscompiles even without involving custom allocators or noalias (return position or otherwise), so this optimization has to be disabled anyway (unless you want much larger-scale changes to e.g. give integers provenance, with all the downsides that entails).

Both of your examples seem to crucially rely on this GVN bug. Please correct me if I'm wrong (I've not fully digested either yet), but if that's true, I don't understand what either of them have to do with allocation returning noalias pointers. It seems just yet another variation of pointing out that pointers to equal addresses can be non-interchangeable for actual memory accesses, just with a lot of unnecessary extra complications.

The discussion in section 4.8 is a little bit related in that it's about allocators reusing addresses. But all of the approaches presented at there are perfectly satisfactory in that they can be implemented soundly and just differ in which optimizations they are compatible with (moving a free up above malloc vs moving a pointer comparison after a deallocation). I don't see how it's essential to your example programs -- if anything, the UB or non-determinism of comparing the pointer after it's been deallocated seems more like an obstacle to you (in that it only can make your examples UB or open up extra possibilities for the GVN bug to be rendered harmless).

All that said, it is likely true that when you stop treating alloc/dealloc functions as built-ins all the time and start to (selectively) look at the C or Rust code that implements them through the lens of the language's general memory model, you'll run into trouble. But that's not at all surprising to me (those functions are deliberately treated as magic after all), and it's not related to the examples you've given, or to noalias annotation and address reuse in particular. For example, if you inline only the free part of a malloc-free pair and the allocator implementation stores metadata inline, then the code from free will appear to perform OOB accesses when it pokes the allocation metadata.

comex · 2019-08-23T23:12:35Z

Both of your examples seem to crucially rely on this GVN bug. Please correct me if I'm wrong (I've not fully digested either yet), but if that's true, I don't understand what either of them have to do with allocation returning noalias pointers. It seems just yet another variation of pointing out that pointers to equal addresses can be non-interchangeable for actual memory accesses, just with a lot of unnecessary extra complications.

Pretty much.

I just wouldn't characterize it as "yet another variation". If anything, it's one of two variations. The twin allocation paper creates pointers with the same address but different provenance by using two allocations that are adjacent in space (but exist at the same time). My second example does so using two allocations that are separated in time (but at the same address in space).

My first example with (Ref) uses two pointers that derive from the same allocation and thus have the same provenance. It does rely on GVN to produce an actual miscompilation, but GVN in this particular case would be sound accordin to the conditions in the twin-allocation paper, and other optimizations besides GVN could theoretically cause problems too. I think we're all in agreement that noalias is incorrect for Ref.

But going back to the other example – I tried to make it clear that the problem in that example is likely LLVM's fault, not Rust's. noalias is likely fine on Box and on the allocator's return value. However, I wanted to leave the door open for other conclusions, because I have no evidence that LLVM will actually implement the GVN limitations proposed in the twin allocation paper.

For one thing, they don't seem to be in much of a hurry to implement any sort of formal model.

For another, the model from the twin allocation paper is only one possibility among others. Even if I assumed that LLVM would end up making its optimizations sound with respect to some formal model, that wouldn't automatically justify basing Rust's noalias decisions on the twin allocation model in particular.

My chain of reasoning in the other thread was a bit muddled, but the part about "escaping" is essentially a sketch of an alternative solution for temporal aliasing, compared to how the twin allocation model solves it. Namely, the allocator is modeled such that when it reuses an address, the pointer it returns is 'based on' the argument previously passed to free. The benefit of this model is that it makes unrestricted GVN on pointers sound with respect to temporal aliasing (only). Therefore, it would be potentially useful:

As part of a formal model, in combination with some other solution to spatial aliasing that also allows unrestricted or less-restricted GVN compared to the twin allocation model.
- I don't know what that solution would look like, exactly, and it would probably have other drawbacks. But the question of what model to use is up to LLVM anyway. I'm just exploring possibilities.

But also:

As a conservative approach for the time being, in order to mitigate some of the miscompilations caused by LLVM's current unrestricted GVN.

I don't know how useful that would actually be. It wouldn't fix all miscompilations, so perhaps it would just be a pointless band-aid, especially because GVN-based miscompilations tend to only occur in contrived examples anyway.

But for the record, under that model, Box-typed arguments could still be noalias, but the allocator's return value could not be, nor could arguments of certain other smart pointer types such as my hypothetical DetachedRc.

Finally:

All that said, it is likely true that when you stop treating alloc/dealloc functions as built-ins all the time and start to (selectively) look at the C or Rust code that implements them through the lens of the language's general memory model, you'll run into trouble. But that's not at all surprising to me (those functions are deliberately treated as magic after all), and it's not related to the examples you've given, or to noalias annotation and address reuse in particular. For example, if you inline only the free part of a malloc-free pair and the allocator implementation stores metadata inline, then the code from free will appear to perform OOB accesses when it pokes the allocation metadata.

LLVM can, in fact, inline an alloc/dealloc function under the right circumstances. So the potential for "trouble" exists, mitigated by LLVM's lack of sophistication. If you say this is off topic – sure. I originally only mentioned it in passing.

RalfJung · 2019-08-25T20:52:54Z

The twin allocation paper creates pointers with the same address but different provenance by using two allocations that are adjacent in space (but exist at the same time). My second example does so using two allocations that are separated in time (but at the same address in space).

Agreed. In terms of the formal memory model presented in said paper, that is the same situation, so it is not surprising that they lead to the same kinds of miscompilation.

My chain of reasoning in the other thread was a bit muddled, but the part about "escaping" is essentially a sketch of an alternative solution for temporal aliasing, compared to how the twin allocation model solves it. Namely, the allocator is modeled such that when it reuses an address, the pointer it returns is 'based on' the argument previously passed to free.

Interesting! That does seem worth exploring.

JakobDegen · 2023-08-08T17:00:23Z

Closing as a duplicate of #385

RalfJung added C-open-question Category: An open question that we should revisit A-provenance Topic: Related to when which values have which provenance (but not which alias restrictions follow) labels Aug 23, 2019

hanna-kruppe mentioned this issue Aug 23, 2019

Ref parameter incorrectly decorated with noalias attribute rust-lang/rust#63787

Closed

JakobDegen closed this as completed Aug 8, 2023

pnkfelix mentioned this issue Aug 8, 2023

do not add noalias in return position rust-lang/rust#106371

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

allocation functions and return-position noalias #196

allocation functions and return-position noalias #196

hanna-kruppe commented Aug 23, 2019

RalfJung commented Aug 23, 2019

hanna-kruppe commented Aug 23, 2019 •

edited

Loading

comex commented Aug 23, 2019 •

edited

Loading

RalfJung commented Aug 25, 2019

JakobDegen commented Aug 8, 2023

allocation functions and return-position noalias #196

allocation functions and return-position noalias #196

Comments

hanna-kruppe commented Aug 23, 2019

RalfJung commented Aug 23, 2019

hanna-kruppe commented Aug 23, 2019 • edited Loading

comex commented Aug 23, 2019 • edited Loading

RalfJung commented Aug 25, 2019

JakobDegen commented Aug 8, 2023

hanna-kruppe commented Aug 23, 2019 •

edited

Loading

comex commented Aug 23, 2019 •

edited

Loading