Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support opaque pointers #177

Closed
RyanGlScott opened this issue Oct 27, 2021 · 9 comments · Fixed by #221
Closed

Support opaque pointers #177

RyanGlScott opened this issue Oct 27, 2021 · 9 comments · Fixed by #221

Comments

@RyanGlScott
Copy link
Contributor

LLVM is in the process of migrating all of its pointer types from i8*, i32*, etc. to an opaque ptr type, as described here. LLVM 13 takes an important step in that direction, as it is the first LLVM release to include ptr in the LLVM AST. See llvm/llvm-project@2155dc5.

Note, however, that ordinary clang invocations do not yet produce LLVM ASTs that use ptr yet. Because ptr is still an experimental feature, one must explicitly opt in to it. Here is one way to do this:

// wat.c
void f(int *x) {
  *x += 1;
}
$ ~/Software/clang+llvm-13.0.0-x86_64-linux-gnu-ubuntu-20.04/bin/clang -c -emit-llvm wat.c -o wat.bc
$ ~/Software/clang+llvm-13.0.0-x86_64-linux-gnu-ubuntu-20.04/bin/llvm-dis --force-opaque-pointers wat.bc -o wat.ll
$ cat wat.ll
; ModuleID = 'wat.bc'
source_filename = "wat.c"
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

; Function Attrs: noinline nounwind optnone uwtable
define dso_local void @f(ptr %0) #0 {
  %2 = alloca ptr, align 8
  store ptr %0, ptr %2, align 8
  %3 = load ptr, ptr %2, align 8
  %4 = load i32, ptr %3, align 4
  %5 = add nsw i32 %4, 1
  store i32 %5, ptr %3, align 4
  ret void
}

attributes #0 = { noinline nounwind optnone uwtable "frame-pointer"="all" "min-legal-vector-width"="0" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "tune-cpu"="generic" }

!llvm.module.flags = !{!0, !1, !2}
!llvm.ident = !{!3}

!0 = !{i32 1, !"wchar_size", i32 4}
!1 = !{i32 7, !"uwtable", i32 1}
!2 = !{i32 7, !"frame-pointer", i32 2}
!3 = !{!"clang version 13.0.0 (https://github.com/llvm/llvm-project/ 24c8eaec9467b2aaf70b0db33a4e4dd415139a50)"}

llvm-pretty-bc-parser is currently not equipped to handle this, unsurprisingly:

$ ~/Software/clang+llvm-13.0.0-x86_64-linux-gnu-ubuntu-20.04/bin/llvm-as wat.ll -o wat.bc
$ cabal repl
...
λ> parseBitCodeFromFile "wat.bc"
Left (Error {errContext = ["TYPE_BLOCK","type symbol table","MODULE_BLOCK","Bitstream"], errMessage = "Unknown type code  25\nAre you sure you're using a supported compiler?\nCheck here: https://github.com/GaloisInc/llvm-pretty-bc-parser\n"})

While opaque pointers are still a work in progress, we may want to add basic support for parsing them in the meantime.

@RyanGlScott
Copy link
Contributor Author

I did some more research into opaque pointers recently. Here are some highlights:

  1. As of LLVM 15, Clang has switched to using opaque pointers by default, although they can be disabled with a compiler flag. By LLVM 17, they will have been completely removed from the compiler. See here for more details.
  2. Adding support for the ptr (the opaque pointer type) itself is actually quite straightforward. The tricky part is updating all of the instructions that make use of pointers. These include (but are not limited to) load, store, alloca, call, callbr, etc. llvm-pretty-bc-parser's currently handle these instructions by assuming that these instructions are always given a PtrTo type (i.e., a non-opaque pointer), and in many cases, they extract the pointee type from the PtrTo. This will not be possible with opaque pointers, so we will need to update the handling for these instructions so that they work uniformly for both opaque and non-opaque pointers.
  3. While investigating the call and callbr instructions in particular, I discovered that the way llvm-pretty-bc-parser types these functions is subtly incorrect, as llvm-pretty-bc-parser gives them the type Ptr <function type> rather than <function type>, the latter of which is what LLVM expects. See call/callbr's function type should not be a pointer type #189 (comment) for how we should go about addressing this issue.

@RyanGlScott
Copy link
Contributor Author

Continuing from #177 (comment) ...

  1. Because of there are so many LLVM instructions that have needed to be tweaked to support opaque pointers, it is unlikely that we will get opaque pointers 100% correct on the first try. As a result, I think we will need to add support for opaque pointers gradually. I propose that for LLVM 13 support, we add the bare minimum changes needed to support the program in Support opaque pointers #177 (comment), and then we can iterate on supporting more instructions afterwards.

RyanGlScott added a commit that referenced this issue Mar 20, 2023
See `Note [Typing function applications]` for the rationale.

This fixes #189. This is also an important step towards being able to support
opaque pointers, as described in #177.
@eddywestbrook
Copy link

Given the complexity, is there any way to test the opaque pointer support?

@RyanGlScott
Copy link
Contributor Author

I'm not quite sure I understand your question, but we can test instructions that use opaque pointers just like any other instruction. We can test the parsing aspects of opaque pointers in llvm-pretty-bc-parser itself, and that alone will probably suffice to catch a number of pointer-related issues like the one in #189 (comment). In order to properly test the semantics of these instructions, however, we will likely need more test cases in something like crucible-llvm.

@eddywestbrook
Copy link

I just meant that it sounds like you are changing a fair bit of code for this, so I was prompting you to think about how to test all that new code.

I think you're right, the most important part is testing the semantics, and we need some test cases in crucible-llvm.

@RyanGlScott
Copy link
Contributor Author

The best thing to do would be to figure out all of the instructions that intersect with opaque pointers in some way, such as call, and add a test case for each one where all pointers are made opaque. It might be difficult to do this without scouring the LLVM commit history, however. The next best thing will be to simply run our test suites with LLVM 15 or later, as they use opaque pointers for everything by default. That will give us a much better picture of everything that needs to change, and we can do so gradually, since we can disable opaque pointers if we encounter a particular opaque pointer–related instruction that we don't yet support.

RyanGlScott added a commit that referenced this issue Mar 21, 2023
See `Note [Typing function applications]` for the rationale.

This fixes #189. This is also an important step towards being able to support
opaque pointers, as described in #177.
RyanGlScott added a commit that referenced this issue Apr 3, 2023
See `Note [Typing function applications]` for the rationale. I have added a
`T189.ll` test case to ensure that we actually practice what we preach in that
Note.

This fixes #189. This is also an important step towards being able to support
opaque pointers, as described in #177.
RyanGlScott added a commit that referenced this issue Apr 3, 2023
See `Note [Typing function applications]` for the rationale. I have added a
`T189.ll` test case to ensure that we actually practice what we preach in that
Note.

This fixes #189. This is also an important step towards being able to support
opaque pointers, as described in #177.
@RyanGlScott
Copy link
Contributor Author

Another challenge that I've encountered while trying to make a simple test case involving opaque pointers work: how do we handle mixing opaque pointers with non-opaque pointers? At the LLVM level, this is forbidden: a bitcode file can either have solely opaque pointers or solely non-opaque pointers, but it is an error to combine the two. It's not so straightforward to enforce this in llvm-pretty-bc-parser, however, for a couple of reasons:

  1. It is unlikely that we will want to drop support for non-opaque pointers any time soon, if ever. Not only is this important for backwards-compatibility reasons, but there are some downstream users of llvm-pretty-bc-parser that make key use of the pointee types in non-opaque pointers.

  2. For practical reasons, llvm-pretty-bc-parser sometimes infers non-opaque pointer types. For example, we always give alloca instructions a return type of PtrTo <ty>:

    ity = if explicitType then PtrTo instty else instty

    Similarly, we always give function references (e.g., @foo) the type PtrTo <fun-ty>:

    let ty = case funTy of
    PtrTo _ -> funTy
    _ -> PtrTo funTy

There are sometimes situations where llvm-bc-pretty-parser tries to compare an opaque pointer to a non-opaque pointer. For instance, consider this code:

define void @f() { ... }

define void @g() {
  %1 = alloca ptr, align 8
  store ptr @f, ptr %1, align 8
  ...
}

The type of %1 will be PtrTo PtrOpaque for the reasons described above in (2), and similarly, the type of @f will be PtrTo (FunTy (PrimType Void) []). When we parse the store instruction, we reach this sanity check, which checks that the pointee type of the pointer being written to matches the type of the value being stored. In our example, the pointee type is PtrOpaque, but the type of the value is PtrTo (FunTy (PrimType Void) []). Because these are syntactically different types, this check rejects the example above. Ack!

After discussing this with others, the approach that I am going to use to work around this issue is to define a coarser notion of type equality. This will behave like the Eq Type instance, but with the exception that PtrOpaque is always equal to PtrTo <ty> for any <ty>. Off of the top of my head, I believe we will need to use this version of type equality in the following places:

There are also some trickier cases to consider, such as this code, which uses PValMd as a Map key. Because PValMd is defined in terms of Type, PValMd's Eq instance inherits the properties of Type's Eq instance. It is unclear if we need to use the coarser version of equality here as well, but perhaps we can get away without it.

RyanGlScott added a commit that referenced this issue Apr 7, 2023
TODO RGS: Add test case

This is necessary in order to `load` from an opaque pointer. See #177.
RyanGlScott added a commit that referenced this issue Apr 7, 2023
TODO RGS: Test cases

This is necessary in order to use `getelementptr` on an opaque pointer.
See #177.
@RyanGlScott
Copy link
Contributor Author

In the same vein as GaloisInc/llvm-pretty#102 (comment), I did a quick audit of the code in llvm-pretty-bc-parser to find opaque pointer "violations". Here is what I found:

  • The code for parsing atomicrmw instructions inspects the pointee type of the pointer argument:

    parseAtomicRMW :: Bool -> ValueTable -> Record -> PartialDefine -> Parse PartialDefine
    parseAtomicRMW old t r d = do
    -- TODO: parse sync scope (ssid)
    (ptr, ix0) <- getValueTypePair t r 0
    (val, ix1) <- case typedType ptr of
    PtrTo ty@(PrimType prim) -> do
    -- Catch pointers of the wrong type
    when (case prim of
    Integer _ -> False
    FloatType _ -> False
    _ -> True) $
    fail $ "Expected pointer to integer or float, found " ++ show ty

    There's no good reason to do this, however, as we can just as well use the type of val for this check.

  • The code for parsing invoke, call, and callbr instructions all assume that the function being invoked has a non-opaque pointer type:

    All of these instructions store the function type separately, so this can easily be avoided.

@RyanGlScott
Copy link
Contributor Author

One surprising thing that I recently discovered about mixing opaque and non-opaque pointers is that LLVM will often optimize opaque-pointer bitcode in a way that would make it ill-typed in a non-opaque-pointer setting. Here is an example of this in action:

#include <stdint.h>
#include <stdio.h>

void __attribute__((noinline)) f(uint8_t x[2][2]) {
  x[1][0] = 42;
}

int main(void) {
  uint8_t arr[2][2] = {{0, 0}, {0, 0}};
  f(arr);
  printf("%d\n", arr[1][0]);
  return 0;
}

If you compile this with clang-15 -O1 -emit-llvm -S, you will get:

@.str = private unnamed_addr constant [4 x i8] c"%d\0A\00", align 1

; Function Attrs: argmemonly mustprogress nofree noinline norecurse nosync nounwind willreturn writeonly uwtable
define dso_local void @f(ptr nocapture noundef writeonly %0) local_unnamed_addr #0 {
  %2 = getelementptr inbounds [2 x i8], ptr %0, i64 1
  store i8 42, ptr %2, align 1, !tbaa !5
  ret void
}

; Function Attrs: nofree nounwind uwtable
define dso_local i32 @main() local_unnamed_addr #1 {
  %1 = alloca [2 x [2 x i8]], align 4
  call void @llvm.lifetime.start.p0(i64 4, ptr nonnull %1) #4
  store i32 0, ptr %1, align 4
  call void @f(ptr noundef nonnull %1)
  %2 = getelementptr inbounds [2 x [2 x i8]], ptr %1, i64 0, i64 1
  %3 = load i8, ptr %2, align 2, !tbaa !5
  %4 = zext i8 %3 to i32
  %5 = tail call i32 (ptr, ...) @printf(ptr noundef nonnull @.str, i32 noundef %4)
  call void @llvm.lifetime.end.p0(i64 4, ptr nonnull %1) #4
  ret i32 0
}

This bitcode is fine on its own, but it interacts strangely with how llvm-pretty-bc-parser typechecks it. First, let's look at the first lines of @main:

  %1 = alloca [2 x [2 x i8]], align 4
  ...
  store i32 0, ptr %1, align 4

The call to store is fine under the current typing rule for store, which simply states that the second argument to store must be a ptr. Before opaque pointers, however, the previous typing rule for store was:

store [volatile] <ty> <value>, <ty>* <pointer>[, align <alignment>][, !nontemporal !<nontemp_node>][, !invariant.group !<empty_node>]        ; yields void

That is, the second argument's pointee type had to match the type of the first argument. llvm-pretty-bc-parser still includes this typing rule as a sanity check here.

Now let's look at the preceding call to alloca. The current typing rule for alloca says that it always returns a ptr. But the previous typing rule for alloca says that alloca <ty> returns something of type <ty>*. llvm-pretty-bc-parser still implements the previous typing rule for alloca here.

While prototyping a fix for #177, I kept these old typing rules in place, thinking that opaque and non-opaque pointers could fully coexist without issues. But this is not quite true, and the example above proves it. Let's walk through how this example would be typechecked under the previous typing rules:

  • %1 = alloca [2 x [2 x i8]], align 4: Here, llvm-pretty-bc-parser would give %1 the type [2 x [2 x i8]]*.
  • store i32 0, ptr %1, align 4: Due to how llvm-pretty-bc-parser parses values, it will parse the argument %1 as a forward reference and give it the type [2 x [2 x i8]]*, as chosen in the previous step. (This is a different type than ptr!)
  • llvm-pretty-bc-parser will now reject this store instruction, as the pointee type [2 x [2 x i8]] does not match i32, the type of store's first argument.

Why didn't this issue arise before opaque pointers were a thing? To see why, compare the bitcode when the example above is compiled with clang-14:

@.str = private unnamed_addr constant [4 x i8] c"%d\0A\00", align 1

; Function Attrs: mustprogress nofree noinline norecurse nosync nounwind uwtable willreturn writeonly
define dso_local void @f([2 x i8]* nocapture noundef writeonly %0) local_unnamed_addr #0 {
  %2 = getelementptr inbounds [2 x i8], [2 x i8]* %0, i64 1, i64 0
  store i8 42, i8* %2, align 1, !tbaa !3
  ret void
}

; Function Attrs: nofree nounwind uwtable
define dso_local i32 @main() local_unnamed_addr #1 {
  %1 = alloca i32, align 4
  %2 = bitcast i32* %1 to [2 x [2 x i8]]*
  %3 = bitcast i32* %1 to i8*
  call void @llvm.lifetime.start.p0i8(i64 4, i8* nonnull %3) #4
  store i32 0, i32* %1, align 4
  %4 = bitcast i32* %1 to [2 x i8]*
  call void @f([2 x i8]* noundef nonnull %4)
  %5 = getelementptr inbounds [2 x [2 x i8]], [2 x [2 x i8]]* %2, i64 0, i64 1, i64 0
  %6 = load i8, i8* %5, align 2, !tbaa !3
  %7 = zext i8 %6 to i32
  %8 = call i32 (i8*, ...) @printf(i8* noundef nonnull dereferenceable(1) getelementptr inbounds ([4 x i8], [4 x i8]* @.str, i64 0, i64 0), i32 noundef %7)
  call void @llvm.lifetime.end.p0i8(i64 4, i8* nonnull %3) #4
  ret i32 0
}

In particular, the first lines of @main are now:

  %1 = alloca i32, align 4
  %2 = bitcast i32* %1 to [2 x [2 x i8]]*
  %3 = bitcast i32* %1 to i8*
  ...
  store i32 0, i32* %1, align 4

This time, LLVM is very careful to have alloca return something of type i32* to match the following store instruction. Later instructions require using %1 at different types, however, so LLVM makes extensive use of bitcast instructions to cast %1 to different pointer types. With clang-15, however, there is only a single ptr type, so no bitcasts are necessary to reinterpret pointers at different types. (Indeed, this page cites the complexity of inserting bitcasts everywhere as a key motivation for switching to opaque pointers in the first place.)

The moral of the story: pointee types are a convenient fiction, and with opaque pointers, LLVM is free to optimize code in a way that doesn't make sense in a non-opaque-pointer setting.


What should we do about this? As noted above, we don't want to outright drop support for non-opaque pointers for various reasons. Yet this issue exposes a rather fundamental tension that arises when mixing opaque and non-opaque-pointers.

My initial reaction is that the most straightforward solution would be to relax some of the non-opaque-pointer checks that llvm-pretty-bc-parser performs. In particular, I think we will need to drop this check in the store implementation, as this check cannot possibly work on the example above. In fact, I think we will need to get rid of the ptrTo function entirely, as we cannot be sure if any pointee types in a <ty>* type are actually correct in an opaque pointer world.

RyanGlScott added a commit that referenced this issue Apr 23, 2023
This bumps the `llvm-pretty` submodule to bring in the changes to the `Load`
data constructor from GaloisInc/llvm-pretty#110 and adapts the code in
`llvm-pretty-bc-parser` accordingly.

This is necessary in order to `load` from an opaque pointer. See #177. A test
case will be added in a subsequent commit.
RyanGlScott added a commit that referenced this issue Apr 23, 2023
This bumps the `llvm-pretty` submodule to bring in the changes to the
`GEP`/`ConstGEP` data constructors from GaloisInc/llvm-pretty#110 and adapts the
code in `llvm-pretty-bc-parser` accordingly.

Because `ConstGEP` now stores the basis type for calculations explicitly, I
needed to fix #218 in order to ensure that the basis type is always parsed
properly. In the process of fixing this issue, I refactored the `parseCeGep` to
make the code clearer and more closely mirror the structure of LLVM's own
bitcode parser.

This is necessary in order to use `getelementptr` on an opaque pointer.
See #177. A test case will be added in a subsequent commit.
RyanGlScott added a commit that referenced this issue Apr 23, 2023
This adds the bare minimum needed to parse `PtrOpaque` (opaque pointer) types.
See #177. Other instructions will need to be tweaked in order to account for
the possibility of opaque pointer arguments, but this will happen in subsequent
commits.
RyanGlScott added a commit that referenced this issue Apr 23, 2023
As explained in the new `Note [Pointers and pointee types]`, we cannot inspect
`PtrTo` pointee types if we simultaneously support opaque pointers. The `ptrTo`
and `baseType` functions fundamentally rely on this, and as such, they have
been removed. They are ultimately used in service of implementing assertions,
so removing them is fairly straightforward. See #177.

The `elimPtrTo` and `elimPtrTo_` functions also inspect pointee types, but they
are required to support old versions of LLVM that do not store the necessary
type information in the instructions that need them. In subsequent commits, I
will ensure that all uses of `elimPtrTo`/`elimPtrTo_` are appropriately guarded
such that they will not be used on modern versions of LLVM bitcode.
RyanGlScott added a commit that referenced this issue Apr 23, 2023
See `Note [Pointers and pointee types]` for the rationale. See also #177.
RyanGlScott added a commit that referenced this issue Apr 23, 2023
See `Note [Pointers and pointee types]` for the rationale. See also #177.
RyanGlScott added a commit that referenced this issue Apr 23, 2023
See `Note [Pointers and pointee types]` for the rationale. See also #177.
RyanGlScott added a commit that referenced this issue Apr 23, 2023
Recent versions of the `FUNC_CODE_INST_ATOMICRMW` instruction code directly
store the type corresponding to the pointer argument, which avoids the need to
pattern-match on the pointer type.  This is required to support opaque
pointers. See #177.

Older versions of the instruction (`FUNC_CODE_INST_ATOMICRMW_OLD`) do not store
this type directly, so there were have no choice but to inspect the pointee
type using `elimPtrTo`.
RyanGlScott added a commit that referenced this issue Apr 23, 2023
Because `llvm-pretty` permits opaque and non-opaque pointers to coexist, it is
possible for the first argument of `cmpxchg` to be an opaque pointer and the
second argument to be a non-opaque pointer (or vice versa). We don't want to
reject such scenarios, so we compare the types of the argument using
`eqTypeModuloOpaquePtrs`, a special form of type equality that treats opaque
and non-opaque pointers as being the same. See #177.

This requires bumping the `llvm-pretty` submodule to bring in the corresponding
changes from GaloisInc/llvm-pretty#110.
RyanGlScott added a commit that referenced this issue Apr 23, 2023
…ointers

This bumps the `llvm-pretty` submodule to bring in the `fixupOpaquePtrs`
function from GaloisInc/llvm-pretty#110 and use it in the `disasm-test` test
suite. This is needed because we must give pretty-printed `llvm-pretty` ASTs to
`llvm-as`, which strictly forbids mixing opaque and non-opaque pointers. See #177.
RyanGlScott added a commit that referenced this issue May 3, 2023
Recent versions of the `FUNC_CODE_INST_ATOMICRMW` instruction code directly
store the type corresponding to the pointer argument, which avoids the need to
pattern-match on the pointer type.  This is required to support opaque
pointers. See #177.

Older versions of the instruction (`FUNC_CODE_INST_ATOMICRMW_OLD`) do not store
this type directly, so there were have no choice but to inspect the pointee
type using `elimPtrTo`.
RyanGlScott added a commit that referenced this issue May 3, 2023
Because `llvm-pretty` permits opaque and non-opaque pointers to coexist, it is
possible for the first argument of `cmpxchg` to be an opaque pointer and the
second argument to be a non-opaque pointer (or vice versa). We don't want to
reject such scenarios, so we compare the types of the argument using
`eqTypeModuloOpaquePtrs`, a special form of type equality that treats opaque
and non-opaque pointers as being the same. See #177.

This requires bumping the `llvm-pretty` submodule to bring in the corresponding
changes from GaloisInc/llvm-pretty#110.
RyanGlScott added a commit that referenced this issue May 3, 2023
…ointers

This bumps the `llvm-pretty` submodule to bring in the `fixupOpaquePtrs`
function from GaloisInc/llvm-pretty#110 and use it in the `disasm-test` test
suite. This is needed because we must give pretty-printed `llvm-pretty` ASTs to
`llvm-as`, which strictly forbids mixing opaque and non-opaque pointers. See #177.
RyanGlScott added a commit that referenced this issue May 3, 2023
RyanGlScott added a commit that referenced this issue May 4, 2023
As explained in the new `Note [Pointers and pointee types]`, we cannot inspect
`PtrTo` pointee types if we simultaneously support opaque pointers. The `ptrTo`
and `baseType` functions fundamentally rely on this, and as such, they have
been removed. They are ultimately used in service of implementing assertions,
so removing them is fairly straightforward. See #177.

The `elimPtrTo` and `elimPtrTo_` functions also inspect pointee types, but they
are required to support old versions of LLVM that do not store the necessary
type information in the instructions that need them. In subsequent commits, I
will ensure that all uses of `elimPtrTo`/`elimPtrTo_` are appropriately guarded
such that they will not be used on modern versions of LLVM bitcode.
RyanGlScott added a commit that referenced this issue May 4, 2023
See `Note [Pointers and pointee types]` for the rationale. See also #177.
RyanGlScott added a commit that referenced this issue May 4, 2023
Recent versions of the `FUNC_CODE_INST_ATOMICRMW` instruction code directly
store the type corresponding to the pointer argument, which avoids the need to
pattern-match on the pointer type.  This is required to support opaque
pointers. See #177.

Older versions of the instruction (`FUNC_CODE_INST_ATOMICRMW_OLD`) do not store
this type directly, so there were have no choice but to inspect the pointee
type using `elimPtrTo`.
RyanGlScott added a commit that referenced this issue May 4, 2023
Because `llvm-pretty` permits opaque and non-opaque pointers to coexist, it is
possible for the first argument of `cmpxchg` to be an opaque pointer and the
second argument to be a non-opaque pointer (or vice versa). We don't want to
reject such scenarios, so we compare the types of the argument using
`eqTypeModuloOpaquePtrs`, a special form of type equality that treats opaque
and non-opaque pointers as being the same. See #177.

This requires bumping the `llvm-pretty` submodule to bring in the corresponding
changes from GaloisInc/llvm-pretty#110.
RyanGlScott added a commit that referenced this issue May 4, 2023
…ointers

This bumps the `llvm-pretty` submodule to bring in the `fixupOpaquePtrs`
function from GaloisInc/llvm-pretty#110 and use it in the `disasm-test` test
suite. This is needed because we must give pretty-printed `llvm-pretty` ASTs to
`llvm-as`, which strictly forbids mixing opaque and non-opaque pointers. See #177.
RyanGlScott added a commit that referenced this issue May 4, 2023
RyanGlScott added a commit that referenced this issue May 30, 2023
This bumps the `llvm-pretty` submodule to bring in the changes to the `Load`
data constructor from GaloisInc/llvm-pretty#110 and adapts the code in
`llvm-pretty-bc-parser` accordingly.

This is necessary in order to `load` from an opaque pointer. See #177. A test
case will be added in a subsequent commit.
RyanGlScott added a commit that referenced this issue May 30, 2023
This bumps the `llvm-pretty` submodule to bring in the changes to the
`GEP`/`ConstGEP` data constructors from GaloisInc/llvm-pretty#110 and adapts the
code in `llvm-pretty-bc-parser` accordingly.

Because `ConstGEP` now stores the basis type for calculations explicitly, I
needed to fix #218 in order to ensure that the basis type is always parsed
properly. In the process of fixing this issue, I refactored the `parseCeGep` to
make the code clearer and more closely mirror the structure of LLVM's own
bitcode parser.

This is necessary in order to use `getelementptr` on an opaque pointer.
See #177. A test case will be added in a subsequent commit.
RyanGlScott added a commit that referenced this issue May 30, 2023
This adds the bare minimum needed to parse `PtrOpaque` (opaque pointer) types.
See #177. Other instructions will need to be tweaked in order to account for
the possibility of opaque pointer arguments, but this will happen in subsequent
commits.
RyanGlScott added a commit that referenced this issue May 30, 2023
As explained in the new `Note [Pointers and pointee types]`, we cannot inspect
`PtrTo` pointee types if we simultaneously support opaque pointers. The `ptrTo`
and `baseType` functions fundamentally rely on this, and as such, they have
been removed. They are ultimately used in service of implementing assertions,
so removing them is fairly straightforward. See #177.

The `elimPtrTo` and `elimPtrTo_` functions also inspect pointee types, but they
are required to support old versions of LLVM that do not store the necessary
type information in the instructions that need them. In subsequent commits, I
will ensure that all uses of `elimPtrTo`/`elimPtrTo_` are appropriately guarded
such that they will not be used on modern versions of LLVM bitcode.
RyanGlScott added a commit that referenced this issue May 30, 2023
See `Note [Pointers and pointee types]` for the rationale. See also #177.
RyanGlScott added a commit that referenced this issue May 30, 2023
Recent versions of the `FUNC_CODE_INST_ATOMICRMW` instruction code directly
store the type corresponding to the pointer argument, which avoids the need to
pattern-match on the pointer type.  This is required to support opaque
pointers. See #177.

Older versions of the instruction (`FUNC_CODE_INST_ATOMICRMW_OLD`) do not store
this type directly, so there were have no choice but to inspect the pointee
type using `elimPtrTo`.
RyanGlScott added a commit that referenced this issue May 30, 2023
Because `llvm-pretty` permits opaque and non-opaque pointers to coexist, it is
possible for the first argument of `cmpxchg` to be an opaque pointer and the
second argument to be a non-opaque pointer (or vice versa). We don't want to
reject such scenarios, so we compare the types of the argument using
`eqTypeModuloOpaquePtrs`, a special form of type equality that treats opaque
and non-opaque pointers as being the same. See #177.

This requires bumping the `llvm-pretty` submodule to bring in the corresponding
changes from GaloisInc/llvm-pretty#110.
RyanGlScott added a commit that referenced this issue May 30, 2023
…ointers

This bumps the `llvm-pretty` submodule to bring in the `fixupOpaquePtrs`
function from GaloisInc/llvm-pretty#110 and use it in the `disasm-test` test
suite. This is needed because we must give pretty-printed `llvm-pretty` ASTs to
`llvm-as`, which strictly forbids mixing opaque and non-opaque pointers. See #177.
RyanGlScott added a commit that referenced this issue May 30, 2023
RyanGlScott added a commit that referenced this issue May 30, 2023
This bumps the `llvm-pretty` submodule to bring in the changes to the `Load`
data constructor from GaloisInc/llvm-pretty#110 and adapts the code in
`llvm-pretty-bc-parser` accordingly.

This is necessary in order to `load` from an opaque pointer. See #177. A test
case will be added in a subsequent commit.
RyanGlScott added a commit that referenced this issue May 30, 2023
This bumps the `llvm-pretty` submodule to bring in the changes to the
`GEP`/`ConstGEP` data constructors from GaloisInc/llvm-pretty#110 and adapts the
code in `llvm-pretty-bc-parser` accordingly.

Because `ConstGEP` now stores the basis type for calculations explicitly, I
needed to fix #218 in order to ensure that the basis type is always parsed
properly. In the process of fixing this issue, I refactored the `parseCeGep` to
make the code clearer and more closely mirror the structure of LLVM's own
bitcode parser.

This is necessary in order to use `getelementptr` on an opaque pointer.
See #177. A test case will be added in a subsequent commit.
RyanGlScott added a commit that referenced this issue May 30, 2023
This adds the bare minimum needed to parse `PtrOpaque` (opaque pointer) types.
See #177. Other instructions will need to be tweaked in order to account for
the possibility of opaque pointer arguments, but this will happen in subsequent
commits.
RyanGlScott added a commit that referenced this issue May 30, 2023
As explained in the new `Note [Pointers and pointee types]`, we cannot inspect
`PtrTo` pointee types if we simultaneously support opaque pointers. The `ptrTo`
and `baseType` functions fundamentally rely on this, and as such, they have
been removed. They are ultimately used in service of implementing assertions,
so removing them is fairly straightforward. See #177.

The `elimPtrTo` and `elimPtrTo_` functions also inspect pointee types, but they
are required to support old versions of LLVM that do not store the necessary
type information in the instructions that need them. In subsequent commits, I
will ensure that all uses of `elimPtrTo`/`elimPtrTo_` are appropriately guarded
such that they will not be used on modern versions of LLVM bitcode.
RyanGlScott added a commit that referenced this issue May 30, 2023
See `Note [Pointers and pointee types]` for the rationale. See also #177.
RyanGlScott added a commit that referenced this issue May 30, 2023
Recent versions of the `FUNC_CODE_INST_ATOMICRMW` instruction code directly
store the type corresponding to the pointer argument, which avoids the need to
pattern-match on the pointer type.  This is required to support opaque
pointers. See #177.

Older versions of the instruction (`FUNC_CODE_INST_ATOMICRMW_OLD`) do not store
this type directly, so there were have no choice but to inspect the pointee
type using `elimPtrTo`.
RyanGlScott added a commit that referenced this issue May 30, 2023
Because `llvm-pretty` permits opaque and non-opaque pointers to coexist, it is
possible for the first argument of `cmpxchg` to be an opaque pointer and the
second argument to be a non-opaque pointer (or vice versa). We don't want to
reject such scenarios, so we compare the types of the argument using
`eqTypeModuloOpaquePtrs`, a special form of type equality that treats opaque
and non-opaque pointers as being the same. See #177.

This requires bumping the `llvm-pretty` submodule to bring in the corresponding
changes from GaloisInc/llvm-pretty#110.
RyanGlScott added a commit that referenced this issue May 30, 2023
…ointers

This bumps the `llvm-pretty` submodule to bring in the `fixupOpaquePtrs`
function from GaloisInc/llvm-pretty#110 and use it in the `disasm-test` test
suite. This is needed because we must give pretty-printed `llvm-pretty` ASTs to
`llvm-as`, which strictly forbids mixing opaque and non-opaque pointers. See #177.
RyanGlScott added a commit that referenced this issue May 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants