Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More complete support for opaque pointers (required for LLVM 15+) #221

Merged
merged 14 commits into from
May 30, 2023

Conversation

RyanGlScott
Copy link
Contributor

@RyanGlScott RyanGlScott commented May 3, 2023

This is a collection of various tweaks to llvm-pretty-bc-parser that allow it to fully support opaque pointers, an alternative representation of pointer types that is enabled by default in LLVM 15 or later. This patch should fix #177 insofar as I have tested these changes on a reasonable subset of LLVM 15/16 bitcode, and this is enough to make everything that I have thrown at llvm-pretty-bc-parser work.

I have split this patch up into self-contained commits, and for reviewing purposes, I'd recommend viewing each patch in isolation. The general themes of these patches are:

Adapting to llvm-pretty API changes (from GaloisInc/llvm-pretty#110)

  • Adapt to Load gaining an explicit load type
  • Adapt to GEP/ConstGEP gaining explicit base types

Parsing opaque pointers

  • Basic support for parsing opaque pointer types

Making other instructions work with opaque pointers

  • Remove ptrTo and baseType
  • Only use elimPtrTo on old LLVMs for invoke/call/callbr
  • parseAtomicRMW: Don't inspect pointee type of argument on recent LLVMs
  • cmpxchg: Compare argument types using eqTypeModuloOpaquePtrs

CI/documentation updates

  • README: Indicate support for LLVM 15 and 16
  • llvm-quick-fuzz: Add LLVM 15 and 16 configurations

Test suite updates

  • disasm-test: Avoid llvm-as from crashing on mixed opaque/non-opaque pointers
  • disasm-test: Rename callbr.ll to callbr.pre-llvm15.ll
  • disasm-test: Add callbr test output for LLVM 15+
  • disasm-test: Add opaque pointer test cases

Copy link
Member

@kquick kquick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some documentation related comments, but overall looks good.

src/Data/LLVM/BitCode/Assert.hs Outdated Show resolved Hide resolved
The elimPtrTo and elimPtrTo_ functions go against this rule, as they retrieve
the pointee type in a PtrTo. These functions are primarily used for supporting
old versions of LLVM which do not store the necessary type information in the
instruction itself.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a recommendation for what to use instead for opaque pointers and more modern approaches?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

elimPtrTo is typically used in situations in which an explicit Type is needed, but the only way to retrieve the Type is by inspecting a pointer type. Because opaque pointers don't have pointee types, there is no direct elimPtrTo counterpart for opaque pointers. The modern approach would be to store the relevant Type information nearby so that it can always be retrieved independently of the pointer. There isn't an exact template that you can follow for this, but in the case of LLVM instructions, this is usually tantamount to having an extra Type field alongside the Value representing the pointer.

src/Data/LLVM/BitCode/Assert.hs Show resolved Hide resolved
src/Data/LLVM/BitCode/IR/Function.hs Show resolved Hide resolved
src/Data/LLVM/BitCode/Assert.hs Show resolved Hide resolved
@kquick
Copy link
Member

kquick commented May 3, 2023

Dang, I missed that the category feature for user warnings was a new item.

I did a scan of most of our code, and the only calls I can find to elimPtrTo are only in llvm-pretty and llvm-pretty-bc-parser, so I'm much less concerned about not having warnings (or even migration documentation) at this point.

RyanGlScott and others added 14 commits May 30, 2023 08:06
This bumps the `llvm-pretty` submodule to bring in the changes to the `Load`
data constructor from GaloisInc/llvm-pretty#110 and adapts the code in
`llvm-pretty-bc-parser` accordingly.

This is necessary in order to `load` from an opaque pointer. See #177. A test
case will be added in a subsequent commit.
This bumps the `llvm-pretty` submodule to bring in the changes to the
`GEP`/`ConstGEP` data constructors from GaloisInc/llvm-pretty#110 and adapts the
code in `llvm-pretty-bc-parser` accordingly.

Because `ConstGEP` now stores the basis type for calculations explicitly, I
needed to fix #218 in order to ensure that the basis type is always parsed
properly. In the process of fixing this issue, I refactored the `parseCeGep` to
make the code clearer and more closely mirror the structure of LLVM's own
bitcode parser.

This is necessary in order to use `getelementptr` on an opaque pointer.
See #177. A test case will be added in a subsequent commit.
This adds the bare minimum needed to parse `PtrOpaque` (opaque pointer) types.
See #177. Other instructions will need to be tweaked in order to account for
the possibility of opaque pointer arguments, but this will happen in subsequent
commits.
As explained in the new `Note [Pointers and pointee types]`, we cannot inspect
`PtrTo` pointee types if we simultaneously support opaque pointers. The `ptrTo`
and `baseType` functions fundamentally rely on this, and as such, they have
been removed. They are ultimately used in service of implementing assertions,
so removing them is fairly straightforward. See #177.

The `elimPtrTo` and `elimPtrTo_` functions also inspect pointee types, but they
are required to support old versions of LLVM that do not store the necessary
type information in the instructions that need them. In subsequent commits, I
will ensure that all uses of `elimPtrTo`/`elimPtrTo_` are appropriately guarded
such that they will not be used on modern versions of LLVM bitcode.
See `Note [Pointers and pointee types]` for the rationale. See also #177.
Recent versions of the `FUNC_CODE_INST_ATOMICRMW` instruction code directly
store the type corresponding to the pointer argument, which avoids the need to
pattern-match on the pointer type.  This is required to support opaque
pointers. See #177.

Older versions of the instruction (`FUNC_CODE_INST_ATOMICRMW_OLD`) do not store
this type directly, so there were have no choice but to inspect the pointee
type using `elimPtrTo`.
Because `llvm-pretty` permits opaque and non-opaque pointers to coexist, it is
possible for the first argument of `cmpxchg` to be an opaque pointer and the
second argument to be a non-opaque pointer (or vice versa). We don't want to
reject such scenarios, so we compare the types of the argument using
`eqTypeModuloOpaquePtrs`, a special form of type equality that treats opaque
and non-opaque pointers as being the same. See #177.

This requires bumping the `llvm-pretty` submodule to bring in the corresponding
changes from GaloisInc/llvm-pretty#110.
…ointers

This bumps the `llvm-pretty` submodule to bring in the `fixupOpaquePtrs`
function from GaloisInc/llvm-pretty#110 and use it in the `disasm-test` test
suite. This is needed because we must give pretty-printed `llvm-pretty` ASTs to
`llvm-as`, which strictly forbids mixing opaque and non-opaque pointers. See #177.
The syntax for inline assembly changed slightly in LLVM 15+, so let's
pre-emptively move this test case to `pre-llvm15`. In a subsequent commit, we
will add a newer `.ll` file that supports LLVM 15+.
Note that `versions` uses SemVer rather than the PVP, and as a result, it
sometimes bumps the secondary version number even for non-breaking changes. As
a result, I am guarding against the major version number rather than the
secondary one.
@RyanGlScott RyanGlScott marked this pull request as ready for review May 30, 2023 12:09
@RyanGlScott RyanGlScott merged commit 65be0b0 into master May 30, 2023
@RyanGlScott RyanGlScott deleted the llvm-15-take-two branch May 30, 2023 12:21
RyanGlScott added a commit to GaloisInc/saw-script that referenced this pull request May 31, 2023
This commit bumps the `llvm-pretty` submodule to bring in a commit from
GaloisInc/llvm-pretty#110 that adds additional fields to `ConstGEP` to represent
the basis type and expression to use for offset calculations. This also bumps
the `llvm-pretty-bc-parser` and `crucible` submodule to bring in corresponding
changes from GaloisInc/llvm-pretty-bc-parser#221 and GaloisInc/crucible#1085,
respectively.

This change affects one use site in `heapster-saw`, which is easily adapted.
RyanGlScott added a commit to GaloisInc/saw-script that referenced this pull request May 31, 2023
This patch adds support for LLVM 15 and 16 by adding support for opaque
pointers, which are described in https://llvm.org/docs/OpaquePointers.html.  I
have also added a test case involving LLVM bitcode using opaque pointers to
kick the tires and ensure that the basics work as expected.

For the most part, this is a straightforward process, as most uses of pointer
types in SAW already do not care about pointee types. There are some
exceptions, however:

* The `typeOfSetupValue` function, as well as several places that use this
  function, scrutinize pointee types of pointers, which would appear to fly in
  the face of opaque pointers. I attempt to explain in #1876 which this is
  actually OK for now (although a bit awkward).
* The `llvm_boilerplate`/skeleton machinery does not support opaque pointers
  at all. See #1877.

This patch also bumps the following submodules to bring in support for opaque
pointers:

* `llvm-pretty`: GaloisInc/llvm-pretty#110
* `llvm-pretty-bc-parser`: GaloisInc/llvm-pretty-bc-parser#221
* `crucible`: GaloisInc/crucible#1085

This also bumps the `what4` submodule to bring in the changes from
GaloisInc/what4#234. This isn't necessary to support opaque pointers, but it
_is_ necessary to support a build plan involving `tasty-sugar-2.2.*`, which
`llvm-pretty-bc-parser`'s test suite now requires.
RyanGlScott added a commit to GaloisInc/saw-script that referenced this pull request Jun 1, 2023
This patch adds support for LLVM 15 and 16 by adding support for opaque
pointers, which are described in https://llvm.org/docs/OpaquePointers.html.  I
have also added a test case involving LLVM bitcode using opaque pointers to
kick the tires and ensure that the basics work as expected.

For the most part, this is a straightforward process, as most uses of pointer
types in SAW already do not care about pointee types. There are some
exceptions, however:

* The `typeOfSetupValue` function, as well as several places that use this
  function, scrutinize pointee types of pointers, which would appear to fly in
  the face of opaque pointers. I attempt to explain in #1876 which this is
  actually OK for now (although a bit awkward).
* The `llvm_boilerplate`/skeleton machinery does not support opaque pointers
  at all. See #1877.
* The `llvm_fresh_expanded_val` command does not support opaque pointers at
  all. See #1879.

This patch also bumps the following submodules to bring in support for opaque
pointers:

* `llvm-pretty`: GaloisInc/llvm-pretty#110
* `llvm-pretty-bc-parser`: GaloisInc/llvm-pretty-bc-parser#221
* `crucible`: GaloisInc/crucible#1085

This also bumps the `what4` submodule to bring in the changes from
GaloisInc/what4#234. This isn't necessary to support opaque pointers, but it
_is_ necessary to support a build plan involving `tasty-sugar-2.2.*`, which
`llvm-pretty-bc-parser`'s test suite now requires.
yav pushed a commit to GaloisInc/saw-script that referenced this pull request Jun 16, 2023
This commit bumps the `llvm-pretty` submodule to bring in a commit from
GaloisInc/llvm-pretty#110 that adds additional fields to `ConstGEP` to represent
the basis type and expression to use for offset calculations. This also bumps
the `llvm-pretty-bc-parser` and `crucible` submodule to bring in corresponding
changes from GaloisInc/llvm-pretty-bc-parser#221 and GaloisInc/crucible#1085,
respectively.

This change affects one use site in `heapster-saw`, which is easily adapted.
yav pushed a commit to GaloisInc/saw-script that referenced this pull request Jun 16, 2023
This patch adds support for LLVM 15 and 16 by adding support for opaque
pointers, which are described in https://llvm.org/docs/OpaquePointers.html.  I
have also added a test case involving LLVM bitcode using opaque pointers to
kick the tires and ensure that the basics work as expected.

For the most part, this is a straightforward process, as most uses of pointer
types in SAW already do not care about pointee types. There are some
exceptions, however:

* The `typeOfSetupValue` function, as well as several places that use this
  function, scrutinize pointee types of pointers, which would appear to fly in
  the face of opaque pointers. I attempt to explain in #1876 which this is
  actually OK for now (although a bit awkward).
* The `llvm_boilerplate`/skeleton machinery does not support opaque pointers
  at all. See #1877.
* The `llvm_fresh_expanded_val` command does not support opaque pointers at
  all. See #1879.

This patch also bumps the following submodules to bring in support for opaque
pointers:

* `llvm-pretty`: GaloisInc/llvm-pretty#110
* `llvm-pretty-bc-parser`: GaloisInc/llvm-pretty-bc-parser#221
* `crucible`: GaloisInc/crucible#1085

This also bumps the `what4` submodule to bring in the changes from
GaloisInc/what4#234. This isn't necessary to support opaque pointers, but it
_is_ necessary to support a build plan involving `tasty-sugar-2.2.*`, which
`llvm-pretty-bc-parser`'s test suite now requires.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support opaque pointers
2 participants