Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect parsing of CST_CODE_CE_GEP_WITH_INRANGE_INDEX's type #218

Closed
RyanGlScott opened this issue Apr 6, 2023 · 0 comments
Closed

Incorrect parsing of CST_CODE_CE_GEP_WITH_INRANGE_INDEX's type #218

RyanGlScott opened this issue Apr 6, 2023 · 0 comments

Comments

@RyanGlScott
Copy link
Contributor

While investigating the code for parsing ConstGEP, I noticed this suspicious-looking TODO:

parseCeGep :: Bool -> Maybe Word64 -> ValueTable -> Record -> Parse PValue
parseCeGep isInbounds mInrangeIdx t r = do
let isExplicit = odd (length (recordFields r)) -- TODO: is this right for INRANGE_INDEX?
firstIdx = if isJust mInrangeIdx then 2 else if isExplicit then 1 else 0
field = parseField r
loop n = do
ty <- getType =<< field n numeric
elt <- field (n+1) numeric
rest <- loop (n+2) `mplus` return []
cxt <- getContext
return (Typed ty (typedValue (forwardRef cxt elt t)) : rest)
mPointeeType <-
if isExplicit
then Just <$> (getType =<< field 0 numeric)
else pure Nothing
args <- loop firstIdx
return $! ValConstExpr (ConstGEP isInbounds mInrangeIdx mPointeeType args)

And indeed, this code isn't right for CST_CODE_CE_GEP_WITH_INRANGE_INDEX. As proof, if you take this .ll file (taken from here):

define i8** @constexpr() {
  ret i8** getelementptr inbounds ({ [4 x i8*], [4 x i8*] }, { [4 x i8*], [4 x i8*] }* null, i32 0, inrange i32 1, i32 2)
}

Convert it to a .bc file:

$ llvm-as test.ll -o test.bc

And then load it with llvm-pretty-bc-parser, you will get:

λ> parseBitCodeFromFile "test.bc"
Right (Module {modSourceName = Just "test.ll", modTriple = TargetTriple {ttArch = UnknownArch, ttSubArch = NoSubArch, ttVendor = UnknownVendor, ttOS = UnknownOS, ttEnv = UnknownEnvironment, ttObjFmt = UnknownObjectFormat}, modDataLayout = [], modTypes = [], modNamedMd = [], modUnnamedMd = [], modComdat = fromList [], modGlobals = [], modDeclares = [], modDefines = [Define {defLinkage = Nothing, defVisibility = Just DefaultVisibility, defRetType = PtrTo (PtrTo (PrimType (Integer 8))), defName = Symbol "constexpr", defArgs = [], defVarArgs = False, defAttrs = [], defSection = Nothing, defGC = Nothing, defBody = [BasicBlock {bbLabel = Just (Anon 0), bbStmts = [Effect (Ret (Typed {typedType = PtrTo (PtrTo (PrimType (Integer 8))), typedValue = ValConstExpr (ConstGEP True (Just 1) Nothing [Typed {typedType = PtrTo (Struct [Array 4 (PtrTo (PrimType (Integer 8))),Array 4 (PtrTo (PrimType (Integer 8)))]), typedValue = ValNull},Typed {typedType = PrimType (Integer 32), typedValue = ValInteger 0},Typed {typedType = PrimType (Integer 32), typedValue = ValInteger 1},Typed {typedType = PrimType (Integer 32), typedValue = ValInteger 2}])})) []]}], defMetadata = fromList [], defComdat = Nothing}], modInlineAsm = [], modAliases = []})

The incorrect part is the Nothing in ConstGEP True (Just 1) Nothing ..., indicating that there is no explicit type to use as a basis for calculations. This is incorrect because CST_CODE_CE_GEP_WITH_INRANGE_INDEX always includes an explicit type, as indicated here. The correct thing to do would be to mirror how LLVM's own bicode reader works here.

RyanGlScott added a commit that referenced this issue Apr 23, 2023
This bumps the `llvm-pretty` submodule to bring in the changes to the
`GEP`/`ConstGEP` data constructors from GaloisInc/llvm-pretty#110 and adapts the
code in `llvm-pretty-bc-parser` accordingly.

Because `ConstGEP` now stores the basis type for calculations explicitly, I
needed to fix #218 in order to ensure that the basis type is always parsed
properly. In the process of fixing this issue, I refactored the `parseCeGep` to
make the code clearer and more closely mirror the structure of LLVM's own
bitcode parser.

This is necessary in order to use `getelementptr` on an opaque pointer.
See #177. A test case will be added in a subsequent commit.
RyanGlScott added a commit that referenced this issue Apr 23, 2023
This bumps the `llvm-pretty` submodule to bring in the changes to the
`GEP`/`ConstGEP` data constructors from GaloisInc/llvm-pretty#110 and adapts the
code in `llvm-pretty-bc-parser` accordingly.

Because `ConstGEP` now stores the basis type for calculations explicitly, I
needed to fix #218 in order to ensure that the basis type is always parsed
properly. In the process of fixing this issue, I refactored the `parseCeGep` to
make the code clearer and more closely mirror the structure of LLVM's own
bitcode parser.

This is necessary in order to use `getelementptr` on an opaque pointer.
See #177. A test case will be added in a subsequent commit.
RyanGlScott added a commit that referenced this issue May 3, 2023
This bumps the `llvm-pretty` submodule to bring in the changes to the
`GEP`/`ConstGEP` data constructors from GaloisInc/llvm-pretty#110 and adapts the
code in `llvm-pretty-bc-parser` accordingly.

Because `ConstGEP` now stores the basis type for calculations explicitly, I
needed to fix #218 in order to ensure that the basis type is always parsed
properly. In the process of fixing this issue, I refactored the `parseCeGep` to
make the code clearer and more closely mirror the structure of LLVM's own
bitcode parser.

This is necessary in order to use `getelementptr` on an opaque pointer.
See #177. A test case will be added in a subsequent commit.
RyanGlScott added a commit that referenced this issue May 30, 2023
This bumps the `llvm-pretty` submodule to bring in the changes to the
`GEP`/`ConstGEP` data constructors from GaloisInc/llvm-pretty#110 and adapts the
code in `llvm-pretty-bc-parser` accordingly.

Because `ConstGEP` now stores the basis type for calculations explicitly, I
needed to fix #218 in order to ensure that the basis type is always parsed
properly. In the process of fixing this issue, I refactored the `parseCeGep` to
make the code clearer and more closely mirror the structure of LLVM's own
bitcode parser.

This is necessary in order to use `getelementptr` on an opaque pointer.
See #177. A test case will be added in a subsequent commit.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant