Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Symbolic execution fails for infinite sequences in some circumstances #1703

Closed
ddfreeman opened this issue Jul 8, 2022 · 5 comments
Closed
Labels
type: bug Issues reporting bugs or unexpected/unwanted behavior

Comments

@ddfreeman
Copy link

For reference, the below uses Cryptol 2.13.0.99 and SAW 0.9.0.99. Using the SAW Python Module, I was attempting to prove equivalence between a C function and a Cryptol one:

uint32_t weird(uint32_t cv[8], uint8_t i) {
    return cv[i % 8];
}
weird : [8][32] -> [8] -> [32]
weird cv i = k
    where
        k = cvs @ i
        cvs = cv # cvs

The idea is that I can simulate cvs infinitely repeating cv in the Cryptol implementation by taking the modulus of the index in the C implementation, since C does not support infinite sequences.

I used the following Python script to do so:

class Contract_weird(Contract):
    def specification(self) -> None:
        cv = self.fresh_var(array_ty(8, i32), "cv")
        cv_p = self.alloc(array_ty(8, i32), points_to = cv, read_only=True)
        i = self.fresh_var(i8, "i")

        self.execute_func(cv_p, i)
        self.returns(cry_f("weird {cv} {i}"))

However, this contract failed to verify:

[12:09:18.830] Subgoal failed: weird internal: error: in _SAW_verify_prestate SAWServer
Literal equality postcondition

[12:09:18.830] SolverStats {solverStatsSolvers = fromList ["W4 ->z3"], solverStatsGoalSize = 990}
[12:09:18.831] ----------Counterexample----------
[12:09:18.831]   cv0: [4294967295, 0, 0, 0, 0, 0, 0, 0]
[12:09:18.831]   i0: 128
[12:09:18.831] ----------------------------------

So, I put that input through C and Cryptol:

int main() {
    uint32_t cv[8] = {4294967295, 0, 0, 0, 0, 0, 0, 0};
    uint32_t i = 128;
    printf("0x%x\n", weird(cv, i));
}
clang weird.c -o weird.out ; ./weird.out
0xffffffff

weird> weird [4294967295, 0, 0, 0, 0, 0, 0, 0] 128
0xffffffff

The counterexample, erm, does not seem to be a counterexample. After some playing around, I discovered a few interesting things. First, if I hard-code a value for i, the proof succeeds:

class Contract_weird(Contract):
    def specification(self) -> None:
        cv = self.fresh_var(array_ty(8, i32), "cv")
        cv_p = self.alloc(array_ty(8, i32), points_to = cv, read_only=True)
        i = self.fresh_var(i8, "i")

        self.execute_func(cv_p, cry_f("(128 : [8])"))
        self.returns(cry_f("weird {cv} 128"))
Proof succeeded! weird
✅  Verified: lemma_Contract_weird (defined at weird.py:36)

I'm not really an expert on symbolic execution or formal verification, but this seems to suggest that something is amiss with the symbolic execution.

Further, if i is a palindrome, the proof also succeeds:

class Contract_weird(Contract):
    def specification(self) -> None:
        cv = self.fresh_var(array_ty(8, i32), "cv")
        cv_p = self.alloc(array_ty(8, i32), points_to = cv, read_only=True)
        i = self.fresh_var(i8, "i")

        self.precondition_f("{i} == reverse {i}")

        self.execute_func(cv_p, i)
        self.returns(cry_f("weird {cv} {i}"))
Proof succeeded! weird
✅  Verified: lemma_Contract_weird (defined at weird.py:36)

I'm not sure what to make of this one, but it was interesting, so I felt like I should include it.

@weaversa
Copy link
Contributor

weaversa commented Jul 8, 2022

If we change weird to:

weird : [8][32] -> [8] -> [32]
weird cv i = k
    where
        k = cv @ (i % 8)

the proof passes as well. So this looks like something going wrong with, as you say, infinite sequences. The palindrome thing is very strange.

@RyanGlScott
Copy link
Contributor

It appears that using What4 is important to triggering this bug. If I change from a What4-based Z3 to an SBV-based one:

-    weird_ov = llvm_verify(mod, "weird", Contract_weird(), script=ProofScript([UseProver(Z3([]))]))
+    weird_ov = llvm_verify(mod, "weird", Contract_weird(), script=ProofScript([UseProver(Z3_SBV([]))]))

Then the proof succeeds.

@RyanGlScott RyanGlScott added the type: bug Issues reporting bugs or unexpected/unwanted behavior label Jul 8, 2022
@robdockins
Copy link
Contributor

The handling of infinite sequences is one of the few places where the code paths for the SAW symbolic simulators differ between What4 and SBV, so that's probably where to start looking.

RyanGlScott added a commit that referenced this issue Dec 3, 2022
SAWCore assumes the invariant that when indexing into an infinite stream using
a symbolic index, the mux tree constructed over the index will test each bit in
big-endian order. This is the responsibility of the `selectV` function found in
`saw-core`, `saw-core-sbv`, and `saw-core-what4`. All of these functions save
for `saw-core-what4`'s `selectV` were upholding this invariant, as
`saw-core-what4`'s was testing each bit in little-endian order, resulting in
the oddities observed in #1703.

This corrects the mistake in `saw-core-what4'`s implementation, fixing #1703.
It also leaves some more documentation to make the fact that each `selectV`
should be proceeding in big-endian order more apparent.
@RyanGlScott
Copy link
Contributor

I happened to figure out what is going on here while investigating #1768. When indexing into an infinite SAWCore bitvector using a symbolic index, SAW will create a giant mux of all possible bits. The topmost ite expression in the mux tree will check the value of the most significant bit, the ite expressions below the topmost one will check the value of the second-most significant bit, and so on. This MSB-first muxing is implemented in the selectV family of functions, which can be found in saw-core:

-- @selectV mux maxValue valueFn v@ treats the vector @v@ as an
-- index, represented as a big-endian list of bits. It does a binary
-- lookup, using @mux@ as an if-then-else operator. If the index is
-- greater than @maxValue@, then it returns @valueFn maxValue@.
selectV :: (b -> a -> a -> a) -> Int -> (Int -> a) -> Vector b -> a
selectV mux maxValue valueFn v = impl len 0
where
len = V.length v
err = panic "selectV: impossible"
impl _ x | x > maxValue || x < 0 = valueFn maxValue
impl 0 x = valueFn x
impl i x = mux (vecIdx err v (len - i)) (impl j (x `setBit` j)) (impl j x) where j = i - 1

In saw-core-sbv:

-- selectV merger maxValue valueFn index returns valueFn v when index has value v
-- if index is greater than maxValue, it returns valueFn maxValue. Use the ite op from merger.
selectV :: (SBool -> b -> b -> b) -> Natural -> (Natural -> b) -> SWord -> b
selectV merger maxValue valueFn vx =
case svAsInteger vx of
Just i
| i >= 0 -> valueFn (fromInteger i)
| otherwise -> panic "selectV" ["expected nonnegative integer", show i]
Nothing -> impl (intSizeOf vx) 0
where
impl _ x | x > maxValue || x < 0 = valueFn maxValue
impl 0 y = valueFn y
impl i y = merger (svTestBit vx j) (impl j (y `setBit` j)) (impl j y) where j = i - 1

And in saw-core-what4:

-- selectV merger maxValue valueFn index returns valueFn v when index has value v
-- if index is greater than maxValue, it returns valueFn maxValue. Use the ite op from merger.
selectV :: forall sym b.
Sym sym =>
sym ->
(SBool sym -> IO b -> IO b -> IO b) -> Natural -> (Natural -> IO b) -> SWord sym -> IO b
selectV sym merger maxValue valueFn vx =
case SW.bvAsUnsignedInteger vx of
Just i -> valueFn (fromInteger i :: Natural)
Nothing -> impl (swBvWidth vx) 0
where
impl :: Int -> Natural -> IO b
impl _ x | x > maxValue || x < 0 = valueFn maxValue
impl 0 y = valueFn y
impl i y = do
p <- SW.bvAtBE sym vx (toInteger j)
merger p (impl j (y `setBit` j)) (impl j y) where j = i - 1

While comparing these functions, I realized that saw-core-what4's selectV is the odd one out in that it is actually constructing the mux tree LSB first, not MSB first! The value of i in impl starts at the bit width and then counts down with each iteration of impl. In each iteration, i - 1 is passed to bvAtBE, which uses the convention that index 0 corresponds to the MSB. As a result, impl's first iteration will compute the least significant bit, impl's second iteration will compute the second-least significant bit, and so on.

This is the opposite order of the other two selectVs. saw-core's selectV also assumes a big-endian order when looking up bits, but it counts up in each iteration of impl rather than counting down. saw-core-sbv's selectV counts down in each iteration of impl, but it assumes a little-endian order when looking up bits, so it effectively builds the mux tree in MSB-first order.

I believe this also explains why your example happened to work when the index is a palindrome, as those are precisely the conditions under which MSB-first order would be equivalent to LSB-first order.

A change that fixes the issue is therefore:

diff --git a/saw-core-what4/src/Verifier/SAW/Simulator/What4.hs b/saw-core-what4/src/Verifier/SAW/Simulator/What4.hs
index e5789d85..b4de3b96 100644
--- a/saw-core-what4/src/Verifier/SAW/Simulator/What4.hs
+++ b/saw-core-what4/src/Verifier/SAW/Simulator/What4.hs
@@ -595,7 +595,7 @@ selectV sym merger maxValue valueFn vx =
     impl _ x | x > maxValue || x < 0 = valueFn maxValue
     impl 0 y = valueFn y
     impl i y = do
-      p <- SW.bvAtBE sym vx (toInteger j)
+      p <- SW.bvAtLE sym vx (toInteger j)
       merger p (impl j (y `setBit` j)) (impl j y) where j = i - 1
 
 instance Show (SArray sym) where

I've included this in #1773, along with some documentation on the selectV functions to make it clearer how they work.

RyanGlScott added a commit that referenced this issue Dec 6, 2022
SAWCore assumes the invariant that when indexing into an infinite stream using
a symbolic index, the mux tree constructed over the index will test each bit in
big-endian order. This is the responsibility of the `selectV` function found in
`saw-core`, `saw-core-sbv`, and `saw-core-what4`. All of these functions save
for `saw-core-what4`'s `selectV` were upholding this invariant, as
`saw-core-what4`'s was testing each bit in little-endian order, resulting in
the oddities observed in #1703.

This corrects the mistake in `saw-core-what4'`s implementation, fixing #1703.
It also leaves some more documentation to make the fact that each `selectV`
should be proceeding in big-endian order more apparent.
@RyanGlScott
Copy link
Contributor

Fixed in #1773.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: bug Issues reporting bugs or unexpected/unwanted behavior
Projects
None yet
Development

No branches or pull requests

4 participants