Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

w4_abc_aiger proves False #1938

Closed
qsctr opened this issue Sep 15, 2023 · 3 comments · Fixed by #1943
Closed

w4_abc_aiger proves False #1938

qsctr opened this issue Sep 15, 2023 · 3 comments · Fixed by #1943
Assignees
Labels
type: bug Issues reporting bugs or unexpected/unwanted behavior unsoundness Issues that can lead to unsoundness or false verification

Comments

@qsctr
Copy link
Contributor

qsctr commented Sep 15, 2023

For the following C program

void test(int *x) {
    *x = 2;
}

and the following SAW script

m <- llvm_load_module "test4.bc";

let setup = do {
  x <- llvm_alloc_readonly (llvm_int 32);
  llvm_execute_func [x];
};

llvm_verify m "test" [] true setup abc;

an error should be reported because the code writes to read-only memory. However SAW instead outputs that the proof is successful:

[01:33:31.665] Loading file "/home/bretton/Documents/github/saw-script/test/test4.saw"
[01:33:31.671] Verifying test ...
[01:33:31.683] Simulating test ...
[01:33:31.687] Checking proof obligations test ...
[01:33:31.738] Proof succeeded! test

This issue seems to only occur with the abc/w4_abc_aiger backend. The z3, w4_abc_smtlib2, w4_abc_verilog, and sbv_abc backends correctly report an error. @andreistefanescu suggests that the problem is in the translation to AIGER.

@qsctr qsctr added the type: bug Issues reporting bugs or unexpected/unwanted behavior label Sep 15, 2023
@RyanGlScott
Copy link
Contributor

RyanGlScott commented Sep 15, 2023

If you print out the proof goal that is discharged to the solver:

llvm_verify m "test" [] true setup (do { print_goal; abc; });

You'll get:

$ ~/Software/saw-1.0/bin/saw test.saw



[11:48:06.151] Loading file "/home/ryanglscott/Documents/Hacking/SAW/test.saw"
[11:48:06.154] Verifying test ...
[11:48:06.164] Simulating test ...
[11:48:06.165] Checking proof obligations test ...
[11:48:06.165] Goal test (goal number 0): safety assertion
at test.c:2:8 in test
test.c:2:8: error: in test
Memory store failed
Details:
  The region wasn't allocated, wasn't large enough, or was marked as readonly

EqTrue (implies True (implies True False))

[11:48:06.187] Proof succeeded! test

implies True (implies True False) definitely shouldn't be provable, as this is equivalent to False. In fact, if we try to prove False directly with abc:

prove_print abc {{ False }};

It succeeds without any errors:

$ ~/Software/saw-1.0/bin/saw wat.saw



[11:49:39.231] Loading file "/home/ryanglscott/Documents/Hacking/SAW/wat.saw"

Proving False with z3 instead of abc, on the other hand, fails as expected:

$ ~/Software/saw-1.0/bin/saw wat.saw



[11:50:17.964] Loading file "/home/ryanglscott/Documents/Hacking/SAW/wat.saw"
[11:50:18.000] Stack trace:
"prove_print" (/home/ryanglscott/Documents/Hacking/SAW/wat.saw:1:1-1:12)
prove: 1 unsolved subgoal(s)
Invalid: []

abc seems quite broken in this regard.

@qsctr qsctr changed the title w4_abc_aiger backend not reporting readonly violation of LLVM code w4_abc_aiger proves False Sep 15, 2023
@qsctr qsctr added the unsoundness Issues that can lead to unsoundness or false verification label Sep 15, 2023
@RyanGlScott
Copy link
Contributor

I think I see what is going on.

When abc produces a counter-example to a property, it writes it to a file named something like foo.cex. For example, abc is able to find a counterexample to the property \(x : [8]) -> ~ (x == x), and when it does, it prints the following to foo.cex:

11111111

Where each 1 represents a single bit of x. SAW will invoke abc and then check foo.cex to see if it contains a counterexample. Straightforward enough, it seems.

Things get weird when the property doesn't have any variables, however. abc actually does find a counterexample for the False property, but it proceeds to write an empty file to foo.cex. With the way SAW is currently set up, it cannot distinguish between an empty file resulting from a valid property versus an empty file resulting from an invalid property's counterexample. As such, it assumes that an empty file always means the property is valid, which definitely isn't the case for False!

I can see two possible ways of rectifying this:

  1. Make sure that foo.cex does not exist prior to invoking abc, and after running abc, check to see if foo.cex was created as a side effect of running abc. If foo.cex doesn't exist, then conclude that the property is valid, as no counterexample was written. If foo.cex does exist, then conclude that the property is invalid, and further conclude that an empty foo.cex file means that the counterexample involves no variables.
  2. Check the standard output from abc after invoking it, which includes a SATISFIABLE or UNSATISFIABLE message to indicate the results. (Note that in this context, UNSATISFIABLE means that the proof is valid, since SAW sends the logical negation of a SATQuery to an abc-based proof.)

Option (1) sounds slightly fiddly, so I'm inclined to try option (2) first.

@RyanGlScott
Copy link
Contributor

w4_abc_verilog also has trouble coping with the False property for similar reasons:

prove_print w4_abc_verilog {{ False }};
$ ~/Software/saw-1.0/bin/saw bug.saw



[17:38:44.024] Loading file "/home/ryanscott/Documents/Hacking/Haskell/saw-script/bug.saw"
[17:38:44.083] Stack trace:
"prove_print" (/home/ryanscott/Documents/Hacking/Haskell/saw-script/bug.saw:1:1-1:12)
"w4_abc_verilog" (/home/ryanscott/Documents/Hacking/Haskell/saw-script/bug.saw:1:13-1:27)
invalid counterexample line:   1 

The invalid counterexample line: 1 error arises due to the output line not listing any values for bits (which there are none). What's more, the output from the particular abc commands that w4_abc_verilog uses does not use the UNSATISFIABLE/SATISFIABLE convention. Instead, it prints something like:

$ abc -q "%read wat.v; %blast; &sweep -C 5000; &syn4; &cec -m; write_aiger_cex wat.cex;"
Assuming the current network is a single-output miter.
Networks are NOT EQUIVALENT. Output 0 trivially differs (different phase).  Time =     0.00 sec

Or:

$ abc -q "%read wat.v; %blast; &sweep -C 5000; &syn4; &cec -m; write_aiger_cex wat.cex;"
Assuming the current network is a single-output miter.
Networks are equivalent.  Time =     0.00 sec
There is no current CEX.

This makes me less enthusiastic about the idea of parsing abc's standard output, i.e., option (2). Luckily, option (1) is still viable.

RyanGlScott added a commit that referenced this issue Sep 19, 2023
Previously, any proofs involving the `w4_abc_aiger` (a.k.a., `abc`) or
`w4_abc_verilog` proof scripts would succeed if they did not involve any
variables, even false properties (e.g., `False`). This happened for a very
silly reason: the counterexamples that the `abc` would generate contained a
blank output (since there are no variables to describe), and SAW was
misinterpreting this as a successful proof. Oops!

With this patch, SAW now properly distinguishes between an successful proof (in
which case no counterexample file will be generated) and a unsuccessful proof
involving no variables (in which case a blank counterexample file will be
generated). This is admittedly a bit fiddly, as it requires making some
assumptions about the format of the counterexample files that `abc` produces.
Nevertheless, this does work on all the examples that I have tried.

Fixes #1938.
@RyanGlScott RyanGlScott self-assigned this Sep 19, 2023
@mergify mergify bot closed this as completed in #1943 Sep 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: bug Issues reporting bugs or unexpected/unwanted behavior unsoundness Issues that can lead to unsoundness or false verification
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants