`w4_abc_aiger` proves `False` #1938

qsctr · 2023-09-15T01:48:36Z

For the following C program

void test(int *x) {
    *x = 2;
}

and the following SAW script

m <- llvm_load_module "test4.bc";

let setup = do {
  x <- llvm_alloc_readonly (llvm_int 32);
  llvm_execute_func [x];
};

llvm_verify m "test" [] true setup abc;

an error should be reported because the code writes to read-only memory. However SAW instead outputs that the proof is successful:

[01:33:31.665] Loading file "/home/bretton/Documents/github/saw-script/test/test4.saw"
[01:33:31.671] Verifying test ...
[01:33:31.683] Simulating test ...
[01:33:31.687] Checking proof obligations test ...
[01:33:31.738] Proof succeeded! test

This issue seems to only occur with the abc/w4_abc_aiger backend. The z3, w4_abc_smtlib2, w4_abc_verilog, and sbv_abc backends correctly report an error. @andreistefanescu suggests that the problem is in the translation to AIGER.

The text was updated successfully, but these errors were encountered:

RyanGlScott · 2023-09-15T11:52:19Z

If you print out the proof goal that is discharged to the solver:

llvm_verify m "test" [] true setup (do { print_goal; abc; });

You'll get:

$ ~/Software/saw-1.0/bin/saw test.saw



[11:48:06.151] Loading file "/home/ryanglscott/Documents/Hacking/SAW/test.saw"
[11:48:06.154] Verifying test ...
[11:48:06.164] Simulating test ...
[11:48:06.165] Checking proof obligations test ...
[11:48:06.165] Goal test (goal number 0): safety assertion
at test.c:2:8 in test
test.c:2:8: error: in test
Memory store failed
Details:
  The region wasn't allocated, wasn't large enough, or was marked as readonly

EqTrue (implies True (implies True False))

[11:48:06.187] Proof succeeded! test

implies True (implies True False) definitely shouldn't be provable, as this is equivalent to False. In fact, if we try to prove False directly with abc:

prove_print abc {{ False }};

It succeeds without any errors:

$ ~/Software/saw-1.0/bin/saw wat.saw



[11:49:39.231] Loading file "/home/ryanglscott/Documents/Hacking/SAW/wat.saw"

Proving False with z3 instead of abc, on the other hand, fails as expected:

$ ~/Software/saw-1.0/bin/saw wat.saw



[11:50:17.964] Loading file "/home/ryanglscott/Documents/Hacking/SAW/wat.saw"
[11:50:18.000] Stack trace:
"prove_print" (/home/ryanglscott/Documents/Hacking/SAW/wat.saw:1:1-1:12)
prove: 1 unsolved subgoal(s)
Invalid: []

abc seems quite broken in this regard.

RyanGlScott · 2023-09-18T19:57:39Z

I think I see what is going on.

When abc produces a counter-example to a property, it writes it to a file named something like foo.cex. For example, abc is able to find a counterexample to the property \(x : [8]) -> ~ (x == x), and when it does, it prints the following to foo.cex:

11111111

Where each 1 represents a single bit of x. SAW will invoke abc and then check foo.cex to see if it contains a counterexample. Straightforward enough, it seems.

Things get weird when the property doesn't have any variables, however. abc actually does find a counterexample for the False property, but it proceeds to write an empty file to foo.cex. With the way SAW is currently set up, it cannot distinguish between an empty file resulting from a valid property versus an empty file resulting from an invalid property's counterexample. As such, it assumes that an empty file always means the property is valid, which definitely isn't the case for False!

I can see two possible ways of rectifying this:

Make sure that foo.cex does not exist prior to invoking abc, and after running abc, check to see if foo.cex was created as a side effect of running abc. If foo.cex doesn't exist, then conclude that the property is valid, as no counterexample was written. If foo.cex does exist, then conclude that the property is invalid, and further conclude that an empty foo.cex file means that the counterexample involves no variables.
Check the standard output from abc after invoking it, which includes a SATISFIABLE or UNSATISFIABLE message to indicate the results. (Note that in this context, UNSATISFIABLE means that the proof is valid, since SAW sends the logical negation of a SATQuery to an abc-based proof.)

Option (1) sounds slightly fiddly, so I'm inclined to try option (2) first.

RyanGlScott · 2023-09-19T17:42:18Z

w4_abc_verilog also has trouble coping with the False property for similar reasons:

prove_print w4_abc_verilog {{ False }};

$ ~/Software/saw-1.0/bin/saw bug.saw



[17:38:44.024] Loading file "/home/ryanscott/Documents/Hacking/Haskell/saw-script/bug.saw"
[17:38:44.083] Stack trace:
"prove_print" (/home/ryanscott/Documents/Hacking/Haskell/saw-script/bug.saw:1:1-1:12)
"w4_abc_verilog" (/home/ryanscott/Documents/Hacking/Haskell/saw-script/bug.saw:1:13-1:27)
invalid counterexample line:   1

The invalid counterexample line: 1 error arises due to the output line not listing any values for bits (which there are none). What's more, the output from the particular abc commands that w4_abc_verilog uses does not use the UNSATISFIABLE/SATISFIABLE convention. Instead, it prints something like:

$ abc -q "%read wat.v; %blast; &sweep -C 5000; &syn4; &cec -m; write_aiger_cex wat.cex;"
Assuming the current network is a single-output miter.
Networks are NOT EQUIVALENT. Output 0 trivially differs (different phase).  Time =     0.00 sec

Or:

$ abc -q "%read wat.v; %blast; &sweep -C 5000; &syn4; &cec -m; write_aiger_cex wat.cex;"
Assuming the current network is a single-output miter.
Networks are equivalent.  Time =     0.00 sec
There is no current CEX.

This makes me less enthusiastic about the idea of parsing abc's standard output, i.e., option (2). Luckily, option (1) is still viable.

Previously, any proofs involving the `w4_abc_aiger` (a.k.a., `abc`) or `w4_abc_verilog` proof scripts would succeed if they did not involve any variables, even false properties (e.g., `False`). This happened for a very silly reason: the counterexamples that the `abc` would generate contained a blank output (since there are no variables to describe), and SAW was misinterpreting this as a successful proof. Oops! With this patch, SAW now properly distinguishes between an successful proof (in which case no counterexample file will be generated) and a unsuccessful proof involving no variables (in which case a blank counterexample file will be generated). This is admittedly a bit fiddly, as it requires making some assumptions about the format of the counterexample files that `abc` produces. Nevertheless, this does work on all the examples that I have tried. Fixes #1938.

qsctr added the type: bug Issues reporting bugs or unexpected/unwanted behavior label Sep 15, 2023

qsctr changed the title ~~w4_abc_aiger backend not reporting readonly violation of LLVM code~~ w4_abc_aiger proves False Sep 15, 2023

qsctr added the unsoundness Issues that can lead to unsoundness or false verification label Sep 15, 2023

RyanGlScott self-assigned this Sep 19, 2023

RyanGlScott mentioned this issue Sep 19, 2023

w4_abc_{aiger,verilog}: Handle variable-less properties correctly #1943

Merged

mergify bot closed this as completed in #1943 Sep 20, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`w4_abc_aiger` proves `False` #1938

`w4_abc_aiger` proves `False` #1938

qsctr commented Sep 15, 2023

RyanGlScott commented Sep 15, 2023 •

edited

Loading

RyanGlScott commented Sep 18, 2023

RyanGlScott commented Sep 19, 2023

w4_abc_aiger proves False #1938

w4_abc_aiger proves False #1938

Comments

qsctr commented Sep 15, 2023

RyanGlScott commented Sep 15, 2023 • edited Loading

RyanGlScott commented Sep 18, 2023

RyanGlScott commented Sep 19, 2023

`w4_abc_aiger` proves `False` #1938

`w4_abc_aiger` proves `False` #1938

RyanGlScott commented Sep 15, 2023 •

edited

Loading