Skip to content

Commit 2ef3145

Browse files
committed
Add llvm_points_to_bitfield and enable_lax_loads_and_stores
This adds support for writing specifications that talk about bitfields in LLVM code by way of the new `llvm_points_to_bitfield` command. Broadly speaking, `llvm_points_to_bitfield ptr fieldName rhs` is like `llvm_points_to (llvm_field ptr fieldName) rhs`, except that `fieldName` is required to be the name of a field within a bitfield. The salient details are: * LLVM bitcode itself does not a built-in concept of bitfields, but LLVM's debug metadata does. Support for retrieving bitfield-related metadata was added to `llvm-pretty` in GaloisInc/llvm-pretty#90, so this patch bumps the `llvm-pretty` submodule to incorporate it. This patch also updates the `crucible` submodule to incorporate corresponding changes in GaloisInc/crucible#936. * The `LLVMPointsTo` data type now has a new `LLVMPointsToBitfield` data constructor that stores all of the necessary information related to the `llvm_points_to_bitfield` command. As a result, the changes in this patch are fairly insulated from the rest of SAW, as most of the new code involves adding additional cases to handle `LLVMPointsToBitfield`. * Two of the key new functions are `storePointsToBitfieldValue` and `matchPointsToBitfieldValue`, which implement the behavior of `llvm_points_to_bitfield` in pre- and post-conditions. These functions implement the necessary bit-twiddling to store values in and retrieve values out of bitfield. I have left extensive comments in each function describing how all of this works. * Accompanying `llvm_points_to_bitfield` is a new set of `{enable,disable}_lax_loads_and_stores` command, which toggles the Crucible-side option of the same name. When `enable_lax_loads_and_stores` is on, reading from uninitialized memory will return a symbolic value rather than failing outright. This is essential to be able to deal with LLVM bitcode involving bitfields, as reading a field from a bitfield involves reading the entire bitfield at once, which may include parts of the struct that have not been initialized yet. * There are various `test_bitfield_*` test cases under `intTests` to test examples of bitfield-related specifications that should and should not verify. * I have also updated `saw-remote-api` and `saw-client` to handle bitfields as well, along with a Python-specific test case. Fixes #1461.
1 parent a07c01c commit 2ef3145

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

51 files changed

+1615
-98
lines changed

CHANGES.md

+8
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,14 @@
1717
* New command `w4_unint_z3_using` like `w4_unint_z3`, but use the given Z3
1818
tactic.
1919

20+
* A new `llvm_points_to_bitfield` command has been introduced, providing a
21+
version of `llvm_points_to` that is specifically tailored for structs
22+
containing bitfields. In order to use `llvm_points_to_bitfield`, one must
23+
also use the new `enable_lax_loads_and_stores` command, which relaxes some
24+
of Crucible's assumptions about reading from uninitialized memory. (This
25+
command also comes with a corresponding `disable_lax_loads_and_stores`
26+
command.) For more details on how each of these commands should be used,
27+
consult the "Bitfields" section of the SAW manual.
2028

2129
# Version 0.9
2230

deps/llvm-pretty

doc/manual/manual.md

+97
Original file line numberDiff line numberDiff line change
@@ -1212,6 +1212,31 @@ Ultimately, we plan to implement a more generic tactic that leaves
12121212
certain constants uninterpreted in whatever prover is ultimately used
12131213
(provided that uninterpreted functions are expressible in the prover).
12141214

1215+
Note that each of the `unint_*` tactics have variants that are prefixed
1216+
with `sbv_` and `w4_`. The `sbv_`-prefixed tactics make use of the SBV
1217+
library to represent and solve SMT queries:
1218+
1219+
* `sbv_unint_cvc4 : [String] -> ProofScript ()`
1220+
1221+
* `sbv_unint_yices : [String] -> ProofScript ()`
1222+
1223+
* `sbv_unint_z3 : [String] -> ProofScript ()`
1224+
1225+
The `w4_`-prefixed tactics make use of the What4 library instead of SBV:
1226+
1227+
* `w4_unint_cvc4 : [String] -> ProofScript ()`
1228+
1229+
* `w4_unint_yices : [String] -> ProofScript ()`
1230+
1231+
* `w4_unint_z3 : [String] -> ProofScript ()`
1232+
1233+
In most specifications, the choice of SBV versus What4 is not important, as
1234+
both libraries are broadly compatible in terms of functionality. There are some
1235+
situations where one library may outpeform the other, however, due to
1236+
differences in how each library represents certain SMT queries. There are also
1237+
some experimental features that are only supported with What4 at the moment,
1238+
such as `enable_lax_loads_and_stores`.
1239+
12151240
## Other External Provers
12161241

12171242
In addition to the built-in automated provers already discussed, SAW
@@ -2218,6 +2243,78 @@ the value of an array element.
22182243
* `jvm_field_is : JVMValue -> String -> JVMValue -> JVMSetup ()`
22192244
specifies the name of an object field.
22202245

2246+
### Bitfields
2247+
2248+
SAW has experimental support for specifying `struct`s with bitfields, such as
2249+
in the following example:
2250+
2251+
~~~~ .c
2252+
struct s {
2253+
uint8_t x:1;
2254+
uint8_t y:1;
2255+
};
2256+
~~~~
2257+
2258+
Normally, a `struct` with two `uint8_t` fields would have an overall size of
2259+
two bytes. However, because the `x` and `y` fields are declared with bitfield
2260+
syntax, they are instead packed together into a single byte.
2261+
2262+
Because bitfields have somewhat unusual memory representations in LLVM, some
2263+
special care is required to write SAW specifications involving bitfields. For
2264+
this reason, there is a dedicated `llvm_points_to_bitfield` function for this
2265+
purpose:
2266+
2267+
* `llvm_points_to_bitfield : SetupValue -> String -> SetupValue -> CrucibleSetup ()`
2268+
2269+
The type of `llvm_points_to_bitfield` is similar that of `llvm_points_to`,
2270+
except that it takes the name of a field within a bitfield as an additional
2271+
argument. For example, here is how to assert that the `y` field in the `struct`
2272+
example above should be `0`:
2273+
2274+
~~~~
2275+
ss <- llvm_alloc (llvm_alias "struct.s");
2276+
llvm_points_to_bitfield ss "y" (llvm_term {{ 0 : [1] }});
2277+
~~~~
2278+
2279+
Note that the type of the right-hand side value (`0`, in this example) must
2280+
be a bitvector whose length is equal to the size of the field within the
2281+
bitfield. In this example, the `y` field was declared as `y:1`, so `y`'s value
2282+
must be of type `[1]`.
2283+
2284+
Note that the following specification is _not_ equivalent to the one above:
2285+
2286+
~~~~
2287+
ss <- llvm_alloc (llvm_alias "struct.s");
2288+
llvm_points_to (llvm_field ss "y") (llvm_term {{ 0 : [1] }});
2289+
~~~~
2290+
2291+
`llvm_points_to` works quite differently from `llvm_points_to_bitfield` under
2292+
the hood, so using `llvm_points_to` on bitfields will almost certainly not work
2293+
as expected.
2294+
2295+
In order to use `llvm_points_to_bitfield`, one must also use the
2296+
`enable_lax_loads_and_stores` command:
2297+
2298+
* `enable_lax_loads_and_stores: TopLevel ()`
2299+
2300+
Both `llvm_points_to_bitfield` and `enable_lax_loads_and_stores` are
2301+
experimental commands, so these also require using `enable_experimental` before
2302+
they can be used.
2303+
2304+
The `enable_lax_loads_and_stores` command relaxes some
2305+
of SAW's assumptions about uninitialized memory, which is necessary to make
2306+
`llvm_points_to_bitfield` work under the hood. For example, reading from
2307+
uninitialized memory normally results in an error in SAW, but with
2308+
`enable_lax_loads_and_stores`, such a read will instead return a symbolic
2309+
value. At present, `enable_lax_loads_and_stores` only works with What4-based
2310+
tactics (e.g., `w4_unint_z3`); using it with SBV-based tactics
2311+
(e.g., `sbv_unint_z3`) will result in an error.
2312+
2313+
Note that SAW relies on LLVM debug metadata in order to determine which struct
2314+
fields reside within a bitfield. As a result, you must pass `-g` to `clang` when
2315+
compiling code involving bitfields in order for SAW to be able to reason about
2316+
them.
2317+
22212318
## Global variables
22222319

22232320
Mutable global variables that are accessed in a function must first be allocated

intTests/test_bitfield_basic/Makefile

+11
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
CC = clang
2+
CFLAGS = -g -emit-llvm -frecord-command-line -O0
3+
4+
all: test.bc
5+
6+
test.bc: test.c
7+
$(CC) $(CFLAGS) -c $< -o $@
8+
9+
.PHONY: clean
10+
clean:
11+
rm -f test.bc

intTests/test_bitfield_basic/test.bc

4.72 KB
Binary file not shown.

intTests/test_bitfield_basic/test.c

+34
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
#include <stdbool.h>
2+
#include <stdint.h>
3+
4+
struct s {
5+
int32_t w;
6+
uint8_t x1:1;
7+
uint8_t x2:2;
8+
uint8_t y:1;
9+
int32_t z;
10+
};
11+
12+
uint8_t get_x2(struct s *ss) {
13+
return ss->x2;
14+
}
15+
16+
bool get_y(struct s *ss) {
17+
return ss->y;
18+
}
19+
20+
void set_x2(struct s *ss, uint8_t x2) {
21+
ss->x2 = x2;
22+
}
23+
24+
void set_y(struct s *ss, bool y) {
25+
ss->y = y;
26+
}
27+
28+
void set_x2_alt(struct s *ss, uint8_t x2) {
29+
set_x2(ss, x2);
30+
}
31+
32+
void set_y_alt(struct s *ss, bool y) {
33+
set_y(ss, y);
34+
}

intTests/test_bitfield_basic/test.saw

+55
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,55 @@
1+
enable_experimental;
2+
enable_lax_loads_and_stores;
3+
4+
let get_x2_spec = do {
5+
ss <- llvm_alloc (llvm_alias "struct.s");
6+
z <- llvm_fresh_var "z" (llvm_int 2);
7+
llvm_points_to_bitfield ss "x2" (llvm_term z);
8+
llvm_execute_func [ss];
9+
llvm_return (llvm_term {{ zext z : [8] }});
10+
};
11+
12+
let get_y_spec = do {
13+
ss <- llvm_alloc (llvm_alias "struct.s");
14+
z <- llvm_fresh_var "z" (llvm_int 1);
15+
llvm_points_to_bitfield ss "y" (llvm_term z);
16+
llvm_execute_func [ss];
17+
llvm_return (llvm_term z);
18+
};
19+
20+
let set_x2_spec = do {
21+
ss <- llvm_alloc (llvm_alias "struct.s");
22+
z <- llvm_fresh_var "z" (llvm_int 8);
23+
llvm_execute_func [ss, llvm_term z];
24+
llvm_points_to_bitfield ss "x2" (llvm_term {{ drop z : [2] }});
25+
};
26+
27+
let set_x2_alt_spec = do {
28+
ss <- llvm_alloc (llvm_alias "struct.s");
29+
z <- llvm_fresh_var "z" (llvm_int 2);
30+
llvm_execute_func [ss, llvm_term {{ zext z : [8] }}];
31+
llvm_points_to_bitfield ss "x2" (llvm_term z);
32+
};
33+
34+
let set_y_spec = do {
35+
ss <- llvm_alloc (llvm_alias "struct.s");
36+
z <- llvm_fresh_var "z" (llvm_int 1);
37+
llvm_execute_func [ss, llvm_term z];
38+
llvm_points_to_bitfield ss "y" (llvm_term z);
39+
};
40+
41+
let set_y_alt_spec = set_y_spec;
42+
43+
m <- llvm_load_module "test.bc";
44+
45+
llvm_verify m "get_x2" [] false get_x2_spec (w4_unint_z3 []);
46+
llvm_verify m "get_y" [] false get_y_spec (w4_unint_z3 []);
47+
llvm_verify m "set_x2" [] false set_x2_spec (w4_unint_z3 []);
48+
llvm_verify m "set_x2_alt" [] false set_x2_alt_spec (w4_unint_z3 []);
49+
llvm_verify m "set_y" [] false set_y_spec (w4_unint_z3 []);
50+
llvm_verify m "set_y_alt" [] false set_y_alt_spec (w4_unint_z3 []);
51+
52+
set_x2_ov <- llvm_unsafe_assume_spec m "set_x2" set_x2_spec;
53+
llvm_verify m "set_x2_alt" [set_x2_ov] false set_x2_alt_spec (w4_unint_z3 []);
54+
set_y_ov <- llvm_unsafe_assume_spec m "set_y" set_y_spec;
55+
llvm_verify m "set_y_alt" [set_y_ov] false set_y_alt_spec (w4_unint_z3 []);

intTests/test_bitfield_basic/test.sh

+5
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
#!/usr/bin/env bash
2+
3+
set -e
4+
5+
$SAW test.saw
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
CC = clang
2+
CFLAGS = -g -emit-llvm -frecord-command-line -O0
3+
4+
all: test.bc
5+
6+
test.bc: test.c
7+
$(CC) $(CFLAGS) -c $< -o $@
8+
9+
.PHONY: clean
10+
clean:
11+
rm -f test.bc
Binary file not shown.
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
#include <stdbool.h>
2+
#include <stdint.h>
3+
4+
struct s {
5+
int32_t w;
6+
uint8_t x1:1;
7+
uint8_t x2:2;
8+
uint8_t y:1;
9+
int32_t z;
10+
};
11+
12+
bool get_y(struct s *ss) {
13+
return ss->y;
14+
}
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
enable_experimental;
2+
enable_lax_loads_and_stores;
3+
4+
let get_y_spec = do {
5+
ss <- llvm_alloc (llvm_alias "struct.s");
6+
z <- llvm_fresh_var "z" (llvm_int 1);
7+
// Duplicate llvm_points_to_bitfield statements involving `y`
8+
llvm_points_to_bitfield ss "y" (llvm_term z);
9+
llvm_points_to_bitfield ss "y" (llvm_term z);
10+
llvm_execute_func [ss];
11+
llvm_return (llvm_term z);
12+
};
13+
14+
m <- llvm_load_module "test.bc";
15+
fails (llvm_verify m "get_y" [] false get_y_spec (w4_unint_z3 []));
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
#!/usr/bin/env bash
2+
3+
set -e
4+
5+
$SAW test.saw
+11
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
CC = clang
2+
CFLAGS = -g -emit-llvm -frecord-command-line -O0
3+
4+
all: test.bc
5+
6+
test.bc: test.c
7+
$(CC) $(CFLAGS) -c $< -o $@
8+
9+
.PHONY: clean
10+
clean:
11+
rm -f test.bc
4.73 KB
Binary file not shown.
+34
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
#include <stdbool.h>
2+
#include <stdint.h>
3+
4+
struct s {
5+
int32_t w;
6+
uint8_t x1:1;
7+
uint8_t x2:2;
8+
uint8_t y:1;
9+
int32_t z;
10+
};
11+
12+
uint8_t get_x2(struct s *ss) {
13+
return ss->x2;
14+
}
15+
16+
bool get_y(struct s *ss) {
17+
return ss->y;
18+
}
19+
20+
void set_x2(struct s *ss, uint8_t x2) {
21+
ss->x2 = x2;
22+
}
23+
24+
void set_y(struct s *ss, bool y) {
25+
ss->y = y;
26+
}
27+
28+
void set_x2_alt(struct s *ss, uint8_t x2) {
29+
set_x2(ss, x2);
30+
}
31+
32+
void set_y_alt(struct s *ss, bool y) {
33+
set_y(ss, y);
34+
}
+55
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,55 @@
1+
enable_experimental;
2+
enable_lax_loads_and_stores;
3+
enable_smt_array_memory_model;
4+
5+
let get_x2_spec = do {
6+
ss <- llvm_alloc (llvm_alias "struct.s");
7+
z <- llvm_fresh_var "z" (llvm_int 2);
8+
llvm_points_to_bitfield ss "x2" (llvm_term z);
9+
llvm_execute_func [ss];
10+
llvm_return (llvm_term {{ zext z : [8] }});
11+
};
12+
13+
let get_y_spec = do {
14+
ss <- llvm_alloc (llvm_alias "struct.s");
15+
z <- llvm_fresh_var "z" (llvm_int 1);
16+
llvm_points_to_bitfield ss "y" (llvm_term z);
17+
llvm_execute_func [ss];
18+
llvm_return (llvm_term z);
19+
};
20+
21+
let set_x2_spec = do {
22+
ss <- llvm_alloc (llvm_alias "struct.s");
23+
z <- llvm_fresh_var "z" (llvm_int 8);
24+
llvm_execute_func [ss, llvm_term z];
25+
llvm_points_to_bitfield ss "x2" (llvm_term {{ drop z : [2] }});
26+
};
27+
28+
let set_x2_alt_spec = do {
29+
ss <- llvm_alloc (llvm_alias "struct.s");
30+
z <- llvm_fresh_var "z" (llvm_int 2);
31+
llvm_execute_func [ss, llvm_term {{ zext z : [8] }}];
32+
llvm_points_to_bitfield ss "x2" (llvm_term z);
33+
};
34+
35+
let set_y_spec = do {
36+
ss <- llvm_alloc (llvm_alias "struct.s");
37+
z <- llvm_fresh_var "z" (llvm_int 1);
38+
llvm_execute_func [ss, llvm_term z];
39+
llvm_points_to_bitfield ss "y" (llvm_term z);
40+
};
41+
42+
let set_y_alt_spec = set_y_spec;
43+
44+
m <- llvm_load_module "test.bc";
45+
46+
llvm_verify m "get_x2" [] false get_x2_spec (w4_unint_z3 []);
47+
llvm_verify m "get_y" [] false get_y_spec (w4_unint_z3 []);
48+
llvm_verify m "set_x2" [] false set_x2_spec (w4_unint_z3 []);
49+
llvm_verify m "set_x2_alt" [] false set_x2_alt_spec (w4_unint_z3 []);
50+
llvm_verify m "set_y" [] false set_y_spec (w4_unint_z3 []);
51+
52+
set_x2_ov <- llvm_unsafe_assume_spec m "set_x2" set_x2_spec;
53+
llvm_verify m "set_x2_alt" [set_x2_ov] false set_x2_alt_spec (w4_unint_z3 []);
54+
set_y_ov <- llvm_unsafe_assume_spec m "set_y" set_y_spec;
55+
llvm_verify m "set_y_alt" [set_y_ov] false set_y_alt_spec (w4_unint_z3 []);

0 commit comments

Comments
 (0)