Skip to content

Commit

Permalink
Draft: Add llvm_points_to_bitfield and enable_lax_loads_and_stores
Browse files Browse the repository at this point in the history
This adds support for writing specifications that talk about bitfields in LLVM
code by way of the new `llvm_points_to_bitfield` command. Broadly speaking,
`llvm_points_to_bitfield ptr fieldName rhs` is like
`llvm_points_to (llvm_field ptr fieldName) rhs`, except that `fieldName` is
required to be the name of a field within a bitfield. The salient details are:

* LLVM bitcode itself does not a built-in concept of bitfields, but LLVM's
  debug metadata does. Support for retrieving bitfield-related metadata was
  added to `llvm-pretty` in GaloisInc/llvm-pretty#90, so this patch bumps the
  `llvm-pretty` submodule to incorporate it. This patch also updates the
  `crucible` submodule to incorporate corresponding changes in
  GaloisInc/crucible#936.

* The `LLVMPointsTo` data type now has a new `LLVMPointsToBitfield` data
  constructor that stores all of the necessary information related to the
  `llvm_points_to_bitfield` command. As a result, the changes in this patch
  are fairly insulated from the rest of SAW, as most of the new code involves
  adding additional cases to handle `LLVMPointsToBitfield`.

* Two of the key new functions are `storePointsToBitfieldValue` and
  `matchPointsToBitfieldValue`, which implement the behavior of
  `llvm_points_to_bitfield` in pre- and post-conditions. These functions
  implement the necessary bit-twiddling to store values in and retrieve values
  out of bitfield. I have left extensive comments in each function describing
  how all of this works.

* Accompanying `llvm_points_to_bitfield` is a new set of
  `{enable,disable}_lax_loads_and_stores` command, which toggles the
  Crucible-side option of the same name. When `enable_lax_loads_and_stores` is
  on, reading from uninitialized memory will return a symbolic value rather
  than failing outright. This is essential to be able to deal with LLVM bitcode
  involving bitfields, as reading a field from a bitfield involves reading the
  entire bitfield at once, which may include parts of the struct that have not
  been initialized yet.

* There are various `test_bitfield_*` test cases under `intTests` to test
  examples of bitfield-related specifications that should and should not
  verify.

* I have also updated `saw-remote-api` and `saw-client` to handle bitfields as
  well, along with a Python-specific test case.

Fixes #1461.
  • Loading branch information
RyanGlScott committed Dec 8, 2021
1 parent f0cc92d commit 3b66a82
Show file tree
Hide file tree
Showing 51 changed files with 1,571 additions and 86 deletions.
8 changes: 8 additions & 0 deletions CHANGES.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,14 @@
* New command `w4_unint_z3_using` like `w4_unint_z3`, but use the given Z3
tactic.

* A new `llvm_points_to_bitfield` command has been introduced, providing a
version of `llvm_points_to` that is specifically tailored for structs
containing bitfields. In order to use `llvm_points_to_bitfield`, one must
also use the new `enable_lax_loads_and_stores` command, which relaxes some
of Crucible's assumptions about reading from uninitialized memory. (This
command also comes with a corresponding `disable_lax_loads_and_stores`
command.) For more details on how each of these commands should be used,
consult the "Bitfields" section of the SAW manual.

# Version 0.9

Expand Down
2 changes: 1 addition & 1 deletion deps/llvm-pretty
97 changes: 97 additions & 0 deletions doc/manual/manual.md
Original file line number Diff line number Diff line change
Expand Up @@ -1212,6 +1212,31 @@ Ultimately, we plan to implement a more generic tactic that leaves
certain constants uninterpreted in whatever prover is ultimately used
(provided that uninterpreted functions are expressible in the prover).

Note that each of the `unint_*` tactics have variants that are prefixed
with `sbv_` and `w4_`. The `sbv_`-prefixed tactics make use of the SBV
library to represent and solve SMT queries:

* `sbv_unint_cvc4 : [String] -> ProofScript ()`

* `sbv_unint_yices : [String] -> ProofScript ()`

* `sbv_unint_z3 : [String] -> ProofScript ()`

The `w4_`-prefixed tactics make use of the What4 library instead of SBV:

* `w4_unint_cvc4 : [String] -> ProofScript ()`

* `w4_unint_yices : [String] -> ProofScript ()`

* `w4_unint_z3 : [String] -> ProofScript ()`

In most specifications, the choice of SBV versus What4 is not important, as
both libraries are broadly compatible in terms of functionality. There are some
situations where one library may outpeform the other, however, due to
differences in how each library represents certain SMT queries. There are also
some experimental features that are only supported with What4 at the moment,
such as `enable_lax_loads_and_stores`.

## Other External Provers

In addition to the built-in automated provers already discussed, SAW
Expand Down Expand Up @@ -2218,6 +2243,78 @@ the value of an array element.
* `jvm_field_is : JVMValue -> String -> JVMValue -> JVMSetup ()`
specifies the name of an object field.

### Bitfields

SAW has experimental support for specifying `struct`s with bitfields, such as
in the following example:

~~~~ .c
struct s {
uint8_t x:1;
uint8_t y:1;
};
~~~~

Normally, a `struct` with two `uint8_t` fields would have an overall size of
two bytes. However, because the `x` and `y` fields are declared with bitfield
syntax, they are instead packed together into a single byte.

Because bitfields have somewhat unusual memory representations in LLVM, some
special care is required to write SAW specifications involving bitfields. For
this reason, there is a dedicated `llvm_points_to_bitfield` function for this
purpose:

* `llvm_points_to_bitfield : SetupValue -> String -> SetupValue -> CrucibleSetup ()`

The type of `llvm_points_to_bitfield` is similar that of `llvm_points_to`,
except that it takes the name of a field within a bitfield as an additional
argument. For example, here is how to assert that the `y` field in the `struct`
example above should be `0`:

~~~~
ss <- llvm_alloc (llvm_alias "struct.s");
llvm_points_to_bitfield ss "y" (llvm_term {{ 0 : [1] }});
~~~~

Note that the type of the right-hand side value (`0`, in this example) must
be a bitvector whose length is equal to the size of the field within the
bitfield. In this example, the `y` field was declared as `y:1`, so `y`'s value
must be of type `[1]`.

Note that the following specification is _not_ equivalent to the one above:

~~~~
ss <- llvm_alloc (llvm_alias "struct.s");
llvm_points_to (llvm_field ss "y") (llvm_term {{ 0 : [1] }});
~~~~

`llvm_points_to` works quite differently from `llvm_points_to_bitfield` under
the hood, so using `llvm_points_to` on bitfields will almost certainly not work
as expected.

In order to use `llvm_points_to_bitfield`, one must also use the
`enable_lax_loads_and_stores` command:

* `enable_lax_loads_and_stores: TopLevel ()`

Both `llvm_points_to_bitfield` and `enable_lax_loads_and_stores` are
experimental commands, so these also require using `enable_experimental` before
they can be used.

The `enable_lax_loads_and_stores` command relaxes some
of SAW's assumptions about uninitialized memory, which is necessary to make
`llvm_points_to_bitfield` work under the hood. For example, reading from
uninitialized memory normally results in an error in SAW, but with
`enable_lax_loads_and_stores`, such a read will instead return a symbolic
value. At present, `enable_lax_loads_and_stores` only works with What4-based
tactics (e.g., `w4_unint_z3`); using it with SBV-based tactics
(e.g., `sbv_unint_z3`) will result in an error.

Note that SAW relies on LLVM debug metadata in order to determine which struct
fields reside within a bitfield. As a result, you must pass `-g` to `clang` when
compiling code involving bitfields in order for SAW to be able to reason about
them.

## Global variables

Mutable global variables that are accessed in a function must first be allocated
Expand Down
11 changes: 11 additions & 0 deletions intTests/test_bitfield_basic/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
CC = clang
CFLAGS = -g -emit-llvm -frecord-command-line -O0

all: test.bc

test.bc: test.c
$(CC) $(CFLAGS) -c $< -o $@

.PHONY: clean
clean:
rm -f test.bc
Binary file added intTests/test_bitfield_basic/test.bc
Binary file not shown.
34 changes: 34 additions & 0 deletions intTests/test_bitfield_basic/test.c
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
#include <stdbool.h>
#include <stdint.h>

struct s {
int32_t w;
uint8_t x1:1;
uint8_t x2:2;
uint8_t y:1;
int32_t z;
};

uint8_t get_x2(struct s *ss) {
return ss->x2;
}

bool get_y(struct s *ss) {
return ss->y;
}

void set_x2(struct s *ss, uint8_t x2) {
ss->x2 = x2;
}

void set_y(struct s *ss, bool y) {
ss->y = y;
}

void set_x2_alt(struct s *ss, uint8_t x2) {
set_x2(ss, x2);
}

void set_y_alt(struct s *ss, bool y) {
set_y(ss, y);
}
55 changes: 55 additions & 0 deletions intTests/test_bitfield_basic/test.saw
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
enable_experimental;
enable_lax_loads_and_stores;

let get_x2_spec = do {
ss <- llvm_alloc (llvm_alias "struct.s");
z <- llvm_fresh_var "z" (llvm_int 2);
llvm_points_to_bitfield ss "x2" (llvm_term z);
llvm_execute_func [ss];
llvm_return (llvm_term {{ zext z : [8] }});
};

let get_y_spec = do {
ss <- llvm_alloc (llvm_alias "struct.s");
z <- llvm_fresh_var "z" (llvm_int 1);
llvm_points_to_bitfield ss "y" (llvm_term z);
llvm_execute_func [ss];
llvm_return (llvm_term z);
};

let set_x2_spec = do {
ss <- llvm_alloc (llvm_alias "struct.s");
z <- llvm_fresh_var "z" (llvm_int 8);
llvm_execute_func [ss, llvm_term z];
llvm_points_to_bitfield ss "x2" (llvm_term {{ drop z : [2] }});
};

let set_x2_alt_spec = do {
ss <- llvm_alloc (llvm_alias "struct.s");
z <- llvm_fresh_var "z" (llvm_int 2);
llvm_execute_func [ss, llvm_term {{ zext z : [8] }}];
llvm_points_to_bitfield ss "x2" (llvm_term z);
};

let set_y_spec = do {
ss <- llvm_alloc (llvm_alias "struct.s");
z <- llvm_fresh_var "z" (llvm_int 1);
llvm_execute_func [ss, llvm_term z];
llvm_points_to_bitfield ss "y" (llvm_term z);
};

let set_y_alt_spec = set_y_spec;

m <- llvm_load_module "test.bc";

llvm_verify m "get_x2" [] false get_x2_spec (w4_unint_z3 []);
llvm_verify m "get_y" [] false get_y_spec (w4_unint_z3 []);
llvm_verify m "set_x2" [] false set_x2_spec (w4_unint_z3 []);
llvm_verify m "set_x2_alt" [] false set_x2_alt_spec (w4_unint_z3 []);
llvm_verify m "set_y" [] false set_y_spec (w4_unint_z3 []);
llvm_verify m "set_y_alt" [] false set_y_alt_spec (w4_unint_z3 []);

set_x2_ov <- llvm_unsafe_assume_spec m "set_x2" set_x2_spec;
llvm_verify m "set_x2_alt" [set_x2_ov] false set_x2_alt_spec (w4_unint_z3 []);
set_y_ov <- llvm_unsafe_assume_spec m "set_y" set_y_spec;
llvm_verify m "set_y_alt" [set_y_ov] false set_y_alt_spec (w4_unint_z3 []);
5 changes: 5 additions & 0 deletions intTests/test_bitfield_basic/test.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
#!/usr/bin/env bash

set -e

$SAW test.saw
11 changes: 11 additions & 0 deletions intTests/test_bitfield_duplicate_points_to/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
CC = clang
CFLAGS = -g -emit-llvm -frecord-command-line -O0

all: test.bc

test.bc: test.c
$(CC) $(CFLAGS) -c $< -o $@

.PHONY: clean
clean:
rm -f test.bc
Binary file not shown.
14 changes: 14 additions & 0 deletions intTests/test_bitfield_duplicate_points_to/test.c
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
#include <stdbool.h>
#include <stdint.h>

struct s {
int32_t w;
uint8_t x1:1;
uint8_t x2:2;
uint8_t y:1;
int32_t z;
};

bool get_y(struct s *ss) {
return ss->y;
}
15 changes: 15 additions & 0 deletions intTests/test_bitfield_duplicate_points_to/test.saw
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
enable_experimental;
enable_lax_loads_and_stores;

let get_y_spec = do {
ss <- llvm_alloc (llvm_alias "struct.s");
z <- llvm_fresh_var "z" (llvm_int 1);
// Duplicate llvm_points_to_bitfield statements involving `y`
llvm_points_to_bitfield ss "y" (llvm_term z);
llvm_points_to_bitfield ss "y" (llvm_term z);
llvm_execute_func [ss];
llvm_return (llvm_term z);
};

m <- llvm_load_module "test.bc";
fails (llvm_verify m "get_y" [] false get_y_spec (w4_unint_z3 []));
5 changes: 5 additions & 0 deletions intTests/test_bitfield_duplicate_points_to/test.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
#!/usr/bin/env bash

set -e

$SAW test.saw
11 changes: 11 additions & 0 deletions intTests/test_bitfield_smt_array/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
CC = clang
CFLAGS = -g -emit-llvm -frecord-command-line -O0

all: test.bc

test.bc: test.c
$(CC) $(CFLAGS) -c $< -o $@

.PHONY: clean
clean:
rm -f test.bc
Binary file added intTests/test_bitfield_smt_array/test.bc
Binary file not shown.
34 changes: 34 additions & 0 deletions intTests/test_bitfield_smt_array/test.c
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
#include <stdbool.h>
#include <stdint.h>

struct s {
int32_t w;
uint8_t x1:1;
uint8_t x2:2;
uint8_t y:1;
int32_t z;
};

uint8_t get_x2(struct s *ss) {
return ss->x2;
}

bool get_y(struct s *ss) {
return ss->y;
}

void set_x2(struct s *ss, uint8_t x2) {
ss->x2 = x2;
}

void set_y(struct s *ss, bool y) {
ss->y = y;
}

void set_x2_alt(struct s *ss, uint8_t x2) {
set_x2(ss, x2);
}

void set_y_alt(struct s *ss, bool y) {
set_y(ss, y);
}
55 changes: 55 additions & 0 deletions intTests/test_bitfield_smt_array/test.saw
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
enable_experimental;
enable_lax_loads_and_stores;
enable_smt_array_memory_model;

let get_x2_spec = do {
ss <- llvm_alloc (llvm_alias "struct.s");
z <- llvm_fresh_var "z" (llvm_int 2);
llvm_points_to_bitfield ss "x2" (llvm_term z);
llvm_execute_func [ss];
llvm_return (llvm_term {{ zext z : [8] }});
};

let get_y_spec = do {
ss <- llvm_alloc (llvm_alias "struct.s");
z <- llvm_fresh_var "z" (llvm_int 1);
llvm_points_to_bitfield ss "y" (llvm_term z);
llvm_execute_func [ss];
llvm_return (llvm_term z);
};

let set_x2_spec = do {
ss <- llvm_alloc (llvm_alias "struct.s");
z <- llvm_fresh_var "z" (llvm_int 8);
llvm_execute_func [ss, llvm_term z];
llvm_points_to_bitfield ss "x2" (llvm_term {{ drop z : [2] }});
};

let set_x2_alt_spec = do {
ss <- llvm_alloc (llvm_alias "struct.s");
z <- llvm_fresh_var "z" (llvm_int 2);
llvm_execute_func [ss, llvm_term {{ zext z : [8] }}];
llvm_points_to_bitfield ss "x2" (llvm_term z);
};

let set_y_spec = do {
ss <- llvm_alloc (llvm_alias "struct.s");
z <- llvm_fresh_var "z" (llvm_int 1);
llvm_execute_func [ss, llvm_term z];
llvm_points_to_bitfield ss "y" (llvm_term z);
};

let set_y_alt_spec = set_y_spec;

m <- llvm_load_module "test.bc";

llvm_verify m "get_x2" [] false get_x2_spec (w4_unint_z3 []);
llvm_verify m "get_y" [] false get_y_spec (w4_unint_z3 []);
llvm_verify m "set_x2" [] false set_x2_spec (w4_unint_z3 []);
llvm_verify m "set_x2_alt" [] false set_x2_alt_spec (w4_unint_z3 []);
llvm_verify m "set_y" [] false set_y_spec (w4_unint_z3 []);

set_x2_ov <- llvm_unsafe_assume_spec m "set_x2" set_x2_spec;
llvm_verify m "set_x2_alt" [set_x2_ov] false set_x2_alt_spec (w4_unint_z3 []);
set_y_ov <- llvm_unsafe_assume_spec m "set_y" set_y_spec;
llvm_verify m "set_y_alt" [set_y_ov] false set_y_alt_spec (w4_unint_z3 []);
5 changes: 5 additions & 0 deletions intTests/test_bitfield_smt_array/test.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
#!/usr/bin/env bash

set -e

! $SAW test.saw
Loading

0 comments on commit 3b66a82

Please sign in to comment.