-
Notifications
You must be signed in to change notification settings - Fork 63
/
Copy pathmanual.md
4947 lines (3866 loc) · 186 KB
/
manual.md
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
## Overview
The Software Analysis Workbench (SAW) is a tool for constructing
mathematical models of the computational behavior of software,
transforming these models, and proving properties about them.
SAW can currently construct models of a subset of programs written in
Cryptol, LLVM (and therefore C), and JVM (and therefore Java). SAW also has
experimental, incomplete support for MIR (and therefore Rust). The
models take the form of typed functional programs, so in a sense SAW can
be considered a translator from imperative programs to their functional
equivalents. Various external proof tools, including a variety of SAT
and SMT solvers, can be used to prove properties about the functional
models. SAW can construct models from arbitrary Cryptol programs, and
from C and Java programs that have fixed-size inputs and outputs and
that terminate after a fixed number of iterations of any loop (or a
fixed number of recursive calls). One common use case is to verify that
an algorithm specification in Cryptol is equivalent to an algorithm
implementation in C or Java.
The process of extracting models from programs, manipulating them,
forming queries about them, and sending them to external provers is
orchestrated using a special purpose language called SAWScript.
SAWScript is a typed functional language with support for sequencing of
imperative commands.
The rest of this document first describes how to use the SAW tool,
`saw`, and outlines the structure of the SAWScript language and its
relationship to Cryptol. It then presents the SAWScript commands that
transform functional models and prove properties about them. Finally, it
describes the specific commands available for constructing models from
imperative programs.
# Invoking SAW
The primary mechanism for interacting with SAW is through the `saw`
executable included as part of the standard binary distribution. With no
arguments, `saw` starts a read-evaluate-print loop (REPL) that allows
the user to interactively evaluate commands in the SAWScript language.
With one file name argument, it executes the specified file as a SAWScript
program.
In addition to a file name, the `saw` executable accepts several
command-line options:
`-h, -?, --help`
~ Print a help message.
`-V, --version`
~ Show the version of the SAWScript interpreter.
`-c path, --classpath=path`
~ Specify a colon-delimited list of paths to search for Java classes.
`-i path, --import-path=path`
~ Specify a colon-delimited list of paths to search for imports.
`-t, --extra-type-checking`
~ Perform extra type checking of intermediate values.
`-I, --interactive`
~ Run interactively (with a REPL). This is the default if no other
arguments are specified.
`-j path, --jars=path`
~ Specify a colon-delimited list of paths to `.jar` files to search
for Java classes.
`-b path, --java-bin-dirs`
~ Specify a colon-delimited list of paths to search for a Java
executable.
`-d num, --sim-verbose=num`
~ Set the verbosity level of the Java and LLVM simulators.
`-v num, --verbose=num`
~ Set the verbosity level of the SAWScript interpreter.
`--clean-mismatched-versions-solver-cache[=path]`
~ Run the `clean_mismatched_versions_solver_cache` command on the solver
cache at the given path, or if no path is given, the solver cache at the
value of the `SAW_SOLVER_CACHE_PATH` environment variable, then exit. See
the section **Caching Solver Results** for a description of the
`clean_mismatched_versions_solver_cache` command and the solver caching
feature in general.
SAW also uses several environment variables for configuration:
`CRYPTOLPATH`
~ Specify a colon-delimited list of directory paths to search for Cryptol
imports (including the Cryptol prelude).
`PATH`
~ If the `--java-bin-dirs` option is not set, then the `PATH` will be
searched to find a Java executable.
`SAW_IMPORT_PATH`
~ Specify a colon-delimited list of directory paths to search for imports.
`SAW_JDK_JAR`
~ Specify the path of the `.jar` file containing the core Java
libraries. Note that that is not necessary if the `--java-bin-dirs` option
or the `PATH` environment variable is used, as SAW can use this information
to determine the location of the core Java libraries' `.jar` file.
`SAW_SOLVER_CACHE_PATH`
~ Specify a path at which to keep a cache of solver results obtained during
calls to certain tactics. A cache is not created at this path until it is
needed. See the section **Caching Solver Results** for more detail about this
feature.
On Windows, semicolon-delimited lists are used instead of colon-delimited
lists.
# Structure of SAWScript
A SAWScript program consists, at the top level, of a sequence of
commands to be executed in order. Each command is terminated with a
semicolon. For example, the `print` command displays a textual
representation of its argument. Suppose the following text is stored in
the file `print.saw`:
~~~~
print 3;
~~~~
The command `saw print.saw` will then yield output similar to the
following:
~~~~
Loading module Cryptol
Loading file "print.saw"
3
~~~~
The same code can be run from the interactive REPL:
~~~~
sawscript> print 3;
3
~~~~
At the REPL, terminating semicolons can be omitted:
~~~~
sawscript> print 3
3
~~~~
To make common use cases simpler, bare values at the REPL are treated as
if they were arguments to `print`:
~~~~
sawscript> 3
3
~~~~
One SAWScript file can be included in another using the `include`
command, which takes the name of the file to be included as an argument.
For example:
~~~~
sawscript> include "print.saw"
Loading file "print.saw"
3
~~~~
Typically, included files are used to import definitions, not perform
side effects like printing. However, as you can see, if any commands
with side effects occur at the top level of the imported file, those
side effects will occur during import.
## Syntax
The syntax of SAWScript is reminiscent of functional languages such as
Cryptol, Haskell and ML. In particular, functions are applied by
writing them next to their arguments rather than by using parentheses
and commas. Rather than writing `f(x, y)`, write `f x y`.
Comments are written as in C, Java, and Rust (among many other languages). All
text from `//` until the end of a line is ignored. Additionally, all
text between `/*` and `*/` is ignored, regardless of whether the line
ends.
## Basic Types and Values
All values in SAWScript have types, and these types are determined and
checked before a program runs (that is, SAWScript is statically typed).
The basic types available are similar to those in many other languages.
* The `Int` type represents unbounded mathematical integers. Integer
constants can be written in decimal notation (e.g., `42`), hexadecimal
notation (`0x2a`), and binary (`0b00101010`). However, unlike many
languages, integers in SAWScript are used primarily as constants.
Arithmetic is usually encoded in Cryptol, as discussed in the next
section.
* The Boolean type, `Bool`, contains the values `true` and `false`, like
in many other languages. As with integers, computations on Boolean
values usually occur in Cryptol.
* Values of any type can be aggregated into tuples. For example, the
value `(true, 10)` has the type `(Bool, Int)`.
* Values of any type can also be aggregated into records, which are
exactly like tuples except that their components have names. For
example, the value `{ b = true, n = 10 }` has the type `{ b : Bool,
n : Int }`.
* A sequence of values of the same type can be stored in a list. For
example, the value `[true, false, true]` has the type `[Bool]`.
* Strings of textual characters can be represented in the `String` type.
For example, the value `"example"` has type `String`.
* The "unit" type, written `()`, is essentially a placeholder, similar
to `void` in languages like C and Java. It has only one value, also
written `()`. Values of type `()` convey no information. We will show
in later sections several cases where this is useful.
* Functions are given types that indicate what type they consume and
what type they produce. For example, the type `Int -> Bool` indicates
a function that takes an `Int` as input and produces a `Bool` as
output. Functions with multiple arguments use multiple arrows. For
example, the type `Int -> String -> Bool` indicates a function in
which the first argument is an `Int`, the second is a `String`, and
the result is a `Bool`. It is possible, but not necessary, to group
arguments in tuples, as well, so the type `(Int, String) -> Bool`
describes a function that takes one argument, a pair of an `Int` and a
`String`, and returns a `Bool`.
SAWScript also includes some more specialized types that do not have
straightforward counterparts in most other languages. These will appear
in later sections.
## Basic Expression Forms
One of the key forms of top-level command in SAWScript is a *binding*,
introduced with the `let` keyword, which gives a name to a value. For
example:
~~~~
sawscript> let x = 5
sawscript> x
5
~~~~
Bindings can have parameters, in which case they define functions. For
instance, the following function takes one parameter and constructs a
list containing that parameter as its single element.
~~~~
sawscript> let f x = [x]
sawscript> f "text"
["text"]
~~~~
Functions themselves are values and have types. The type of a function
that takes an argument of type `a` and returns a result of type `b` is
`a -> b`.
Function types are typically inferred, as in the example `f`
above. In this case, because `f` only creates a list with the given
argument, and because it is possible to create a list of any element
type, `f` can be applied to an argument of any type. We say, therefore,
that `f` is *polymorphic*. Concretely, we write the type of `f` as `{a}
a -> [a]`, meaning it takes a value of any type (denoted `a`) and
returns a list containing elements of that same type. This means we can
also apply `f` to `10`:
~~~~
sawscript> f 10
[10]
~~~~
However, we may want to specify that a function has a more
specific type. In this case, we could restrict `f` to operate only on
`Int` parameters.
~~~~
sawscript> let f (x : Int) = [x]
~~~~
This will work identically to the original `f` on an `Int` parameter:
~~~~
sawscript> f 10
[10]
~~~~
However, it will fail for a `String` parameter:
~~~~
sawscript> f "text"
type mismatch: String -> t.0 and Int -> [Int]
at "_" (REPL)
mismatched type constructors: String and Int
~~~~
Type annotations can be applied to any expression. The notation `(e :
t)` indicates that expression `e` is expected to have type `t` and
that it is an error for `e` to have a different type. Most types in
SAWScript are inferred automatically, but specifying them explicitly can
sometimes enhance readability.
Because functions are values, functions can return other functions. We
make use of this feature when writing functions of multiple arguments.
Consider the function `g`, similar to `f` but with two arguments:
~~~~
sawscript> let g x y = [x, y]
~~~~
Like `f`, `g` is polymorphic. Its type is `{a} a -> a -> [a]`. This
means it takes an argument of type `a` and returns a *function* that
takes an argument of the same type `a` and returns a list of `a` values.
We can therefore apply `g` to any two arguments of the same type:
~~~~
sawscript> g 2 3
[2,3]
sawscript> g true false
[true,false]
~~~~
But type checking will fail if we apply it to two values of different types:
~~~~
sawscript> g 2 false
type mismatch: Bool -> t.0 and Int -> [Int]
at "_" (REPL)
mismatched type constructors: Bool and Int
~~~~
So far we have used two related terms, *function* and *command*, and we
take these to mean slightly different things. A function is any value
with a function type (e.g., `Int -> [Int]`). A command is any value with
a special command type (e.g. `TopLevel ()`, as shown below). These
special types allow us to restrict command usage to specific contexts,
and are also *parameterized* (like the list type). Most but not all
commands are also functions.
The most important command type is the `TopLevel` type, indicating a
command that can run at the top level (directly at the REPL, or as one
of the top level commands in a script file). The `print` command
has the type `{a} a -> TopLevel ()`, where `TopLevel ()` means that it
is a command that runs in the `TopLevel` context and returns a value of
type `()` (that is, no useful information). In other words, it has a
side effect (printing some text to the screen) but doesn't produce any
information to use in the rest of the SAWScript program. This is the
primary usage of the `()` type.
It can sometimes be useful to bind a sequence of commands together. This
can be accomplished with the `do { ... }` construct. For example:
~~~~
sawscript> let print_two = do { print "first"; print "second"; }
sawscript> print_two
first
second
~~~~
The bound value, `print_two`, has type `TopLevel ()`, since that is the
type of its last command.
Note that in the previous example the printing doesn't occur until
`print_two` directly appears at the REPL. The `let` expression does not
cause those commands to run. The construct that *runs* a command is
written using the `<-` operator. This operator works like `let` except
that it says to run the command listed on the right hand side and bind
the result, rather than binding the variable to the command itself.
Using `<-` instead of `let` in the previous example yields:
~~~~
sawscript> print_two <- do { print "first"; print "second"; }
first
second
sawscript> print print_two
()
~~~~
Here, the `print` commands run first, and then `print_two` gets the
value returned by the second `print` command, namely `()`. Any command
run without using `<-` at either the top level of a script or within a
`do` block discards its result. However, the REPL prints the result of
any command run without using the `<-` operator.
In some cases it can be useful to have more control over the value
returned by a `do` block. The `return` command allows us to do this. For
example, say we wanted to write a function that would print a message
before and after running some arbitrary command and then return the
result of that command. We could write:
~~~~
let run_with_message msg c =
do {
print "Starting.";
print msg;
res <- c;
print "Done.";
return res;
};
x <- run_with_message "Hello" (return 3);
print x;
~~~~
If we put this script in `run.saw` and run it with `saw`, we get
something like:
~~~~
Loading module Cryptol
Loading file "run.saw"
Starting.
Hello
Done.
3
~~~~
Note that it ran the first `print` command, then the caller-specified
command, then the second `print` command. The result stored in `x` at
the end is the result of the `return` command passed in as an argument.
## Other Basic Functions
Aside from the functions we have listed so far, there are a number of other
operations for working with basic data structures and interacting
with the operating system.
The following functions work on lists:
* `concat : {a} [a] -> [a] -> [a]` takes two lists and returns the
concatenation of the two.
* `head : {a} [a] -> a` returns the first element of a list.
* `tail : {a} [a] -> [a]` returns everything except the first element.
* `length : {a} [a] -> Int` counts the number of elements in a list.
* `null : {a} [a] -> Bool` indicates whether a list is empty (has zero
elements).
* `nth : {a} [a] -> Int -> a` returns the element at the given position,
with `nth l 0` being equivalent to `head l`.
* `for : {m, a, b} [a] -> (a -> m b) -> m [b]` takes a list and a
function that runs in some command context. The passed command will be
called once for every element of the list, in order. Returns a list of
all of the results produced by the command.
For interacting with the operating system, we have:
* `get_opt : Int -> String` returns the command-line argument to `saw`
at the given index. Argument 0 is always the name of the `saw`
executable itself, and higher indices represent later arguments.
* `exec : String -> [String] -> String -> TopLevel String` runs an
external program given, respectively, an executable name, a list of
arguments, and a string to send to the standard input of the program.
The `exec` command returns the standard output from the program it
executes and prints standard error to the screen.
* `exit : Int -> TopLevel ()` stops execution of the current script and
returns the given exit code to the operating system.
Finally, there are a few miscellaneous functions and commands:
* `show : {a} a -> String` computes the textual representation of its
argument in the same way as `print`, but instead of displaying the value
it returns it as a `String` value for later use in the program. This can
be useful for constructing more detailed messages later.
* `str_concat : String -> String -> String` concatenates two `String`
values, and can also be useful with `show`.
* `time : {a} TopLevel a -> TopLevel a` runs any other `TopLevel`
command and prints out the time it took to execute.
* `with_time : {a} TopLevel a -> TopLevel (Int, a)` returns both the
original result of the timed command and the time taken to execute it
(in milliseconds), without printing anything in the process.
# The Term Type
Perhaps the most important type in SAWScript, and the one most unlike
the built-in types of most other languages, is the `Term` type.
Essentially, a value of type `Term` precisely describes all
possible computations performed by some program. In particular, if
two `Term` values are *equivalent*, then the programs that they
represent will always compute the same results given the same inputs. We
will say more later about exactly what it means for two terms to be
equivalent, and how to determine whether two terms are equivalent.
Before exploring the `Term` type more deeply, it is important to
understand the role of the Cryptol language in SAW.
# Cryptol and its Role in SAW
Cryptol is a domain-specific language originally designed for the
high-level specification of cryptographic algorithms. It is general
enough, however, to describe a wide variety of programs, and is
particularly applicable to describing computations that operate on
streams of data of some fixed size.
In addition to being integrated into SAW, Cryptol is a standalone
language with its own manual:
~~~~
http://cryptol.net/files/ProgrammingCryptol.pdf
~~~~
SAW includes deep support for Cryptol, and in fact requires the use of
Cryptol for most non-trivial tasks. To fully understand the rest of
this manual and to effectively use SAW, you will need to develop at least
a rudimentary understanding of Cryptol.
The primary use of Cryptol within SAWScript is to construct values of type
`Term`. Although `Term` values can be constructed from various sources,
inline Cryptol expressions are the most direct and convenient way to create
them.
Specifically, a Cryptol expression can be placed inside double curly
braces (`{{` and `}}`), resulting in a value of type `Term`. As a very
simple example, there is no built-in integer addition operation in
SAWScript. However, we can use Cryptol's built-in integer addition operator
within SAWScript as follows:
~~~~
sawscript> let t = {{ 0x22 + 0x33 }}
sawscript> print t
85
sawscript> :type t
Term
~~~~
Although it printed out in the same way as an `Int`, it is important to
note that `t` actually has type `Term`. We can see how this term is
represented internally, before being evaluated, with the `print_term`
function.
~~~~
sawscript> print_term t
let { x@1 = Prelude.Vec 8 Prelude.Bool
x@2 = Cryptol.TCNum 8
x@3 = Cryptol.PLiteralSeqBool x@2
}
in Cryptol.ecPlus x@1 (Cryptol.PArithSeqBool x@2)
(Cryptol.ecNumber (Cryptol.TCNum 34) x@1 x@3)
(Cryptol.ecNumber (Cryptol.TCNum 51) x@1 x@3)
~~~~
For the moment, it's not important to understand what this output means.
We show it only to clarify that `Term` values have their own internal
structure that goes beyond what exists in SAWScript. The internal
representation of `Term` values is in a language called SAWCore. The
full semantics of SAWCore are beyond the scope of this manual.
The text constructed by `print_term` can also be accessed
programmatically (instead of printing to the screen) using the
`show_term` function, which returns a `String`. The `show_term` function
is not a command, so it executes directly and does not need `<-` to bind
its result. Therefore, the following will have the same result as the
`print_term` command above:
~~~~
sawscript> let s = show_term t
sawscript> :type s
String
sawscript> print s
<same as above>
~~~~
Numbers are printed in decimal notation by default when printing terms,
but the following two commands can change that behavior.
* `set_ascii : Bool -> TopLevel ()`, when passed `true`, makes
subsequent `print_term` or `show_term` commands print sequences of bytes
as ASCII strings (and doesn't affect printing of anything else).
* `set_base : Int -> TopLevel ()` prints all bit vectors in the given
base, which can be between 2 and 36 (inclusive).
A `Term` that represents an integer (any bit vector, as affected by
`set_base`) can be translated into a SAWScript `Int` using the
`eval_int : Term -> Int` function. This function returns an
`Int` if the `Term` can be represented as one, and fails at runtime
otherwise.
~~~~
sawscript> print (eval_int t)
85
sawscript> print (eval_int {{ True }})
"eval_int" (<stdin>:1:1):
eval_int: argument is not a finite bitvector
sawscript> print (eval_int {{ [True] }})
1
~~~~
Similarly, values of type `Bit` in Cryptol can be translated into values
of type `Bool` in SAWScript using the `eval_bool : Term -> Bool` function:
~~~~
sawscript> let b = {{ True }}
sawscript> print_term b
Prelude.True
sawscript> print (eval_bool b)
true
~~~~
Anything with sequence type in Cryptol can be translated into a list of
`Term` values in SAWScript using the `eval_list : Term -> [Term]` function.
~~~~
sawscript> let l = {{ [0x01, 0x02, 0x03] }}
sawscript> print_term l
let { x@1 = Prelude.Vec 8 Prelude.Bool
x@2 = Cryptol.PLiteralSeqBool (Cryptol.TCNum 8)
}
in [Cryptol.ecNumber (Cryptol.TCNum 1) x@1 x@2
,Cryptol.ecNumber (Cryptol.TCNum 2) x@1 x@2
,Cryptol.ecNumber (Cryptol.TCNum 3) x@1 x@2]
sawscript> print (eval_list l)
[Cryptol.ecNumber (Cryptol.TCNum 1) (Prelude.Vec 8 Prelude.Bool)
(Cryptol.PLiteralSeqBool (Cryptol.TCNum 8))
,Cryptol.ecNumber (Cryptol.TCNum 2) (Prelude.Vec 8 Prelude.Bool)
(Cryptol.PLiteralSeqBool (Cryptol.TCNum 8))
,Cryptol.ecNumber (Cryptol.TCNum 3) (Prelude.Vec 8 Prelude.Bool)
(Cryptol.PLiteralSeqBool (Cryptol.TCNum 8))]
~~~~
Finally, a list of `Term` values in SAWScript can be collapsed into a single
`Term` with sequence type using the `list_term : [Term] -> Term` function,
which is the inverse of `eval_list`.
~~~~
sawscript> let ts = eval_list l
sawscript> let l = list_term ts
sawscript> print_term l
let { x@1 = Prelude.Vec 8 Prelude.Bool
x@2 = Cryptol.PLiteralSeqBool (Cryptol.TCNum 8)
}
in [Cryptol.ecNumber (Cryptol.TCNum 1) x@1 x@2
,Cryptol.ecNumber (Cryptol.TCNum 2) x@1 x@2
,Cryptol.ecNumber (Cryptol.TCNum 3) x@1 x@2]
~~~~
In addition to being able to extract integer and Boolean values from
Cryptol expressions, `Term` values can be injected into Cryptol
expressions. When SAWScript evaluates a Cryptol expression between `{{`
and `}}` delimiters, it does so with several extra bindings in scope:
* Any variable in scope that has SAWScript type `Bool` is visible in
Cryptol expressions as a value of type `Bit`.
* Any variable in scope that has SAWScript type `Int` is visible in
Cryptol expressions as a *type variable*. Type variables can be
demoted to numeric bit vector values using the backtick (`` ` ``)
operator.
* Any variable in scope that has SAWScript type `Term` is visible in
Cryptol expressions as a value with the Cryptol type corresponding to
the internal type of the term. The power of this conversion is that
the `Term` does not need to have originally been derived from a
Cryptol expression.
In addition to these rules, bindings created at the Cryptol level,
either from included files or inside Cryptol quoting brackets, are
visible only to later Cryptol expressions, and not as SAWScript
variables.
To make these rules more concrete, consider the following examples. If
we bind a SAWScript `Int`, we can use it as a Cryptol type variable. If
we create a `Term` variable that internally has function type, we can
apply it to an argument within a Cryptol expression, but not at the
SAWScript level:
~~~~
sawscript> let n = 8
sawscript> :type n
Int
sawscript> let {{ f (x : [n]) = x + 1 }}
sawscript> :type {{ f }}
Term
sawscript> :type f
<stdin>:1:1-1:2: unbound variable: "f" (<stdin>:1:1-1:2)
sawscript> print {{ f 2 }}
3
~~~~
If `f` was a binding of a SAWScript variable to a `Term` of function
type, we would get a different error:
~~~~
sawscript> let f = {{ \(x : [n]) -> x + 1 }}
sawscript> :type {{ f }}
Term
sawscript> :type f
Term
sawscript> print {{ f 2 }}
3
sawscript> print (f 2)
type mismatch: Int -> t.0 and Term
at "_" (REPL)
mismatched type constructors: (->) and Term
~~~~
One subtlety of dealing with `Term`s constructed from Cryptol is that
because the Cryptol expressions themselves are type checked by the
Cryptol type checker, and because they may make use of other `Term`
values already in scope, they are not type checked until the Cryptol
brackets are evaluated. So type errors at the Cryptol level may occur at
runtime from the SAWScript perspective (though they occur before the
Cryptol expressions are run).
So far, we have talked about using Cryptol *value* expressions. However,
SAWScript can also work with Cryptol *types*. The most direct way to
refer to a Cryptol type is to use type brackets: `{|` and `|}`. Any
Cryptol type written between these brackets becomes a `Type` value in
SAWScript. Some types in Cryptol are *numeric* (also known as *size*)
types, and correspond to non-negative integers. These can be translated
into SAWScript integers with the `eval_size` function. For example:
~~~~
sawscript> let {{ type n = 16 }}
sawscript> eval_size {| n |}
16
sawscript> eval_size {| 16 |}
16
~~~~
For non-numeric types, `eval_size` fails at runtime:
~~~~
sawscript> eval_size {| [16] |}
"eval_size" (<stdin>:1:1):
eval_size: not a numeric type
~~~~
In addition to the use of brackets to write Cryptol expressions inline,
several built-in functions can extract `Term` values from Cryptol files
in other ways. The `import` command at the top level imports all
top-level definitions from a Cryptol file and places them in scope
within later bracketed expressions. This includes [Cryptol `foreign`
declarations](https://galoisinc.github.io/cryptol/master/FFI.html). If a
[Cryptol implementation of a foreign
function](https://galoisinc.github.io/cryptol/master/FFI.html#cryptol-implementation-of-foreign-functions)
is present, then it will be used as the definition when reasoning about
the function. Otherwise, the function will be imported as an opaque
constant with no definition.
The `cryptol_load` command behaves similarly, but returns a
`CryptolModule` instead. If any `CryptolModule` is in scope, its
contents are available qualified with the name of the `CryptolModule`
variable. A specific definition can be explicitly extracted from a
`CryptolModule` using the `cryptol_extract` command:
* `cryptol_extract : CryptolModule -> String -> TopLevel Term`
# Transforming Term Values
The three primary functions of SAW are *extracting* models (`Term`
values) from programs, *transforming* those models, and *proving*
properties about models using external provers. So far we've shown how
to construct `Term` values from Cryptol programs; later sections
will describe how to extract them from other programs. Now we show how
to use the various term transformation features available in SAW.
## Rewriting
Rewriting a `Term` consists of applying one or more *rewrite rules* to
it, resulting in a new `Term`. A rewrite rule in SAW can be specified in
multiple ways:
1. as the definition of a function that can be unfolded,
1. as a term of Boolean type (or a function returning a Boolean) that
is an equality statement, and
1. as a term of _equality type_ with a body that encodes a proof that
the equality in the type is valid.
In each case the term logically consists of two sides and describes a
way to transform the left side into the right side. Each side may
contain variables (bound by enclosing lambda expressions) and is
therefore a *pattern* which can match any term in which each variable
represents an arbitrary sub-term. The left-hand pattern describes a term
to match (which may be a sub-term of the full term being rewritten), and
the right-hand pattern describes a term to replace it with. Any variable
in the right-hand pattern must also appear in the left-hand pattern and
will be instantiated with whatever sub-term matched that variable in the
original term.
For example, say we have the following Cryptol function:
~~~~
\(x:[8]) -> (x * 2) + 1
~~~~
We might for some reason want to replace multiplication by a power of
two with a shift. We can describe this replacement using an equality
statement in Cryptol (a rule of form 2 above):
~~~~
\(y:[8]) -> (y * 2) == (y << 1)
~~~~
Interpreting this as a rewrite rule, it says that for any 8-bit vector
(call it `y` for now), we can replace `y * 2` with `y << 1`. Using this
rule to rewrite the earlier expression would then yield:
~~~~
\(x:[8]) -> (x << 1) + 1
~~~~
The general philosophy of rewriting is that the left and right patterns,
while syntactically different, should be semantically equivalent.
Therefore, applying a set of rewrite rules should not change the
fundamental meaning of the term being rewritten. SAW is particularly
focused on the task of proving that some logical statement expressed as
a `Term` is always true. If that is in fact the case, then the entire
term can be replaced by the term `True` without changing its meaning. The
rewriting process can in some cases, by repeatedly applying rules that
themselves are known to be valid, reduce a complex term entirely to
`True`, which constitutes a proof of the original statement. In other
cases, rewriting can simplify terms before sending them to external
automated provers that can then finish the job. Sometimes this
simplification can help the automated provers run more quickly, and
sometimes it can help them prove things they would otherwise be unable
to prove by applying reasoning steps (rewrite rules) that are not
available to the automated provers.
In practical use, rewrite rules can be aggregated into `Simpset`
values in SAWScript. A few pre-defined `Simpset` values exist:
* `empty_ss : Simpset` is the empty set of rules. Rewriting with it
should have no effect, but it is useful as an argument to some of the
functions that construct larger `Simpset` values.
* `basic_ss : Simpset` is a collection of rules that are useful in most
proof scripts.
* `cryptol_ss : () -> Simpset` includes a collection of Cryptol-specific
rules. Some of these simplify away the abstractions introduced in the
translation from Cryptol to SAWCore, which can be useful when proving
equivalence between Cryptol and non-Cryptol code. Leaving these
abstractions in place is appropriate when comparing only Cryptol code,
however, so `cryptol_ss` is not included in `basic_ss`.
The next set of functions can extend or apply a `Simpset`:
* `addsimp' : Term -> Simpset -> Simpset` adds a single `Term` to an
existing `Simpset.
* `addsimps' : [Term] -> Simpset -> Simpset` adds a list of `Term`s to
an existing `Simpset`.
* `rewrite : Simpset -> Term -> Term` applies a `Simpset` to an existing
`Term` to produce a new `Term`.
To make this more concrete, we examine how the rewriting example
sketched above, to convert multiplication into shift, can work in
practice. We simplify everything with `cryptol_ss` as we go along so
that the `Term`s don't get too cluttered. First, we declare the term
to be transformed:
~~~~
sawscript> let term = rewrite (cryptol_ss ()) {{ \(x:[8]) -> (x * 2) + 1 }}
sawscript> print_term term
\(x : Prelude.Vec 8 Prelude.Bool) ->
Prelude.bvAdd 8 (Prelude.bvMul 8 x (Prelude.bvNat 8 2))
(Prelude.bvNat 8 1)
~~~~
Next, we declare the rewrite rule:
~~~~
sawscript> let rule = rewrite (cryptol_ss ()) {{ \(y:[8]) -> (y * 2) == (y << 1) }}
sawscript> print_term rule
let { x@1 = Prelude.Vec 8 Prelude.Bool
}
in \(y : x@1) ->
Cryptol.ecEq x@1 (Cryptol.PCmpWord 8)
(Prelude.bvMul 8 y (Prelude.bvNat 8 2))
(Prelude.bvShiftL 8 Prelude.Bool 1 Prelude.False y
(Prelude.bvNat 1 1))
~~~~
Finally, we apply the rule to the target term:
~~~~
sawscript> let result = rewrite (addsimp' rule empty_ss) term
sawscript> print_term result
\(x : Prelude.Vec 8 Prelude.Bool) ->
Prelude.bvAdd 8
(Prelude.bvShiftL 8 Prelude.Bool 1 Prelude.False x
(Prelude.bvNat 1 1))
(Prelude.bvNat 8 1)
~~~~
Note that `addsimp'` and `addsimps'` take a `Term` or list of `Term`s;
these could in principle be anything, and are not necessarily terms
representing logically valid equalities. They have `'` suffixes because
they are not intended to be the primary interface to rewriting. When using
these functions, the soundness of the proof process depends on the
correctness of these rules as a side condition.
The primary interface to rewriting uses the `Theorem` type instead of
the `Term` type, as shown in the signatures for `addsimp` and
`addsimps`.
* `addsimp : Theorem -> Simpset -> Simpset` adds a single `Theorem` to a
`Simpset`.
* `addsimps : [Theorem] -> Simpset -> Simpset` adds several `Theorem`
values to a `Simpset`.
A `Theorem` is essentially a `Term` that is proven correct in some way.
In general, a `Theorem` can be any statement, and may not be useful as a
rewrite rule. However, if it has an appropriate shape it can be used for
rewriting. In the ["Proofs about Terms"](#proofs-about-terms) section,
we'll describe how to construct `Theorem` values from `Term` values.
In the absence of user-constructed `Theorem` values, there are some
additional built-in rules that are not included in either `basic_ss` and
`cryptol_ss` because they are not always beneficial, but that can
sometimes be helpful or essential. The `cryptol_ss` simpset includes
rewrite rules to unfold all definitions in the `Cryptol` SAWCore module,
but does not include any of the terms of equality type.
* `add_cryptol_defs : `[String] -> Simpset -> Simpset` adds unfolding
rules for functions with the given names from the SAWCore `Cryptol` module
to the given `Simpset`.
* `add_cryptol_eqs : [String] -> Simpset -> Simpset` adds the terms of
equality type with the given names from the SAWCore `Cryptol` module to
the given `Simpset`.
* `add_prelude_defs : [String] -> Simpset -> Simpset` adds unfolding
rules from the SAWCore `Prelude` module to a `Simpset`.
* `add_prelude_eqs : [String] -> Simpset -> Simpset` adds equality-typed
terms from the SAWCore `Prelude` module to a `Simpset`.
Finally, it's possible to construct a theorem from an arbitrary SAWCore
expression (rather than a Cryptol expression), using the `core_axiom`
function.
* `core_axiom : String -> Theorem` creates a `Theorem` from a `String`
in SAWCore syntax. Any `Theorem` introduced by this function is assumed
to be correct, so use it with caution.
## Folding and Unfolding
A SAWCore term can be given a name using the `define` function, and is
then by default printed as that name alone. A named subterm can be
"unfolded" so that the original definition appears again.
* `define : String -> Term -> TopLevel Term`
* `unfold_term : [String] -> Term -> Term`
For example:
~~~~
sawscript> let t = {{ 0x22 }}
sawscript> print_term t
Cryptol.ecNumber (Cryptol.TCNum 34) (Prelude.Vec 8 Prelude.Bool)
(Cryptol.PLiteralSeqBool (Cryptol.TCNum 8))
sawscript> t' <- define "t" t
sawscript> print_term t'
t
sawscript> let t'' = unfold_term ["t"] t'
sawscript> print_term t''
Cryptol.ecNumber (Cryptol.TCNum 34) (Prelude.Vec 8 Prelude.Bool)
(Cryptol.PLiteralSeqBool (Cryptol.TCNum 8))
~~~~
This process of folding and unfolding is useful both to make large terms
easier for humans to work with and to make automated proofs more