In this assignment you'll implement a small language called Boa, which has
B-inary O-perators A-nd conditionals. You'll use an A-Normal Form
conversion to make binary operator compilation easy, and compile if via jmp
instructions.
-
Use the fork link from the class website to create your private clone of the starter code.
-
Do
git clone https://github.com/ucsd-cse131-fa18/00-warmup-XXX
whereXXX
is your private repo. -
Link your clone to the "upstream" to get any updates
$ make upstream
after this you can get "updates" (in case we modify the starter code), with
$ make update
- Save (and submit) your work with:
$ git commit -a -m MESSAGE
$ git push
As usual, there are a few pieces that go into defining a language for us to compile.
-
A description of the concrete syntax – the text the programmer writes
-
A description of the abstract syntax – how to express what the programmer wrote in a data structure our compiler uses.
-
A description of the semantics — or behavior —of the abstract syntax, so our compiler knows what the code it generates should evaluate.
The concrete syntax of Boa is:
<expr> :=
| let <bindings> in <expr>
| if <expr>: <expr> else: <expr>
| <binop-expr>
<binop-expr> :=
| <number>
| <identifier>
| add1(<expr>)
| sub1(<expr>)
| <expr> + <expr>
| <expr> - <expr>
| <expr> * <expr>
| ( <expr> )
<bindings> :=
| <identifier> = <expr>
| <identifier> = <expr>, <bindings>
As in adder
, a let
expression can have one or more bindings.
The abstract syntax is defined by the data type Expr
:
data Expr a
= Number !Integer a
| Id !Id a
| Prim1 !Prim1 !(Expr a) a
| Prim2 !Prim2 !(Expr a) !(Expr a) a
| Let !(Bind a) !(Expr a) !(Expr a) a
| If !(Expr a) !(Expr a) !(Expr a) a
data Bind a
= Bind !Id a
An immediate expression is a
- constant
Number n
, or - identifier
Id x
.
{-@ measure isImm @-}
isImm :: Expr a -> Bool
isImm (Number _ _) = True
isImm (Id _ _) = True
isImm _ = False
{-@ type ImmExpr a = {v:Expr a | isImm v} @-}
An A-Normal Form (ANF) expression is one where every argument
for a Prim1
or Prim2
or conditional (in an If
) is immediate
{-@ measure isAnf @-}
isAnf :: Expr a -> Bool
isAnf (Number _ _) = True
isAnf (Id _ _) = True
isAnf (Prim1 _ e _) = isImm e
isAnf (Prim2 _ e e' _) = isImm e && isImm e'
isAnf (If c t e _) = isImm c && isAnf t && isAnf e
isAnf (Let _ e e' _) = isAnf e && isAnf e'
{-@ type AnfExpr a = {v:Expr a| isAnf v} @-}
We use the a
parameter to capture different kinds of meta-data about the
expressions. For example, the parser returns values where each sub-expression
is labeled by its SourceSpan
(to enable proper error messages).
type Bare = Expr SourceSpan
Finally, the compilation (code generation) works with AST
nodes labeled by Tag
. We will ensure each sub-expression
has a distinct tag; consequently we can use the tags
to generate distinct control-flow labels for branches.
type Tag = (SourceSpan, Int)
type AExp = AnfExpr Tag
type IExp = ImmExpr Tag
Numbers, unary operators, let-bindings, and ids have the same semantics as before. Binary operator expressions evaluate their arguments and combine them based on the operator.
If
expressions behave similarly to if statements in C
:
- The conditional (first part) is evaluated.
- If the conditional is
0
, the else branch is evaluated; - Otherwise (if the conditional is non-zero), the then branch is evaluated.
For example,
sub1(5) -- ===> 4
if 5 - 5: 6 else: 8 -- ===> 8
let x = 10
, y = 9
in
if (x - y) * 2:
x
else:
y -- ===> 10
We have introduced a new type to represent control-flow labels:
data Label
= BranchTrue Tag
| BranchDone Tag
here Tag
is an alias for Int
; the above lets us define "true" and "done"
labels for each Tag
.
The new Instruction
are:
-
IMul Arg Arg
— Multiply the left argument by the right argument, and store in the left argument (typically the left argument iseax
for us) -
ICmp Arg Arg
— Compares the two arguments for equality. Set the condition code in the machine to track if the arguments were equal, or if the left was greater than or less than the right. This information is used byjne
and other conditional jump commands.Example:
cmp [esp-4], 0
Example:
cmp eax, [esp-8]
-
ILabel Label
— Create a location in the code that can be jumped to withjmp
,jne
, and other jump commands. -
IJne Label
— If the condition code says that the last comparison (cmp
) was given equal arguments, do nothing. If it says that the last comparison was not equal, immediately start executing instructions from the givenLabel
(by changing the program counter). -
IJe Label
— LikeIJne
but with the jump/no jump cases reversed -
IJmp Label
— Unconditionally start executing instructions from the given label (by changing the program counter)
To flex your muscles regarding the above, fill in the implementation of:
-- lib/Language/Boa/Asm.hs
instrAsm (ICmp a1 a2) = error "TBD:instrAsm:cmp"
instrAsm (ILabel l) = error "TBD:instrAsm:label"
instrAsm (IJe l) = error "TBD:instrAsm:je"
instrAsm (IJne l) = error "TBD:instrAsm:jne"
instrAsm (IJmp l) = error "TBD:instrAsm:jmp"
As usual, full summaries of the instructions we use are at this assembly guide.
When compiling an if expression, we need to execute exactly one of the branches (and not accidentally evaluate both!). A typical structure for doing this is to have two labels: one for the then case and one for the end of the if expression. So the compiled shape may look like:
cmp eax, 0 ; check if eax is equal to 0
jne label_true
; commands for "else" branch go here
jmp label_done
label_true:
; commands for "then" branch go here
label_done:
Note that if we did not put jmp label_done
after the commands for the else branch, control
would continue and evaluate the then branch as well.
So we need to make sure we skip over the else branch
by unconditionally jumping to label_done
.
We can't simply use the same label names over and over.
The assembler will get confused if we have multiple nested
if
expressions, because it won't know which label_done
to jmp
to, for example. So we need some way of generating
distinct, non-conflicting names.
Thus, in your code, you will use
BranchTrue i
andBranchdone i
where i
will be derived from the meta-data
tag on the If
expression to generate suitable
distinct labels for each branch.
Specifically, your task will be to fill in the implementations of:
-- lib/Language/Boa/Compiler.hs
immArg env e@(Id x _) = error "TBD:immArg:Id"
compilePrim1 :: Tag -> Env -> Prim1 -> IExp -> [Instruction]
compilePrim1 l env Add1 v = error "TBD:compilePrim1:Add1"
compilePrim1 l env Sub1 v = error "TBD:compilePrim1:Sub1"
compilePrim2 :: Tag -> Env -> Prim2 -> IExp -> IExp -> [Instruction]
compilePrim2 l env Plus v1 v2 = error "TBD:compilePrim2:Plus"
compilePrim2 l env Minus v1 v2 = error "TBD:compilePrim2:Minus"
compilePrim2 l env Times v1 v2 = error "TBD:compilePrim2:Times"
Aside from conditionals, the other major thing you need to do in the
implementation of boa
is add an implementation of ANF to convert the
user-facing syntax to the ANF syntax the compiler uses.
Study the cases that are filled in to figure out how the other cases should go.
You can also see this detailed write up as guide
Specifically, your task is to fill in the code for:
-- Normalizer.hs
anf :: Int -> Expr a -> (Int, AnfExpr a)
anf i (Let x e b l) = error "TBD:anf:let"
anf i (Prim2 o e1 e2 l) = error "TBD:anf:prim2"
imm :: Int -> AnfExpr a -> (Int, Binds a, ImmExpr a)
imm i (Number n l) = error "TBD:imm:Number"
imm i (Id x l) = error "TBD:imm:Id"
imm i (Prim2 o e1 e2 l) = error "TBD:imm:prim2"
For this assignment, you can assume that all variables have different names.
That means in particular you don't need to worry about nested instances of
variables with the same name, duplicates within a list, etc. The combination
of these with anf
causes some complications that go beyond our goals for
this assignment.
Finally, add new tests by filling new tests in tests/
.
Specifically, you should add new entries in
-
anf.json
to test your ANF conversion; this takes string representations of anExpr
and anAnfExpr
and checks that theanf
function converts the former to the latter. You can use this to test your ANF directly, to make sure it's producing the constrained expression you expect, before you try compiling it. -
boa.json
to test the compiler, both for outputs and error messages as before (inadder
). You can "test" yourcompile
function without implementing anf by using the commented outcompiler
definition (that just omits the call toanormal
.) This is useful if you want to test binary operators and branches before you are confident that your ANF transformation works.
Here's an order in which you could consider tackling the implementation:
-
Write some tests for the input and output of
anf
for nestedPrim1
expressions to understand what the ANF transformation looks like on those examples. For example,anf.json
intests/
contains sample Boa programs and their ANF transformations. -
Fill in the
immArg
case andcompilePrim1
inCompiler.hs
. This will let you run things likesub1(sub1(5))
and similarly nested expressions. -
Finish the
If
case forcompileEnv
(with immediate conditions) so you can run simple programs withif
. This includes the pretty-printing for the comparison and jump instructions inAsm.hs
, etc. -
Write some tests for the input and output of performing
anf
onIf
expressions, again to get a feel for what the transformation looks like. -
Work through both the
anf
implementation and the compiler implementationcompilePrim2
.
Write tests as you go. -
Work through both the
anf
implementation and the compiler implementation ofLet
.
Write tests as you go.
First, make sure that your code works with the provided test cases by running the command
make test
When you're ready, run the following command to submit your homework:
make turnin
Before submitting your code, you have to fill this form to register your groups. You need to fill it in for each assignment.
We will use this form to match your github account username to your student id, so you must fill it even if you have worked on the assignment individually.
To submit your code, just do:
$ make turnin
This will simply do a git commit
followed by a git push
to send us your code.
We will use the most recent commit of your code (on master
branch) as your submission.