This is an implementation for a small, strongly-typed functional language with parametric polymorphism, user-defined datatypes, pattern matching, and general recursion. The implementation uses no dependencies beyond the OCaml standard library.
Done:
- Definition of the syntax. See
syntax.ml
. - Evaluation of terms, including pattern-matching and recursion, and top-level
declarations. See
eval.ml
. - Pretty-printing of terms. See
pretty.ml
. - Unification. See
unify.ml
. - Typechecking. See
typecheck.ml
. - Some example programs. See
examples/ski.ev
. - External syntax. This is the syntax recognized by the parser.
- Scopechecking to translate external syntax to internal syntax. This pass ensures that all references are valid (constructors, other functions, types, variables, etc.)
- Parsing. The parser is a handwritten recursive descent parser using combinators. The error messages aren't great!
- Bytecode compilation. The bytecode is described in
bytecode.ml
, for a two-stack virtual machine (one for locals/parameters and one for the call stack)
Todo:
- tail-call optimization
- optimized pattern-matching compilation (current strategy is linear / naive)
- efficient interpreter implemented in C++ or Rust
- recursive local bindings (only top-level recursive functions work at the moment)
Syntax
: definitions of different ASTssynext.ml
external syntax: this is close to what the user writessynint.ml
internal syntax: what we typechecksynclo.ml
closed syntax: an intermediate representation generated by closure conversion.
Loc
: source locationsPretty
: pretty-printing of internal syntax, closed syntax, and bytecode instructions.Parser
: hand-written parser combinators that generate external syntax from a string.Scopecheck
: converts external syntax to internal syntax, disambiguates names (locals vs references), scope-checks all identifiers.TMVar
: unification variables, aka "type metavariables" (TMVars); type substitutionsUnify
: unification of typesTypecheck
: Hindley-Milner type inference using unificationEval
: evaluator for internal syntax, turns internal syntax into a value.Close
: closure conversion hoists function bodies occurring within functions to the top-level and replaces them with anMkClo
node.Bytecode
: description of the bytecode instructions emitted by compilation.Compile
: compiles closed syntax into bytecode instructions.Interpret
: bytecode interpreter.
The main flow of data is: text -> Parser
-> Scopecheck
-> Typecheck
-> Close
Currently, applications of constructors, statically known functions, and primitive operations are automatically eta-expanded, but that eta-expansion introduces new abstractions without appropriately adjusting the indices of bound variables, so it produces bogus code. Manually eta-expand everything as a workaround for now.