Skip to content
Open
Show file tree
Hide file tree
Changes from 22 commits
Commits
Show all changes
38 commits
Select commit Hold shift + click to select a range
4a35d21
Add high level strata file
keyboardDrummer Oct 3, 2025
552bb66
Save
keyboardDrummer Oct 3, 2025
1fdbb8f
Added isPure: bool argument, but now I get an error about option
keyboardDrummer Oct 3, 2025
c5df937
Revert "Added isPure: bool argument, but now I get an error about opt…
keyboardDrummer Oct 3, 2025
24b0597
Added comments
keyboardDrummer Oct 4, 2025
6dac5d7
Save
keyboardDrummer Oct 6, 2025
d753511
Save before changing EvalResult
keyboardDrummer Oct 6, 2025
efa41bb
Add missing cases with panics
keyboardDrummer Oct 6, 2025
d6338eb
Rename files
keyboardDrummer Oct 6, 2025
530ee22
Merge branch 'main' into highStrata
keyboardDrummer Oct 6, 2025
d38ef71
Remove label from loop, and simplify jumps
keyboardDrummer Oct 6, 2025
fbfa165
Save before introducing stacks
keyboardDrummer Oct 6, 2025
9f44691
Tweaks
keyboardDrummer Oct 6, 2025
29d4883
Merge branch 'highStrata' of github.com:keyboardDrummer/Strata into h…
keyboardDrummer Oct 6, 2025
7bc8026
Update
keyboardDrummer Oct 6, 2025
c472e19
Update design choices section
keyboardDrummer Oct 7, 2025
252320f
Fix typo
keyboardDrummer Oct 7, 2025
51fdf85
Fix typo
keyboardDrummer Oct 7, 2025
6123382
Additions
keyboardDrummer Oct 8, 2025
6d5ff0c
Merge branch 'main' into highStrata
atomb Oct 21, 2025
70ab1d4
Rename file and small updates to comment
keyboardDrummer Oct 22, 2025
57a20bd
Merge branch 'highStrata' of github.com:keyboardDrummer/Strata into h…
keyboardDrummer Oct 22, 2025
6cfc4d0
Add examples
keyboardDrummer Oct 27, 2025
095adc8
Remove dynamic parts
keyboardDrummer Oct 27, 2025
971a857
Remove nullable
keyboardDrummer Oct 27, 2025
9c7acc0
Update comment
keyboardDrummer Oct 27, 2025
6a123d6
Add another example
keyboardDrummer Oct 27, 2025
0d7fbd5
Add constrained types example
keyboardDrummer Oct 27, 2025
a838e2d
Remove reference to dynamic
keyboardDrummer Oct 27, 2025
cffd116
Remove flow typing references
keyboardDrummer Oct 27, 2025
1def107
Merge branch 'main' into highStrata
keyboardDrummer Oct 27, 2025
c9cde82
Add comment
keyboardDrummer Oct 27, 2025
ad75a53
Tweaks to constructors
keyboardDrummer Oct 27, 2025
98aa2dd
Merge branch 'highStrata' of github.com:keyboardDrummer/Strata into h…
keyboardDrummer Oct 27, 2025
b8f088d
Renames
keyboardDrummer Oct 28, 2025
b1600a5
Undo rename
keyboardDrummer Oct 28, 2025
0fdbe1d
Add copyright header
keyboardDrummer Oct 28, 2025
403040d
Merge branch 'main' into highStrata
keyboardDrummer Oct 28, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
290 changes: 290 additions & 0 deletions Strata/DL/HighStrata/HighStrata.lean
Original file line number Diff line number Diff line change
@@ -0,0 +1,290 @@
/-
The high strata language is supposed to serve as an intermediate verification language for at least Java, Python, JavaScript.

It enables doing various forms of verification:
- Deductive verification
- Property based testing
- Data-flow analysis

Features currently not present:
- Type inference. The source program needs to specify enough types so that no inference is needed.
- Type checking. We assume types have already been checked before.
- Namespaces. All definition and reference names consist of a single Identifier

Design choices:
- Pure contracts: contracts may only contain pure code. Pure code does not modify the heap, neither by modifying existing objects are creating new ones.
- Callables: instead of functions and methods we have a single more general concept called a 'callable'.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you considered "callables" being thunks? This is something I was exploring that was working out quite nicely. There are instructions for both the caller and callee to create/update/read-from the activation record separate from the invocation. This would then support the various kinds of call semantics that each language has (e.g., TypeScript has two with function and => functions, Python has lambdas, bound methods, unbound methods).

Copy link
Author

@keyboardDrummer keyboardDrummer Oct 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Closures/thunks are supported through the dynamic dispatch mechanism, see the discussion I had with Aaron on this: https://github.com/strata-org/Strata/pull/152/files#r2468899688 and the InstanceCall constructor of StmtExpr

Basically, you need a mechanism to group behavior and data for handling everything you mention above. However, a more general mechanism is to also allow grouping data and several named behaviors, since that supports the former but also supports doing dynamic dispatch of virtual methods in languages like Java.

But also, the InstanceCall that I've proposed here will for sure be useful for Java, Kotlin and C#. If there is another high-level concept that would work well for multiple other languages, that we can consider adding that as well.

- Purity: Callables can be marked as pure or impure. Pure callables have a reads clause while impure ones have a modifies clause.
A reads clause is currently not useful for impure callables, since reads clauses are used to determine when the output changes, but impure callables can be non-determinismic so the output can always change.
- Opacity: callables can have a body that's transparant or opaque. Only an opaque body may declare a postcondition. A transparant callable must be pure.
- StmtExpr: Statements and expressions are part of the same type. This reduces duplication since the same concepts are needed in both, such as conditions and variable declarations.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Totally agree! This is great.

- Loops: The only loop is a while, but this can be used to compile do-while and for loops to as well.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great, as long as we have some way to recover the higher-level loop (e.g.., to recognize a "foreach" loop that can be vectorized), but I take this as more the job the Strata framework than an individual dialect.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is only intended as a first version. In particular, the current set of constructs is what's needed to support a JVerify deductive verification use-case. Even if it would be impossible to recover the higher-level loop, we can still add one later so the recovery is not needed.

- Jumps: Instead of break and continue statements, there is a labelled block that can be exited from using an exit statement inside of it.
This can be used to model break statements and continue statements for both while and for loops.
- Pattern matching: there is no match-case construct, but there are type tests with pattern matching that enable the same functionality but more generally. `if (x is <Type> <Binding>) { access to <Binding> }`

- User defined types consist of two categories: composite types and constrained types.
- Composite types have fields and callables, and may extend other composite types.
- Fields state whether they are mutable, which impacts what permissions are needed to access them
- Fields state their type, which is needed to know the resulting type when reading a field.
- Constrained types are defined by a base type and a constraint over that type.
- Algebriac datatypes do not exist directly but can be encoded using composited and constrained types.

- The base type for all composite types is dynamic, which is a type that can be type tested.
For all primitive types there is an implicit composite type that wraps around the primitive, so primitives can be boxed to become the Dynamic type. They can be unboxed using a type test. This is useful for source languages such as JavaScript. The operators that work on primitives also work on the dynamic type, although they can error if the types do not align.
- WIP: There is no concept of constructors, but each composite type has a partial variant that represents an object of that type whose fields
are not yet assigned and whose type invariants might not hold.
A partial type can be completed to a full type once all fields are assigned and the type invariants are known to hold.

-/

abbrev Identifier := String /- Potentially this could be an Int to save resources. -/

mutual
structure Callable: Type where
name : Identifier
typeParameters : List TypeParameter
inputs : List Parameter
output : HighType
precondition : StmtExpr
decreases : StmtExpr
purity : Purity
body : Body

structure Parameter where
name : Identifier
type : HighType

inductive HighType : Type where
| TVoid
| TBool
| TInt
| TReal
| TFloat64 /- Required for JavaScript (number). Used by Python (float) and Java (double) as well -/
/- A value of type `Dynamic` is a tuple consisting of a type and a value.
The value can be a primitive type or a map (that can be used with the field select and field assign expressions)
`var o: Dynamic = {}; o.age = 13;`
Values are automatically casted to and from `Dynamic`
Example: `var x: Dynamic = 3; return x is Int` returns `True`
-/
| Dynamic
| UserDefined (name: Identifier)
| Applied (base : HighType) (typeArguments : List HighType)
/- Partial represents a composite type with unassigned fields and whose type invariants might not hold.
Can be represented as a Map that contains a subset of the fields of the composite type -/
| Partial (base : HighType)
/- A nullable type is implicitly cast to its base when required,
such as in a member access or when assigned to a variable of the base type -/
| Nullable (base : HighType)
/- Java has implicit intersection types.
Example: `<cond> ? RustanLeino : AndersHejlsberg` could be typed as `Scientist & Scandinavian`-/
| Intersection (types : List HighType)
deriving Repr

structure TypeParameter where
name : Identifier
bounds : List HighType /- Java has bounded type parameters -/

inductive Purity: Type where
/-
Since a reads clause is used to determine when the result of a call changes,
a reads clause is only useful for deterministic callables.
-/
| Pure (reads : StmtExpr)
| Impure (modifies : StmtExpr)

/- No support for something like function-by-method yet -/
inductive Body where
| Transparent (body : StmtExpr)
/- Without an implementation, the postcondition is assumed -/
| Opaque (postcondition : StmtExpr) (implementation : Option StmtExpr)
/- An abstract body is useful for types that are extending.
A type containing any members with abstract bodies can not be instantiated. -/
| Abstract (postcondition : StmtExpr)

/- We will support these operations for dynamic types as well -/
/- The 'truthy' concept from JavaScript should be implemented using a library function -/
inductive Operation: Type where
/- Works on Bool -/
/- Equality on composite types uses reference equality for impure types, and structural equality for pure ones -/
| Eq | Neq
| And | Or | Not
/- Works on Int/Real/Float64 -/
| Neg | Add | Sub | Mul | Div | Mod
| Lt | Leq | Gt | Geq

/-
A StmtExpr contains both constructs that we typically find in statements and those in expressions.
By using a single datatype we prevent duplication of constructs that can be used in both contexts,
such a conditionals and variable declarations
-/
/-
It would be nice to replace `Type` on the next line with `(isPure: Bool) -> Type`,
so that we can prevent certain constructors from being used for pure StmtExpr's,
but when trying that Lean complained about parameters being used in a nested inductive,
for example in `Option (StmtExpr isPure)`
-/
inductive StmtExpr : Type where
/- Statement like -/
| IfThenElse (cond : StmtExpr) (thenBranch : StmtExpr) (elseBranch : Option StmtExpr)
| Block (statements : List StmtExpr) (label : Option Identifier)
/- The initializer must be set if this StmtExpr is pure -/
| LocalVariable (name : Identifier) (type : HighType) (initializer : Option StmtExpr)
/- While is only allowed in an impure context
The invariant and decreases are always pure
-/
| While (cond : StmtExpr) (invariant : Option StmtExpr) (decreases: Option StmtExpr) (body : StmtExpr)
| Exit (target: Identifier)
| Return (value : Option StmtExpr)
/- Expression like -/
| LiteralInt (value: Int)
| LiteralBool (value: Bool)
-- Next line is commented out since this needs MathLib
-- | LiteralReal {Rat} (value: Rat)
| Identifier (name : Identifier)
/- Assign is only allowed in an impure context -/
| Assign (target : StmtExpr) (value : StmtExpr)
| StaticFieldSelect (target : StmtExpr) (fieldName : Identifier)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name StaticFieldSelect suggests it applies to static fields, but it also includes a target parameter. Do you mean the term "static" to refer to the fact that the field name is a constant?

Copy link
Author

@keyboardDrummer keyboardDrummer Oct 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the field can be statically resolved because the type of the target and the name of the field are statically known.

I often use the term "static" as opposed to dynamic, to refer to compile time things. But yes you're right I'm overloading the term. We should figure out how to eliminate the overloading, by renaming either of the usages. I don't have any immediate ideas for that though.

I've now renamed this to FieldSelect since there are no other field selects and I don't think we will need one since for JavaScript's <target>.<member> we would probably translate that to StaticCall("JSRuntimeLibrary.PrototypeBasedMemberRead", [<target>, <member>]) and similarly for other languages.

/- PureFieldUpdate is the only way to assign values to fields of pure types -/
| PureFieldUpdate (target : StmtExpr) (fieldName : Identifier) (newValue : StmtExpr)
| StaticInvocation (callee : Identifier) (arguments : List StmtExpr)
| PrimitiveOp (operator: Operation) (arguments : List StmtExpr)
/- Instance related -/
| This
/- IsType works both with KnowsType and Composite types
The newBinding parameter allows bringing a new variable into scope that has the checked type
The scope where newBinding becomes available depends on where IsType is used
Example 1: `x is <SomeType> newVar && newVar.someField` here the variable `newVar` became in scope in the RHS of the &&
Example 2: `if x is <SomeType> newVar then newVar.someField else ...` here the variable `newVar` became in scope in the then branch

Together with IfThenElse, IsType replaces the need for other pattern matching constructs
-/
| IsType (target : StmtExpr) (type: HighType) (newBinding : Option Identifier)
| InstanceInvocation (target : StmtExpr) (callee : Identifier) (arguments : List StmtExpr)

/- Related to creation of objects -/
/- Create returns a partial type, whose fields are still unassigned and whose type invariants are not guaranteed to hold. -/
/- In a pure context, may only create pure types -/
| Create (type : Identifier)
/- Takes an expression of a partial type, checks that all its fields are assigned and its type invariants hold,
and return the complete type.
In case the partial type contained members with partial values, then those are completed as well, recursively.
This way, you can safely construct cyclic object graphs. -/
| Complete (value : StmtExpr)

/- Related to dynamic language features -/
| DynamicInvocation (callable : StmtExpr) (arguments : List StmtExpr)
/- alternatively, we could have a closure that takes a CompositeType, like Java's inner classes
This would be more powerful but slightly more of a hassle to use when creating callable closures -/
| Closure (callable: Callable)
/- The next two could be defined using a library -/
| DynamicFieldAccess (target : StmtExpr) (fieldName : StmtExpr)
| DynamicFieldUpdate (target : StmtExpr) (fieldName : StmtExpr) (newValue : StmtExpr)

/- Verification specific -/
-- TODO: Add forall and exists
| Assigned (name : StmtExpr)
| Old (value : StmtExpr)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be useful for this to be able to refer to the state at any label, not just a distinguished "old" state?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably but the Java verification I'm working on doesn't use it yet.

/- Fresh may only target impure composite types -/
| Fresh(value : StmtExpr)

/- Related to proofs -/
| Assert (condition: StmtExpr)
| Assume (condition: StmtExpr)
/-
ProveBy allows writing proof trees. Its semantics are the same as that of the given `value`,
but the `proof` is used to help prove any assertions in `value`.
Example:
ProveBy(
someCall(arg1, arg2),
ProveBy(
aLemmaToHelpProveThePreconditionOfTheCall(),
anotherLemmaToProveThePreconditionOfTheFirstLemma()
)
)
-/
| ProveBy (value: StmtExpr) (proof: StmtExpr)

-- ContractOf allows extracting the contract of a function
| ContractOf (type: ContractType) (function: StmtExpr)
/-
Abstract can be used as the root expr in a contract for reads/modifies/precondition/postcondition. For example: `reads(abstract)`
It can only be used for instance callables and it makes the containing type abstract, meaning it can not be instantiated.
An extending type can become concrete by redefining any callables that had abstracts contracts and providing non-abstract contracts.
-/
| Abstract
| All -- All refers to all objects in the heap. Can be used in a reads or modifies clause
/- Hole has a dynamic type and is useful when programs are only partially available -/
| Hole

inductive ContractType where
| Reads | Modifies | Precondition | PostCondition
end

partial def highEq (a: HighType) (b: HighType) : Bool := match a, b with
| HighType.TVoid, HighType.TVoid => true
| HighType.TBool, HighType.TBool => true
| HighType.TInt, HighType.TInt => true
| HighType.TReal, HighType.TReal => true
| HighType.TFloat64, HighType.TFloat64 => true
| HighType.Dynamic, HighType.Dynamic => true
| HighType.UserDefined n1, HighType.UserDefined n2 => n1 == n2
| HighType.Applied b1 args1, HighType.Applied b2 args2 =>
highEq b1 b2 && args1.length == args2.length && (args1.zip args2 |>.all (fun (a1, a2) => highEq a1 a2))
| HighType.Partial b1, HighType.Partial b2 => highEq b1 b2
| HighType.Nullable b1, HighType.Nullable b2 => highEq b1 b2
| HighType.Intersection ts1, HighType.Intersection ts2 =>
ts1.length == ts2.length && (ts1.zip ts2 |>.all (fun (t1, t2) => highEq t1 t2))
| _, _ => false

instance : BEq HighType where
beq := highEq

def HighType.isDynamic : HighType → Bool
| Dynamic => true
| _ => false

def HighType.isBool : HighType → Bool
| TBool => true
| _ => false

structure Field where
name : Identifier
isMutable : Bool
type : HighType

structure CompositeType where
name : Identifier
typeParameters : List TypeParameter
extending : List Identifier
fields : List Field
isPure : Bool /- A pure type may not have mutable fields, and does not support reference equality -/
instanceCallables : List Callable

structure ConstrainedType where
Copy link
Author

@keyboardDrummer keyboardDrummer Oct 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wondering whether we could get rid of the need of a ConstrainedType, by adding instead an invariant: StmtExpr field to CompositeType, but the issue is then the composite type would always wrap whatever type you want to add a constraint to, so it doesn't seem like it would work as a replacement

name : Identifier
base : HighType
valueName : Identifier
constraint : StmtExpr
witness : StmtExpr

/-
Note that there are no explicit 'inductive datatypes'. Typed unions are created by
creating a CompositeType for each constructor, and a ConstrainedType for their union.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there were algebraic data types, would that be a convenient alternative?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For verification from Java, I don't see a use for algebraic datatypes. There's nothing in Java that would map to them. The concept of a ADT in Java can be modeled using a sealed interface and several implementing records, but that can not be mapped to an actual ADT since in the Java version, each implementing record is a type on its own, which can occur in for example a type test or be the type of a local variable, while in an ADT the constructors are not types.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I wouldn't expect them to directly model Java types. I was more wondering whether you were using things like Option for internal purposes. But if it's just the representation of an Option type that exists at the Java level, then I agree.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And I think this answers my question about composite types above. :) If you don't have named type, you can't create a constrained version of it.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was more wondering whether you were using things like Option for internal purposes.

We're not


Example 1:
`composite Some<T> { value: T }`
`constrained Option<T> = value: Dynamic | value is Some<T> || value is Unit`

Example 2:
`composite Cons<T> { head: T, tail: List<T> }`
`constrained List<T> = value: Dynamic | value is Cons<T> || value is Unit`
-/
inductive TypeDefinition where
| Composite (ty : CompositeType)
| Constrainted {ConstrainedType} (ty : ConstrainedType)

structure Program where
staticCallables : List Callable
staticFields : List Field
types : List TypeDefinition
Loading