Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix memory forms and evaluation #7

Merged
merged 4 commits into from
Jul 24, 2018

Conversation

DIJamner
Copy link
Contributor

Fixes the specification of BIL memory by introducing a proper memory value-form and managing sizes correctly in the evaluation rules. Introduces a type well-formedness judgment in order to do so. Also fixes a couple preservation bugs involving unknown.

@ivg ivg self-requested a review July 16, 2018 15:34
Copy link
Member

@ivg ivg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm really reluctant to add the compute_type relation to the semantics of the BIL language. It moves us far away from the untyped lambda calculus and makes evaluation dependent on the typing context. Also, I believe, that we can express the operational semantics of our program without any type-level computations.

Basically, we have two places where we need typing information to define the semantics: loads from non-byte storages, and operations on unknowns.

We can handle unknowns (which represent undefined bitstrings of the specified length, not a diverging computation, aka the bottom value) by enumerating all possibilities on the irreducible level, e.g.. for the concatenation operator, a well-formed program may be reduced to only four forms:

  1. w <cat> unk
  2. unk <cat> w
  3. unk <cat> unk
  4. w <cat> w

Neither of which requires type computation. The same is true for concatenations and extracts. So they all could be defined without compute_type. The same is true for extract and bop.

We can employ the same approach with the load operations. We should first reduce the load operation to a normal form on its storage. An irreducible storage would be unknown of type mem<a,t>, as we represent the initial state of the memory as an array of unknown values of type t, indexed by values of type a. Thus a storage will always be evaluated with unknown memory storage at the bottom of the derivation tree.

@DIJamner
Copy link
Contributor Author

I realized that I didn't specify the definition of the compute_type function. It should be defined as follows:

compute_type (num:nat) = imm<nat>
compute_type (v[num:nat <- v : sz]) = mem<nat,sz>
compute_type (unknown[str]:t) = t

I thought that the above definition would correspond to your idea that the type of a term should be simply computable from the syntax. If this definition clears things up, I will add it to the LaTeX.

Otherwise, it would be possible to split each rule that uses this function into two separate cases, one for word or memory values and one for unknown. While this would add a number of additional rules, we could structure it that way if you prefer.

Thanks for the feedback.

Copy link
Member

@ivg ivg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, sorry this PR slipped of my attention. I'm fine with having compute as a metafunction (though I'm not a big fan of the compute_type name, as it suggests that this function does some computation, so something like typeof, valuetype or anything that doesn't actually involve any verbs in the name would be better. However, it doesn't show up in the rules, so that's fine).

I'm also still confused why we need this function and why we can't destruct the value directly in the judgement, e.g.,

sz' > sz
succ w = w'
-----------------------------------------  load_word_be
delta |- (v[_:nat] <- v' : sz)[w ...

perhaps using the def metafunction for making it shorter.

So is there any fundamental problem that I'm misising? Or this just makes things easier and shorter? I'm fine with it, if it makes your life easier, then let's make it as it is.

@DIJamner
Copy link
Contributor Author

No problem, thanks for reviewing all of my changes. The issue is that a mem<nat,sz> is either a memory literal or an unknown. I could duplicate each relevant rule, once for each, but I felt it better captured what was happening, and reduced the number of rules, to abstract that out into a metafunction.

@ivg
Copy link
Member

ivg commented Jul 24, 2018

Ok, this explanation works for me. And it also looks nice, I must admit :)

@ivg ivg merged commit ca2b1e2 into BinaryAnalysisPlatform:master Jul 24, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants