Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slicing rules for desugaring #41

Closed
4 tasks done
rolyp opened this issue May 19, 2020 · 16 comments
Closed
4 tasks done

Slicing rules for desugaring #41

rolyp opened this issue May 19, 2020 · 16 comments
Assignees

Comments

@rolyp
Copy link
Collaborator

rolyp commented May 19, 2020

For explorable-viz/fluid#278 and explorable-viz/fluid#277 we need syntax sugar. Slicing needs to be defined for desugaring.

  • syntax of surface language (as separate figure), including if..then..else fluid#115
  • desugaring definition, with auxiliary function to totalise an eliminator
  • forward slicing rules
  • backward slicing rules
  • prove that desugar_fwd and desugar_bwd form a Galois connection – see Desugaring Galois connection #52
@rolyp rolyp mentioned this issue May 19, 2020
7 tasks
@rolyp rolyp assigned min-nguyen and unassigned rolyp Jun 29, 2020
@rolyp
Copy link
Collaborator Author

rolyp commented Jun 29, 2020

@min-nguyen Here’s the sketch from earlier that shows the type of the relation, the form of the judgement, an example rule and the Galois connection that we’ll derive from it.
418E5B49-8464-4F64-A649-2744A3A2C365

Once you’ve completed the final tidy-up of explorable-viz/fluid#300, we can have a chat about how to get started with formalising desugaring.

@rolyp
Copy link
Collaborator Author

rolyp commented Jul 6, 2020

@min-nguyen If you’re happy getting started with this over the next couple of days, a concrete suggestion would be to sketch out a definition of the “desugars to” relation (written above) in the style of Fig. 9 in the paper. Start by giving the syntax of the surface language in the style of Fig 2. (Not necessarily in LaTeX – paper would be fine.) Then define a relation like Fig. 9. You can ignore the greyout out T :: that occurs to the left of the judgement in every premise/conclusion – these correspond to traces in the implementation. For now we needn’t worry about these for desugaring, although we will want them at some point. Ping me if you have questions, otherwise let’s talk when I’m free on Weds.

@rolyp
Copy link
Collaborator Author

rolyp commented Jul 13, 2020

@min-nguyen See below. I recommend using GitHub branches when working with the formalism – knowing you’re on a branch should allow you to feel more relaxed about making changes! Feel free to ping me about macros or any other LaTeX questions.

5FC8FB6C-5507-44DD-B728-141D316546AF

D4E81EFB-3C80-4D97-BAC5-550BB96867A6

D6CA3C2F-B0BD-49B7-950E-FF446AC7E627

@rolyp
Copy link
Collaborator Author

rolyp commented Jul 21, 2020

@min-nguyen Notes on desugaring rules that want to reuse the surface language:
7ED72CAB-9256-45E5-BE27-9C740740BC13 2

And on “totalising” an eliminator with a default continuation, needed for the desugaring of partial generators (and which we can also use to add catch-all _ clauses to the surface language):
621E1B0A-507B-48DE-B1D6-07C467C7E2D4

@rolyp
Copy link
Collaborator Author

rolyp commented Jul 30, 2020

@min-nguyen We sketched the form of the relation, but I don’t think we discussed any specific rules. Something I mentioned last time: we may not need a trace after all for desugaring. Instead we can probably use the original (unsugared) expression, which contains enough information to run the computation backward. (You can actually take this approach with evaluation as well, at least for a deterministic language, but it becomes prohibitively expensive. IIRC, you end up having to do a lot of evaluation during backwards slicing – in fact more than you would do just executing the program once.)

Untitled

So, in the upper part of the diagram above, I’ve given a rule from the desugaring relation. Underneath is the corresponding backward slicing rule. The basic idea behind the bwd rules is to push the annotations on “core” language constructors back onto “surface” language constructors. For now you can just have an intuitive stab at this; when we come to define fwd slicing, we’ll need to make sure (and eventually prove) that the bwd propagation of annotations is sound w.r.t. fwd propagation, and vice versa. Don’t worry about this too much for now.

I think it will be useful here (and in the core language) to distinguish expressions which are unannotated all the way down (where the type of the annotations is Unit), from expressions where a bit is associated with every constructor (where the type of the annotations is 𝔹). For example, the parser should build SExpr Unit, which desugars to Expr Unit, which evaluates to Expl Unit and Val Unit. By doing map (const True), we can switch to a slicing setting, where desugar_fwd goes from SExpr 𝔹 to Expr 𝔹, and eval_fwd goes from Expr 𝔹 to Expl 𝔹 and Val 𝔹. That allows us to ignore annotations (since they’re unit) when we don’t care about them.

With that in mind, I’m abusing metavariables a bit here: on the left of the bwd arrow, we should probably think of s as ranging over SExpr Unit, whereas on the right of the bwd arrow, as ranging over SExpr 𝔹.

@rolyp
Copy link
Collaborator Author

rolyp commented Aug 6, 2020

@min-nguyen Ok, I’ve updated the implementation along the lines of what I proposed yesterday. There is now a type class [Bounded]JoinSemilattice for bona fide join semilattices, with join total and bot a unary constant. There are instances for Boolean and Unit.

The old [Bounded]JoinSemilattice type class is now called [Bounded]Slices and has a partial maybeJoin and botOf: a -> a.

I’ve parameterised the Slices instances for Expr and Val so that they work for annotations of an arbitrary JoinSemilattice. For notational convenience (so that we can write in backward slicing and not consider the fact that this technically may be undefined), every Slices is also a JoinSemilattice which implements join as fromJust <<< maybeJoin, using a helper called definedJoin.

I’ve also generalised the expr and val helpers so that they construct a term using bot, rather than assuming the annotation to be of type Boolean (which I think is the problem you were facing yesterday). And the Pretty instance is similarly parameterised over an arbitrary BoundedJoinSemilattice (since it uses expr, atlhough only once).

@rolyp
Copy link
Collaborator Author

rolyp commented Aug 6, 2020

@min-nguyen Nib was loose on my pencil 🤦 .

Thinking some more, having eval work with Expr Unit and eval_fwd work with Expr 𝔹 sounds nice, but might not be easy to do in practice. We will have an environment of type Env Unit (containing primitives and prelude) which we use to evaluate the program, which we will need to be of type Env 𝔹 when we come to forward slice in that environment. Although conceptually this is as simple as applying map (const true), unfortunately map doesn’t really work with environments: an environment is a DAG with a vast amount of redundancy (lots of sharing), and recursing over the DAG effectively unravels it into an enormous tree. Without some kind of memoisation/dynamic programming, it’s not a feasible computation.

So, in the formalism I think we can say the annotation type is (meta-)unit if it is convenient to do so, but in the implementation we will for now assume it’s always Boolean, even if sometimes we don’t care.

@rolyp
Copy link
Collaborator Author

rolyp commented Aug 11, 2020

@min-nguyen Ok, I’ve had a chance to look over what you’ve done. This is looking good – you’ve really made progress. We’ve still a fair way to go to reach the point we need to get to, but that’s not a reflection on what you’ve done, just the normal process of iterating over something until it converges.

I have lots of minor suggestions, and a couple of other thoughts that we might need to discuss. If you could aim to process these changes by Friday then we can talk about other things.

  • I suggest dropping the σ,e index on the totalising relation; they look like additional arguments. It can be reasonable to use subscripts to generate overloads/variants of relation symbols, but it’s best used judiciously and probably also best to use a different font (e.g. a sans serif font) so that it’s the clear the symbol isn’t acting as a metavariable.
  • Let’s stick to UK spelling, so totalising.
  • It’s probably overkill to define totalise as an inductively defined relation. It should be “obviously” total and deterministic, and thus definable directly as a function (via a set of defining recursive equations – look at the LaTeX environments used in Fig. 10 or 11). One reason to use inductive definitions is when we need to be able to treat the definition as a data type (as with evaluation traces); another is when the thing you’re definining is a proper relation, or at least not obviously a function. Neither of those apply here, so a function is probably best.
  • The var-elim and var-expr cases of totalise could use some parentheses around the eliminator argument. The nil case could probably lose the parentheses inside the Cons.
  • Contexts (Fig. 1) need to allow Δ as a metavariable.
  • Fig. 17 refers to “extended” terms in a few places. Let’s standardise on “surface-language terms”. For the use inside the figure itself we can probably say “Raw surface terms”.
  • Let’s standardise on “list range” rather than “list sequence”. Don’t forget the rule names as well.
  • Fig. 6 (Typing rules for terms) still includes surface-language terms.
  • Let’s always put the Nil case before the Cons case (e.g. in the definition of Pattern, Fig. 17)
  • Double and triple primes don’t always format well in LaTeX. There are macros in scope called (I think) \twoPrime and \threePrime which should help with this.
  • Drop the let binding from the list-comp-gen rule (but see comment below).
  • The list-comp-qual desugaring rule overlaps with list-comp-true, which makes the rules non-deterministic (and potentially non-terminating). This can be fixed by adding an appropriate side-condition to list-comp-qual.
  • I feel like we should add regular list notation to the surface language now, so that [] desugars to Nil and [s · \vec{s}] desugars to Cons(e, e') in the obvious way. Otherwise it seems weird that we support list comprehensions but not normal lists!
  • I think we need to be more explicit about an operation (probably best defined as a function) that converts a pattern and a continuation into a (singleton) eliminator. In particular list-comp-gen invokes totalise with an argument that isn’t an eliminator. (p \mapsto e doesn’t really make sense.)
  • The list-comp-decl rule is problematic because structured let bindings (let p = e in e’) don’t exist in the core language either, nor are they trivial to add without introducing the notion of pattern (or singleton eliminator) into the core. (Currently they are defined in the appendix.) Assuming we have a function that we can use to turn p and e into an eliminator σ, we can use that here and then desugar the let qualified into a match..as. That in turn suggests a better way of dealing with structured let more generally, rather than the way it’s currently dealt with in the appendix.
  • We should probably assume range is a curried function.
  • I don’t think we need Fig. 25 (pattern-matching by pattern).
  • some of RHS in the backward slicing rules don’t look right – backward slicing must produce a slice of the original expression. In particular list-comp-true and list-comp-qual look wrong. I haven’t looked carefully yet at the others.

The list-comp-gen rule is going to take some thought. You seem to have a slightly odd mixture here: an anonymous function (which doesn’t exist in the core language), which you bind to a name f. Generate names during desugaring will require freshness side-conditions, which will make the rules non-deterministic, which will mean (for backward slicing) we
will need a trace after all to record the names that were picked. That might be a complication to avoid if we can, which suggests we need anonymous functions in the core. But then without some further refactoring, we’ll end up with two kinds of closure (recursive and non-recursive). These are low-level details, mostly presentational, but we’ll need to decide how best to rearrange things. For now I’ve suggested dropping the binding to f and assuming anonymous functions exist in the core.

Finally, there’s some thinking to do about annotation-propagation in the slicing rules. The slicing rules we’ve given elsewhere don’t uniformly assume expressions have annotations on them (although the implementation does), but they do propagate annotations on data constructors and constants. We’ll need to do something similar for desugaring. But I think it’s fine to get the structure of the rules right first, and then think about the annotations.

@rolyp
Copy link
Collaborator Author

rolyp commented Aug 26, 2020

@min-nguyen Changes from today’s discussion:

  • (Figs. 6, 7 and perhaps elsewhere) Let’s generalise the list typing rules so that lists are polymorphic.
  • Let’s write [ ] as [] and ... as .. (a la Haskell). I would put the .. in code font if it isn’t already.
  • Rename “nil” and “cons” rules in Fig. 17 to “empty list” and “non-empty list”.
  • We need pattern forms for list notation as well (empty and non-empty). Fig. 19 needs the corresponding typing rules.
  • The list typing rules (but not the range typing) need to be made polymorphic.
  • We need to need a separate typing judgement for the sequence form that occurs in the second premise of the non-empty list typing rule. There are other approaches here but this is consistent with the treatment of sequences of qualifiers.
  • if-then-else should now desugar to a lambda.
  • match-as needs to desugar the σ too (needs a separate definition).
  • totalise and branch can probably be combined into a single definition (see figure below).
  • Drop s' output of forward slicing judgement.
  • list-comp-gen backward slicing rule: express second premise in terms of untotalise function, which takes the original p and the output of totalise, σ, and produces a (slice of) the original e that p was mapped to.
  • Introduce “Surface terms” to Fig. 17, ranged over by t, which attach an annotation to a raw surface term. All the recursive occurrence of s in the definition of s should become t (and similarly for the occurrences of s in q).
  • We needn’t actually have lambdas as part of the core language – I think it’s enough just to assume that it’s clear how to add lambdas (since they’re described in the appendix), and then make use of them in the desugaring. The problem with adding them to the core language is that they give rise to a different notion of closure. We don’t want 2 kinds of closure in the core and unifying them isn’t easy. (It’s not so much technically challenging, just inelegant.)

Then we’ll be in a position to enrich Fig. 24 to specify how the annotations on an e are converted back into annotations on a t.

image

@rolyp
Copy link
Collaborator Author

rolyp commented Sep 8, 2020

@min-nguyen I’m thinking we should aim for ICFP next year (deadline ~Feb). I’ll email separately about that. First some thoughts on how to move forward with the desugaring formalisation and implementation. These need to be tied together as we move forward and you should be in the driving seat for both, with my guidance.

A couple of high-level points, now that things are a bit clearer. First, I think we should treat eliminators as core language only — very similar to the way Haskell pattern-matching compiles to case trees. We already have some of this in place as we had introduce a pattern syntax for list comprehensions and a way of “totalising” these into eliminators. What I propose we do further is allow functions to be defined “equationally”, i.e. as lists of equations with patterns on the LHS. These equations should be turned into an eliminator by merging their patterns. This operation is only partial; if it fails the equations are ill-formed, e.g. because they try to merge constructors from distinct data types.

The implementation does this already, so this is just a question of formalising this aspect of the implementation. The upside is that we will have a clean separation of concerns between the surface language (which uses more familiar pattern syntax) and core language (which uses eliminators), and a small but useful contribution of the paper will be to show how Galois slicing works in that setting.

The other point concerns list notation (let’s use this term to mean notation like [x, y, …] rather than Nil and Cons. (Minor aside: it’s annoying to have both forms in the same surface language, but it seems we can’t do without; we might be able to make this slightly less painful by renaming Nil to [], and perhaps Cons to infix :, but I won’t suggest that just yet.)

The key observation about list notation is that I think we need to be syntactically explicit about the implied Cons nodes. Currently we give the syntax of lists as either [] or [t · \vec{t}], where \vec is effectively interpreted as a meta-level type constructor for sequences. This doesn’t really allow the surface language to track the annotations on the Cons nodes in the desugared expression. I think we need to model the grammar of list notation more explicitly:

IMG_20200908_103704

The “list rest” metavariable is the letter ell. I’ve left the typing and evaluation rules for ell for you to do (they’re straightforward). Hopefully this makes some kind of sense; if the rationale for doing things this way isn’t quite clear yet, hopefully it will become so once you start writing down the rules that propagate annotations between the two notations. (Returning to the aside above: if we were to rename Nil to [], we no longer need any extra rules for [].)

I’ll create a new issue for the “equational definitions” desugaring, and create a list of todo’s for list notation and other minor tweaks, this afternoon. (I’m tied up now until about 3pm.)

@rolyp
Copy link
Collaborator Author

rolyp commented Sep 9, 2020

@min-nguyen Todo’s relating to list notation and other minor comments:

  • Let’s swap metavariables s and t so that s ranges over surface terms (I think this will be more memorable, but it means updating all the figures – sorry!)
  • New list notation as described above and associated typing and evaluation rules.
  • List patterns will also need to support the new list notation, with an analogous grammar.
  • Eliminators no longer need to be part of the surface language, and so can be removed from Fig. 17. We will need to address the question of partiality but we’ll do that as part of Piecewise definitions #51.
  • I don’t really understand the definition of eliminator desugaring (Fig. 23). Luckily we don’t need it any more.
  • Based on your observation that desugaring and forward slicing of desugaring are not very different, I suggest we turn Fig. 22 into a definition of forward slicing, rather than desugaring. This will mean extending each rule so that the conclusion takes a surface term on the LHS rather than a raw surface term and then specifies how annotations are produced on the output term. We won’t give an explicit definition of desugaring itself, but just say (in the text) that it’s the same as the forward slicing rules, but with the annotations disregarded.
  • Define forward and backward slicing for desugaring “list rest”. This is where you’ll see the point of the explicit list notation, because the annotations will be transferred to and from the desugared form.
  • Kill match from the surface language until we complete Piecewise definitions #51
  • The list-comp-true rule could be renamed list-comp-done.
  • I don’t think the t’ ∈ Bool side-condition on list-comp-guard is necessary (and I don’t really know what it means).
  • We need to take care to visually distinguish “object language” names like concatMap and range from meta-language names like totalise. For the latter, use the same formatting macro as we use for the val function that occurs in Fig. 15, which is also a meta-level function.
  • Figs. 26 & 27: If we want to use metavariables to give the signatures of these operations, I would use an equational notation, e.g. totalise ρ κ = σ. (That’s not a particularly common way of giving a function signature, but it’s better than treating metavariables as types. Note that in this case the first p is in the wrong font.) Alternatively, we could give these operations actual types, using indexed families (the set theoretic notation for dependent types). We can talk about how to do this; it’s not necessary, but it is a way of making definitions precise, and can be useful when operations do complex things like transform variable contexts in certain ways.
  • Fig. 26. a Cons eliminator has syntax Cons σ not Cons ↦ σ. A couple of the equations have an extra .
  • Fig. 26: also, the Cons rules that defer to totalise don’t need to wrap the totalise call in parentheses – in the metalanguage we’re assuming functions have the notation f(x) in constrast to f x in the object language, which does sometimes need to be parenthesised.

@rolyp
Copy link
Collaborator Author

rolyp commented Sep 19, 2020

@min-nguyen New to do list (migating 2 remaining tasks from last week):

  • Update totalise and untotalise to new list notation.
  • Fig. 27: this definition looks broken in a few places: the third (false) equation looks incorrect; metavariables look wrong in equations 6 and 8 (cons cases). Inner parentheses in the fourth (pair) equation are also unnecessary. And I would probably use τ rather than σ on the LHS in the cases where there σ is not already being introduced.
  • Fix list notation errors in various figures. Ensure macros are used consistently for all new notation.
  • Pattern typing rules should use \patt- macros.
  • List notation start and end symbols [ and ] should use same font as exNil macro. You should probably wrap the brackets in squiggly brackets { }; this will stop LaTeX treating the [ as a bracket symbol and messing up the layout.
  • Ensure variable fonts are used consistently: taking types as an example, A (math font) is a metavariable which ranges over (object-language) types; List (code font) is an object language type.
  • Forward slicing rules should use forward-slicing arrow.
  • Add the forward and backward annotation-propation behaviour to the desugaring rules. Start with the list syntax.
  • Form of the judgement (metavariable p) is incorrect in Fig. 21.
  • Figs. 28 and 29 should use the \elim- macros and (for list eliminators) the \branchNil and \branchCons macros.
  • Use \vec rather than \overrightarrow for vectors.

@rolyp
Copy link
Collaborator Author

rolyp commented Oct 2, 2020

@min-nguyen Adding some todo’s for today’s chat:

  • add annotations to the “list rest” and qualifier syntax
  • explicit if constructor for guards (check the macros do their work here!)
  • add the [] case to the desugaring rules (we should probably add rules for all surface language forms at some point, but it’s not urgent now)
  • remove spurious α’s in a few places in the outputs
  • identify all the places where there are annotations on the LHS or RHS of every judgement ( e.g. nested function applications, qualifiers inside list comprehensions, etc) and ensure the rules specify how all of these are consumed/produced as necessary – annotations should be conjoined going forward and disjoined going backwards
  • by the same token, remember that λσ is a raw term (and thus has its own annotation)
  • for the meta-level sequence notation \vec{q}, I suggest using the “·” (\cdot) symbol for cons and concat, so for example q · \vec{q} would pattern-match a sequence into a head and a tail (you can use the macro \concat for this).

If would be great if you could deal with these and make a start on #51 by next Friday. We can also start talking about/defining the lattices induced by the α᾽s, which we’ll need for the Galois connection.

@rolyp
Copy link
Collaborator Author

rolyp commented Oct 30, 2020

@min-nguyen Pic from today:
image
(It’s unfortunate that Haskell uses left-arrow, right-arrow and equals to indicate variable binding in various contexts.)
To do’s we discussed:

  • fix spacing after , in surface syntax
  • drop Fig. 22, and fix second case of Fig. 23 as discussed a couple of weeks ago
  • check metavariables in list-comp-decl fwd rule
  • turn elim into the definition of clause desugaring (write using desugaring arrow)
  • elim is incorrect as defined: it claims to take a clause, but we call it with arbitrary continuations. It also needs to produce an eliminator containing desugared expressions. To address these points, perhaps elim needs to take not a clause as input, but a non-empty list of patterns and an already desugared expression – see implementation
  • merge should be desugaring for non-empty list of clauses (write using desugaring arrow)
  • unmerge can become a resugaring: write with backwards desugaring arrow, and give as an inductively defined relation rather than express via zipWith
  • use clause desugaring in list-comp-decl rule
  • use clause desugaring in list-comp-gen rule (see figure above)
  • totalise should be defined for eliminators, not (pattern, expr) pairs (totaliseRest no longer needed?)
  • untotalise should also be defined for eliminators (untotaliseRest no longer needed)
  • write out explicitly recursive definition of merge (as inductively defined relation)
  • conts (:) cases look incorrect – revisit once unmerge is rewritten in explicitly recursive, relational style

You should be able to have a go at finishing explorable-viz/fluid#115 as well (including migrating any examples of match..as where the scrutinee is a Boolean to if..then..else).

@rolyp
Copy link
Collaborator Author

rolyp commented Nov 5, 2020

@min-nguyen Here you go. Page 12 is a little bit blurry but there’s very little on that page anyway.

IMG_0086

IMG_0087

IMG_0088

IMG_0089

IMG_0090

IMG_0091

IMG_0092

@rolyp
Copy link
Collaborator Author

rolyp commented Nov 30, 2020

@min-nguyen I think we can consider this task done. Well done, more of a slog than expected, but worth it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants