Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parser: work on NAppDef & OperatorInfo related code #1047

Conversation

Anton-Latukha
Copy link
Collaborator

No description provided.

@Anton-Latukha
Copy link
Collaborator Author

Anton-Latukha commented Jan 23, 2022

Thoughts during this PR:

  • seems like a project can get a structural bonus from having a grammar/syntax module (or at least section in the Parser).

Because grammar now is better formulated & becomes more obvious.

  • So far hadn't found particular elegant way to remove OperatorInfo without big changes in processing code & Pretty.

OperatorInfo has several ugly sides, where the functional application (which we can say always left associative), and unary operators (which do not have associativity property at all) still need to fake associativity field. In Binary, the NAssocNone is still associativity 🤣 (bidirectional) (this mistake drags at least from parser-combinators, which called associative infix operator non-associative, while not having associativity on binary operator is almost impossible to encounter, since not having associativity makes algebra not composable in a general sense of use).

I would like to have & use a better map of NAssoc to parser-combinators. For Infix* to not be present at all & implied through free abstraction in the source code.

Also the OperatorInfo is used in Pretty to change the operator information on the fly & pretend that operator associativity changes, I'd liked to have the pretty-printing the function application, unary and binary operations in an honest straightforward way.

I'd liked there to be the only datatype & data structure of the language operator grammar (NOpearatorDef), which would allow to fix above things.

(also do not know the current state of performance of GADTs in GHC.)

I think maybe the use of patterns (or functions) on NOpearatorDef would allow doing these.

@Anton-Latukha
Copy link
Collaborator Author

Anton-Latukha commented Jan 23, 2022

The proper solution is probably to have one value that represents the language grammar.

And then have a set of functions that on NOpearatorDef entry return the properties.

And then on top of that build N{Unary,Binary,Special}Op -> <property> retrieval functions.

@Anton-Latukha Anton-Latukha force-pushed the 2022-01-22-merge-NOperatorDef-n-OperatorInfo branch 6 times, most recently from d842a6c to d7d507d Compare January 26, 2022 18:00
@Anton-Latukha Anton-Latukha force-pushed the 2022-01-22-merge-NOperatorDef-n-OperatorInfo branch from 1139aa8 to 0edf6d7 Compare January 26, 2022 20:54
@Anton-Latukha
Copy link
Collaborator Author

Anton-Latukha commented Jan 27, 2022

Well.

The work is mostly done.

The main side-effect of working on OperatorInfo removal was that hacks & dirtiness around it in {Parser, Pretty} code was cleaned up during it & results were refactored.

The funny thing is that now I arrived basically to OperatorInfo design again, but with more straightforward code. Seems like OperatorInfo may existed as such to have Pretty module & information sharing decoupled from Parser.NOperatorDef, so it allowed information transmission without reliance on NOperatorDef type structure. But also NOperatorDef structure was underdeveloped at the time (lacked precedence information). But that is not that good design overall either, because OperatorInfo is not attached to the object it represents, so it always required to be matched externally & because that was not done - the hacks in Pretty appeared, which code puzzled the reader (me), and while being OperatorInfo it was used to represent not only operators but all kinds of things.

The next question is: Nix.Pretty either way is a specialized code which is created to pretty print the particular language, and so it depends on the language, and so dependence on language NOperatorDef. Even if to go/think about further multipackage - the Pretty module probably would be difficult to reuse with big 3rd party changes in the language - that would change the pretty

@soulomoon
Copy link
Collaborator

Why stop using unless and for_, it seems pretty reasonable to use them in the original code.

@Anton-Latukha
Copy link
Collaborator Author

Anton-Latukha commented Jan 28, 2022

So people would learn to work/read with default Haskell functions traverse{,_} & sequenceA{,_} better.

Traversable is defined for traverse & sequenceA. All others are essentially aliases from them. map{,_,M}, for{,_,M}, sequence{,_,M} (that is 9 functions) are old naming for them & now are indeed aliases. And old names create a combinatorial explosion.

The combinatorial explosion creates a combinatorial explosion of knowledge requirement & mapping between those names & cognitive effort to process it.

Even writing rewriting rules for them requires Either grow combinatorial explosion exponentially, through composition lattice, OR just to do unification rewriting to default names & have standard consice rewriting lattice.

HLint has no equality intuition between aliases, and so, in long term - HLint would help only if default lattice rules match.

For example return now enters deprecation in GHC, and I have HLint PR that unifies return -> pure & that change alone allowed lattice to behave better (it started to discover new patterns).

And rewriting is mostly what I did up to this point here.

I myself forget that the default name is sequenceA and not sequence. It is helpful to have HLint giving suggestions.

Then learn all old & new intuitions & lattices. For new functional programmers & Haskellers it would be easier to learn the default Traversable traverse & sequenceA, they would have no intuition of an imperative for loop, which reverses control flow in Haskell (function application there gets reversed, for x f, instead of traverse f x).

Above rationalizations I apply to map{,_M}, for{,_M}, sequence{,_M}.

And I do such renames to form and stabilize style in the project.

As a result, style does not needs to be managed in the project at all. It allows to me to accept and be happy to take in work done in any style, I do not require adopting the style by other people, in reviews - for me the improvements are important. I can relax the review constraints & concentrate on improvements, because project code itself passively and politely nudges contributors towards a particular style if they can be nudged, but if they decided to contribute in their own style - I do know they decided so, for me that is enough to justify them contributing in their own style. I can obtain a bus factor myself & code form itself would teach people this style and show it is possible & works great, and without me would provide style maintenance passively continuously and would make people consider this design style for this project, and their own style and so for other projects for a number of years.

But it also may be true I went a bit far with essentially redefining the prelude for the project.

But most of these changes are probably sane to expect from the next default Prelude, or contribute suggestions into it. I also promised to move to the new default Prelude when it would be released.


unless rename.

Because I constantly need to remember how to read unless. And a have perception - a lot of other people also.

I also met & read other people noting that they have difficulty reading unless. But when (not exists) - needs no explanation or mental internalization of the name semantic implication.

It is also why I aliased not null as isPresent, since programmers are interested in inhabitants of a structure and that structure can be empty - is an unfortunate consequence of structures.

The additional question is why unless/when are bool (pure ()) <expr> <condition>, when they could been/already be based on pure mempty just equally. () is always Monoid, pure behavior is always transparent and by definition does not depend on the value in a, because of Applicative properties, so there is no reason not to use pure mempty for them.


All these style changes I did so far seem to align the code representation with the internal partial application sharing process (first default case, then general case), which seems to align with use of eliminators & control flow in Haskell.

Of course control flow in do blocks works differently. & inside them for & map may land & read better. But I am aware I do not use do blocks enough, currently I mostly internalized default lambda & functional design patterns in Haskell, still developing the style, do blocks sometimes effective way to create automatically optimized code, but I often prefer to compose results afterward & apply optimizations myself & get a new semantic understanding and simplified representation & denote functions in composition.


I am happy to have a dialog on any point or get additional reasoning points into a dialog.

@Anton-Latukha
Copy link
Collaborator Author

Anton-Latukha commented Jan 28, 2022

I also currently get to have thoughts about when{,True,False,Nothing,Just,Left,Right,Pure,Free, ... etc}. Besides I can be attacked that I unflipped the arguments in more lambda style order - that is indeed a schism, people would expect order as in base. Currently I am more and more aware that such single-hand functions imply too much & names tell too little, and overall seem to not feet Haskell. Because from when it is not distinct - do they imply pure (), Monoid a => pure mempty, mempty, id for default case - is not obvious from their name, and cases {mempty, pure mempty, id} - are all indeed get use. So at least distinct naming prefixes should be defined for such 3 cases, but also that is right on Fairbairn threshold & we start to ask are those when{,True,False,Nothing,Just,Left,Right,Pure,Free, ... etc} matrix even needs to exist or just use binary eliminators with {mempty, stub, id}, and so when becomes bool stub ... etc. That info is just my consideration and is not related to current PR.

M  src/Nix/Exec.hs
M  src/Nix/Standard.hs
M  src/Nix/Value.hs
Stramgely, profiling shown that in this bool a pretty big part of time was spent.
@Anton-Latukha Anton-Latukha marked this pull request as ready for review January 31, 2022 10:09
@Anton-Latukha Anton-Latukha merged commit 2a3cea0 into haskell-nix:master Jan 31, 2022
@Anton-Latukha Anton-Latukha deleted the 2022-01-22-merge-NOperatorDef-n-OperatorInfo branch January 31, 2022 10:20
@soulomoon
Copy link
Collaborator

soulomoon commented Feb 1, 2022

Thank you @Anton-Latukha for the detailed explanation. Haskell does suffer the fact that a lot of function names points to the same or similar functionality. For me, I do not have a strong opinion upon these changes.
My concern is purely for the looking of the code.

  • The replacement of for_ here alters the parameters positions and put the much longer argument to the front
  • unless x y with when (not x) y introduce a little bit more structural complexity.

But overall, I suppose it is indeed a good idea to stick to a style.

@Anton-Latukha
Copy link
Collaborator Author

Anton-Latukha commented Feb 1, 2022

Yes. But that the source character size of arguments is different - is not an argument to suddenly introduce a 2nd function to do the same thing.

traverse does puts a larger argument in the middle. A lot of functions that have canonical Haskell argument order, & eliminators - do put larger argument in the middle. But that is also logical why it is larger - it is a function - it almost always would be larger than the argument for it.

For me this Haskell design requires Either being Ok with reading the large section, OR - if not Ok with having that section - formalizing the semantics of the section using denotation abstraction (a fancy way of saying - that should be a nicely named local function).

For me that, the function sectioning in the middle makes the code feels awkward to do - is a feature, it indicates that that section is above the Fairbairn threshold & so design nudges toward naming that code part properly & so as a result having more readable & understandable code through composition.

While for pattern - encourages to hang huge lambdas onto the tail & not thinking about how to name/abstract them, because why abstract the tail (while in reality that is a function that is a middle of control flow conveyour).

In both cases - (traverse, for) - the solution is to find a proper denotation for the function, but the g f a form directs towards doing it.

Indeed, sometimes I section things and like: "ugh, how to name this" - that is the intended property of the design.


At the same time, traverse / g f a form makes the function composable, as the composition of the function (generally) happens over the main path of the argument passing.

With the ability to be composed, and being the default name - HLint would pick up the rewriting simplifications for it if they would occur.


While initially learning Haskell I found the map{,_,M}, for{,_,M}, traverse{,_,M} sequence{,_,M}` superset deeply puzzling. And additionally found very puzzling the functions & code that reverses & does jumps in the control flow/function application/composition - it required time to be able to read them properly.

When I arrived at HNix, here was a number of z >>= d <&> a ?? h, as if someone tried hard to reverse the control flow in Haskell. It also spanned multiple lines & was hard to read, even how to move eyes through it was hard to determine. Example of why control flow is better to keep <- - it is easier for newcomers to read & understand.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants