Skip to content
This repository has been archived by the owner on Apr 21, 2023. It is now read-only.

Migrate the Xtext formatter to the formatter2 API #1489

Closed
miklossy opened this issue May 11, 2020 · 24 comments
Closed

Migrate the Xtext formatter to the formatter2 API #1489

miklossy opened this issue May 11, 2020 · 24 comments
Assignees
Milestone

Comments

@miklossy
Copy link
Contributor

The XtextFormatter and the corresponding test cases should be migrated to the formatter2 API infrastructure.

@cdietrich
Copy link
Member

cdietrich commented May 11, 2020

we should have a java api for the new formatter first
eclipse/xtext#1697

@miklossy
Copy link
Contributor Author

See also https://bugs.eclipse.org/bugs/show_bug.cgi?id=405184

Xtext still uses the old formatter, this should be changed also.

@ewillink
Copy link

Not "should" be changed, "must" be changed. The new infrastructure does not support the old Formatter consequently ALL primary Xtext related tools that use a formatter MUST be migrated well before the old infrastructure can be deleted.

@miklossy
Copy link
Contributor Author

miklossy commented Jun 12, 2020

The new infrastructure does not support the old Formatter

@ewillink It is possible to use the old formatter with the new workflow, currently, the Eclipse GEF project also uses that combination, see the GenerateDot.mwe2 and DotFormatter files.

@ewillink
Copy link

Thanks, although GenerateDot.mwe2 is commented as supporting the old Formatter, it doesn't but it provides the critical FragmentAdapter clue (using deprecated classes).

fragment = org.eclipse.xtext.generator.adapter.FragmentAdapter {
	fragment = org.eclipse.xtext.generator.formatting.FormatterFragment {}
}

@ArneDeutsch ArneDeutsch added this to the Release_2.24 milestone Sep 25, 2020
@ArneDeutsch ArneDeutsch self-assigned this Sep 25, 2020
@ArneDeutsch
Copy link
Contributor

I will do this as a first example for the java API for the formatter and to spot additional improvements to the API.

@ewillink
Copy link

Please be aware that there may soon be a new Serializer/Formatter option. See https://bugs.eclipse.org/bugs/show_bug.cgi?id=564265

I have this working for 5 OCL editors and am just polishing the Idioms DSL, prior to writing Xtext.idioms and Terminals.idioms to demonstrate its utility for Xtext itself.

You can look at https://git.eclipse.org/r/plugins/gitiles/ocl/org.eclipse.ocl/+/refs/heads/ewillink/563046/plugins/org.eclipse.ocl.xtext.base/src/org/eclipse/ocl/xtext/base/Base.idioms to get a hopefully self-explanatory example. The implementation of serializer+formatter requires just some shortish 100% declarative *.idioms files and a single '10-line' Java class to implement the bespoke getComment(semanticObject) method. It seems to be 2-3 times faster when measured around XtextResource.save, and perhaps 5-50 times faster for serialization+formatting although I'm not sure that the internal measurement is a fair comparison. The auto-generated XXXSerializationMetaData is a bit bigger than the old infrastructure Semantic+SyntacticSequencer+Formatter, but distinctly smaller than Semantic+SyntacticSequencer+Formatter+GrammarAccess so if the GrammarAccess is eliminated it's an all round win.

If you want to study the code now, look at the examples/org.eclipse.ocl.examples.xtext.{build , idioms , idioms.ui , serializer} plugins in the ewillink/563046 branch. Currently 99% OCL-free; just a few utility/build subroutines to clone.

ArneDeutsch added a commit that referenced this issue Sep 25, 2020
ArneDeutsch added a commit to eclipse/xtext-eclipse that referenced this issue Sep 25, 2020
ArneDeutsch added a commit that referenced this issue Oct 2, 2020
@szarnekow
Copy link
Contributor

@ewillink I see that you are still working on the serializer in the ocl repo. Could you please let me know when you find this to be ready for prime-time? I'd like to give it a shot with the Xtext examples.

@ewillink
Copy link

ewillink commented Oct 4, 2020

Xtext.txt

@szarnekow (when is anything ever ready for prime time?) It's certainly ready for you to evaluate to see what you do/don't like so that I can react to your comments. The four examples/org.eclipse.ocl.examples.xtext.{build, idioms, idioms.ui, serializer} plugins on branch ewillink/563046 of the OCL GIT should be independent of all other OCL plugins and so 'only' awaiting a repackaging if you choose to adopt them.

Currently I am sorting out re-formatting as opposed to total serialization save formatting which is fully working, except for the pathological cases that probably do not exist in practice.

I made very heavy weather of line wrapping idioms; one test works, more needed.

The attached Xtext.idioms (please rename) transliterates the Xtext formatter. Very similar but about 30% smaller and no code. I need to do some work to support both // and // comments. Currently just //

The *.idioms file sits adjacent to the *.xtext file; you may want to streamline the access/indexing/stubs.

/org.eclipse.ocl.examples.xtext.build/src/org/eclipse/ocl/examples/xtext/build/mwe2/GenerateIdiomsEditor.mwe2 uses the new Serializer/Formatter fragments.

Error handling needs a revisit.

@ewillink
Copy link

ewillink commented Oct 5, 2020

I am sorting out re-formatting as opposed to total serialization save formatting

Unfortunately because of https://bugs.eclipse.org/bugs/show_bug.cgi?id=405184 I'm not 100% sure what I'm trying to achieve.

My first attempt was a total serialization, oops mustn't change unselected text, so blend in a local total serialization, but oops that imposes construct order normalization, and normalizes typical Xtext String single/double quote optionality. Changed to a resynthesize selected whitespace, which seems to be what JDT does, although I cannot distinguish the intent of "Format" and "Format Element".

Xext is however inconsistent. Given

ContextDeclCS returns ContextDeclCS:
	PropertyContextDeclCS
	| ClassifierContextDeclCS
	| OperationContextDeclCS;

selecting the whole clause and reformatting makes no change, but selecting part of line 3 and part of line 4 then reformatting eliminates the line 3/4 line break. Perverse, a regularly formatted alternatives is reformatted to be irregular.

What is the expected behaviour? My current goal is to replace all non-comment whitespace within the selected text by the corresponding whitespace of a total serialization. The existence/ordering of all non-whitespace characters is unchanged. i.e. in Java the optional braces for a single statement if-body remain present/absent according to the prevailing user practice. Comment bodies are unchanged, but comment framing i.e. the /* ... * ... */ of multiline comments is imposed.

@ewillink
Copy link

ewillink commented Oct 9, 2020

Does a normal Xtext DSL user ever use the serializer? Surely not needed to load/parse/edit DSL source files? Any use during post-edit saving seems like a potentially gratuitous corruption - just a formatting is all that could be justified. If not then my efforts to rescue comments from the Node model are completely pointless.

@cdietrich
Copy link
Member

yes. e.g. using quickfixes

ArneDeutsch added a commit that referenced this issue Oct 9, 2020
ArneDeutsch added a commit that referenced this issue Oct 23, 2020
ArneDeutsch added a commit that referenced this issue Oct 23, 2020
ArneDeutsch added a commit that referenced this issue Oct 23, 2020
ArneDeutsch added a commit that referenced this issue Oct 23, 2020
ArneDeutsch added a commit that referenced this issue Oct 23, 2020
ArneDeutsch added a commit that referenced this issue Oct 23, 2020
ArneDeutsch added a commit that referenced this issue Oct 23, 2020
cdietrich pushed a commit to eclipse/xtext-eclipse that referenced this issue Oct 24, 2020
cdietrich added a commit to eclipse/xtext-eclipse that referenced this issue Oct 24, 2020
…. Adapted test expectations

Signed-off-by: Christian Dietrich <[email protected]>
cdietrich added a commit to eclipse/xtext-eclipse that referenced this issue Oct 24, 2020
…. Adapted test expectations

Signed-off-by: Christian Dietrich <[email protected]>
ArneDeutsch added a commit that referenced this issue Oct 30, 2020
ArneDeutsch added a commit to eclipse/xtext-eclipse that referenced this issue Oct 30, 2020
ArneDeutsch added a commit that referenced this issue Oct 30, 2020
ArneDeutsch added a commit that referenced this issue Nov 6, 2020
ArneDeutsch added a commit to eclipse/xtext-eclipse that referenced this issue Nov 6, 2020
ArneDeutsch added a commit that referenced this issue Nov 13, 2020
ArneDeutsch added a commit to eclipse/xtext-eclipse that referenced this issue Nov 13, 2020
ArneDeutsch added a commit that referenced this issue Nov 16, 2020
[#1489] Java based Xtext formatter (formatter2 API).

Remove API restrictions.
Testing java based formatter.
Formatter2 Java Fragment.
ArneDeutsch added a commit to eclipse/xtext-eclipse that referenced this issue Nov 16, 2020
@cdietrich
Copy link
Member

TODO

  • investigate open review comments.

@cdietrich
Copy link
Member

cdietrich commented Nov 17, 2020

@ArneDeutsch
more things to investigate:
eclipse/xtext-xtend#1140 (comment)

if something like
({XtendClass . annotationInfo = current}
is formatted, then the spaces wont be removed

e.g. sample grammar

MultiTypeReference returns JvmTypeReference:
	JvmTypeReference ({JvmSynonymTypeReference  .  references  +=current} ('|' references+=JvmTypeReference)+)?;

JvmTypeReference:
	name=ID; 

@cdietrich
Copy link
Member

comment from @szarnekow regarding grammar access generation

Unrelated: I wonder if we should use the verbatim rule body instead.
Why: The statemachine example grammar looks like being carefully formatted by hand. I suppose this is true for most languages - the grammar is usually written such that the developer perceives it as well formatted. Why do we bother using the formatter here?

ArneDeutsch added a commit that referenced this issue Nov 20, 2020
Added unit tests as well.

Signed-off-by: Arne Deutsch <[email protected]>
cdietrich added a commit that referenced this issue Nov 23, 2020
@cdietrich
Copy link
Member

Lets consider this fixed and handle follow ups via

#1626
#1625
#1618

@cdietrich
Copy link
Member

thx @ArneDeutsch for making this possible

@ewillink
Copy link

Does this render the Idioms DSL driven Declarative Formatter/Serializer unwanted? If so, I can save a couple of gratuitous modularization plugins and some class clones in the 2020-12 OCL release.

@cdietrich
Copy link
Member

no it does not. IFormatter2 is still full of bugs and not maintained as there is nobody who understandsit.
but as long as your approach does not cover 100% (or 95%)of the usecases.
we still need a possibility to write a IFormatter2 in java.

@szarnekow
Copy link
Contributor

As a side note: Before you put more effort into modularisation plugins or class clones, I suggest that you use the time to stabilize your efforts such that we can asses the approach. If it turns out to be the way forward from our perspective, too, we can make a plan how to move this into Xtext. I don't it's necessary right now to put more work into infrastructure topics withouth that review having happened.

@ewillink
Copy link

The modularization effort is all done; 4 org.eclipse.ocl,examples.xtext.* plugins with no other OCL dependencies. In SimRel M3,

Since Xtext will obviously not backport any of the plugins to Xtext 2.13 and Oxygen, OCL will have to ship at least the org.eclipse.ocl,examples.xtext.serializer plugin for many years, albeit with a class-test switch to the Xtext variant when available. I've given the Idioms editor a klunky name to ensure that the eventual Xtext variant is visually more attractive when both are installed.

I hit a problem with heterogeneous assignments e.g xOrY+=X* xOrY+=Y* where my static analysis foolishly ignored the 99.9% guaranteed ordering of xOrY to analyze the loop counts Set-wise in advance. Much easier to just peel them off in order without a static analysis. Therefore difficult rules now fall back to a runtime analysis, so there may be fewer pathological breakers.

My shortish term goal is to autogenerate XtextSerializationMetadata.java and then manually weave it into the Xtext plugin to demonstrate that it can work. Since the strange irreversible formulation of the parenthesis recursion in the Xtext grammar broke my code, I might give the MWE2 grammar a try too.

That and a manual code review are needed before I can reasonably claim job done developer-wise.

@ewillink
Copy link

https://bugs.eclipse.org/bugs/show_bug.cgi?id=564265#c22 identifies that the support for not-well-behaved grammars such as Xtext is taking more time than I can justify.

@szarnekow
Copy link
Contributor

Thank you for the effort @ewillink. It is greatly appreciated and unfortunate that the exercise didn't work out as we had hoped for arbitrary languages. Nevertheless it's great to see that the solution is applicable for the environment it was created for and that your languages formatting and serialization can benefit from it.

@ewillink
Copy link

It's not a problem with arbitrary languages; If I was currently developing Xtext, I would put Xtext.xext through an LALR checker using the Xtext2LPG conversion that I use for my own grammars. This would force a bit of clear thinking as to what the core productions are. Now that the Xtext grammar is 'legacy' the probably minor reorganisation to make the grammar well-behaved would surely encounter too many legacy compatibility concerns. The various validation checks on e.g. spurious duplicate cardinality assignment would probably vanish once the nesting was rational; I was impressed that every stupid thing I discovered could be legally done with the Xtext grammar seemed to be diagnosed by the validation.

There are also two different worlds. For OCL/QVTd where the AS model is the most important and the textual syntax a convenience; the two share a bijective relationship. For typical Xtext usage, the text dominates with the AS being a transient auto-derived internal artefact. There is no need for a bidirectional parsing/serialization symmetry, as evidenced by the lack of a lossless AS model comment representation and the magical stripping of gratuitous parentheses without corresponding Groups to support their reconstruction during serialization.

I've not studied Xcore to see how it handles the need to support a traditional Ecore AS model while offering a textual counterpart.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants