Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PDF Syntax Highlighting #17

Closed
10 tasks done
tajmone opened this issue Sep 7, 2018 · 12 comments
Closed
10 tasks done

PDF Syntax Highlighting #17

tajmone opened this issue Sep 7, 2018 · 12 comments
Assignees
Labels
💡 enhancement A new feature or enhancement request/proposal 🕑 WIP Work in progress (possibly with tasks list). 👑 PDF Format Issues with conversion to PDF format 💀 format porting issues Cross-format problems (ADoc, HTML, PDF, etc.) ⭐ syntax highlighting Topic: Syntax Highlighting

Comments

@tajmone
Copy link
Collaborator

tajmone commented Sep 7, 2018

Task list and progress status of customization of syntax highlighting in PDF documents.

  • Create custom language definitions fo XSLTHL:
  • Customize XSL stylesheets for sourcecode blocks, so that each language has its own color theme and all syntax elements are covered and styled:
    • Alan code examples
      • Add conditional checks in XSL/XSLHL styles to customize look and feel of Alan source code.
      • Define styling for all syntax elements of the Alan language.
      • Choose a color scheme for Alan examples (chosen GitHub's old scheme)
    • BNF rules
      • Add conditional checks in XSL/XSLHL styles to customize look and feel of BNF rules. (moved to Add XSLTHL EBNF Syntax Definition #38).
      • Define styling for all syntax elements of BNF (terminal, non terminal).
      • Choose a color scheme for BNF rules.

NOTEasciidoctor-fopub uses Apache™ FOP to convert from DocBook to PDF, and XSLTHL for syntax highlighting source code (for more info, see the XSLTHL Wiki).

@tajmone tajmone added 💡 enhancement A new feature or enhancement request/proposal 💀 format porting issues Cross-format problems (ADoc, HTML, PDF, etc.) 🕑 WIP Work in progress (possibly with tasks list). labels Sep 7, 2018
@tajmone tajmone added this to the PDF Conversion Toolchain milestone Sep 7, 2018
@tajmone tajmone added the 👑 PDF Format Issues with conversion to PDF format label Sep 7, 2018
@tajmone
Copy link
Collaborator Author

tajmone commented Sep 7, 2018

Ciao @thoni56,

I've finally managed to work my way trhough the misteries of XSL stylesheets, and succeeded in creating different styles for Alan code examples, BNF rules and other verbatim blocks. I've also managed to add colored background and border with radius (it wasn't enabled in the original templates).

Alan Examples Theme

For testing purposes, I've used the Monokai color scheme — both because I like it and because I had a reusable ready-made template for it.

I've actually implemented in the XSL stylesheet a ful color scheme, based on Base16 variables, so changing the colors is now very easy (otherwise, the attributes that control styling of sourcecode are scattered all over the place, but now they are centralized in a block of variables):

<!-- =============================================== -->
<!-- Monokai Base16 Color Scheme, by Wimer Hazenberg -->
<!-- =============================================== -->
  <xsl:param name="Monokai.base00">#272822</xsl:param><!-- Rangoon Green ( almost black ) -->
  <xsl:param name="Monokai.base01">#383830</xsl:param><!-- Armadillo ( almost black ) -->

<!-- [...] -->

<!-- ================================== -->
<!-- Syntax Highlighting Theme for Alan -->
<!-- ================================== -->
  <xsl:param name="AlanHL.background" select="$Monokai.base00"></xsl:param>
  <xsl:param name="AlanHL.normal"     select="$Monokai.base05"></xsl:param>
  <xsl:param name="AlanHL.quotedId"   select="$AlanHL.normal"></xsl:param><!-- TEST WITH base12 -->
  <xsl:param name="AlanHL.keyword"    select="$Monokai.base08"></xsl:param>
  <xsl:param name="AlanHL.comment"    select="$Monokai.base04"></xsl:param>

<!-- [...] -->

Chossing A Color Scheme

Personally, I love the Monokai scheme, but I now that not everyone likes dark schemes. Anyhow, changing the scheme is not a problem at all, and we can test various schemes easily with the new base16 system.

I think that if it has to be a light scheme it shouldn't be a strong yellow like the current one, because of syntax highlighting colors needing contrast with the backgroun and among themselves (dark schemes provide the best contrast IMO).

Now it's a matter of choosing the final color scheme to adopt.

The ideal way would be to actually test the color schems on Alan syntax using Hihglight tool, which now ships with the Alan syntax and all the Base16 schemes:

Alternatively, here is a link to live previewes of color schemes on Highlight.js (no Alan syntax though):

We could refer to the names of those schemes to communicate and discuss choices.

About the New Alan Highlighter

I had to create a new syntax for Alan in XSLHL. It works pretty well, and have done some stress testing against edge cases and works good (eg., no false positive for keywords inside quoted identifiers).

I'm going to add to the project some test files for syntax highlighting (now PDF, later on HTML too), for both previewing and testing.

The new syntax is almost identical to the Highlight syntax, except that it doesn't capture special dollar symbols in strings (XSLHL doesn't support styling escape sequences or interpolation inside strings).

Also, I didn't add highlighting of predefined classes this time, because I realized that there is a conflict between the actor, container and location classes and their keywords counterparts (ie, actor and location pseudo-attirbutes, as in Curent Actor/Location, and the container property). The highlighter would be unable to distinguish between when these are classes or not; in fact I should remove them from the Highlight syntax too!

I've now understand why the keywords list in the Manual doesn't include some predefined classes — it doesn't actually contain ANY classes, for these are there in virtue of being either pseudo-attributes or properties.

Is it correct? or am I misunderstanding the above point?

@thoni56
Copy link
Contributor

thoni56 commented Sep 9, 2018

About the classes, yes, I think of them not as keywords, but as just predefined classes. So in my mind they should have the same treatment as other, author defined, classes (and identifiers).

Actually we should probably change the "syntax" from Current Actor to Current actor, semantically stating that following the Current keyword should be a class identifier, but that only the predefined actor and location are allowed.

Concerning colour scheme I prefer light ones, but since we want the section to be visually different from the white "paper" background that limits the choice. I've often used Solarized Light, but tweaked the background to a more grayish colour.

I don't think we want a proponent colour, like blue-ish as in Lakeside Light, I'm leaning more towards Github, Foundation or Magula. I'd like the code blocks to be identifiable, but not intrusive (which I think dark themes mostly are), and the syntax colouring to be clearly visible but not like a christmas tree.

(I went for Highlight.js showcase since I felt the update on Base16 was sluggish, particularly for some schemes. Maybe a performance issue with the highlighter?)

@tajmone
Copy link
Collaborator Author

tajmone commented Sep 9, 2018

Predefined Classes

Actually we should probably change the "syntax" from Current Actor to Current actor, semantically stating that following the Current keyword should be a class identifier, but that only the predefined actor and location are allowed.

Indeed, this would be less confusing. There is mentioning of pseudo-attributes in the Manual, but for practical purposes these technical distinctions aren't rally important for the end user, and we should stick to the natural English side of Alan, instead of the "technically correct" programming semantics.

About the classes, yes, I think of them not as keywords, but as just predefined classes. So in my mind they should have the same treatment as other, author defined, classes (and identifiers).

This creates some confusion right now because I've added to the highlighter's kewyords list all the keywords found in the Manual, and these include actor and location but not thing and object.

So, in this cases some predefined classes will be highlighter as keywords (shown in all caps) while others not (lowercased):

Every man IsA ACTOR
Every room IsA LOCATION
Every magic IsA thing
Every toy IsA object

The XSLHL highlighter doesn't use a stack, it has a flat approach to token, so it's not possible to highlight selectively according to context. This would be possible with Rouge, which we can use only for HTML, and we would end up with different syntax highlighting in PDF and HTML, which is not good.

So, we either remove actor and location from the keywords list or we add also thing and object. With the former solution actor and location would never be highlighted as keywords, even when they are pseudo attributes.

Current actor     --> 'actor' not a keyword!
Current location  --> 'location' not a keyword!

But then this would allow us to add all the predefined classes to a separate group of syntax elements, and we could style them differently if we wanted (for eg. in bold, just to remind reader that they are native classes).

The latter solution doesn't make much sense because classes shouldn't be colored like keywords in my opinion.

Literals: String?

I am still confused about the literals though. Should String be treated as a class or a keyword in highlighting?


Color Scheme

Concerning colour scheme I prefer light ones, but since we want the section to be visually different from the white "paper" background that limits the choice.

Since I've added a thin border around the code, even very light bg colors should work fine, and a slightly darker border will frame the code and separate it visually from the page

Transparent BG?

... besides, there is no golden rule about the code needing a background color or border, since it uses monospace fonts, and custom text coloring, we could actually do without any bg color and border at all, and probably it might look better, especially when there is a page break in the middle of a code block, which in the PDF slices the content abrupbtly (although in DocBook this can be controled and fixed).

Should we try transparent background, and just focus on foreground colors? After all, we only need a few contrasting colors here:

  • normal text
  • keywords
  • strings
  • comments
  • numbers (optional)
  • operators (optional)

All the schemes you mentioned (Github, Foundation, Magula) could be tested with transparent (i.e. white) background. Some schemes were designed to work good on white out of the box (e.g. Google).

I'll do some tests and post some grabbed screenshots here.

Highlight Base16 Problems

... I felt the update on Base16 was sluggish, particularly for some schemes. Maybe a performance issue with the highlighter?)

Could you please expand on this point? I'm the contributor of the Base16 update in Highlight, so if there are problems with it I'd like to fix it.

@thoni56
Copy link
Contributor

thoni56 commented Sep 9, 2018

Predefined classes

My suggestion is to remove actor and location from the keyword list. As a direct consequence of this decision, string should also be removed. They are all just classes, although special, predefined, ones.

They could be highlighted differently but if I read you correctly you think that is not a good idea, and I tend to agree.

Transparent background

Let's try transparent background, then.

Base16

By sluggish, I mean that when I changed the scheme it first rendered everything green (or something) and then after a second or two it drew the correct colouring. This made it impossible to skip through the schemes with any speed.

I tried it again just now, and the problem is gone. Browser issue maybe...

@tajmone
Copy link
Collaborator Author

tajmone commented Sep 9, 2018

Other Blocks Coloring

In the choice of the color scheme for Alan code we should keep in mind that there are other block which require colored background, and each of them should have a different color to differentiate them and avoid confusion.

These are all the colored (aka "shaded" in XSL terminology) blocks:

  • BNF RULESorange background (like original, maybe slightly more pastel).
  • PLAY TRANSCRIPTS — although not a literal block, it should have border and bg color. I was thinking of a very light pastel blue, with normal text a dark blue (almost black but not quite).
  • SHELL/CMD — this should definitely be dark, as usually shells are. I propose to adopt the MS DOS colors here (almost white text on black) since it's a familiar scheme, and because Linux shells don't have a standard color scheme.
  • COMPILER ERRORS/MESSAGES — as found in appendices (Run-Time, etc.). I think that a very light grey could do, with a border, or maybe we could use yellow here (since we're not going to use it anymore for Alan code).

This being the context of colored blocks, I think that having Alan code without border and bg color could be actually a good idea since it would make it stand out by the fact that it doesn't need bg color nor border (i.e., it would make it "special" in this respect).

Keep in mind that Alan code blocks are also padded, and this together with the monospaced fonts and custom colors should make it clear that it's code. The only problem might be visually tracking indentantion of the code, but probably it's not really an issue in practicality.

@thoni56
Copy link
Contributor

thoni56 commented Sep 9, 2018

Sounds ok. Until I see it in real life ;-)

@tajmone
Copy link
Collaborator Author

tajmone commented Sep 9, 2018

... when I changed the scheme it first rendered everything green (or something) and then after a second or two it drew the correct colouring. This made it impossible to skip through the schemes with any speed.

I tried it again just now, and the problem is gone. Browser issue maybe...

Problably cache issue then. While I was creating and testing the Base16 schemes in Hihglight GUI I didn't experience any problems, and neither after the schemes were included in the next Highlight release (which has created a separate list for the Base16 schemes, and placed them in a subfolder).

If you tested via Highlight CLI, the problem might have been due to the subfoldering of Base16 schemes. Who knows ...

@tajmone
Copy link
Collaborator Author

tajmone commented Sep 9, 2018

Next Steps

Probably what I should now is to also create the color schemes for the other blocks, so we can compare Alan code with the general contex:

  • Alan code: transparent, no border
  • BNF: orange (already there)
  • Transcripts: light blue
  • Shell: white on black
  • Compiler Messages: light grey or yellow

I'll have a go at them this afternoon then!

tajmone added a commit that referenced this issue Sep 9, 2018
Begin implementing changes discussed in Issue #17:
- Add new syntax elements group "classes" to highlight predefined classes:
  `actor`, `entity`, `location`, `object`, `string`, `thing`.
- Remove `actor` and `location` from "keywords" group.
- Update stylesheets accordingly.

NOTE: Now predefined classes are styled in blue, for testing.
tajmone added a commit that referenced this issue Sep 9, 2018
Carry on implementing changes of Issue #17:
Add customizable attributes to control border-style and -with in source code
and verbatim blocks.
tajmone added a commit that referenced this issue Sep 9, 2018
Implements experimental changes discussed at Issue #17:
- Alan code example without border and with transparent BG color.
- Custom colors used for testing (no specifc scheme), might need adjusting.
  - predefined classes are shown in dark purple.
  - operators are styled as normal text (no highlighting visible)
  - numbers are shown in orange.
@tajmone
Copy link
Collaborator Author

tajmone commented Sep 9, 2018

I've created a commit in a test branch so I could share a preview of the PDF documents using no bg color and border for Alan code:

The above links will always point to the latest PDF produced on the test branch, so even if we start tweaking colors they will always work.

I haven't actually followed any of the schemes you pointed out, because I've noticed they relied on some bg color, so I just worked out a quick tempoary palette that would fit a white bg. Current colors are temporary, and they could be discussed and improved.

What I wanted to test here is if presenting code without border nor background color looks nice or not.

I think it looks good, but I have mixed feelings about it (keep in mind that I'm a strong supporter of always using dark schemes for code, because they are less stressful to the eye).

Definitely, when a code block gets interrupted by a page-break it looks better without border or background.

Also, I have the impression that without a box around it these examples seem to flow in with the discourse a bit more (ie, the box creates a big contrast with the body text, without box the code seems more attached to the text, so to speak).

In any case, I doubt the the lack of a boxing frame makes it difficoult to distinguish between code and text — monospace font, syntax highlighting and differnt font sizes make it clear which is which.

What's your opinion? keep going down the no-border no-bg road, or revert to using a color scheme and just find the right one?

@thoni56
Copy link
Contributor

thoni56 commented Sep 9, 2018

I'm much for the code to blend in with the text flow, like you say, but should still easy be identifiable as code. This, I think, leads us to a transparent, or very light background.

I agree that a border is probably a bad idea, since I presume it will generate two boxes if broken by page break.

An ever so light grey might work, maybe.

The current colours and styling definitely does work, possibly the result of a very strong red keyword colour. I think I'd prefer a slightly less strong one. I don't want the keywords to be eye-magnets, ideally you should be able to read the code as easily as the text around it, but still clearly identify itself as code.

Yes, I realize that having no parts of code "stand in front" is one way to describe it.

Also you should be able to squint with your eyes and then only see the text in quotes ;-)

@tajmone
Copy link
Collaborator Author

tajmone commented Sep 9, 2018

Ok, I've tried both Foundation and GtiHub schemes, as found on Highlight.js website. I avoided Magula because it looked too dark.

Bare in mind that with the new variables-based system changing schemes in the XSL stylesheet is very easy, and I actually just commented out the previous schemes, so restoring them is matter of a few clicks. Once we'll settle for a specific scheme I'll just delete the older ones from the source.

The PDF links above are now updated to the GitHub scheme — I tried Foundation but it looked a bit too darkish, anyhow below are the screenshot of both.

Here's a screenshot of Foundation:

Foundation Scheme

And here's a screenshot of GitHub:

Foundation Scheme

Also not that in GitHub I've removed the bold style from keywords, and I think it looks nicer (and would keep it that way for any other scheme too).

I prefer the GitHub scheme — by the way, I think that this is actually the old color scheme used by GitHub, the newer one is a bit brighter in colors.

@thoni56
Copy link
Contributor

thoni56 commented Sep 9, 2018

I agree that the Github theme is better, and I think that is quite good. In the full manual PDF I feel it flows very nicely.

tajmone added a commit that referenced this issue Sep 12, 2018
- Alan XSHL: Fix syntax:
  - Add missing keywords: `meta`, `transitively`, `indirectly`.
  - Add `literal` to predefined classes.
  - Create new group `hero` for highligting `hero` instance (optional).
- XSL Stylesheets: integrate new `hero` syntax element.
- Add to `_dev/hl/syntax-highlighting.asciidoc` new code to test predefined
  classes and `hero`.
(see #15 and #17 for details)
tajmone added a commit that referenced this issue Sep 12, 2018
- Add missing keywords `meta`, `transitively` and `indirectly` to the list in
  "D.2. Keywords" and remove `actor` and `location`.
- Move keywords lists to external CSV file (`manual_keywords.csv`) and include
  it in the table.
(see #15 and #17 for more details)
tajmone added a commit to tajmone/highlight that referenced this issue Sep 20, 2018
This commit fixes some keywords issues and improves the syntax:
- kwd-ID 1: Add missing Keywords: `indirectly`, `meta`, `transitively`.
- kwd-ID 2: Add missing Prefefined Classes: `literal`, `string`.
- Add new keywords group (ID 3) and move `hero` into it (from kwd-ID 2).
For more details on these changes, see discussion with Alan develepor Thomas
Nilefalk (@thoni56) at:
    alan-if/alan-docs#15
    alan-if/alan-docs#17
@tajmone tajmone added the ⭐ syntax highlighting Topic: Syntax Highlighting label Sep 29, 2018
@tajmone tajmone self-assigned this Oct 6, 2018
@tajmone tajmone closed this as completed Dec 19, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
💡 enhancement A new feature or enhancement request/proposal 🕑 WIP Work in progress (possibly with tasks list). 👑 PDF Format Issues with conversion to PDF format 💀 format porting issues Cross-format problems (ADoc, HTML, PDF, etc.) ⭐ syntax highlighting Topic: Syntax Highlighting
Projects
None yet
Development

No branches or pull requests

2 participants