Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switching to the Rouge Highlighter #107

Closed
tajmone opened this issue May 26, 2021 · 10 comments
Closed

Switching to the Rouge Highlighter #107

tajmone opened this issue May 26, 2021 · 10 comments
Labels
💡 enhancement A new feature or enhancement request/proposal 👑 HTML Format Issues with conversion to HTML format 👑 PDF Format Issues with conversion to PDF format 🔨 Asciidoctor PDF Tool: Asciidoctor PDF backend 🔨 asciidoctor-fopub Tool: asciidoctor-fopub (PDF toolchain) 🔨 Rouge Tool: Rouge syntax highlighter (Ruby) ⭐ syntax highlighting Topic: Syntax Highlighting 👅 Ruby Lang: Ruby

Comments

@tajmone
Copy link
Collaborator

tajmone commented May 26, 2021

The problem of supporting callouts in Highlight (#36) has been pending a solution for so many years that I think it might be worth considering switching to Rouge instead, and create an ALAN definition for it.

Rouge has the following pros and cons over Highlight:

  • PROS:
    • It's natively supported by Asciidoctor Ruby, for both the HTML and Asciidoctor PDF backends; see: Asciidoctor Documentation » Syntax Highlighting » Rouge.
      • The Asciidoctor v2.0.0 update didn't affect syntax highlight via Rouge, whereas our Highlight extension has become legacy code.
    • It's written in Ruby, so it integrates well with Asciidoctor Ruby.
    • Its Lexer is state-driven, so it can handle powerful syntax constructs (e.g. nested states), pretty much like Pygments.
    • Doesn't require end users to install another language dependency (like Python for Pygments), since Asciidoctor users will have installed Ruby already.
    • Its themes are 100% compatible with Pygments stylesheets.
  • CONS:
    • Only supports UTF-8! so we might have problems with ISO-8859-1 ALAN sources (but this is already a problem with Asciidoctor anyhow). And:
      • The next Alan Beta will introduce support for UTF-8 source, which mitigates this problem.
      • This only affects highlighting ALAN sources in isolated form, and does not affect highlighting code blocks inside AsciiDoc documents.
    • The documentation on how to write a Lexer was lacking (or poor) in the past; but now it's been fairly documented, and tutorials are also available.
    • Using a custom syntax for our project might be a bit trickier, compared to Highlight, and maybe not even possible without submitting our ALAN syntax to the Rouge repository. Because:
      • Rouge is installed as a Gem, and I'm not sure it's possible to use external syntaxes (lexers) not included in the package.
      • Asciidoctor interfaces to Rouge via its Gem library, not the command line, so even if it was possible to use custom lexers via command line options, these might not be accessible via Asciidoctor.

The most problematic downside (and possibly a blocking issue) is definitely the complication of using a custom syntax (i.e. lexer) which is not part of the official Rouge Gem. Ideally, we would like to be able to immediately benefit from any fixes to our custom ALAN syntax, without having to wait for our PR to be merged upstream, and then for the next Rouge release, because this would prevent us from benefiting from any newly added features (e.g. the introduction of block comments in ALAN).

Also, we might want to create smaller custom lexers (e.g. to highlight the BNF rules in the ALAN Manual) which most likely are not eligible for inclusion in the upstream Gem. With Highlight, we're free to override any of it bundled syntaxes with our own; so far I didn't find a way to do this with Rouge.

It would probably be possible to setup the Alan Docs repository so that it would use our fork of the Rouge repository (as a Git submodule) instead of the globally installed Gem (e.g. using bundler, or some other tool); but we would still need to ensure that Asciidoctor will be picking our version instead of the (globally installed) Rouge Gem.

Any ideas on how to achieve this?

@tajmone tajmone added 💡 enhancement A new feature or enhancement request/proposal 👑 PDF Format Issues with conversion to PDF format 👑 HTML Format Issues with conversion to HTML format ⭐ syntax highlighting Topic: Syntax Highlighting labels May 26, 2021
@tajmone
Copy link
Collaborator Author

tajmone commented May 26, 2021

ALAN Lexer for Rouge: WIP

@thoni56, in these days I've started experimenting with Rouge, and after some fiddling around to find out how to switch from Ruby 3.x to Ruby 2.7 (required only for visual testing custom lexers) I finally managed to start working on a custom ALAN lexer.

So far, so good. Rouge is fairly simply to work with, and I quickly managed to draft up an ALAN syntax. It shouldn't take long to come up with a polished syntax capable of handling contextual semantics either.

Right now I'm struggling more with attempting to solve clashes for the .i file extension, which is used by another syntax too. Rouge has a system to handle such clashes via some decisional rules, but this feature is poorly documented so I can't find a solution to it right now.

The next thing I want to do is discover if and how we could use our custom ALAN syntax in this project, without depending on the official Rouge gem/repository. Is not that I don't want to submit the ALAN syntax once it's ready (I will), it's just that I don't like the idea to depend on the upstream repository for our own updates.

Anyhow, I just wanted to mention here that I'm confident about the possibility of creating an ALAN syntax definition for Rouge, and that it would be superior in quality to the current syntaxes we have.

Since this project already requires Ruby, switching to Rouge wouldn't introduce a dependencies burden (actually reduce it, since it's easier to install and update a Gem than a third party tool like Highlight).

Switching from Fopub to Asciidoctor-PDF?

Furthermore, since the Rouge syntax would also work with Asciidoctor's native PDF backend (asciidoctor-pdf), we can already start looking if the problems that previously prevented us from using it (see #9) are now solved (three years have passed in the meantime!).

I personally would love to switch from asciidoctor-fopub to asciidoctor-pdf, for various reasons:

  1. Fopub/DocBook templates are too hard to understand and customize, and they are ugly too. From what I remember, asciidoctor-pdf template are a breeze to work with.
  2. I hate having to read and write XML (I'm sure it's recognized as a kind of ocular-torture in some civilized country), and Fopub/DocBook is all XML.
  3. I hate and distrust Java, and asciidoctor-fopub in the past has forced me to install the Java SDKs (considered unsafe) in order to use it. Eventually it was upgraded to use JDK 10, but since the asciidoctor-fopub project is looking for new maintainers, and it's not regularly updated (last commit 2018), this situation will probably repeat in the future.
  4. Asciidoctor-pdf is the officially endorsed PDF backend by the Asciidoctor Project, so it will always receive more attention than asciidoctor-fopub — and will most likely supersede it entirely.
  5. Asciidoctor-pdf is in Ruby — I'd like to keep most dependencies in Ruby, since we're using Asciidoctor Ruby here!

So, if the original limitation which prevented us from using asciidoctor-pdf three years ago are now resolved, having an Alan syntax in Rouge would be the only other requirement for us to make the switch.

@tajmone tajmone pinned this issue May 27, 2021
@tajmone
Copy link
Collaborator Author

tajmone commented May 27, 2021

Using Custom Lexer: It's Possible!

OK, after some tests, and after having create a PoC Alan lexer for Rouge, I can safely say that we could use our custom lexers from without the Rouge Gem.

Assuming our lexer is called alan3.rb, to highlight the sample.alan file from the command line we only need to use the --require/-r option:

$ rougify sample.alan --require ./alan3.rb

As for Asciidoctor, it doesn't seem like there's a way to enforce the --require option on Rouge. So my best guess right now is we might need to write our custom Rouge extension and override Asciidoctor's native API for Rouge:

This shouldn't be too hard, we've already done something similar to enable Highlight support. The only potential issue I can foresee is that we might have to rename our highlighter as RougeCustom in case it clashes with the built-in Rouge API (haven't checked this, it's only an hypothesis, and it might just as well override it).

So at this point I'm very optimistic about this!

I've already produced a usable Alan lexer for Rouge, which is already more accurate then the current Highlight and highlight.js syntaxes — and I could even improve it further.

Also, in respect to Issue #81 and #83, I'm starting to think to our best option in terms of creating a good toolchain to build the docs is to rely on Ruby rather than Bash scripts, in the future.

We already need Ruby for the project because of Asciidoctor, so we might just as well use Ruby scripts to build the documents — which give us finer grain control over Asciidoctor settings and options, via its API — and could even think of using one Ruby's many build frameworks to configure the different aspects of our toolchain.

As for the idea of #83, regarding moving shared ALAN toolchain assets into a common repository, a better alternative could be to create a dedicated Gem that covers the common needs of ALAN documents in AsciiDoc and handling the ALAN SDK for building and testing examples and libraries, and to dynamically extract their source and generated transcripts into ADoc sources.

In the coming days I'll be evaluating whether it would be better to move similar assets into a repository (which can then be submoduled via Git in Alan Docs, StdLib, etc.) or whether I should create a Ruby Gem that exposes all the required assets — the advantage of using a Gem is that it would benefit also private project that don't rely on Git.

@tajmone tajmone added 🔨 Rouge Tool: Rouge syntax highlighter (Ruby) 👅 Ruby Lang: Ruby labels May 27, 2021
@tajmone
Copy link
Collaborator Author

tajmone commented Jun 11, 2021

Rouge Tests Now Publicly Available

A fully working (but not completed yet) Rouge lexer for Alan can now be found at:

https://github.com/alan-if/Alan-Testbed/tree/master/Rouge

that folder is currently being used for testing integration of custom Rouge lexers in the Asciidoctor toolchain.

tajmone added a commit to alan-if/Alan-Testbed that referenced this issue Jun 12, 2021
Thanks to @mojavelinux (asciidoctor/asciidoctor#4080) we now have our
first working test on how to use Asciidoctor with custom lexers for ALAN
that are not part of the Rouge gem.

This milestone confirms that we'll be able to address and fix the Issues
discussed at alan-if/alan-docs#107 and alan-if/alan-docs#36.
@tajmone
Copy link
Collaborator Author

tajmone commented Jul 1, 2021

Rouge Lexer and Themes Almost Ready

@thoni56, I just wanted to update you on the work regarding the migration from Highlight and highlight.js to Rouge (Ruby) for the HTML backend.

Both the ALAN lexer and custom theme are almost ready — I still need to tweak the colours in the three themes for ALAN code, and make a couple of styling decisions, but we should be ready to migrate in a few weeks time.

You can always see a live preview of the ongoing work in these two HTML documents:

Since the Rouge lexer is more accurate in terms of the tokens it captures, I'll need to add a couple of missing colours to cover some elements that weren't available with Highlight and highlight.js.

I'll try to stick to the original themes and decisions, and for example hide some elements by assigning them the default text colour (e.g. numbers, operators, etc.) or colouring them like ALAN keywords (e.g. punctuation), just as we did so far, and agreed upon.

But I'll have to tweak the default theme (GitHub inspired) because I need at least two extra colours, but I'll keep as close to the current scheme as possible (I was thinking of retrieving the original GitHub colour scheme, and take the missing colours from it, keeping our customization of the background colour as is).

Also, I'd like to enforce more consistency across the three ALAN themes, e.g. use similar colour hues to represent the same syntax elements, for consistency sake. But I'll probably need to replace the current dark (library) and light (tutorial) themes with some other colour scheme that has more colours choices — I guess you're OK with this, since I picked those alternatives schemes myself.

The problem is that the current alternative schemes (Base16 Eighties and Google Dark) contain too many duplicate entries, leaving me with vacant colours to use.

Anyhow, the switch to Rouge will not only allow us to drop two dependencies for the HTML backend, but will also finally empower us to use call-outs in code blocks, which is a feature I really wanted to use in the ALAN Manual, since it's really cool.

When everything is production ready, I'll buzz you again, so you can check the details of the new themes (via the above links) and give me the green light (or ask for revisions) to go ahead and switch to Rouge in the ALAN Docs.

@thoni56
Copy link
Contributor

thoni56 commented Jul 1, 2021

Nice work and progress.

With just a quick look maybe I feel the red is to red. I think the other parts of the documents have a more "soft" touch. It's a fine balance between colouring everything that needs it (everything might not...) and not making a distracting and hard-to-read colour salad ;-)

@tajmone
Copy link
Collaborator Author

tajmone commented Jul 1, 2021

The colours in the sample document are still mostly unmatched to the original scheme, but those that are are using the original Sass color schemes — so the red of the default theme is the red of the original scheme used in Alan Docs.

The syntax test document is a another matter, I'm using extra bright colours just for testing, but that's a working scheme only.

What I miss in terms of colours are a good contrasting yellow and alternative orange or purple, really. Something that will look nice in strings, to represent delimiters, escapes and interpolation. The current colours (in all schemes) are either too contrasted with the string's green, or too dark. I was thinking of adding some ad hoc colors for srting elements, to ensure they look nice (after all, strings are the core of adventures, since they represent output).

I'm calibrating my monitor colours every day with the calibrator gadget (a rather painful and lengthy process), so I'm 100% sure that the colours I see are as they are intended to be (in fact, I can barely distinguish the different colours of the default them, which all looks too similar to me, and need a magnifying lens and/or checking via Chroms's debugger to ensure that elements are represented in a given colour). Some colours only show up properly on bold elements, unfortunately, and with normal weight text tend to look alike.

@tajmone
Copy link
Collaborator Author

tajmone commented Jul 6, 2021

Problems with Substitutions in Code Blocks

I just discovered that the native Asciidoctor interface for Rouge doesn't actually support substitutions in code blocks, which means we won't be able to use Rouge with the ALAN Beginner's Guide, which makes frequent use of it to highlight single lines of code:

So me might have to keep using Highlight for the Beginner's Guide, while switching to Rouge for all other docs. At least the ALAN Manual will finally be syntax highlighted; but I was hoping to adopt a single solution for all HTML docs.

Also, I'm having problems implementing the new block comments feature in Highlight, which will make its future use problematic with documents updated to mirror latest Beta 8 features.

@tajmone tajmone unpinned this issue Jul 30, 2021
@tajmone
Copy link
Collaborator Author

tajmone commented Jul 30, 2021

🎵🎺🎵 TADAAA! 🎵🎺🎵 Rouge Is Here!

@thoni56, we've done it!!!

Against all odds, we finally now can also use Rouge for syntax highlighting documents in this projects.

I've managed to keep identical ALAN themes as we had with Highlight — I basically simply suppressed all the extra tokens that the new lexer can match, by assigning to them same colour (so the tokens are there, you just don't see them because everything is coloured as before).

As a first test, I've tweaked the build script of the Alan Design docs, so they now are using Rouge, as you can see on their updated live versions:

(the Alan Rules! document doesn't have any ALAN code, so I didn't link it, but if you add code it will be highlighted).

So, let me know what you think ... and when you give the OK I can enforce Rouge on the ALAN Manual too, which currently is still using highlight.js, which is not working online (see #109).

Just remember that Rouge supports callouts but not inline spans or highlight markers, whereas Highlight is the other way round (no callouts, but supports spans and highlight/marked).

Since the Beginner Guide uses a lot of highlight markers in the code, we'll have to keep Highlight for the moment, along with Rouge. We'll also keep highlight.js, just in case we ever need it.

@thoni56
Copy link
Contributor

thoni56 commented Jul 30, 2021

Great work, Tristano! And conratulations for achiving that goal, which I know you have worked hard to reach.

It looks very nice, so I'm trusting you to handle any, if any, upcoming issues. So you definitely have my ok.

@tajmone
Copy link
Collaborator Author

tajmone commented Jul 31, 2021

I've just updated the HTML versions of the Manual (beta and alpha) on the website, with their new Rouge version.
Now they are both syntax highlighted and nice to read!

@tajmone tajmone added 🔨 Asciidoctor PDF Tool: Asciidoctor PDF backend 🔨 asciidoctor-fopub Tool: asciidoctor-fopub (PDF toolchain) labels Aug 21, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
💡 enhancement A new feature or enhancement request/proposal 👑 HTML Format Issues with conversion to HTML format 👑 PDF Format Issues with conversion to PDF format 🔨 Asciidoctor PDF Tool: Asciidoctor PDF backend 🔨 asciidoctor-fopub Tool: asciidoctor-fopub (PDF toolchain) 🔨 Rouge Tool: Rouge syntax highlighter (Ruby) ⭐ syntax highlighting Topic: Syntax Highlighting 👅 Ruby Lang: Ruby
Projects
None yet
Development

No branches or pull requests

2 participants