Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when trying to import remark-torchlight #1

Open
Benjaminsson opened this issue Sep 27, 2021 · 10 comments
Open

Error when trying to import remark-torchlight #1

Benjaminsson opened this issue Sep 27, 2021 · 10 comments

Comments

@Benjaminsson
Copy link

I get this error when trying to import remark-torchlight:

SyntaxError: Cannot use import statement outside a module

I reproduced the error in this CodeSandbox

@cody-quinn
Copy link

👍 Am getting the exact same issue

import remarkTorchlight from "remark-torchlight";
error - /home/~~Redacted~~/node_modules/remark-torchlight/index.js:1
import parse5 from 'parse5'
^^^^^^

SyntaxError: Cannot use import statement outside a module

N0tExisting added a commit to N0tExisting/remark-torchlight that referenced this issue Jan 14, 2022
This pr fixes torchlight-api#1 by converting the package to an esm only package.
@ghost
Copy link

ghost commented Apr 13, 2022

Same issue here

@jobyh
Copy link

jobyh commented Jun 15, 2022

Also preventing me using in NextJS (swc compiler) 🙁 @aarondfrancis is there something we can help with toward getting this resolved?

@aarondfrancis
Copy link
Contributor

Ok yall, thanks for your patience. I just updated the underlying library and this library to ESM, but now the test is failing. It worked with the old node_modules versions I had installed locally, but when I pushed to GH it failed. I've spent a couple hours digging and trying to figure out what changed, but the whole remark/rehype/unified/hast ecosystem stuff is not super straightforward to me.

Can anyone tell if something has changed and I need to update the shape of the nodes coming out of this plugin?

@aarondfrancis
Copy link
Contributor

Ok I finally figured it out. Between remark-html 14.0.0 and 14.0.1 they changed a default param.

https://github.com/torchlight-api/remark-torchlight/actions/runs/2506264156

The test now works, but I don't feel great that it relies on sanitize false.

Should this be a rehype plugin instead?

@jobyh
Copy link

jobyh commented Jun 16, 2022

Amazing @aarondfrancis thank you! Will have a go with this later on.

RE: sanitize: false I agree it feels incorrect, particularly after reading the remark docs.

RE: Rehype - in short, yes, I think you could be right. While it wasn't possible to use this library yesterday I ended up doing some digging of my own and integrating rehype-highlight for SSR and started wondering the same thing.

The full pipeline felt like it made the most sense as a chain via unified. A torchlight-rehype plugin could replace the use of rehype-highlight below:

import {readFileSync} from 'fs'
import {unified} from 'unified'
import highlight from 'rehype-highlight'
import parse from 'remark-parse'
import rehype from 'remark-rehype'
import stringify from 'rehype-stringify'
import matter from 'gray-matter'

 const markdown = matter(readFileSync(`./path/to/your/markdown.md`), 'utf8'))
 const rendered = (await unified()
    .use(parse)
    .use(rehype)
    .use(highlight)
    .use(stringify)
    .process(markdown.content))
    .toString()

the whole remark/rehype/unified/hast ecosystem stuff is not super straightforward to me

...yep 🤯 - my takeaway was: remark for raw markdown content to HTML and rehype for processing HTML. They're all built on top of unified.

@mnapoli
Copy link

mnapoli commented Dec 29, 2022

@aarondfrancis it seems the last release that fixes this (0.0.3) is not published to NPM: https://www.npmjs.com/package/remark-torchlight?activeTab=versions

Any way to publish it to NPM? Thanks!

@mcgrealife
Copy link

mcgrealife commented Jan 21, 2023

v0.0.3 might be unpublished because it requires a workaround that removes remark-html's sanitize option (creating an XSS attack vulnerabillity).

As a temporary solution, downloading v0.0.3 and providing it to package.json as a local module (i.e. in package.json reference the package via the syntax file:path/to/local-module) works

Stale research:

For a long-term solution:

await unified() // unified directly
  .use(remarkRehype)
  .use(rehypeTorchlight)
  .use(rehypeSanitize, {schema}) // schema with allowed class names
  .use(rehypeStringify)
  .process(markdown)

(instead of the current v0.0.3 way):

await remark()
  .use(remarkhtml, {santize: false})
  .use(remarkTorchlight)
  .process(markdown)

@mcgrealife
Copy link

mcgrealife commented Jan 22, 2023

Stale research

Update: I have some working versions, and details about sanitization. Conclusion: remark-torchlight does not need to be re-written as a rehype plugin.

remark = markdown
rehype = hypertext (HTML)

Currently, remark-torchlight works while the code is still in a markdown abstract syntax tree (mast), before being converted (by remarkRehype) into an HTML abstract syntax tree (hast).

Plugin chain

Both of chain configurations work. The important new plugin is .use(rehypeSanitize):

Edit: there is a 3rd simpler option that allows passing the schema object to the sanitize prop in remark-html (see final comment below)

await unified()

await Unified():

the benefit of awaiting unified() (instead of remark()) seems to be that your input can begin as something other than markdown. The drawback is that it requires more .use() plugins.

import { unified } from 'unified'
import remarkParse from 'remark-parse'

import torchlight from '../modules/remark-torchlight'
import remarkRehype from 'remark-rehype'
import rehypeSanitize, { defaultSchema } from 'rehype-sanitize'
import rehypeStringify from 'rehype-stringify'

await unified()
  .use(remarkParse) // required when using unified
  .use(torchlight, config) // _before_ markdown is transformed into HTML
  .use(remarkRehype) // converts from mast to hast (markdown syntax tree to HTML syntax tree)
  .use(rehypeSanitize, schemaObject) // sanitize will remove all attributes that are not explicitly specified in the schemaObject (i.e. class names, styles)
  .use(rehypeStringify)
  .process(markdownInput)
await remark()

Closer to the current document. Under the hood, remark() is awaiting unified() and using remarkParse()

import { remark } from 'remark'

import torchlight from '../modules/remark-torchlight'
import remarkRehype from 'remark-rehype'
import rehypeSanitize, { defaultSchema } from 'rehype-sanitize'
import rehypeStringify from 'rehype-stringify'

await remark() // if we know we're starting with markdown, we do not need unified and can start directly with remark(). This option does not require using remarkParse
  .use(torchlight, config) // highlight code while it's still markdown
  .use(remarkRehype) // convert to HTML
  .use(rehypeStringify)
  .process(markdownInput)

Schema challenges of use.(rehypeSanitize, schema)

If an attribute is not explicitly provided, it will be removed.

import rehypeSanitize, { defaultSchema } from 'rehype-sanitize'

// ...

 .use(rehypeSanitize, {
      ...defaultSchema,
      attributes: {
        ...defaultSchema.attributes,
        pre: [
          ...(defaultSchema.attributes.pre || []),
          ['className', 'torchlight', 'has-highlight-lines', 'has-focus-lines'],
          ['style', 'background-color: #2e3440ff'],
        ],
        code: [
          ...(defaultSchema.attributes.code || []),
          ['className', 'language-js', 'js']
        ],
        div: [
          ...(defaultSchema.attributes.div || []),
          ['className', 'line', 'line-focus', 'line-highlight', 'line-focus', 'line-has-background', 'yourCustomClass'],
          ['style', 'background-color: #3b4252'],
          ['id', 'customId'],
        ],
        span: [
          ...(defaultSchema.attributes.span || []),
          ['className', 'line-number'],
          ['style', 'color: #D8DEE9;', 'color: #88C0D0;', 'color: #A3BE8C;', 'color: #88C0D0;', 'color: #D8DEE9FF;', 'color: #ECEFF4;'], // requires semicolons?
        ]
      }
    })

Some schema challenges:

  1. In the example, I added some classNames manually, but torchlight supports more. Maybe torchlight exports a map of HTML elements and possible classNames.
  2. torchlight themes are implemented as color and background-colors properties on thestyle attribute, with much variance between themes. I can't find any theme declaration files that could be used as a dictionary yet.
  3. torchlight allows users to add custom classes and ids, which would need to be passed to the schema
    a. note: rehypeSantize will append user-content- to your custom id. E.g. from id="yourFakeId" to id="user-content-yourFakeId"
Other small observations
  • I noticed that the remark-torchlight code imports "pase5" and "parseFrom5" to convert the plugin input from a markdown abstract syntax tree (mast) to an HTML abstract syntax tree (hast)

  • the output of remarkRehype is already a hast. So maybe an opportunity for simplification. (it would require moving the .use(torchlight) down the chain, so its not used until after .use(remarkRehype))

@mcgrealife
Copy link

mcgrealife commented Jan 22, 2023

The `remark-html` plugin allows passing a **schema object into the sanitize prop**. E.g.
await remark()
    .use(html, {
      sanitize: { torchlightSchemaObject } // schema object here!
    })
    .use(torchlight, config)
    .process(markdownInput)

It works like an 'allowlist' for HTML attributes and their values. I.e. classNames and styles.
If a class or style is not explicitly provided, it will be removed.

Example extended schema

There are numerous quirks with this.

But it allows most of the default torchlight theme.

import { defaultSchema } from 'rehype-sanitize' // or from 'hast-util-sanitize'

const extendedSchema = {
  ...defaultSchema,
  attributes: {
    ...defaultSchema.attributes,
    '*': ['style', 'className'], // this `*` wildcard syntax allows these attributes on any HTML element.
    pre: [
      ...(defaultSchema.attributes.pre || []),
      ['className', 'torchlight', 'has-highlight-lines', 'has-focus-lines', 'torchlight has-highlight-lines has-focus-lines'], // i.e. "allow these classes on the 'pre' element
      ['style']
      // ['style', 'background-color: #2e3440ff; --theme-selection-background: #88c0d099;'], // this has to be uncommented or a full string like this
    ],
    code: [
      ...(defaultSchema.attributes.code || []),
      ['className', 'language-js', 'js', 'has-focus-lines', 'torchlight', 'has-highlight-lines', 'has-focus-lines']
    ],
    div: [
      ...(defaultSchema.attributes.div || []),
      ['className', 'line', 'line-focus', 'line-highlight', 'line-focus', 'line-has-background', 'yourCustomClass'],
      ['style', 'background-color: #3b4252', 'background - color: #3b4252; '],
      ['id', 'customId'],
    ],
    span: [
      ...(defaultSchema.attributes.span || []),
      ['className', 'line-number'],
      ['style', 'color: #D8DEE9;', 'color: #88C0D0;', 'color: #A3BE8C;', 'color: #88C0D0;', 'color: #D8DEE9FF;', 'color: #ECEFF4;', 'color: #d8dee9;', 'color:#d8dee9; text-align: right; -webkit-user-select: none; user-select: none;', 'color:#4c566a; text-align: right; -webkit-user-select: none; user-select: none;'], // long strings required for line number style
    ]
  }
}

To create a robust torchlight schema likely requires exporting some dictionaries from the core torchlight library (or from it's underlying shiki processor). For example, a map of all supported language keys.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants