With version 1.0+ this module is ESM only: Node 12+ is needed to use it and it must be imported instead of required.
This rehype
plugin matches W3C-style annotations to their targets in the parsed HTML file. It wraps text range selections in mark
elements and adds attributes and hooks to matched locations that can be used in other processors or browser scripts to implement further behaviours.
Note: this modifies the original tree and in some cases can add class attributes. Make sure to sanitise the tree afterwards.
The script does not embed annotation-provided styles, although that is on the roadmap. We do assign the values of the styleClass
property to the annotated nodes so the client can provide their own stylesheet.
Apache 2.0
We haven't yet published this package on npm
but you can install it directly from the GitHub repository.
npm
:
npm install RebusFoundation/rehype-annotate
rehype-annotate
should be used as a rehype
or unified
plugin to match annotations to a hast
syntax tree.
const vfile = require("to-vfile");
const unified = require("unified");
const annotate = require("rehype-annotate");
const parse = require("rehype-parse");
const stringify = require("rehype-stringify");
const report = require("vfile-reporter");
const glob = require("glob");
const path = require("path");
async function process(file, options) {
return unified()
.use(parse)
.use(annotate, options)
.use(stringify)
.process(await vfile.read(file));
}
const options = {
// Should be an array of W3C Web Annotations
annotations: require("./path/to/annotations/json"),
// the base url for the original html
url: "https://syndicated.example.com/annotated.html",
// the canonical url for the html
canonical: "https://example.org/annotated.html",
};
process("path/to/example/htmlfile.html", options)
.then((file) => {
console.log(report(file));
console.log(String(file));
})
.catch((err) => console.error(err));
The above code will print whatever issues are found out to the console, followed by the processed HTML.
The file.data.annotations
property will contain the annotations that have been matched to the HTML, in the order that the appear in the HTML file itself.
Configure processor
to modify the [hast][hast] syntax tree to match annotations to their target locations in the HTML.
The following attributes are added when an element node is matched by a selector:
data-annotation-id
: the id of the matched annotationdata-annotation-motivation
: space-separated list of the motivations from the annotationmotivation
property.data-annotation-purpose
space-separated list of the purposes from the annotation body'spurpose
property.class
the value of the annotation'sstyleClass
property is added when present.
If the source HTML is as follows:
<h2 id="test-id">Stirs ending exceeding fond muster fall Bagshot.</h2>
And rehype-annotate
is run with the following annotation:
{
"id": "http://example.com/annotations1",
"type": "Annotation",
"motivation": "bookmarking",
"creator": {
"id": "http://example.org/user1",
"type": "Person",
"name": "A. Person",
"nickname": "user1"
},
"created": "2015-10-13T13:00:00Z",
"stylesheet": {
"id": "http://example.org/stylesheet1",
"type": "CssStylesheet"
},
"body": [
{
"type": "TextualBody",
"purpose": "tagging",
"value": "love"
},
{
"type": "TextualBody",
"value": "<p>j'adore !</p>",
"format": "text/html",
"language": "fr",
"purpose": "describing"
}
],
"target": {
"source": "https://example.com/tests/fixtures/fragment-multibody.input.html",
"styleClass": "Bookmarked",
"selector": {
"type": "FragmentSelector",
"value": "test-id"
}
}
}
Then the result should be (provided the url
or canonical
options match the source
):
<h2
id="test-id"
data-annotation-id="http://example.com/annotations1"
data-annotation-motivation="bookmarking"
class="Bookmarked"
data-annotation-purpose="tagging describing"
>
Stirs ending exceeding fond muster fall Bagshot.
</h2>
If the source HTML is as follows:
<p>
Resilient Garulf key quest abandon knives lifted niceties tonight disappeared
strongest plates. Farthing ginger large. Nobody tosses a Dwarf. Makes
Shadowfax nearly lesser south deceive hates 22nd missing others!
</p>
And rehype-annotate
is run with the following annotation:
{
"id": "http://example.com/annotations1",
"type": "Annotation",
"motivation": "highlighting",
"created": "2015-10-13T13:00:00Z",
"body": [
{
"type": "TextualBody",
"value": "<p>j'adore !</p>",
"format": "text/html",
"language": "fr",
"purpose": "commenting"
}
],
"target": {
"source": "https://example.com/tests/fixtures/text-quote.input.html",
"styleClass": "Bookmarked",
"selector": {
"type": "TextQuoteSelector",
"exact": "Resilient Garulf key quest abandon knives"
}
}
}
Then the result should be (provided the url
or canonical
options match the source
):
<p>
<mark
data-annotation-id="http://example.com/annotations1"
data-annotation-motivation="highlighting"
class="Bookmarked"
data-annotation-purpose="commenting"
>Resilient Garulf key quest abandon knives</mark
>
lifted niceties tonight disappeared strongest plates. Farthing ginger large.
Nobody tosses a Dwarf. Makes Shadowfax nearly lesser south deceive hates 22nd
missing others!
</p>
An array of annotations that conform to the W3C Web Annotations Data Model. See note below on selector support.
The annotation is only matched to the html source if the annotation.target.source
property matches either canonical
or url
.
CssSelector
: limited to the selectors supported byhast-util-select
XPathSelector
: becauserehype
/hast
doesn't come with built-inxpath
support, these selectors only work when they are very simple. E.g./html/body/p[1]
FragmentSelector
: supports only HTML fragment ids.RangeSelector
: supported when bothstartSelector
andendSelector
resolve to element nodes.TextQuoteSelector
: implementation is loosely based on the excellentdom-anchor-text-quote
by Randall Leeds.TextPositionSelector
For performance reasons text quote and text position selectors that overlap each other in the document are not supported.
You can use the refinedBy
property on a selector that resolves to a single element node (CssSelector
, XPathSelector
, FragmentSelector
) to create a new scope or root for a another selector, including the TextPositionSelector
or the TextQuoteSelector
.