-
-
Notifications
You must be signed in to change notification settings - Fork 389
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Browser executable-less preprocessor #1469
Comments
Definitely. I've hit this on Dreamhost too. The problem in the past (years ago!) was that JSDOM was super-duper slow for whatever reason - specially on big specs - so it wasn't feasible to use it. That's probably been addressed since then, but someone would need to try it out. |
I thought I'd report here because I've been investigating how to make Respec run in JSDOM (for Reffy). Good news is it's mostly doable. Bad news is it's not exactly straightforward... The tweaks that I needed to make to JSDOM and Respec in Reffy are parked in a
With the workarounds and limitations mentioned above, I manage to run Reffy on all W3C specs that Reffy knows about, and the outcome seems good! |
Thanks for sharing your experience! Yeah, I also tried JSDOM and faced those limits. Maybe we should change the whole logic. Some personal thoughts:
|
@tidoust, I thought Reffy ran against "static" documents? If we had a Reffy web service, ReSpec could simply post a static snapshot directly from the browser without needing to run any transformations on the server. To rephrase: what is Reffy's input? |
Reffy's goal is to generate a knowledge graph of specs and an analysis of that graph to create a list of potential anomalies. To generate the knowledge graph, Reffy:
The input of the Reffy web service that we talked about is slightly different. It would take as input:
From that, it would simply have to parse the provided spec, and analyses the result against the knowledge graph. No transformation is needed on the server for that (unless the input is the raw source, obviously). I started to develop such a service (see reffy-service). I haven't had time to test it with I'm now looking at ways to make the service more robust before we deploy it on one of our servers. The investigation reported here is part of that, as an attempt to reduce the number of dependencies that would have to run on the server. |
@tidoust thanks for clarifying. @saschanaz, do you know if puppeteer runs without needing a virtual screen? If yes, then that addresses GPU-less CLI environments problem in one sense... still a fat dependency, but at least we don't need to refactor code significantly to back port ReSpec to JSDOM. |
puppeteer runs in headless mode by default, so yes. |
@tidoust, given that Puppeteer runs without needing the virtual display, is it a viable alternative to JSDOM for Reffy? |
Sure! I was not reporting these issues as requests to change ReSpec. The problem here is not how ReSpec is written, it is JSDOM not yet supporting enough features for ReSpec. I just wanted to document the current limitations somewhere. Actually, we still have a virtual screen for now on the server, but maybe that's no longer needed with Puppeteer (it was for nightmare). We just haven't tried without. But that was not the reason why I wanted to get rid of RespecDocWriter, it was mostly to simplify dependencies and avoid having to run a full chromeless browser on the server. |
Ok cool. Sorry I was confused. I appreciate the documentation above also, so thanks for that!
Understood. Ok, if there is anything else we can do, happy to try to help. Reffy looks super cool and if we can integrate somehow directly (via POST or whatever), happy to help with that however we can! |
Reffy needs to run ReSpec on Editor's Drafts. It relied on RespecDocWriter until now, which uses Puppeeter, which in turn relies on a headless version of Chrome. That requires downloading 100s of MB of dependencies and is a tad heavy for the task at hands, especially as we consider running Reffy as an HTTP service. With this update, RespecDocWriter is no longer being used. Instead, Reffy makes ReSpec run in JSDOM, which is much more lightweight (and JS-based). Problem is that JSDOM does not yet support enough features to run ReSpec. Concrete issues are documented in: https://github.com/w3c/respec/issues/1469#issuecomment-388560878 This update monkey-patches JSDOM's code to add good-enough implementations of missing features, as well as to regain control over network requests so that all network requests go through our cache. It also monkey-patches ReSpec to: 1. drop a couple of modules that cannot run because they use non-implemented features of JSDOM that cannot be easily patched 2. update the code of a couple of ReSpec dependencies for the same reason The whole thing remains fragile: updates to ReSpec or to its dependencies may well break the regular expressions used to monkey patch the code. Most of the monkey-patching should resist minor changes. A couple of them may not though.
Reffy needs to run ReSpec on Editor's Drafts. It relied on RespecDocWriter until now, which uses Puppeeter, which in turn relies on a headless version of Chrome. That requires downloading 100s of MB of dependencies and is a tad heavy for the task at hands, especially as we consider running Reffy as an HTTP service. With this update, RespecDocWriter is no longer being used. Instead, Reffy makes ReSpec run in JSDOM, which is much more lightweight (and JS-based). Problem is that JSDOM does not yet support enough features to run ReSpec. Concrete issues are documented in: https://github.com/w3c/respec/issues/1469#issuecomment-388560878 This update monkey-patches JSDOM's code to add good-enough implementations of missing features, as well as to regain control over network requests so that all network requests go through our cache. It also monkey-patches ReSpec to: 1. drop a couple of modules that cannot run because they use non-implemented features of JSDOM that cannot be easily patched 2. update the code of a couple of ReSpec dependencies for the same reason The whole thing remains fragile: updates to ReSpec or to its dependencies may well break the regular expressions used to monkey patch the code. Most of the monkey-patching should resist minor changes. A couple of them may not though.
@marcoscaceres This has always been my ultimate purpose, for binary-less lighter dependencies, a scriptable API, easier test without karma, and easier debug. Could we bake an experimental build with minimal features to check what should be done, although I'm currently not sure which feature to start? Edit: My vague thought:
|
Excited to see where this ends up! |
I got a blocker from hyperHTML, which cannot be imported on node.js. Maybe we have to fork |
Pinged WebReflection for guidance. |
hyperHTML is 100% code covered via node, all you need to do is to write I've used basichtml to serve output even in cloudflare so everything you need should be there. However, viperHTML is always an alternative if you need just plain SSR 👋 |
So it looks like |
@saschanaz if you have If you don't you can use an empty i.e. const window = {};
require('basichtml').init({window});
const {customElements, document, HTMLElement} = window;
customElements.define('hello-world', class extends HTMLElement {
connectedCallback() {
this.textContent = 'Hello world!';
}
});
// workaround to execute with a temporary global document
const $ = fn => {
const oldDocument = global.document;
global.document = document;
const result = fn();
if (oldDocument)
global.document = oldDocument;
else
delete global.document;
return result;
};
const {bind} = $(() => require('hyperhtml'));
$(() => {
bind(document.documentElement)`
<head>
<title>This is basicHTML</title>
</head>
<body><hello-world/></body>`;
});
console.log(document.toString());
/*
<!DOCTYPE html><html><head>
<title>This is basicHTML</title>
</head><body><hello-world>Hello world!</hello-world></body></html>
*/ but then again, if you have already a |
I mean Currently I decided to use viperhtml for template string and parse the result with jsdom again, as I'm currently trying to remove any global or module level states. It's still a draft, so maybe I could migrate to |
that is not a real-world use case, but even if you need to do that, you can pass an object instead, so nothing gets polluted, and jsdom will work.
You are contradicting yourself, since jsdom pollute the global scope, right?
Sure, whatever works best. I don't even know what is your actual goal (had no time to read this whole thread, I just jumped in at "I got stuck due node" when all I do is using node for everything I develop on the web 😁) |
AFAIK it doesn't, does it? |
actually my bad, it doesn't. I've seen tests written with Then again, neither does In regards of |
@tidoust Hi! Unfortunately we found that #2187 introduces too many changes and creates additional maintenance burden to us. (Even when it might give some hypothetical wins 👀) I want to hear your current need again to decide the future of #2187. Could you use |
Hello @saschanaz,
I don't think that there is anything blocking us from using The one thing that I do now, and which I don't know if I can easily do if we switch to I totally understand the maintenance burden argument. I actually considered using a forked version of Respec with #2187 baked in... and decided against it for the very same reason. +@dontcallmedom who was the one advocating for the switch to JSDOM in the first place. |
I agree it'd a nice to have; if it's too costly to maintain, then we'll absorb the cost in running with headless chromium. |
Okay, that means this is now officially low-priority unless we find an easier way. @marcoscaceres I still want to keep #2187 open to 1) extract some good thing from there 2) maybe just for me to experiment things. Sounds good? |
Sounds good. Yes, let’s grab whatever goodies we can from #2187. |
Making latest version of Respec run under JSDOM is too much of a maintenance burden in the end: https://github.com/w3c/respec/issues/1469#issuecomment-546790835 This update makes Reffy use respecDocWriter (which uses the headless browser Puppeteer under the hoods) again. This allows to run the latest version of Respec. JSDOM is still used to load specifications that do not use Respec and to detect when a specification uses Respec. Note the code no longer runs external scripts when it loads specifications: that seems unneeded in practice. The code still contains minimal monkey patching for JSDOM so because a number of specifications do have inline scripts that call `window.matchMedia`, which JSDOM does not support. This will fix #134.
Making latest version of Respec run under JSDOM is too much of a maintenance burden in the end: https://github.com/w3c/respec/issues/1469#issuecomment-546790835 This update makes Reffy use respecDocWriter (which uses the headless browser Puppeteer under the hoods) again. This allows to run the latest version of Respec. JSDOM is still used to load specifications that do not use Respec and to detect when a specification uses Respec. Note the code no longer runs external scripts when it loads specifications: that seems unneeded in practice. The code still contains minimal monkey patching for JSDOM so because a number of specifications do have inline scripts that call `window.matchMedia`, which JSDOM does not support. This will fix #134.
A new possibility: someone made a JS-to-binary compiler that also supports WASM. Not sure it's any valid, though. |
ReSpec shouldn't require installing real browsers to use
respec2html.js
as jsdom supports browser-like DOM manipulation in Node.js. Dropping Nightmare (and so Electron) dependency will allow easier build environment settings on GPU-less CLI environments.The text was updated successfully, but these errors were encountered: