Skip to content
This repository has been archived by the owner on Sep 2, 2023. It is now read-only.

disambiguate "web compatibility" #142

Closed
bmeck opened this issue Jun 28, 2018 · 113 comments
Closed

disambiguate "web compatibility" #142

bmeck opened this issue Jun 28, 2018 · 113 comments

Comments

@bmeck
Copy link
Member

bmeck commented Jun 28, 2018

similar to "transparent" interop, the use of the term "web compatibility" is a bit muddy in usage.

I think we need to discern 2 main things:

  1. Code compatibility concerns: if users are being encouraged to/must produce code that cannot run on web browsers. Examples include:
    • using things that are not present on import.meta in browsers
    • using magic variables like __filename which are not going to be present in the browser
    • using modules written in CJS
  2. Platform compatibility concerns: if the Node implementation of ESM varies from the WHATWG implementation in a way that prevent people from shipping an ESM module graph that works in Node to the web. Examples include:
    • having a different cache mechanism for indirection
    • having a different value for import.meta.url
    • having a different resolution algorithm

I want to be very clear that platform concerns are about how ESM graphs are loaded and what contextual data is provided to those modules. Underlying formats used to ship modules is unaffected and may be overloaded through different container mechanism such as using webpackage or BinaryAST.

Notably, I would like to make a discerning line about if importing non-ESM is a compatibility concern on the Code level or the Platform level.

I believe very firmly that importing non-ESM is neither.

  • For the Code compatibility concern, the simplest example of this is to show an application that contains only ESM. The Code compatibility concern is about requiring code be written using syntax or APIs that cannot work on the web. If you don't import CJS, you don't have this compatibility concern, even if the feature of importing CJS exists.

  • For the Platform compatibility concern, people can claim that import 'foo'; resolving to CJS in Node is a platform compatibility concern. I want to claim this is a false concern.

    1. Having foo resolve to ESM is still possible either by porting, bundling, or shipping multiple builds. At no point is foo required to be CJS by writing import 'foo';.
    2. If we look at the idea of the web shipping a format that Node does not support (such as HTML modules). There is nothing preventing that author of HTML modules to take the inverse approach so that it could be loaded in Node. If we are to state that importing CJS is a platform compatibility concern due to requiring specific dependencies be written in CJS, we would also need to state the inverse for HTML modules ; which is false, as import does not require the format of the dependency being loaded to be anything , be it CJS or HTML.

I will leave migration concerns up to @demurgos 's incoming review of the issue, but felt that we should address the topic of what "web compatibility" is in a way that we can be concrete in what breaks when we talk about features.

I am open to changing definitions above, but want to make us be more mindful of using the term "web compatibility" without explaining what breaks by including a feature. Is it the ability to use ESM loading mechanisms in both places, or is it the ability to run the same source text in both places? We should also seek to prioritize if the ability to run the same source text in both places is mandatory given that any usage of CJS will not be possible to run without some assistance in the browser.

@jkrems
Copy link
Contributor

jkrems commented Jun 28, 2018

I would call import 'cjs' a code compatibility concern, just like __filename or accessing node-specific globals like Buffer. In all cases it's valid syntax but will do different things in the browser (in most cases: it will break in the browser). You can change another part of your project to "fix" it but at that point you changed the code and it is no longer import 'cjs'.

Magically switching import 'cjs' once a library exposes a module also has the disadvantage that we get one of two undesirable outcomes:

  1. Libraries need to keep exporting default as an object containing a copy of their named exports.
  2. Libraries need to treat adding module support as a breaking change.

There's an implicit third outcome, equally undesirable, where adding support for <insert mechanism that allows named exports for CJS> is a breaking change or forces a weird reserved default export.

@bmeck
Copy link
Member Author

bmeck commented Jun 28, 2018

@jkrems can I ask for examples that mandates people to import CJS, or presents a path that prevents them from shipping ESM?

What you describe is that there a lack of feature parity between the platforms, I claim that is not a compatibility concern since nothing enforces people to do something that must be incompatible with the web.

You are describing a migration problem as well, but you are not stating anything that prohibits people from shipping a full ESM graph to the web. Do you have examples of how allowing import of formats not supported by the web prevents importing web supported formats?

@bmeck
Copy link
Member Author

bmeck commented Jun 28, 2018

@jkrems maybe to clarify a bit, what about import 'cjs'; has different aspects than import.meta.require('cjs') that makes import.meta.require able to avoid breaking module graphs shipped to both platforms.

@jkrems
Copy link
Contributor

jkrems commented Jun 28, 2018

what about import 'cjs'; has different aspects than import.meta.require('cjs') that makes import.meta.require able to avoid breaking module graphs shipped to both platforms.

Nothing. It just makes it immediately apparent that a file containing import.meta.require('some-lib') won't work in the browser. For import 'some-lib' I have to start researching and/or reading some-lib's code.

@bmeck
Copy link
Member Author

bmeck commented Jun 28, 2018

@jkrems would the notification of linkage failing and certain parse failures not serve the same purpose? In addition as I mention above some-lib could take multiple routes to support shipping both formats. Is your concern just about wanting to define the format of the module being imported at the location it is imported?

@GeoffreyBooth
Copy link
Member

I’ve thought it was odd that Node allows import './file' but browsers require the full filename including extension, e.g. import './file.js'/.mjs. Ditto for folders resolving to /index.mjs. Regardless of whether you think this is behavior that Node should allow in ESM, I think this automatic resolving of file extensions and of folder entry points is something that should be on the list of code compatibility concerns.

I think there’s a compelling argument to be made for dropping automatic extension resolution and automatic folder index.mjs discovery, in the interest of browser compatibility. The use case for those features, it seems to me, is import interoperability where Node can automatically resolve to .mjs or .js, or index.mjs vs index.js. That’s a valuable use case, to be sure, but I wonder if it shouldn’t be something that users opt into by adding a loader.

Such an approach would pair with the package name maps proposal to provide ways to do things like import 'lodash' in both browsers and Node. Assuming that gets adopted, lots of NPM modules might be otherwise perfectly web-compatible aside from their dropping of file extensions in import statements, for example. It might be a good idea for Node to nudge them in the direction of browser compatibility, by somehow making them opt in to the browser-incompatible syntax.

@bmeck
Copy link
Member Author

bmeck commented Jun 28, 2018

@GeoffreyBooth I've put the resolution algorithm into Platform compatibility concerns.

@bmeck
Copy link
Member Author

bmeck commented Jun 28, 2018

I'd also note that even with the example of import 'lodash' which doesn't have an extension it would still be able to resolve to any format, so it is separate from the ability to import non-ESM.

@ljharb
Copy link
Member

ljharb commented Jun 28, 2018

@jkrems you have to do that research regardless, because any module anywhere in the graph might use fs. Are you suggesting that ESM shouldn't be able to import node's core modules?

@ljharb
Copy link
Member

ljharb commented Jun 28, 2018

@GeoffreyBooth a) browsers don't mandate the extension, they use URLs. if the URL lacks an extension, so too can the import path. b) with package name maps, you'll be able to omit extensions in browsers (just like is the best practice in node/npm), and it will work the same without a build step. This is a good thing.

@GeoffreyBooth
Copy link
Member

a) browsers don’t mandate the extension, they use URLs. if the URL lacks an extension, so too can the import path.

I don’t quite follow. I didn’t mean to imply that browsers care about extensions, I only meant that they require fully resolvable paths/filenames (yes, as part of a URL). My example import './file.js' is using a relative URL, like <script src="./file.js">. If the URL lacks an extension, doesn’t that just mean that the webserver needs to decide what to serve for that? So for import './file', the webserver would need to either serve an extensionless file named file or know to serve file.js, the same way that many webservers automatically serve index.html for folders?

Since it’s not standard for webservers to resolve an URL ending in file to file.js or file.mjs, it feels like something that Node shouldn’t do either, at least not by default. If it becomes standard, such as via package name maps, sure, that would be great. In that case, though, I would think that Node should do it the same way, through package name maps, rather than its own custom implementation.

@ljharb
Copy link
Member

ljharb commented Jun 28, 2018

It's something node already does by default, and it's something users expect. Whether it's implemented in terms of package name maps or not, I think that it would be extremely hostile to users to omit.

@MylesBorins
Copy link
Contributor

MylesBorins commented Jun 28, 2018 via email

@GeoffreyBooth
Copy link
Member

@ljharb I hear what you’re saying, and the user hostility is why there should still be some way for them to do it: via a loader, via package name maps, or something else. But put simply, either we’re trying to be equivalent with browsers by default or we’re not.

The current behavior certainly wouldn’t change in CommonJS, it’s only ESM we’re discussing here.

I’ve come to accept that Node isn’t going to support everything that current transpilers do, at least not without loaders or other hacks/patches. I think the explanation that “Node has added support for import and export in the same way that browsers support that syntax and ES modules” is a compelling explanation that most users will grasp, and helps defuse complaints about things that don’t behave as users expect or might want.

@ljharb
Copy link
Member

ljharb commented Jun 28, 2018

I don't think we should be trying to be equivalent by default with browsers - browsers don't have filesystem access, nor a massive CJS ecosystem it needs to retain compatibility with (they have a much more massive legacy ecosystem to retain compatibility with).

@bmeck
Copy link
Member Author

bmeck commented Jun 28, 2018

But put simply, either we’re trying to be equivalent with browsers by default or we’re not.

I'd note that in all scenarios we are not equivalent. using import.meta.require() is the same level of concern as supporting import 'cjs'; since it also won't work in browsers because the dependency is an unsupported format, not to mention that import.meta.require won't even be shipping in browsers. There is no scenario where we have this false sense of equivalence.

@zenparsing
Copy link

Is there any advantage to choosing import.meta.require versus import "cjs"?

  • import.meta.require is less complicated to explain
  • import "cjs" can give the user some surprise, if they are expecting named exports
  • import.meta.require nudges authors to intentionally opt-in to an ESM API, and therefore may encourage migration

@GeoffreyBooth
Copy link
Member

GeoffreyBooth commented Jun 28, 2018

I think the distinction is that, we don’t expect equivalence with browsers whenever CommonJS or interoperability with CommonJS is involved; but in all-ESM mode, I would expect Node to behave the way browsers do. If I can do import './file.js' in a browser, I should be able to do the same in Node. If I can’t do import './file' in a browser, where the file to be loaded has an extension, I wouldn’t expect Node to let me do so (without patching).

@devsnek
Copy link
Member

devsnek commented Jun 28, 2018

i think thats a bit of a red herring. both node and browsers have non-esm formats they would like to interact with (wasm, html, cjs, c++, etc).

@ljharb
Copy link
Member

ljharb commented Jun 28, 2018

@zenparsing imo requiring intentional opt-in actually discourages migration :-/

@GeoffreyBooth you can't use document in node; i don't think it's reasonable to assume that "i can do it in a browser" means "i can do it in node", and certainly not the reverse.

@GeoffreyBooth
Copy link
Member

i don’t think it’s reasonable to assume that “i can do it in a browser” means “i can do it in node”, and certainly not the reverse.

Sure, not for everything. I don’t expect Node to support document, as there’s no document in Node. But there’s no reason Node can’t support import './file.js'. The goal is browser equivalence. Just because there are zillions of examples you can point to of places where Node and browsers diverge doesn’t mean that we should add more if we don’t have to.

Let’s cut to the chase: @ljharb, I assume you want to make import './file' work because you want file-level import interoperability where import can import both a script-goal .js file or an ESM .mjs file. Right? And/or the convenience of leaving out extensions, as is common in Node. Any other reasons?

And so the question is whether those goals are more or less important than enforcing browser equivalence when it comes to the import statements that users type in JavaScript code that could potentially execute in either Node or browsers. We have conflicting goals here, and so it’s a subjective call of which priorities are more valued.

@ljharb
Copy link
Member

ljharb commented Jun 28, 2018

That's the primary reason, yes - and I think it's much more important. Package name maps allow the browser to work as ergonomically as node does here, which is great.

I'm also happy to have node support package name maps in some form; but I think it's critical that the default implementation work like node already does.

@bmeck
Copy link
Member Author

bmeck commented Jun 28, 2018

@zenparsing those have counterpoints so it doesn't really aid:

import.meta.require is less complicated to explain

requires knowing the difference in the algorithms, supported formats, loaders, and caches. i presume this is more burden than learning how import treats specific formats.

import "cjs" can give the user some surprise, if they are expecting named exports

clear linking errors removes that point, same as when you import anything that doesn't have a specific named export. surprise is easily fixed.

import.meta.require nudges authors to intentionally opt-in to an ESM API, and therefore may encourage migration

import.meta.require guarantees your source text cannot be used on the web platform. there is no migration path with it.

The only point that seems to have a bigger impact that its counterpoint is the one about surprise and named exports. However, it has clear errors and easy to learn either by documentation or experimentation using import * as or import().

@GeoffreyBooth you make a point about module resolution but nothing about loading CJS. If you resolved to ./file.js and it was served with application/node it would error, but nothing prevents you from loading ESM on the web by letting Node load application/node. The same error of being unable to load a dependency due to it not being supported by browsers applies if you use import.meta.require or if you have any file that imports a node core library like http.

In addition, you can load ./file:

// this line won't work on the web
// linkage fails so this file doesn't even evaluate
// under the argument for being unable to import things the web cannot it is a compatibility problem
// we should use `import.meta.require('http')`
import http from 'http';

// everything in here doesn't work because the web also doesn't ship these APIs
// we once again are not equivalent
http.createServer((req, res) => {
  let files = {
     '/': {type: 'text/html', body: `<script type=module src=/file></script>`},

    // no extension required, just MIME
    // can also do similar using .htaccess / CDNs / cloud storage providers like S3 / ...
    // <FilesMatch "^[^.]+$"> // change as needed
    //   ForceType text/javascript
    // </FilesMatch>
     '/file': {type: 'text/javascript', body: `console.log(1)`},
  };
  let file = files[req.url];
  res.writeHead(200,  {'content-type': file.type});
  res.end(file.body);
}).listen(9090);

addendum: fun fact! even if you set *.js to application/node it will load as Script on the web when using non-module type <script> tags! <script type=text/javascript> doesn't check the content type it loads, at all!

@GeoffreyBooth
Copy link
Member

@bmeck I didn’t mean in my example that file.js was to be assumed to be CommonJS or Script, forgive me for leaving that out. I said above that I was describing an all-ESM environment, but I should’ve made extra clear that I was intending file.js to be an ES module. I understand that’s confusing in this context since in experimental-modules ES modules must be files ending in .mjs. What I had in mind were the examples of how to use import in browsers, such as at https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Statements/import, where you see lines like:

import * as myModule from '/modules/my-module.js';

Yes, obviously you can write lots and lots of Node code that doesn’t work in browsers, and vice versa. That doesn’t mean there isn’t value in trying to align with browsers where we can. This is one place we can, admittedly at the cost of preventing other use cases that people really want (without loaders/patches). Maybe we’ll decide that browser equivalence here isn’t worth the cost, and we would rather make an exception to the browser equivalence goal for these other use cases that are deemed more important.

@zenparsing
Copy link

There seem to be a couple of fairly strong arguments against both import.meta.require and import "cjs":

  • import.meta.require injects legacy APIs into the new module system
  • import "cjs" (assuming no named exports) will probably frustrate users when moving from transpiled ESM to native ESM that are expecting named imports

The max-min implementation must provide some way to load CJS from ESM, but given that both are controversial we might need a third max-min option. Wasn't there a "createRequire" proposal or something floating around?

@zenparsing
Copy link

Also, a meta-note: I'm not really digging the GH emoji responses in the context of this debate. I find them emotionally distracting; I think they make it harder to come to a shared understanding.

@GeoffreyBooth
Copy link
Member

Also, a meta-note: I’m not really digging the GH emoji responses in the context of this debate. I find them emotionally distracting; I think they make it harder to come to a shared understanding.

Strong +1 to this (almost used an emoji for it). People have a right to express themselves, of course, but 👍s to support one side of a debate only make arguments more heated. It feels like a lot of 👍s are given to express solidarity with one side or another, which doesn’t help forge consensus.

@devsnek
Copy link
Member

devsnek commented Jun 29, 2018

i like at least 👍 because it lets us gauge agreement without cluttering comments

@ljharb
Copy link
Member

ljharb commented Jun 29, 2018

I think the frustration of "i can't have named imports from CJS, but oh well, i can still import it and destructure in another line" is way way less of a downside than legacy APIs in the new module system.

@bmeck
Copy link
Member Author

bmeck commented Jun 29, 2018

I can try and expand to comments instead of emojis to give greater clarity in what I agree with within other comments, but already feel I occupy a large amount of text in multiple threads.

@GeoffreyBooth you do have a point with some documentation, but there are plenty of others with named exports from CJS, loading CJS using .js, and loading files while using Node's path resolution algorithm without extensions and using node_modules; some are even saying to use .mjs even if you are purely on the web platform now since Modules don't act like Scripts. I'm not sure having one specific set of documentation say one thing means anything of note when others say different things.

@zenparsing I'm not sure minmax makes sense here because the very nature of being able to load CJS in any form is the heart of the problem if you claim that Node should be equivalent to web browsers. There is not a solution with no downsides because of this. Minmax is about finding the solution with the minimal set of non-controversial features but the very nature of being able to use any of the CJS on npm or from Node core is the root of the claimed compatibility issues. The only solution would to be unable to load CJS or Node core because those are being claimed as incompatible with the web. I agree that they are not features that the web provides, but they have no problems preventing code to be written for both if we don't expose APIs that cannot ship to the web (notably any form of require()). Allowing people to import dependencies that are CJS is vastly different from forcing them to use web incompatible APIs, they have multiple migration paths if they can change the implementation of a dependency to ESM and can use things like service workers, package name maps, etc. to run on the web without rewriting their source text. Using loader provided synchronous APIs would not work with approaches like package name maps or even service workers.

Per createRequireFunction, it is has the same effect as import.meta.require in terms of downsides:

import Module from 'module'; // web is missing feature here, error

// could also just assign this to a variable, but it ends up the same as import.meta.require
import.meta.require = Module.createRequireFunction(import.meta.url);

It moves the web incompatibility to a module rather than all import.meta objects, which has ergonomic problems but prevents our loader from always having parts that are encouraging people to use CJS; they would need to opt into using non-web compat userland by using non-web compat core modules. The key difference in this opt-in vs using a runtime function is that the module graph would fail to load earlier and be easier to find ahead of running code.

I agree with @ljharb on what you see as controversial. I don't think it is a strong enough point to claim that surprise is controversial when documentation, reflection, error messages, etc. allow multiple paths of finding the solution quickly. It also leaves the solution to the consumer so they don't need to wait on upstream changes. If we were to take the API based approach using require, the consumer and the author would both need to change to use ESM based APIs.

@bmeck
Copy link
Member Author

bmeck commented Jul 7, 2018

Actually, I said that Node in a CommonJS context treats .js as application/node. We haven’t yet decided on how .js should be treated in an ESM context. Per convention, webservers and browsers have decided that .js in an ESM context should be treated as text/javascript. Node could decide the same.

One of the reasons for my disagreement is a I don't see a differing require() and import context I only see a Node context. Having a single file differ based upon method of consumption is something I was arguing against above.

Really all we know from reading the code is that this file could be loaded as application/node or it could be loaded as text/javascript. That’s the same information we get from the fact that the file extension is .js.

This is not the same information, the UMD wrapper is a well known method to feature detect the environment a Script is being run in and is compatible with multiple environment. The script is doing the side effects after detecting the environment it runs at. This is after it is given a MIME and is being evaluated. We have been arguing about MIMEs here and how sometimes multiple MIMEs can apply to a single source text.

You don't need such a complex example to show that some source texts can run in different modes, a simple console.log(123); is doing the same in being able to run in all sorts of ways.

Unfortunately, the presence of a .js extension—even from files loaded from the NPM registry—just isn’t enough to disambiguate files intended for importing into CommonJS versus files intended for importing into a browser environment or ESM (either Node ESM or browser ESM).

For all of Node's existence it has been application/node. I agree that file extension is not enough, but I firmly believe you have to opt out of the existing behavior, not opt out of new differing behaviors.

Since a single file can be both CommonJS and ESM at once, as we see above with jquery.js, we can’t rely on anything inside a .js file or in its filename/extension to disambiguate it. That’s why relying on .js meaning CommonJS, even in the context of Node and even in the context of files loaded from the NPM registry, is unsafe.

How so, you have only shown that there are source texts that work between different formats, we have not proven that formats themselves are irrelevant. The fact that the source text is designed to work in different formats doesn't really mean it mustn't be one, but instead just shows that it can be any of the supported ones, such as application/node. You have shown that .js can have a source text that works in multiple formats, but I don't follow the requirement that we cannot use a default MIME for the file extension. You even argue that we should use a default MIME of text/javascript.

It may be correct most of the time, but “most of the time” is all the more dangerous because then edge cases are truly surprising.

I'm not sure we can have a 100% foolproof solution without defining the disambiguation mechanism. As you have shown and others, there are a plethora of source texts that can run in multiple formats, and some that even result in different execution. The solution here is to stay unambiguous, and defaulting the MIME to a specific format is likely part of that solution.

Before we dive into solutions, can we at least agree on this? That Node allowing import './file.js' to by default assume that file.js is CommonJS based solely on the .js extension is both an unsafe assumption and an incompatibility with how browsers and typical webservers operate?

I agree that there is a compatibility concern due to the ambiguity. I do not agree that assuming it is CJS is unsafe because that is how Node treats all .js files currently and it isn't having problems either in Node, UMD examples you show above, or bundlers if it stays that way. Node is its own context and still needs disambiguation, webservers could use whatever mechanism but that requires extra effort beyond the dumb servers we have been talking about. We already have these concerns you have when CJS files are served as text/javascript though so I'm not sure how the inverse doesn't apply to .js being non-CJS being incompatible with Node.

@ljharb
Copy link
Member

ljharb commented Jul 7, 2018

I do not agree, because it is not how browsers operate - browsers ignore the file extension, and require an out of band mechanism (the “type” attribute on a script tag) to differentiate. node choosing the file extension as that mechanism is 100% compatible with browsers, and also with webservers that already use file extensions to differentiate file types.

@GeoffreyBooth
Copy link
Member

node choosing the file extension as that mechanism is 100% compatible with browsers, and also with webservers that already use file extensions to differentiate file types.

Browsers aren’t really the issue here, webservers are. As discussed above, most (if not all) webservers will choose the same MIME type for both .js and .mjs, causing browsers to load both types of files as ESM; and this isn’t likely to change. This is the incompatibility with Node treating .js and .mjs as meaning different things, since the same is not true of most webservers.

I agree that there is a compatibility concern due to the ambiguity. I do not agree that assuming it is CJS is unsafe because that is how Node treats all .js files currently and it isn’t having problems either in Node, UMD examples you show above, or bundlers if it stays that way.

So we at least agree that there’s a problem here, and the issue is what to do about it, which may be nothing. I’ll try to find time on Monday or Tuesday to condense this discussion into a summary that the group can digest.

@ljharb
Copy link
Member

ljharb commented Jul 8, 2018

The concern isn't the MIME type, it's what script tag "type" they'll choose, since that's the mechanism the browsers use to determine how an entry point is parsed.

@GeoffreyBooth
Copy link
Member

The concern isn’t the MIME type, it’s what script tag “type” they’ll choose, since that’s the mechanism the browsers use to determine how an entry point is parsed.

The discussion above is around import statements that import .js files. We’re already in ESM mode, in both Node and browsers, to be able to run an import statement at all; the type attribute isn’t relevant by that point. The incompatibility is that an import statement of a .js file will load the file as CommonJS/Script in Node --experimental-modules and as ESM/Module in browsers (as served by a typical webserver).

Until I write up a summary, you can just look at the demo, including reading the code in that repo. That explains the incompatibility pretty succinctly, and it has working code you can run. The webserver in that demo behaves as a typical real-world webserver would.

@ljharb
Copy link
Member

ljharb commented Jul 8, 2018

Yes, of course - but you're assuming no build process. Before anything gets to the browser, the files can easily have been preprocessed to do the right thing.

@GeoffreyBooth
Copy link
Member

Let me also respond to some of @bmeck’s points:

One of the reasons for my disagreement is a I don’t see a differing require() and import context I only see a Node context. Having a single file differ based upon method of consumption is something I was arguing against above.

I think I’m finally starting to understand where you’re coming from. I suppose that under the hood, Node’s implementation of loading files probably shares a lot of code in common between import and require, so I can see why as a Node developer you’d see them as nearly equivalent; but from my perspective as a user they’re very different. As a user, I wouldn’t expect require in CommonJS mode to behave equivalently as import in ESM mode. If that was the expectation, then if import './commonjs-file.js' works then so should require('./esm-file.mjs').

Having a single file differ upon method of consumption already happens, at least for initial entry points. Take my demo’s ambiguous.js file. If you have a browser load it via <script src="ambiguous.js"> you get sloppy mode, whereas via <script type="module" src="ambiguous.js"> you get strict mode. So I guess the question is, if the method of consumption is relevant for initial entry points, why shouldn’t it be relevant everywhere?

For all of Node’s existence it has been application/node. I agree that file extension is not enough, but I firmly believe you have to opt out of the existing behavior, not opt out of new differing behaviors.

The user is opting into new behavior by using the import statement. We’ve never had the import statement before, and it’s not equivalent to require; it doesn’t simply inherit require’s existing API. By opting into using the import statement, users are opting into its new syntax and new behaviors.

I’m not sure we can have a 100% foolproof solution without defining the disambiguation mechanism. As you have shown and others, there are a plethora of source texts that can run in multiple formats, and some that even result in different execution. The solution here is to stay unambiguous, and defaulting the MIME to a specific format is likely part of that solution.

Truly staying unambiguous would mean that you would never need to disambiguate.

Unambiguous on the author side would be .mjs for ESM and .cjs for CommonJS. I just don’t think we can really safely assume anything about the author’s intent from a .js extension. By definition, .js is ambiguous.

Unambiguous on the consumer side would be having import only ever import ESM files into ESM mode, like what browsers do. If one wants CommonJS-into-ESM import interoperability, one could use a separate API to achieve that, such as import.meta.require. It’s hard to get more unambiguous than completely separate APIs for importing CommonJS versus importing ESM; I think that’s why @MylesBorins was assuming that import.meta.require would be uncontroversial, since you could always have that in addition to a disambiguating import statement. Having separate APIs seems to be the plan for CommonJS, where the leading proposal seems to be using import() for importing ESM into CommonJS, rather than having require disambiguate CommonJS from ESM by looking for an .mjs extension.

@bmeck
Copy link
Member Author

bmeck commented Jul 8, 2018

I think I’m finally starting to understand where you’re coming from. I suppose that under the hood, Node’s implementation of loading files probably shares a lot of code in common between import and require, so I can see why as a Node developer you’d see them as nearly equivalent; but from my perspective as a user they’re very different. As a user, I wouldn’t expect require in CommonJS mode to behave equivalently as import in ESM mode. If that was the expectation, then if import './commonjs-file.js' works then so should require('./esm-file.mjs').

They actually share very little code, this is coming from someone who has been authoring both ESM and CJS in my Node applications. Having a file of unknown format is problematic, for tooling, dealing with issues around teaching, defensive programming, etc.

My expectation as an author is that my file doesn't get interpreted in different ways than what I intended it to be interpreted as. That applies to execution and linking, both forwards compatibility and backwards compatibility.

Having a single file differ upon method of consumption already happens, at least for initial entry points. Take my demo’s ambiguous.js file. If you have a browser load it via <script src="ambiguous.js"> you get sloppy mode, whereas via <script type="module" src="ambiguous.js"> you get strict mode. So I guess the question is, if the method of consumption is relevant for initial entry points, why shouldn’t it be relevant everywhere?

There are people from WHATWG that regret it being ambiguous, but given how <script> without type=module works, it was inevitable. I don't think showing other regrettable forced decisions is really a compelling argument.

In particular it shouldn't be relevant elsewhere because it means some problems for things like determining what MIME a given file should be. If it should be one MIME if loaded one way but another MIME if loaded another way, that is problematic for understanding how to deal with the file and matches my concerns above about using both CJS and ESM.

The user is opting into new behavior by using the import statement. We’ve never had the import statement before, and it’s not equivalent to require; it doesn’t simply inherit require’s existing API. By opting into using the import statement, users are opting into its new syntax and new behaviors.

It does not inherit requires implementation, but there has been no opting in by the author; as @demurgos has pointed out, there is both consumer and author migration concerns. The consumer has opted into the new behavior, but the author has not. I am on the side of the author being the authority on how the file needs to be interpreted. So, not all "users" are opting in as you were stating, only 1 side of the pair of modules interacting has.

It’s hard to get more unambiguous than completely separate APIs for importing CommonJS versus importing ESM; I think that’s why @MylesBorins was assuming that import.meta.require would be uncontroversial, since you could always have that in addition to a disambiguating import statement. Having separate APIs seems to be the plan for CommonJS, where the leading proposal seems to be using import() for importing ESM into CommonJS, rather than having require disambiguate CommonJS from ESM by looking for an .mjs extension.

This is a large point of controversy to me, because it doesn't make any plans for disambiguating the .js type, but intends to keep using it. The idea represented here is dangerous because it acts as if it doesn't keep ambiguity, but is actually making the ambiguity worse by limiting options for disambiguation and being web incompatible. In particular, people could intentionally be using ambiguous files as you show in your UMD example and we would need to start preserving that behavior.

I absolutely think import.meta.require does not separate CommonJS from ESM since it is operating on the same files and is easy to show as tainting web compatibility.

@GeoffreyBooth
Copy link
Member

There are people from WHATWG that regret it being ambiguous, but given how <script> without type=module works, it was inevitable. I don’t think showing other regrettable forced decisions is really a compelling argument.

WHATWG could have required a new MIME type (and therefore, a new file extension) for ESM JavaScript—but they didn’t. WHATWG settled on using text/javascript for both Script and Module JavaScript. The fact that their spec uses .js in its examples is proof enough that they don’t encourage file extension disambiguation.

This idea that an individual file needs to control how it’s parsed is really the source of the incompatibility here. That’s not the case on the Web, and if Node insists on it, it will lead to this incompatibility and probably others.

@ljharb
Copy link
Member

ljharb commented Jul 8, 2018

That’s not the case on the web because browsers made a mistake. We don’t have to repeat their mistake, nor would correcting it cause an incompatibility.

@bmeck
Copy link
Member Author

bmeck commented Jul 8, 2018

WHATWG could have required a new MIME type (and therefore, a new file extension) for ESM JavaScript—but they didn’t. WHATWG settled on using text/javascript for both Script and Module JavaScript. The fact that their spec uses .js in its examples is proof enough that they don’t encourage file extension disambiguation.

That would not have helped, <script> without type=module would have still loaded whatever MIME they chose as Script because it has never checked MIMEs. You would still have ambiguity because Script would always be possible.

This idea that an individual file needs to control how it’s parsed is really the source of the incompatibility here. That’s not the case on the Web, and if Node insists on it, it will lead to this incompatibility and probably others.

I'm not sure I understand this point, the web lets you specify how it should be parsed via content-type HTTP headers.

@GeoffreyBooth
Copy link
Member

I’m not sure I understand this point, the web lets you specify how it should be parsed via content-type HTTP headers.

The idea that an individual JavaScript file needs to control its own parse goal, I mean, is the source of the incompatibility. The Web treats both .js and .mjs files as text/javascript, and both Script and Module .js files as text/javascript, and so therefore the Web simply lacks the concept of author-specified unambiguity. All disambiguation happens on the consumer side.

@bmeck
Copy link
Member Author

bmeck commented Jul 8, 2018

@GeoffreyBooth and the argument is that we need to preserve this because? To my knowledge the plan is to completely replace the need for <script> with type=module alternatives. Is there an exact concern about what the problem is with disambiguating things?

In particular <script> never paying attention to MIME is precisely why things like WASM won't work with it. That isn't a good path to treat as a beacon of being forwards compatible.

@ljharb
Copy link
Member

ljharb commented Jul 8, 2018

@GeoffreyBooth do coffeescript users type coffeescript in .js files, or in .coffee files? Is everything required or imported from coffeescript parsed as coffeescript?

@GeoffreyBooth
Copy link
Member

In particular <script> never paying attention to MIME is precisely why things like WASM won’t work with it. That isn’t a good path to treat as a beacon of being forwards compatible.

WASM is served by webservers as application/wasm, so it doesn’t have the issue that JavaScript Script vs Module has. In WASM’s case, they did choose a new MIME type to represent the new file. And maybe WASM won’t be importable from a <script> tag, but I would think it would be importable via <script type="module">import './app.wasm';</script>.

and the argument is that we need to preserve [the Web’s lack of author-specified disambiguation] because?

Because otherwise we have a major incompatibility with the Web. If I as a package author want to publish a JavaScript Module library for wide use on the Web, I want to publish it as a .js file because that’s much more compatible across the webservers of the world than .mjs is. That’s why it wasn’t a mistake for WHATWG to reject the new MIME type: the benefits of author-specified unambiguity are outweighed by the cost of all the webservers of the world needing updates to serve a new file extension with a new MIME type. A similar cost/benefit analysis applies to Node: Node gets some benefits, sure, from author-specified unambiguity; but the cost is incompatibility with .js ESM files, which the Web allows and encourages, and of which there will soon be many as ESM support in browsers becomes widespread.

Basically, Node doesn’t need author-defined file-level unambiguity. Consumer-defined disambiguation can work, though you may not prefer its syntax. If I had to choose between conflicting goals of allowing authors to enforce the parse goal of their file, versus more compatibility with the Web, I would choose the latter. Authors can always informally specify the parse goal of their file, via filenames like foo.esm.js, to signal to consumers how a file should be consumed. I don’t think the enforcement is all that valuable or even desirable.

I agree with this from the TC39’s discussion of the issue:

TC39 has decided that the host environment can choose to detect a script or module depending on host-specific hints. TC39 just provides the two entry points and allows a given string to be parsed by both of them. The host environment can restrict certain strings from certain sources to only go through one entry point or another, but that’s up to them. Really, you have complete freedom here, and should be glad you don’t get your choices constrained by a standards body.

@ljharb
Copy link
Member

ljharb commented Jul 8, 2018

The paragraph you quoted also means that node doesn't need to have its choices constrained by browsers' standards body.

@bmeck
Copy link
Member Author

bmeck commented Jul 9, 2018

WASM is served by webservers as application/wasm, so it doesn’t have the issue that JavaScript Script vs Module has. In WASM’s case, they did choose a new MIME type to represent the new file. And maybe WASM won’t be importable from a <script> tag, but I would think it would be importable via <script type="module">import './app.wasm';</script>.

Indeed this is part of why having ambiguity is problematic. I can create a file:

// test.wasm
console.log(123);

serve it with application/wasm and it still runs in a <script> tag as a Script. Your point about serving Script as text/javascript doesn't apply to anything because no means of loading that check text/javascript load as a Script. I was trying to point out if there is a claim of ambiguity around text/javascript between files and loading as Script vs Module, the same ambiguity exists for any MIME including things like WASM. You can serve it so that it loads in type=module using some format, but it always collides with the Script goal of JS due to <script>. I don't see how this claim of ambiguity only affecting .js files is true.

Because otherwise we have a major incompatibility with the Web.

I still don't see the incompatibility issue at heart here. We can still have things treated as text/javascript if you opt-in. In addition, the claim of incompatibility is weak in my eyes because it relies on <script> which as I said earlier applies to all files not just .js files that are served with text/javascript.

There certainly is a difference in default behavior if we choose to continue treating .js as application/node, but I don't see any incompatibility that prevents people from writing code that works on both platforms.

If I as a package author want to publish a JavaScript Module library for wide use on the Web, I want to publish it as a .js file because that’s much more compatible across the webservers of the world than .mjs is.

This seems a fine goal, but isn't necessarily related to the default behavior of the web nor node. This could be an opt-in thing and I'm unsure why it being opt-in is problematic when we are already using non-web compatible features like package.json in our ecosystem.

That’s why it wasn’t a mistake for WHATWG to reject the new MIME type: the benefits of author-specified unambiguity are outweighed by the cost of all the webservers of the world needing updates to serve a new file extension with a new MIME type.

text/javascript has never had significant meaning on web browsers. text/javascript is only used in determining if something is a Module never a Script on web browsers. The MIME remains unambiguous on web browsers, but it seems like you think it means Script as well for browsers, and if so can you clarify how browsers are using it related to the Script goal.

A similar cost/benefit analysis applies to Node: Node gets some benefits, sure, from author-specified unambiguity; but the cost is incompatibility with .js ESM files, which the Web allows and encourages, and of which there will soon be many as ESM support in browsers becomes widespread.

I think as long as opt-in to treat .js as text/javascript is easy or even automated in some way it isn't costly at all. I, however, am seriously concerned with the idea of not being able to statically determine what format a file is in. Easier and less bug inducing for the author to clarify intent than us to make things muddy. As long as the mechanism to enable your use case is simple and efficient I don't see how this difference in defaults should be blocking. A difference in defaults makes sense, hence why Node was given a standards track MIME instead of a vendored MIME. Node has a significant existing ecosystem, and opting into different behavior could be as easy as adding a "mode" flag to your package.json. In addition, things that disambiguate in a user configurable fashion allow people to put JSX/Flow/etc. in their .js files while remaining unambiguous (as long as JSX/Flow/etc. make a MIME for their format).

Basically, Node doesn’t need author-defined file-level unambiguity.

I believe the loss of static guarantees about how files are intended to be run is enough to make it a need even if you disagree.

Consumer-defined disambiguation can work, though you may not prefer its syntax.

I don't think any of my comments so far have been about syntax needing to be a specific way. They are rooted in ambiguity problems.

If I had to choose between conflicting goals of allowing authors to enforce the parse goal of their file, versus more compatibility with the Web, I would choose the latter. Authors can always informally specify the parse goal of their file, via filenames like foo.esm.js, to signal to consumers how a file should be consumed. I don’t think the enforcement is all that valuable or even desirable.

You can have both using an opt in mechanism. I don't understand this comment.

@demurgos demurgos mentioned this issue Jul 9, 2018
@GeoffreyBooth
Copy link
Member

I think as long as opt-in to treat .js as text/javascript is easy or even automated in some way it isn’t costly at all.

If there’s a way to have Node treat .js files as ESM, yes, the incompatibility goes away. Then the question becomes how that should be implemented, and whether or when it should be the default. This feels like a great stopping point to shift that discussion into a new thread.

@ljharb
Copy link
Member

ljharb commented Jul 9, 2018

If there's a way for node to treat .js files as ESM, shouldn't there be a way to treat .js files as WASM, and .wasm files as CJS, and .anything files as "anything"? (i realize this may come off as a sarcastic question, but it's a genuine one)

@bmeck
Copy link
Member Author

bmeck commented Jul 9, 2018

@ljharb I would assume so, yes. In particular I'm interested in handling things like Flow/JSX/etc. that also live in .js.

@mathiasbynens
Copy link

The fact that their spec uses .js in its examples is proof enough that they don’t encourage file extension disambiguation.

The HTML Standard uses both .mjs and .js in its examples for module scripts, as well as URLs without a extension or ending with .cgi. This matches reality where on the web, file extensions don’t matter at all to user agents; HTTP headers do.

But how does one decide which HTTP headers to send out? In practice, it happens based on the file extension as opposed to on a file-by-file basis. In general, you’re gonna have an easier time during development but also when configuring your server by using .mjs for modules and .js for scripts and sticking to it consistently.

@bmeck
Copy link
Member Author

bmeck commented Dec 30, 2018

This topic seems to have cooled and is being addressed on a per phase basis.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

9 participants