Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scope hoisting for ES6 and CommonJS modules #1135

Merged
merged 188 commits into from
Jun 14, 2018
Merged

Conversation

fathyb
Copy link
Contributor

@fathyb fathyb commented Apr 4, 2018

Closes #1104.
Fixes #392.

  • eval
  • UMD
  • Pure ES6 support
  • Import CommonJS from ES6
  • Import ES6 from CommonJS
  • Dynamic import
  • Wildcards (export * from)
    • ES6
    • CommonJS
  • minify using Uglify
  • update the non-HMR tests to use scope-hoisting
  • make scope hoisting the default when HMR is disabled
  • sideEffects package field (partial support)
  • Source map

Output :

  • a.js
import {hello, world} from './b'

console.log(`${hello} ${world}`)
  • b.js
export * from './c'
  • c.js
const localHello = 'hello'
export {localHello as hello}

export const world = 'world'
export const bar = 'tree-shake me if you can'

--no-minify

(function () {
  /* ASSET: 5 - c.js */

  var $5$export$hello = 'hello';

  var $5$export$world = 'world';

  var $5$export$bar = 'tree-shake me if you can';

  /* ASSET: 3 - b.js */


  /* ASSET: 1 - a.js */
  console.log($5$export$hello + ' ' + $5$export$world);
})();

--minify

console.log("hello world");

@fathyb fathyb added 💬 RFC Request For Comments 🙋‍♀️ Feature 📝 WIP Work In Progress 💡POC Proof Of Concept labels Apr 4, 2018
@devongovett
Copy link
Member

Oh, awesome! I was actually working on something similar. Will take a look in a bit

if (source && specifiers.length > 0) {
if (BREAK_COMMON_JS) {
path.replaceWith(
t.variableDeclaration(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should be possible to avoid these extra variables by just renaming the local references to point to the imported one, e.g.

path.scope.rename(specifier.local, IMPORT_NAME_HERE);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better! Fixed

@@ -173,7 +174,10 @@ class JSAsset extends Asset {

return {
js: code,
map: this.sourceMap
map: this.sourceMap,
'@reflect': {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than adding these to generated and then removing them later, you could just add them to asset.cacheData. That already gets written to the cache and you should be able to access it from the packager.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried cacheData and I didn't get it to work (cacheData.exports is empty). Should I manually update it? I pushed it

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think you'll have to add it to the asset in the main process from processed.cacheData here: https://github.com/parcel-bundler/parcel/blob/master/src/Bundler.js#L447

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perfect, fixed

@devongovett
Copy link
Member

As for commonjs support, I think you can just create an object at the top of the scope like we had before and add all the exports to it, but also leave the individual variables for ES6 usage like you have here. Those objects will get thrown away by the minifier if they aren't used.

.split('$' + asset.id + '$expand_exports$' + t.toIdentifier(dep.name))
.join('$parcel$expand_exports(' + mod.id + ',' + asset.id + ')');

if (dep.isES6Import) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We track imports to see if they are from ES6, as well as exports to mark a module as ES6. If they are both ES6, then we can just replace the import names with exports. If a commonjs module is imported with ES6, then the default import resolves to module.exports, and other names resolve to module.exports.NAME.

return binding;
})
);
// super.write(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these re-exports are going to be problematic since they are supposed to be live bindings rather than a copy. Commented for now while we figure it out.

// For each specifier, rename the local variables to point to the imported name.
// This will be replaced by the final variable name of the resolved asset in the packager.
for (let specifier of path.node.specifiers) {
if (t.isImportDefaultSpecifier(specifier)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

support more types of specifiers

)
);
} else if (t.isImportNamespaceSpecifier(specifier)) {
path.scope.rename(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

importing a namespace e.g. import * from 'blah' just does the same thing as a commonjs require.

Copy link
Contributor Author

@fathyb fathyb Apr 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for stuff like this I thought about a Babel transform we could do in the worker, where basically we would replace wildcard imports to named if possible (bailout if the namespace if called). We could also do the same to turn for example const {foo} = require('foo') to import {foo} from 'foo', it may simplify our logic an should be safe enough

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that could be done. I wonder if minifiers will be able to do that for us though. We're already creating an object with the namespace for every module, so it seems like the minifier should be able to statically resolve those names off the object...


if (!source) {
let declarations;
ExportDefaultDeclaration(path, asset) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added default export support

ExportNamedDeclaration(path, asset) {
let {declaration, source, specifiers} = path.node;

if (source) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Support re-exports, e.g. export {foo} from 'bar'. Again, this will be problematic I realized, and the code to insert a variable here probably won't work right.

} else if (declaration) {
path.replaceWith(declaration);

if (
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added support for a bunch of different types of declarations. Still more to support, see todos.

BINDING: name,
INIT: init
}).declarations[0];
path.insertAfter(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For each export, we don't create a new variable, we just rename the local variable within the module to the exported name. Also, we add a property on the exports object for use in namespaces and commonjs imports of ES6 modules.

@devongovett
Copy link
Member

Did a few things, and added some inline comments to describe them. Hope you don't mind me committing to your branch, sorry about that. 😜

Currently the only problematic cases are re-exports:

  • export * from 'foo'
  • export {foo} from 'bar'

Currently, we were handling those as variable replacements: we'd make a new variable in the context of the module that is doing the re-exporting, and then dependencies of that module would use the renamed variable instead of the original one.

The problem with this is that if the value of the original binding changes, the renamed one will not. So what we really need to do is resolve re-exported names to their original names in the modules that use them.

Here's an example that illustrates the problem:

a.js

export var foo = 2;
export function changeFoo(x) {
  foo = x;
}

b.js

export * from './a.js'

c.js

import {foo, changeFoo} from './b.js';

console.log(foo) // => 2
changeFoo(10);
console.log(foo) // => 2 (should be 10)

@fathyb
Copy link
Contributor Author

fathyb commented Apr 5, 2018

Thanks for the update, saved me some time 👍 I will look into details later but I noticed it broke tree-shaking and export * support, it now compiles to :

(function () {

/* ASSET: /Users/fathy/Documents/Git/test-parcel/c.js */
var $5$exports = {};
var $5$export$foo = 'hello';
$5$exports.foo = $5$export$foo;
var $5$export$bar = 'tree-shake me if you can';
$5$exports.bar = $5$export$bar;

/* ASSET: /Users/fathy/Documents/Git/test-parcel/b.js */
export * from './c';

/* ASSET: /Users/fathy/Documents/Git/test-parcel/a.js */
console.log('foo', $3$export$foo);
})();

IMHO we should not do $5$exports.bar = $5$export$bar; unless $5$exports is used somewhere else in the code. Expect for Closure, minifiers aren't smart enough to trim it if it has property assignments (tried using Uglify 3 and Babili). We should really focus on making the code as simple as possible to minify.

About the foo changeFoo situation I think we could just parse and transform the final bundle using Babel, we could then correctly replace references to a named export to the original variable name. If we end up using babel/minify we could parse the final code using Babylon, transform it and directly pass it to it. So we get clean, non-regexy way of "expanding" exports and stuff like that.

@devongovett
Copy link
Member

Yeah I commented out the export * support for now because of the above mentioned reason. I played around with it for a bit and realized that we need to do some further renaming in the packager once we have all the modules.

Huh, that's strange about it not removing the unused object. When I tested with uglify on another input, it did seem to remove them. On further testing, it looks like uglify will remove the object but only if there is only one property added to it? Weird.

Here's an example. test3.js has one export and it's unused. uglify totally removes it from the output. As soon as you add another property to $3$exports, it won't remove it anymore.

(function () {

/* ASSET: test3.js */
var $3$exports = {};
function $3$export$default() {
  return 4;
}

$3$exports.default = $3$export$default;

/* ASSET: index.js */
var $1$exports = {};

function $1$var$add(a, b) {
  return a + b;
}

var $1$export$foo = $1$var$add(1, 2);
$1$exports.foo = $1$export$foo;

console.log($1$export$foo);
})();

output:

!function(){var o=1+2;console.log(o)}();

@fathyb
Copy link
Contributor Author

fathyb commented Apr 5, 2018

I pushed a commit with a fix for the wildcard, but it breaks commonjs, I will fix it later. It's messy and I reverted some of your changes, feel free to revert the commit 😄

What I'm trying to do :

  • resolve the exports ahead of time using Babel (src/transforms/concat.js) to get a fully optimisable interop with cjs, umd, es6
  • find if an import should be replaced with $id$export$name or $id$exports.name
  • add the $id$exports.name = $id$export$name assignment only when needed (based on your comment on Uglify and multiple props)

Output

  • a.js
import {foo} from './b'

console.log('foo', foo)
  • b.js
export * from './c'
  • c.js
export let foo = 'hello'
export const bar = 'trim me'

export function changeFoo(next) {
    foo = next
}
  • bundle.js (love the prelude-less bundles 😍)
(function () {

  /* ASSET: /~/test-parcel/c.js */
  var $5$export$foo = 'hello';
  var $5$export$bar = 'trim me';

  function $5$export$changeFoo(next) {
    $5$export$foo = next;
  }

  /* ASSET: /~/test-parcel/b.js */


  /* ASSET: /~/test-parcel/a.js */
  console.log('foo', $5$export$foo);
})();

@devongovett
Copy link
Member

devongovett commented Apr 6, 2018

Thought it would be useful to summarize all of the possible cases of import/export syntax, and what happens for both ES6 and commonjs modules. Eventually we should have tests for all combinations of these.

Imports:

syntax action
import foo from 'bar' rename foo to $bar_id$export$default
import {foo} from 'bar' rename foo to $bar_id$export$foo
import {foo as bar} from 'bar' rename bar to $bar_id$export$foo
import * as foo from 'bar' rename foo to $bar_id$exports
var foo = require('bar') var foo = $bar_id$exports

Exports:

syntax es6 commonjs
export default 4 var $id$export$default = 4 $id$exports.default = $id$export$default
export default function foo() {} var $id$export$default = foo $id$exports.default = $id$export$default
export default foo var $id$export$default = foo $id$exports.default = $id$export$default
export var foo = 4 rename foo to $id$export$foo $id$exports.foo = $id$export$foo
export {foo} rename foo to $id$export$foo $id$exports.foo = $id$export$foo
export {foo as bar} rename foo to $id$export$bar $id$exports.bar = $id$export$bar
export {foo} from 'bar' rename $id$export$foo to $bar_id$export$foo $id$exports.foo = $bar_id$export$foo
export {foo as bar} from 'bar' rename $id$export$bar to $bar_id$export$foo $id$exports.bar = $bar_id$export$foo
export * from 'bar' rename all exports in bar from $id$export$NAME to $bar_id$export$NAME Object.assign($id$exports, $bar_id$exports)
export foo from 'bar' rename $id$export$foo to $bar_id$export$default $id$exports.foo = $bar_id$export$default
export * as foo from 'bar' var $id$export$foo = $bar_id$exports $id$exports.foo = $id$export$foo
module.exports = 4 $id$exports = 4 $id$exports = 4
exports.foo = 4 $id$exports.foo = 4 $id$exports.foo = 4

@devongovett
Copy link
Member

I changed the way export all is implemented a bit. Rather than making identifiers with the ids of the two modules, I just keep track of whether a dependency was produced by an export all and replace the variables in all subsequent modules pointing to exports from the parent module. Since assets are added to the bundle in order of use, this should be safe.

The other thing I did was create the commonjs exports all at once at the end of the file if possible rather than creating an empty object at the start and incrementally assigning to it. This should make it possible for minifiers to prune them more easily. We do incrementally assign to it if the module or exports variables are referenced in the module though, so you could have both commonjs and es6 in the same module (probably more common than you'd think because babel).

The only problematic one is again export all, which requires an Object.assign to add the imported module to exports. The problem is that this causes minifiers to include both exports objects and all of their functions even if neither are used anywhere else. We might need to do that part in the packager only if we detect that a module with export all is referenced by a commonjs module or namespace import.

All these string variable replacements do seem somewhat dangerous, but since we're using unique names should be relatively safe. I somewhat like the idea of running babel over everything again in the packager (and we might anyway for minification), but I'm worried about performance. Should probably measure on some larger apps once we get this working fully.

@devongovett
Copy link
Member

Actually Object.assign is not even gonna work for export all since they need to be live bindings... Here's what babel does for export * from 'foo':

var _foo = require('foo');

Object.keys(_foo).forEach(function (key) {
  if (key === "default" || key === "__esModule") return;
  Object.defineProperty(exports, key, {
    enumerable: true,
    get: function get() {
      return _foo[key];
    }
  });
});

Also apparently default is excluded from re-exports...

@fathyb
Copy link
Contributor Author

fathyb commented Apr 6, 2018

Turns out Babylon AST can be fully (de)serialised using JSON 😮

const ast = JSON.parse(JSON.stringify(babylon.parse(code)))

@DeMoorJasper
Copy link
Member

@fathyb I've tested this inside my treeshaking experiment a lil while back and it had circular properties when I've tested it (might have been a non-babel ast that caused the circular props, as I didn't only stringify babel asts). Just wanted to let you know as you might run into the same issue in the future.

@fathyb
Copy link
Contributor Author

fathyb commented Apr 6, 2018

@DeMoorJasper I was a bit afraid of this 😕 thanks for letting me know

@devongovett Love the syntax matrices 👍

I tweaked the Babel packager implementation to generate correct exports using the scope (Binding.constant). The pure ES6 code is still perfectly tree-shaked, and the CommonJS one is a bit better. The exports are generated on the fly when needed and it chooses property assignment or getter based on the var side-effects :

  • a.js
const {foo} = require('./b')

require('./d')

console.log('foo', foo)
  • b.js
export * from './c'

export const some = 'thing'
  • c.js
export let foo = 'hello'
export const bar = 'trim me'

export function changeFoo(next) {
    foo = next
}
  • d.js
console.log('require(./b)', require('./b'))
console.log('require(./c)', require('./c'))
  • bundle.js
(function () {

  /* ASSET: /~/c.js */
  var $7$export$foo = 'hello';
  var $7$export$bar = 'trim me';

  function $7$export$changeFoo(next) {
    $7$export$foo = next;
  }

  /* ASSET: /~/b.js */

  var $3$export$some = 'thing';

  /* ASSET: /~/d.js */
  var $3$exports = {
    changeFoo: $7$export$changeFoo,

    get foo() {
      return $7$export$foo;
    },

    bar: $7$export$bar,
    some: $3$export$some
  };
  console.log('require(./b)', $3$exports);
  var $7$exports = {
    changeFoo: $7$export$changeFoo,

    get foo() {
      return $7$export$foo;
    },

    bar: $7$export$bar
  };
  console.log('require(./c)', $7$exports);

  /* ASSET: /~/a.js */
  var $1$var$_require = $3$exports,
      $1$var$foo = $1$var$_require.foo;

  var $5$exports = {};
  $5$exports;

  console.log('foo', $1$var$foo);
})();

@devongovett
Copy link
Member

@fathyb did you commit that?

@fathyb
Copy link
Contributor Author

fathyb commented Apr 7, 2018

RxJS has a file which begins like this :

var isArray_1 = require('../util/isArray');
var isArrayLike_1 = require('../util/isArrayLike');

Because of how we do string transformation using toIdentifier and string matching it creates bad code :

var isArray_1 = $352$exports;
var isArrayLike_1 = $352$exportsLike;

I replaced the logic in Babel instead with string literals so we don't lose informations like here :

import {toIdentifier} from 'babel-types'

toIdentifier('testSass') // testSass
toIdentifier('test-sass') // testSass
toIdentifier('test.sass') // testSass

The good news is I successfully compiled RxJS, material-ui, Angular and React! The result on Angular is pretty good. Pushed!

@devongovett
Copy link
Member

Yeah I was worried about toIdentifier for filenames. Maybe instead we could just hash the filename?

@fathyb
Copy link
Contributor Author

fathyb commented Apr 7, 2018

It's getting better! By using Babel's scope we get deep wildcards with tree-shaking without using assign or any copy. Example (no exports object are created for the wildcards, and a variable is used for foo and some while moduleName uses a property access) :

  • a.js
import {foo, moduleName, some} from './d'

console.log('foo', foo, moduleName, some)
  • b.js
export * from './c'

export const some = 'thing'
  • c.js
export let foo = 'hello'
export const bar = 'trim me'

export function changeFoo(next) {
    foo = next
}
  • d.js
export * from './e'
  • e.js
export * from './b'

exports.moduleName = 'e.js'
  • bundle.js
(function () {

  /* ASSET: /~/c.js */
  var $9$export$foo = 'hello';
  var $9$export$bar = 'trim me';

  function $9$export$changeFoo(next) {
    $9$export$foo = next;
  }

  /* ASSET: /~/b.js */
  var $7$export$some = 'thing';

  /* ASSET: /~/e.js */
  var $7$exports = {};

  $7$exports.moduleName = 'e.js';

  /* ASSET: /~/d.js */

  /* ASSET: /~/a.js */
  console.log('foo', $9$export$foo, $7$exports.moduleName, $7$export$some);
})();
  • bundle.min.js (Uglify)
!function(){
  var o="hello";
  var e={
    moduleName: "e.js"
  };
  console.log("foo", o, e.moduleName, "thing")
}();
  • bundle.min.js (babel/minify)
(function() {
  var b = "hello",
    e = {};
  (e.moduleName = "e.js"), console.log("foo", b, e.moduleName, "thing");
})();

@devongovett devongovett changed the title [WIP] Scope hoisting for ES6 and CommonJS modules Scope hoisting for ES6 and CommonJS modules Jun 13, 2018
Copy link
Member

@devongovett devongovett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💯🏆

@devongovett devongovett merged commit 0ac4e29 into master Jun 14, 2018
@devongovett devongovett deleted the fathy/hoist-es6 branch June 14, 2018 08:12
@joseluisq
Copy link

Awesome! 🥇

@Siyfion
Copy link

Siyfion commented Jun 14, 2018

Great work @Fathy 🥇 🏅

@fathyb
Copy link
Contributor Author

fathyb commented Jun 14, 2018

Haha thanks! But @devongovett did most of the work, so congrats and thanks to him for spending so much time working on this! 🥇

).split('');

/**
* This is a very specialized mangler designer to mangle only names in the top-level scope.
Copy link

@vigneshshanmugam vigneshshanmugam Jun 14, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious, why does the mangler fail for topLevel scope? Babel minify supports toplevel mangling and I am interested to know what was the reason to have similar mangling step again here? or is it done for uglify since you are using it as the default minifier?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The point of this mangler is to ONLY mangle the top level scope once all files are concatenation. All other scopes are mangled on a per file level.

* Mangling of names in other scopes happens at a file level inside workers, but we can't
* mangle the top-level scope until scope hoisting is complete in the packager.
*/
function mangleScope(scope) {
Copy link

@vigneshshanmugam vigneshshanmugam Jun 14, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see babel minify license being violated here, Parcel is using the babel minify's mangler code which is licensed(MIT). Can you please mention link to the project and the license?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
📝 WIP Work In Progress 💡POC Proof Of Concept 🙋‍♀️ Feature 💬 RFC Request For Comments
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants