Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fix,Deps] Reduce size by making dependency free and add tests without Weakmap and Map #12

Closed
wants to merge 3 commits into from

Conversation

Tofandel
Copy link
Contributor

@Tofandel Tofandel commented Mar 13, 2022

Will allow a fix for ljharb/qs#404

Codecoverage improved from 89% to 100%

@ljharb
Copy link
Owner

ljharb commented Mar 14, 2022

Note that code coverage is combined across every CI run; coverage is currently at 89.39%.

test/index.js Outdated Show resolved Hide resolved
index.js Outdated
var listGetNode = function (list, key) { // eslint-disable-line consistent-return
for (var prev = list, curr; (curr = prev.next) !== null; prev = curr) {
if (curr.key === key) {
prev.next = curr.next;
curr.next = list.next;
list.next = curr; // eslint-disable-line no-param-reassign
return curr;
}
}
};

var listGet = function (objects, key) {
var node = listGetNode(objects, key);
return node && node.value;
};
var listSet = function (objects, key, value) {
var node = listGetNode(objects, key);
if (node) {
node.value = value;
} else {
// Prepend the new node to the beginning of the list
objects.next = { // eslint-disable-line no-param-reassign
key: key,
next: objects.next,
value: value
};
}
};
var listHas = function (objects, key) {
return !!listGetNode(objects, key);
};
var withoutInspector = require('./withoutInspector');

module.exports = function getSideChannel() {
var $wm;
var $m;
var $o;
var channel = {
assert: function (key) {
if (!channel.has(key)) {
throw new $TypeError('Side channel does not contain ' + inspect(key));
}
},
get: function (key) { // eslint-disable-line consistent-return
if ($WeakMap && key && (typeof key === 'object' || typeof key === 'function')) {
if ($wm) {
return $weakMapGet($wm, key);
}
} else if ($Map) {
if ($m) {
return $mapGet($m, key);
}
} else {
if ($o) { // eslint-disable-line no-lonely-if
return listGet($o, key);
}
}
},
has: function (key) {
if ($WeakMap && key && (typeof key === 'object' || typeof key === 'function')) {
if ($wm) {
return $weakMapHas($wm, key);
}
} else if ($Map) {
if ($m) {
return $mapHas($m, key);
}
} else {
if ($o) { // eslint-disable-line no-lonely-if
return listHas($o, key);
}
}
return false;
},
set: function (key, value) {
if ($WeakMap && key && (typeof key === 'object' || typeof key === 'function')) {
if (!$wm) {
$wm = new $WeakMap();
}
$weakMapSet($wm, key, value);
} else if ($Map) {
if (!$m) {
$m = new $Map();
}
$mapSet($m, key, value);
} else {
if (!$o) {
/*
* Initialize the linked list as an empty node, so that we don't have
* to special-case handling of the first node: we can always refer to
* it as (previous node).next, instead of something like (list).head
*/
$o = { key: {}, next: null };
}
listSet($o, key, value);
}
}
};
return channel;
return withoutInspector(inspect);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't think we need this "withoutInspector" complexity; object-inspect should remain in all code paths, and we should only have one function as this package's API.

Copy link
Contributor Author

@Tofandel Tofandel Mar 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And yet it's another 9kb dependency that probably adds little value except for debugging and that can't be tree shaked, so we need to give the option to remove it, especially if you don't use assert, like in qs, because then you have 9Kb of dead code that can't be optimized out.

I would make it the default or switch between them based on process.env.NODE_ENV but then it would be breaking

In the next major you should probably do that though

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean "can't be tree shaked"? Tree-shaking is just a half-measure to try to remove code that's unused; in this case it is used.

"except for debugging" is massively valuable, since that's where most of the time is spent when coding.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it's used, unless you're not using assert or not using the error message, the 9kb overhead is reason enough to give the option to use it or not but it should be an option, and not imposed

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean "not using assert"?

Copy link
Contributor Author

@Tofandel Tofandel Mar 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're not using channel.assert() then there is no point in having this dependency, because it's not used, but an optimizer won't remove it because its within a method that provides an object, no optimizer is this smart

If you look at qs, channel.assert() is never used, thus rendering 9kb of dead code

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ohhhh thank you for clarifying, i couldn't figure out what you meant.

Let me think about this one, but for this PR, let's just focus on everything else.

index.js Show resolved Hide resolved
light.js Outdated Show resolved Hide resolved
@Tofandel
Copy link
Contributor Author

Could you run the workflows please there is a bug on github actions that prevents me from running them on my fork

@codecov
Copy link

codecov bot commented Mar 14, 2022

Codecov Report

Merging #12 (15cdcd2) into main (ecbe70e) will increase coverage by 10.60%.
The diff coverage is 100.00%.

❗ Current head 15cdcd2 differs from pull request most recent head 0f1b586. Consider uploading reports for the commit 0f1b586 to get more accurate results

Impacted file tree graph

@@             Coverage Diff              @@
##             main       #12       +/-   ##
============================================
+ Coverage   89.39%   100.00%   +10.60%     
============================================
  Files           1         2        +1     
  Lines          66        60        -6     
  Branches       19        22        +3     
============================================
+ Hits           59        60        +1     
+ Misses          7         0        -7     
Impacted Files Coverage Δ
index.js 100.00% <100.00%> (+10.60%) ⬆️
light.js 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ecbe70e...0f1b586. Read the comment docs.

@Tofandel
Copy link
Contributor Author

Tofandel commented Mar 14, 2022

Rolled back NYC for older node support, should pass now, I believe we could go up to v13 of NYC

package.json Outdated Show resolved Hide resolved
light.js Outdated
if (!$m) {
$m = new $Map();
}
$Map.prototype.set.call($m, key, value);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not robust, because .call or Map.prototype.set might be deleted later. call-bind/callBound is strictly necessary here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would anyone delete prototypes? Everybody is quite clear that you shouldn't ever touch native prototypes

I can save it on require as it does the same

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything that is possible, is done, and code should be robust against it.

You can save Map.prototype.set, but you can't call-bind it without Function.call.bind(), and you can't rely on .bind existing in ES3 engines, which this package supports - that's what call-bind is for.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But why do I need to call .bind() ?

Why wouldn't that work?

var $MapSet = Map.prototype.set;

$MapSet.call($m, key, value);

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The library-less way to do it is:

var $MapSet = Function.call.bind(Map.prototype.set);

$MapSet($m, key, value);

that way, you aren't looking up .call at runtime.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's absolutely necessary, and would be both a breaking change and an undesirable one. All 300+ of my packages are robust against this sort of thing, and will eternally remain so.

function-bind can and should be swapped out in your bundler config for module.exports = Function.prototype.bind; including it is the proper default.

Copy link
Contributor Author

@Tofandel Tofandel Mar 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So yeah, then I guess I just make the same PR to call-bind then so we can use it here without getIntrinsinc

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah well I can't, callBound has a direct implementation dependency https://github.com/Tofandel/call-bind/blob/main/callBound.js#L10

So light version it is

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, sumarized in

var bind = require('function-bind');

var $call = Function.prototype.call;

var callBound = function (original) {
	return $call.call(bind, $call, original);
};

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, now you've basically replicated call-bind without get-intrinsic :-) using get-intrinsic, however, serves as a dependency-graph-wide cache of builtins, so that the first module doing the caching sets up the robustness threshold, instead of each one starting fresh.

light.js Outdated Show resolved Hide resolved
@ljharb
Copy link
Owner

ljharb commented Mar 14, 2022

I went ahead and split this up into two commits; one with the tests, and another with the index.js changes. I pushed them up separately to make sure the tests passed first.

Turns out that the tests are failing because nyc itself isn't robustly accessing Map :-) i'm not sure it's that useful to delete the globals, since CI runs in every node minor, including 0.10 which lacks both Map and WeakMap.

so, i ended up dropping that commit entirely, since it wasn't really adding value.

@Tofandel
Copy link
Contributor Author

Tofandel commented Mar 14, 2022

Hmm I think if you're using local.js with the current codebase it would fail because of the getIntrinsinc dependency caching stuff, you'd need to delete the require cache of getIntrinsinc

Deleting the globals is usefull in a local env where you can't actually test all the node versions, or do you mean the tests added in index.js fails?

@ljharb
Copy link
Owner

ljharb commented Mar 14, 2022

Such a local env isn't particularly important; my local env has nvm, so i have every node version available, and only CI really matters, where all versions are tested.

@Tofandel
Copy link
Contributor Author

I had some coverage improvement that you dropped as well in index.js

@ljharb
Copy link
Owner

ljharb commented Mar 14, 2022

Note that you can use call-bind without GetIntrinsic (ie, callBind, not callBound), if you want to pass them in yourself, but that bypasses the whole point of get-intrinsic - caching intrinsics across the entire dep graph. i'd still prefer to use call-bind rather than function-bind directly, though.

@ljharb
Copy link
Owner

ljharb commented Mar 14, 2022

ah, let me restore the index test coverage, thanks for the reminder.

@ljharb
Copy link
Owner

ljharb commented Mar 14, 2022

k, i restored those added tests, and also split up the editorconfig into its own commit, and added eclint to enforce it.

@Tofandel
Copy link
Contributor Author

Tofandel commented Mar 14, 2022

I made a better local.js, now the tests should pass no matter nyc implementation because I restore the globals just after the file is required so it actually checks for robustness, if it wasn't robust codecov would drop back down from 100%

'use strict';

var test = require('tape');

var getSideChannel = require('../');

var protos = [null, 'WeakMap', 'Map'];

var restore = protos.map(function (proto) {
	return global[proto];
});

var removeProtos = function (proto) {
	var i = 0;
	while (protos[i] !== proto) {
		i = i + 1;
		global[protos[i]] = undefined;
	}
};

var restoreProtos = function () {
	protos.forEach(function (proto, index) {
		global[proto] = restore[index];
	});
};

protos.forEach(function (toRemove) {
	var testName = '';
	if (toRemove) {
		if (!global[toRemove]) {
			return; // 'Test skipped'
		}
		testName = ' without ' + toRemove;
	}
	test('setup' + testName, function (t) {
		removeProtos(toRemove);
		// with old implementation
		// delete require.cache[require.resolve('get-intrinsic')];
		delete require.cache[require.resolve('../')];

		/* eslint-disable global-require */
		getSideChannel = require('../');
		/* eslint-enable global-require */
		restoreProtos();

		t.end();
	});
	test('export' + testName, function (t) {
		t.equal(typeof getSideChannel, 'function', 'is a function');
		t.equal(getSideChannel.length, 0, 'takes no arguments');

		var channel = getSideChannel();
		t.ok(channel, 'is truthy');
		t.equal(typeof channel, 'object', 'is an object');

		t.end();
	});

	test('assert' + testName, function (t) {
		var channel = getSideChannel();
		t['throws'](
			function () {
				channel.assert({});
			},
			TypeError,
			'nonexistent value throws'
		);

		var o = {};
		channel.set(o, 'data');
		t.doesNotThrow(function () {
			channel.assert(o);
		}, 'existent value noops');

		t.end();
	});

	test('has' + testName, function (t) {
		var channel = getSideChannel();
		var o = [];

		t.equal(channel.has(o), false, 'nonexistent value yields false');

		channel.set(o, 'foo');
		t.equal(channel.has(o), true, 'existent value yields true');

		t.equal(channel.has('abc'), false, 'non object value non existent yields false');

		channel.set('abc', 'foo');
		t.equal(channel.has('abc'), true, 'non object value that exists yields true');

		t.end();
	});

	test('get' + testName, function (t) {
		var channel = getSideChannel();
		var o = {};
		t.equal(channel.get(o), undefined, 'nonexistent value yields undefined');

		var data = {};
		channel.set(o, data);
		t.equal(channel.get(o), data, '"get" yields data set by "set"');

		t.end();
	});

	test('set' + testName, function (t) {
		var channel = getSideChannel();
		var o = function () {
		};
		t.equal(channel.get(o), undefined, 'value not set');

		channel.set(o, 42);
		t.equal(channel.get(o), 42, 'value was set');

		channel.set(o, Infinity);
		t.equal(channel.get(o), Infinity, 'value was set again');

		var o2 = {};
		channel.set(o2, 17);
		t.equal(channel.get(o), Infinity, 'o is not modified');
		t.equal(channel.get(o2), 17, 'o2 is set');

		channel.set(o, 14);
		t.equal(channel.get(o), 14, 'o is modified');
		t.equal(channel.get(o2), 17, 'o2 is not modified');

		t.end();
	});
});

@ljharb
Copy link
Owner

ljharb commented Mar 14, 2022

I don't think it's really worth it; just deleting the globals doesn't necessarily match older environments - for example, there's some with neither Map nor WeakMap, and some with just Map but not WeakMap, and there's some with neither natively, but Map is shimmed (we're not testing for this right now, ofc). I don't think local coverage is important as long as aggregated coverage is complete.

index.js Show resolved Hide resolved
@ljharb
Copy link
Owner

ljharb commented Mar 14, 2022

I've merged in the first two commits, leaving only the third one in this PR.

@ljharb
Copy link
Owner

ljharb commented Mar 14, 2022

Since it seems like the bulk of the size decreases are about removing object-inspect and call-bind/get-intrinsic, and the former is required for debugging (but i'll keep thinking about ways to address this), and the latter is required for robustness, i'm not sure it's worth the change.

@Tofandel
Copy link
Contributor Author

Well this whole thing was about this, it's 20kb of dependencies, that we don't want in qs, so unless we just make a bare side-channel in qs, the problem stays that we can't/(won't) use the latest version

And I've just shown that it can be addressed without affecting robustness

@ljharb
Copy link
Owner

ljharb commented Mar 14, 2022

It's slightly less robust - imagine this scenario:

  1. something in the dep graph requires get-intrinsic (this is highly likely; i use it in a ton of projects)
  2. malicious or incompetent code loads and breaks the builtins
  3. side-channel is required (via qs, or elsewhere)

Here, qs and side-channel would be more vulnerable than they currently are.

@Tofandel
Copy link
Contributor Author

Tofandel commented Mar 15, 2022

Well worrying about the order of imported deps in the first place for robustness shouldn't be the job of a dependency because that could lead to some much bigger problems and then it's a slippery slope. What If I import lib-1 which uses get intrinsinc which caches an undefined global and then import a polyfill for that global, and then import lib-2 which uses get intrinsinc and expect it to use the polyfill but instead it uses the cached undefined and I had no idea

And same for globals being modified, if somebody is tempering with globals they should expect their code to break no question asked, and in fact I'd argue it's a bug if you're modifying a global and it's not changing the behavior of your dependency and in the end it's not dependencies job to take care of that, at the detriment of other users

Sure if we can do a little bit about robustness, why not cache globals, but pushing it to that level to the detriment of users is not a sane thing to do

In the end breaking changes should come from time to time, they're healthy for the ecosystem to avoid some people staying on a shady 10 years old unsecure version of something, and if they stayed on that version for so long then they're not updating their deps anyways. It's the same problem as IE all over again, an old version of something forcing people to have a lot of overhead

I respect your view and your points though for your libs and if you don't want it here then by all means, keep them robust

But we should really find a solution for qs like a fork of side-channel which in the end is just Weakmap or Map or a simple polyfill, because triple the size for developer experience at the expense of user experience is really not an okay thing

@ljharb
Copy link
Owner

ljharb commented Mar 15, 2022

Polyfills can only be safely used at the application level, and it's the application owner's responsibility to guarantee NO code runs prior to polyfills running.

It is completely reasonable to tamper with globals and assume code not to break; that lots of code fails to do that doesn't mean i need to lower my standards for my hundreds of projects.

Breaking changes should ideally never come - that's why the web and javascript are ubiquitous.

@Tofandel
Copy link
Contributor Author

Tofandel commented Mar 15, 2022

Well in the future when the ecma spec reaches a point it stops evolving, I will agree that breaking changes should ideally not happen but that's not the state of things right now, in the global context of things we're still in the early years of js and Web development (I'd say teenager 😁), where everything is still evolving at an incredible pace

Don't get me wrong, I don't want you to lower your standards for your libs, I think they're great and indeed if they had this standard before they should aim to keep it

But there has to be golden mean for qs because this standard came after the fact to fix a small bug that cost us 20kb

Re get-intrinsic I was pointing out that you're creating side effects in it, by caching globals. And maybe then you should just cache the whole of globals at first require of get intrinsic, as this would be the most robust

Then you should probably implement some tree shaking in get-intrinsic, so that each global can be imported separately and that it can correctly be optimised as opposed to the arrays there is now

Basically cache all globals in one file and export them for internal use and make one file per global
require('intrinsic/Map')
require('intrinsic/Map').prototype.get

All the files could be generated automatically based on some rules

It's a completely different approach and could be a different package but if you ask me how I'd implement robustness, this is how because this kind of code can then be optimised based on usage

Or even better, getIntrinsics should be in the JS specs, we should be able to retrieve the original implementation of something from anywhere if that's what we want

Anyhow that's really a discussion for another day

The problem right now is, can we remove side channel from qs and have this 1kb version of side channel directly in qs? Or can we have a version here without assert or object-inspect that saves 9kb of dead code? Being the golden mean, keeping get-intrinsic?

@ljharb
Copy link
Owner

ljharb commented Mar 15, 2022

The cost to qs was because it needed to handle cycles. The minimum robust implementation for that is effectively side-channel.

Making get-intrinsic have separate files would mean that instead of all intrinsics being cached once the first is requested, each intrinsic would be uncached until it was first requested.

As for get-intrinsic being in the JS specs, that's exactly what my https://github.com/tc39/proposal-get-intrinsic proposal is for.

@Tofandel
Copy link
Contributor Author

Tofandel commented Mar 15, 2022

Nice I really hope it will make it in the specs then 👍

Re qs, yes but as I'm saying there is two options

  1. Partial fix here by shipping another file without assert implementation saves 9kb which is halfway there
  2. "robust in the context of qs" approach where we put the light side-channel I proposed in qs and not using the dependency graph of get-intrinsic should really be okay, gets us all the way there

Re separate files, I meant one global file that declares absolutely all variables and prototypes like

$Array = typeof Array === "undefined" ? undefined : Array
$ArrayPrototypeMap = $Array ? $Array.prototype.map : undefined
// Etc etc

module.exports = {
  Array: $Array,
  ArrayPrototypeMap: $ArrayPrototypeMap,
  // Etc etc
};

Generated automatically

Wouldn't technically need folders in fact, but might be cleaner to use require('intrinsic/Array').prototype.map which would also be an autogenerated file that requires the correct global import and exports the correct one than require('instrinsic').ArrayPrototypeMap

A file like this can be optimized, so the bulk of the work happens at build time, basically no function, just exported variables

Don't know if you know what I mean, but that's my quick 2 cents idea where I only thought about it for a few hours as opposed to the few years you probably did

@Tofandel Tofandel closed this by deleting the head repository Jan 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants