Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move minification check to ReactDOM #10640

Closed
wants to merge 1 commit into from

Conversation

gaearon
Copy link
Collaborator

@gaearon gaearon commented Sep 7, 2017

Addresses @sebmarkbage’s concern.

if (longVariableName && source.match(/longVariableName/g).length === 3) {
// We are not minified.
// This might be a Node environment where DCE is not expected anyway.
return;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we allow this to not be minimized at all?

The other concern was that if shipping works because of this bail out. Then if byte code reversing gets deployed, then it'll skip this and start failing. Without the site doing anything differently.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I don't understand the first question but in Node environment it is expected that code is not minimized or altered in any way.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't hit the server entry point. I guess you could still have a require on this for findDOMNode.

If you run unit tests with react-dom in production, I guess you'd hit this too.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I'm mostly referring to findDOMNode case.

I don't think we support running tests in production. For example test renderer throws in entry point if you try that. I guess we could relax that if there are valid use cases.

@gaearon
Copy link
Collaborator Author

gaearon commented Sep 7, 2017

Can you comment in a more detailed way about bytecode? It's not clear to me which scenario you're worried about.

@sebmarkbage
Copy link
Collaborator

sebmarkbage commented Sep 7, 2017

The format of toString isn't at all standardized but the proposals for standardization aren't what current implementations do. This proposal has had a few turns and stagnated but something like it isn't dead. https://tc39.github.io/Function-prototype-toString-revision/

The first things VMs do when they need to invoke a function, eagerly while parsing, or eagerly in spare cycles is to parse the string source into an AST and then compile that to a byte code. After that, the byte code is the only representation it needs.

Currently VMs also keeps around a copy of the source string. Either in memory or at least in the cache. The only reason to keep this around is for toString. However this is very wasteful. Especially if you're caching scripts in byte code form, or even deploying byte code directly without the source code copy (such as in mobile and embedded devices). Some of these might drop toString support completely which we try to be resilient against here.

However, there is an alternative to storing an entire copy for all source code while still preserving toString semantics. For compatibility with code that depends on it. Instead you can reverse engineer the byte code into a source string on demand. However, obviously this is a lossy format. It might not even have the same structure of the code. SpiderMonkey used to do this before.

It can duplicate the same code paths if it has been inlined, strip variable names, reintroduce new variables etc. So all of our assumptions we make here might be blown out the window.

Another use case is SES/Caja which rewrites your code into secure subsets AOT or on the fly. They don't want to ship an entire copy of the original source code along with the transpiled code just for toString. So the toString you do get is the transpiled code. Similar to how Babel compiled code is not the same neither.

That's why the spec proposals to try to solve this didn't specify the exact format nor that it would retain the original source code. The only thing it did specify is that it would be functionally equivalent if you did something like: eval(fn.toString()) to clone it (minus closed over variables).

So, while the whole thing is sketchy, the thing I do know is that any attempt at reversing this from byte code would have to include at least one string "toString" since that's a property name observable from the outside. So that check would pass. It would also have to include "unreachable" since that code path is not true dead code. However, it could easily, and likely would have to, rename longVariableName while otherwise being completely functionally equivalent. In fact, that's exactly what minifiers rely on.

@gaearon
Copy link
Collaborator Author

gaearon commented Sep 7, 2017

Are you worried about false negative then? Or is there false positive too?

@sebmarkbage
Copy link
Collaborator

What is positive or negative? :) Regardless, I think both can happen.

@gaearon
Copy link
Collaborator Author

gaearon commented Sep 7, 2017

I think I get it now. You're saying there's no heuristic for "has minifier been applied" that doesn't also trigger for hypothetical "VM has messed with the reported source".

@sebmarkbage
Copy link
Collaborator

Yea. Additionally, the fact that we use toString at all will probably give some unfortunate metrics in browsers tracking feature usage.

It can also slow things down significantly if browsers or Node now need to read the source from disk cache separately from the cached byte code. Circumventing anything that byte code cache gives us.

This technique is great for dev but seems problematic for prod.

@gaearon
Copy link
Collaborator Author

gaearon commented Sep 11, 2017

#10673

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants