Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(v7.x backport) url: updates to the WHATWG URL parser #12507

Merged
merged 21 commits into from
Apr 25, 2017

Conversation

TimothyGu
Copy link
Member

@TimothyGu TimothyGu commented Apr 19, 2017

This pull request brings the WHATWG URL implementation in v7.x up to speed with master, including the code itself, the docs, and the tests.

Note: #12042 is not backported since #11956 (which it depends on) has not been backported yet.

Backports of (in the order they were landed to master):

Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • tests and/or benchmarks are included
  • documentation is changed or added
  • commit message follows commit guidelines
Affected core subsystem(s)

url

@nodejs-github-bot nodejs-github-bot added c++ Issues and PRs that require attention from people who are familiar with C++. dont-land-on-v4.x lib / src Issues and PRs related to general changes in the lib or src directory. querystring Issues and PRs related to the built-in querystring module. whatwg-url Issues and PRs related to the WHATWG URL implementation. labels Apr 19, 2017
@TimothyGu TimothyGu removed the lib / src Issues and PRs related to general changes in the lib or src directory. label Apr 19, 2017
@TimothyGu
Copy link
Member Author

Copy link
Member

@jasnell jasnell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rubber stamp LGTM

TimothyGu and others added 18 commits April 24, 2017 17:56
This commit implements the Web IDL USVString conversion, which mandates
all unpaired Unicode surrogates be turned into U+FFFD REPLACEMENT
CHARACTER. It also disallows Symbols to be used as USVString per spec.

Certain functions call into C++ methods in the binding that use the
Utf8Value class to access string arguments. Utf8Value already does the
normalization using V8's String::Write, so in those cases, instead of
doing the full USVString normalization, only a symbol check is done
(`'' + val`, which uses ES's ToString, versus `String()` which has
special provisions for symbols).

PR-URL: nodejs#12507
Reviewed-By: James M Snell <[email protected]>
The ES addition operator calls the ToPrimitive() abstract operation
without hint String, leading a subsequent OrdinaryToPrimitive() to call
valueOf() first on an object rather than the desired toString().

Instead, use template literals which directly call ToString() abstract
operation, per Web IDL spec.

PR-URL: nodejs#12507
Fixes: b610a4d "url: enforce valid UTF-8 in WHATWG parser"
Refs: nodejs@b610a4d#commitcomment-21200056
Refs: https://tc39.github.io/ecma262/#sec-addition-operator-plus-runtime-semantics-evaluation
Refs: https://tc39.github.io/ecma262/#sec-template-literals-runtime-semantics-evaluation
Reviewed-By: James M Snell <[email protected]>
This step was never part of the URL Standard's host parser algorithm,
and is rendered unnecessary after IDNA errors are no longer ignored.

PR-URL: nodejs#12507
Refs: c2a302c "src: do not ignore IDNA conversion error"
Reviewed-By: James M Snell <[email protected]>
The entire `URLSearchParams` class is now fully spec-compliant.

PR-URL: nodejs#12507
Fixes: nodejs#10821
Reviewed-By: James M Snell <[email protected]>
The object is used as a structure, not as a map, which `StorageObject`
was designed for.

PR-URL: nodejs#12507
Reviewed-By: James M Snell <[email protected]>
Provides a factory method to convert a native URL class
into a JS URL object.

```c++
Environment* env = ...

URL url("http://example.org/a/b/c?query#fragment");

MaybeLocal<Value> val = url.ToObject(env);
```

PR-URL: nodejs#12507
Reviewed-By: James M Snell <[email protected]>
- Clarify port state
- Remove scheme flag
- Clarify URL_FLAG_TERMINATED

PR-URL: nodejs#12507
Reviewed-By: James M Snell <[email protected]>
This changes to the way path parsing for non-special URLs.
It allows paths to be empty for non-special URLs and also
takes that into account when serializing.

PR-URL: nodejs#12507
Fixes: nodejs#11962
Refs: whatwg/url#213
Reviewed-By: James M Snell <[email protected]>
It should trim the slashes after the colon into three for file URL.

PR-URL: nodejs#12507
Refs: web-platform-tests/wpt#5195
Fixes: nodejs#11188
Reviewed-By: James M Snell <[email protected]>
@evanlucas evanlucas mentioned this pull request May 1, 2017
evanlucas pushed a commit that referenced this pull request May 1, 2017
PR-URL: #12507
Fixes: #10635
Reviewed-By: James M Snell <[email protected]>
evanlucas pushed a commit that referenced this pull request May 1, 2017
evanlucas pushed a commit that referenced this pull request May 1, 2017
This commit implements the Web IDL USVString conversion, which mandates
all unpaired Unicode surrogates be turned into U+FFFD REPLACEMENT
CHARACTER. It also disallows Symbols to be used as USVString per spec.

Certain functions call into C++ methods in the binding that use the
Utf8Value class to access string arguments. Utf8Value already does the
normalization using V8's String::Write, so in those cases, instead of
doing the full USVString normalization, only a symbol check is done
(`'' + val`, which uses ES's ToString, versus `String()` which has
special provisions for symbols).

PR-URL: #12507
Reviewed-By: James M Snell <[email protected]>
evanlucas pushed a commit that referenced this pull request May 1, 2017
The ES addition operator calls the ToPrimitive() abstract operation
without hint String, leading a subsequent OrdinaryToPrimitive() to call
valueOf() first on an object rather than the desired toString().

Instead, use template literals which directly call ToString() abstract
operation, per Web IDL spec.

PR-URL: #12507
Fixes: b610a4d "url: enforce valid UTF-8 in WHATWG parser"
Refs: b610a4d#commitcomment-21200056
Refs: https://tc39.github.io/ecma262/#sec-addition-operator-plus-runtime-semantics-evaluation
Refs: https://tc39.github.io/ecma262/#sec-template-literals-runtime-semantics-evaluation
Reviewed-By: James M Snell <[email protected]>
evanlucas pushed a commit that referenced this pull request May 1, 2017
evanlucas pushed a commit that referenced this pull request May 1, 2017
This step was never part of the URL Standard's host parser algorithm,
and is rendered unnecessary after IDNA errors are no longer ignored.

PR-URL: #12507
Refs: c2a302c "src: do not ignore IDNA conversion error"
Reviewed-By: James M Snell <[email protected]>
evanlucas pushed a commit that referenced this pull request May 1, 2017
evanlucas pushed a commit that referenced this pull request May 1, 2017
The entire `URLSearchParams` class is now fully spec-compliant.

PR-URL: #12507
Fixes: #10821
Reviewed-By: James M Snell <[email protected]>
evanlucas pushed a commit that referenced this pull request May 1, 2017
The object is used as a structure, not as a map, which `StorageObject`
was designed for.

PR-URL: #12507
Reviewed-By: James M Snell <[email protected]>
evanlucas pushed a commit that referenced this pull request May 1, 2017
Provides a factory method to convert a native URL class
into a JS URL object.

```c++
Environment* env = ...

URL url("http://example.org/a/b/c?query#fragment");

MaybeLocal<Value> val = url.ToObject(env);
```

PR-URL: #12507
Reviewed-By: James M Snell <[email protected]>
evanlucas pushed a commit that referenced this pull request May 1, 2017
- Clarify port state
- Remove scheme flag
- Clarify URL_FLAG_TERMINATED

PR-URL: #12507
Reviewed-By: James M Snell <[email protected]>
evanlucas pushed a commit that referenced this pull request May 1, 2017
This changes to the way path parsing for non-special URLs.
It allows paths to be empty for non-special URLs and also
takes that into account when serializing.

PR-URL: #12507
Fixes: #11962
Refs: whatwg/url#213
Reviewed-By: James M Snell <[email protected]>
evanlucas pushed a commit that referenced this pull request May 1, 2017
evanlucas pushed a commit that referenced this pull request May 1, 2017
evanlucas pushed a commit that referenced this pull request May 1, 2017
evanlucas pushed a commit that referenced this pull request May 1, 2017
It should trim the slashes after the colon into three for file URL.

PR-URL: #12507
Refs: web-platform-tests/wpt#5195
Fixes: #11188
Reviewed-By: James M Snell <[email protected]>
evanlucas pushed a commit that referenced this pull request May 1, 2017
PR-URL: #12507
Fixes: #11485
Reviewed-By: James M Snell <[email protected]>
evanlucas pushed a commit that referenced this pull request May 1, 2017
evanlucas pushed a commit that referenced this pull request May 1, 2017
- Use ordinary properties instead of symbols/getter redirection for
  internal object
- Use template string literals
- Remove unneeded custom inspection for internal objects
- Remove unneeded OpaqueOrigin class
- Remove unneeded type checks

PR-URL: #12507
Reviewed-By: James M Snell <[email protected]>
evanlucas pushed a commit that referenced this pull request May 1, 2017
evanlucas pushed a commit that referenced this pull request May 1, 2017
* reduce indentation
* refactor URL inlined methods
* prefer templates over macros
* do not export ARG_* flags in url binding

PR-URL: #12507
Reviewed-By: James M Snell <[email protected]>
evanlucas added a commit that referenced this pull request May 2, 2017
Notable changes:

* **crypto**:
  - add randomFill and randomFillSync (Evan Lucas)
    #10209
* **meta**: Added new collaborators
  - add lucamaraschi to collaborators (Luca Maraschi)
    #12538
  - add DavidCai1993 to collaborators (David Cai)
    #12435
  - add jkrems to collaborators (Jan Krems)
    #12427
  - add AnnaMag to collaborators (AnnaMag)
    #12414
* **process**:
  - fix crash when Promise rejection is a Symbol (Cameron Little)
    #11640
* **url**:
  - make WHATWG URL more spec compliant (Timothy Gu)
    #12507
* **v8**:
  - fix stack overflow in recursive method (Ben Noordhuis)
    #12460
  - fix build errors with g++ 7 (Ben Noordhuis)
    #12392

PR-URL: #12775
evanlucas added a commit that referenced this pull request May 2, 2017
Notable changes:

* **crypto**:
  - add randomFill and randomFillSync (Evan Lucas)
    #10209
* **meta**: Added new collaborators
  - add lucamaraschi to collaborators (Luca Maraschi)
    #12538
  - add DavidCai1993 to collaborators (David Cai)
    #12435
  - add jkrems to collaborators (Jan Krems)
    #12427
  - add AnnaMag to collaborators (AnnaMag)
    #12414
* **process**:
  - fix crash when Promise rejection is a Symbol (Cameron Little)
    #11640
* **url**:
  - make WHATWG URL more spec compliant (Timothy Gu)
    #12507
* **v8**:
  - fix stack overflow in recursive method (Ben Noordhuis)
    #12460
  - fix build errors with g++ 7 (Ben Noordhuis)
    #12392

PR-URL: #12775
evanlucas added a commit that referenced this pull request May 3, 2017
Notable changes:

* **crypto**:
  - add randomFill and randomFillSync (Evan Lucas)
    #10209
* **meta**: Added new collaborators
  - add lucamaraschi to collaborators (Luca Maraschi)
    #12538
  - add DavidCai1993 to collaborators (David Cai)
    #12435
  - add jkrems to collaborators (Jan Krems)
    #12427
  - add AnnaMag to collaborators (AnnaMag)
    #12414
* **process**:
  - fix crash when Promise rejection is a Symbol (Cameron Little)
    #11640
* **url**:
  - make WHATWG URL more spec compliant (Timothy Gu)
    #12507
* **v8**:
  - fix stack overflow in recursive method (Ben Noordhuis)
    #12460
  - fix build errors with g++ 7 (Ben Noordhuis)
    #12392

PR-URL: #12775
evanlucas added a commit that referenced this pull request May 3, 2017
Notable changes:

* **crypto**:
  - add randomFill and randomFillSync (Evan Lucas)
    #10209
* **meta**: Added new collaborators
  - add lucamaraschi to collaborators (Luca Maraschi)
    #12538
  - add DavidCai1993 to collaborators (David Cai)
    #12435
  - add jkrems to collaborators (Jan Krems)
    #12427
  - add AnnaMag to collaborators (AnnaMag)
    #12414
* **process**:
  - fix crash when Promise rejection is a Symbol (Cameron Little)
    #11640
* **url**:
  - make WHATWG URL more spec compliant (Timothy Gu)
    #12507
* **v8**:
  - fix stack overflow in recursive method (Ben Noordhuis)
    #12460
  - fix build errors with g++ 7 (Ben Noordhuis)
    #12392

PR-URL: #12775
imyller added a commit to imyller/meta-nodejs that referenced this pull request May 4, 2017
    Notable changes:

    * **crypto**:
      - add randomFill and randomFillSync (Evan Lucas)
        nodejs/node#10209
    * **meta**: Added new collaborators
      - add lucamaraschi to collaborators (Luca Maraschi)
        nodejs/node#12538
      - add DavidCai1993 to collaborators (David Cai)
        nodejs/node#12435
      - add jkrems to collaborators (Jan Krems)
        nodejs/node#12427
      - add AnnaMag to collaborators (AnnaMag)
        nodejs/node#12414
    * **process**:
      - fix crash when Promise rejection is a Symbol (Cameron Little)
        nodejs/node#11640
    * **url**:
      - make WHATWG URL more spec compliant (Timothy Gu)
        nodejs/node#12507
    * **v8**:
      - fix stack overflow in recursive method (Ben Noordhuis)
        nodejs/node#12460
      - fix build errors with g++ 7 (Ben Noordhuis)
        nodejs/node#12392

    PR-URL: nodejs/node#12775

Signed-off-by: Ilkka Myller <[email protected]>
anchnk pushed a commit to anchnk/node that referenced this pull request May 6, 2017
Notable changes:

* **crypto**:
  - add randomFill and randomFillSync (Evan Lucas)
    nodejs#10209
* **meta**: Added new collaborators
  - add lucamaraschi to collaborators (Luca Maraschi)
    nodejs#12538
  - add DavidCai1993 to collaborators (David Cai)
    nodejs#12435
  - add jkrems to collaborators (Jan Krems)
    nodejs#12427
  - add AnnaMag to collaborators (AnnaMag)
    nodejs#12414
* **process**:
  - fix crash when Promise rejection is a Symbol (Cameron Little)
    nodejs#11640
* **url**:
  - make WHATWG URL more spec compliant (Timothy Gu)
    nodejs#12507
* **v8**:
  - fix stack overflow in recursive method (Ben Noordhuis)
    nodejs#12460
  - fix build errors with g++ 7 (Ben Noordhuis)
    nodejs#12392

PR-URL: nodejs#12775
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c++ Issues and PRs that require attention from people who are familiar with C++. querystring Issues and PRs related to the built-in querystring module. whatwg-url Issues and PRs related to the WHATWG URL implementation.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants