Tests and spec don't agree about base URL in match result `inputs` #133

lucacasonato · 2021-09-07T19:52:28Z

As per spec, the second item in the inputs field of the match result should be the stringification of the parsed base url URL (it actually says to add the URL object itself, but I'll just consider that a bug).

For this test case, the stringification of the base url is 'https://example.com/".

{
    "pattern": [{ "pathname": "/foo/bar" }],
    "inputs": [ "./foo/bar", "https://example.com" ],
    "expected_match": {
      "hostname": { "input": "example.com", "groups": { "0": "example.com" } },
      "pathname": { "input": "/foo/bar", "groups": {} },
      "protocol": { "input": "https", "groups": { "0": "https" } }
    }
  }

Yet the test runner wants it to be https://example.com. This is the case in Chromium right now (it appends the URL string, not the stringification of the parsed url). See https://source.chromium.org/chromium/chromium/src/+/main:third_party/blink/renderer/modules/url_pattern/url_pattern.cc;l=496-497;drc=1a1342c1f2d295948d79211cd860721732990ed8

I would say that the spec behaviour actually makes more sense, and that the test should be fixed to look like this:

{
    "pattern": [{ "pathname": "/foo/bar" }],
    "inputs": [ "./foo/bar", "https://example.com" ],
    "expected_match": {
      "inputs": [ "./foo/bar", "https://example.com/" ],
      "hostname": { "input": "example.com", "groups": { "0": "example.com" } },
      "pathname": { "input": "/foo/bar", "groups": {} },
      "protocol": { "input": "https", "groups": { "0": "https" } }
    }
  }

If we say this is right, then I suggest for the URLParserInit we should also change the spec & Chromium to have the inputs array contain the "processed" URLParserInit instead of the original.

The text was updated successfully, but these errors were encountered:

wanderview · 2021-09-07T20:05:11Z

This was discussed in issues previously at:

#34

The intent was to make the result.inputs have the original unprocessed value. So I'm inclined to fix the spec to match the implementation.

lucacasonato · 2021-09-07T20:14:16Z

Ah, missed that issue. Seems fine, although I do wonder why this inputs field is needed at all then. If the input is just the unprocessed value, I'd argue there is no point in returning it at all (the user already has it, as they passed it in).

wanderview · 2021-09-07T20:17:40Z

Its a weak use case, but the idea was the user may want to pass the result on to other code to process and it may expect the same exact input for some other comparison/operation. This matches what Regexp.exec() does, etc.

lucacasonato · 2021-09-07T20:20:09Z

Ok - I could maybe see some value here, but it seems to me it wouldn't be much more difficult to just pass the inputs manually as user. I sorta had the expectation when I first saw the dictionary definition that these "inputs" are the normalized values, as this "plain passthrough" is not very common.

wanderview · 2021-09-07T20:21:52Z

Its not too late to change this. I'm just not sure which is better. @domenic do you have any thoughts here?

domenic · 2021-09-07T20:28:12Z

This is a tough one. I'm afraid I can't be of much help as a tiebreaker; I see both sides.

My only contribution is that I could see the RegExp precedent cutting either way. Is inputs meant to respresent the input string, or the input URL? Clearly for RegExps it's the input string, but maybe for URLPattern the input domain is more like URLs, and we just happen to represent them as strings?

wanderview · 2021-09-07T20:36:11Z

@jeffpossnick do you have an opinion from a web developer point of view?

wanderview · 2021-09-07T20:42:31Z

I made a twitter poll as well:

https://twitter.com/wanderview/status/1435342537288556548

lucacasonato · 2021-09-07T21:11:51Z

Looking at the poll, my opinion is obviously in the minority. I'd say let's keep behaviour like it is implemented in Chrome, and update the spec. Seems most people's expectations would be met. The normalized values are always available via the input on individual component results anyway.

lucacasonato · 2021-09-07T21:55:31Z

Someone on Twitter did bring up this point: https://twitter.com/bhathos/status/1435353793588350979. This shows that the expectation is that the matcher matches against whatever "result.input" is - that is however not the case. The matcher matches against the normalized value of result.input. Maybe there would be more insight into the system by users if the matcher returned the normalized inputs.

I'm impartial now.

wanderview · 2021-09-07T23:42:49Z

The matcher also doesn't operate on the entire inputs array. It operates on individual component values after normalization. Those normalized inputs are available at result.pathname.input, etc. Combined with result.inputs that lets you see what normalization took place before the matcher ran.

As I wrote in the previous issue, though, it is all a bit weird and I'm not sure if there is a perfect answer. I think the current spec is the least bad approach, though.

wanderview · 2021-09-08T20:58:43Z

Final poll results:

173 votes
46.8% for original raw value
17.3% for encoded value
35.8% don't know

I'm going to go ahead and update the spec to use the original raw value.

Add original baseURL string to result.inputs. (Fixes #133)

This was referenced Sep 7, 2021

Tests and spec don't agree about base URL in match result inputs denoland/deno#11949

Closed

Fix URLPatternResult#inputs behaviour denoland/rust-urlpattern#12

Closed

wanderview closed this as completed in ad777fe Sep 8, 2021

wanderview added a commit that referenced this issue Sep 8, 2021

Merge pull request #135 from WICG/dev-inputs

ae656a4

Add original baseURL string to result.inputs. (Fixes #133)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tests and spec don't agree about base URL in match result `inputs` #133

Tests and spec don't agree about base URL in match result `inputs` #133

lucacasonato commented Sep 7, 2021

wanderview commented Sep 7, 2021

lucacasonato commented Sep 7, 2021

wanderview commented Sep 7, 2021

lucacasonato commented Sep 7, 2021 •

edited

Loading

wanderview commented Sep 7, 2021

domenic commented Sep 7, 2021

wanderview commented Sep 7, 2021

wanderview commented Sep 7, 2021

lucacasonato commented Sep 7, 2021

lucacasonato commented Sep 7, 2021 •

edited

Loading

wanderview commented Sep 7, 2021

wanderview commented Sep 8, 2021

Tests and spec don't agree about base URL in match result inputs #133

Tests and spec don't agree about base URL in match result inputs #133

Comments

lucacasonato commented Sep 7, 2021

wanderview commented Sep 7, 2021

lucacasonato commented Sep 7, 2021

wanderview commented Sep 7, 2021

lucacasonato commented Sep 7, 2021 • edited Loading

wanderview commented Sep 7, 2021

domenic commented Sep 7, 2021

wanderview commented Sep 7, 2021

wanderview commented Sep 7, 2021

lucacasonato commented Sep 7, 2021

lucacasonato commented Sep 7, 2021 • edited Loading

wanderview commented Sep 7, 2021

wanderview commented Sep 8, 2021

Tests and spec don't agree about base URL in match result `inputs` #133

Tests and spec don't agree about base URL in match result `inputs` #133

lucacasonato commented Sep 7, 2021 •

edited

Loading

lucacasonato commented Sep 7, 2021 •

edited

Loading