From e679f89b3482544b2d213982451a9995ce971e2a Mon Sep 17 00:00:00 2001 From: Mike West Date: Fri, 25 May 2018 10:42:07 +0200 Subject: [PATCH 1/7] Define hosts' public suffix and registrable domain. This patch is another attempt at whatwg/url#72, and defers most of the actual work to the algorithms defined at https://publicsuffix.org/list/. --- url.bs | 91 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 91 insertions(+) diff --git a/url.bs b/url.bs index dd5d571d..209e3c15 100644 --- a/url.bs +++ b/url.bs @@ -272,6 +272,97 @@ for further processing. U+0020 SPACE, U+0023 (#), U+0025 (%), U+002F (/), U+003A (:), U+003F (?), U+0040 (@), U+005B ([), U+005C (\), or U+005D (]). +

A host's public suffix is +the portion of a host which is controlled by a registrar, public or otherwise. To +obtain host's public suffix, run the following steps: + +

    +
  1. Let parsed be the result of host parsing host. + +

  2. If parsed is not a domain, return the empty string. + +

  3. Return the public suffix obtained by executing the + algorithm defined by the Public Suffix List. [[!PSL]]. +

+ +

A host's registrable +domain is a formally valid domain name that could be registered at a registry. To obtain +host's registrable domain, run the following steps: + +

    +
  1. Let parsed be the result of host parsing host. + +

  2. If parsed is not a domain, return the empty string. + +

  3. If parsed's public suffix is host, return the empty + string. + +

  4. Return the registrable domain obtained by executing the + algorithm defined by the Public Suffix List. [[!PSL]]. +

+ +
+ + + + + + + + + + + + + +
Host + Public Suffix + Registrable Domain +
com + com + +
example.com + com + example.com +
www.example.com + com + example.com +
sub.www.example.com + com + example.com +
EXAMPLE.COM + com + example.com +
github.io + github.io + +
whatwg.github.io + github.io + whatwg.github.io +
whatwg.github.io + github.io + whatwg.github.io +
إختبار + xn-kgbechtv + +
example.إختبار + xn-kgbechtv + example.xn-kgbechtv +
sub.example.إختبار + xn-kgbechtv + example.xn-kgbechtv +
+
+ +

Two hosts, A and B are said to be +same-site with each other if either of the +following statements are true: + +

IDNA

From cbf9063ffa7d3d8a47a4f21c2ec198dd56ea4be9 Mon Sep 17 00:00:00 2001 From: Mike West Date: Fri, 25 May 2018 13:22:37 +0200 Subject: [PATCH 2/7] fixup second pass --- url.bs | 24 ++++++++++-------------- 1 file changed, 10 insertions(+), 14 deletions(-) diff --git a/url.bs b/url.bs index 209e3c15..80f48dfc 100644 --- a/url.bs +++ b/url.bs @@ -277,12 +277,11 @@ the portion of a host which is controlled by a registrar, public or obtain host's public suffix, run the following steps:
    -
  1. Let parsed be the result of host parsing host. - -

  2. If parsed is not a domain, return the empty string. +

  3. If host is not a domain, return null.

  4. Return the public suffix obtained by executing the - algorithm defined by the Public Suffix List. [[!PSL]]. + algorithm defined by the Public Suffix List on + host. [[!PSL]].

A host's registrable @@ -290,21 +289,18 @@ domain is a formally valid domain name that could be registered at a regis host's registrable domain, run the following steps:

    -
  1. Let parsed be the result of host parsing host. - -

  2. If parsed is not a domain, return the empty string. - -

  3. If parsed's public suffix is host, return the empty - string. +

  4. If host's public suffix is null, or + equals host, return null.

  5. Return the registrable domain obtained by executing the - algorithm defined by the Public Suffix List. [[!PSL]]. + algorithm defined by the Public Suffix List on + host. [[!PSL]].

- @@ -355,11 +351,11 @@ domain is a formally valid domain name that could be registered at a regis

Two hosts, A and B are said to be -same-site with each other if either of the +same site with each other if either of the following statements are true:

From 6ea048dd2dae64b979f1d0f21e7b814f90b8736a Mon Sep 17 00:00:00 2001 From: Anne van Kesteren Date: Fri, 25 May 2018 15:24:22 +0200 Subject: [PATCH 3/7] minor cleanup --- url.bs | 33 +++++++++++++++++---------------- 1 file changed, 17 insertions(+), 16 deletions(-) diff --git a/url.bs b/url.bs index 80f48dfc..295dfe80 100644 --- a/url.bs +++ b/url.bs @@ -272,25 +272,26 @@ for further processing. U+0020 SPACE, U+0023 (#), U+0025 (%), U+002F (/), U+003A (:), U+003F (?), U+0040 (@), U+005B ([), U+005C (\), or U+005D (]). -

A host's public suffix is -the portion of a host which is controlled by a registrar, public or otherwise. To -obtain host's public suffix, run the following steps: +

A host's public suffix is the portion of a +host which is controlled by a registrar, public or otherwise. To obtain +host's public suffix, run these steps:

    -
  1. If host is not a domain, return null. +

  2. If host is not a domain, then return null.

  3. Return the public suffix obtained by executing the algorithm defined by the Public Suffix List on host. [[!PSL]].

-

A host's registrable -domain is a formally valid domain name that could be registered at a registry. To obtain -host's registrable domain, run the following steps: +

A host's registrable domain is a domain that could +be registered at a registry. To obtain host's registrable domain, run +these steps:

    -
  1. If host's public suffix is null, or - equals host, return null. +

  2. If host's public suffix is null or host's + public suffix equals host, then return + null.

  3. Return the registrable domain obtained by executing the algorithm defined by the Public Suffix List on @@ -300,9 +301,9 @@ domain is a formally valid domain name that could be registered at a regis

Host + Host Input Public Suffix Registrable Domain
- -
Host Input - Public Suffix - Registrable Domain + Host input + Public suffix + Registrable domain
com com @@ -351,15 +352,15 @@ domain is a formally valid domain name that could be registered at a regis

Two hosts, A and B are said to be -same site with each other if either of the -following statements are true: +same site with each other if either of the following statements are true:

+

IDNA

The domain to ASCII algorithm, given a domain From cc03e7b671db1e68140078453c1c7e03e8006348 Mon Sep 17 00:00:00 2001 From: Mike West Date: Mon, 4 Jun 2018 09:32:04 +0200 Subject: [PATCH 4/7] fixup examples and warnings --- url.bs | 48 +++++++++++++++++++++++++++++++++++++----------- 1 file changed, 37 insertions(+), 11 deletions(-) diff --git a/url.bs b/url.bs index 295dfe80..d953235f 100644 --- a/url.bs +++ b/url.bs @@ -273,7 +273,7 @@ U+0020 SPACE, U+0023 (#), U+0025 (%), U+002F (/), U+003A (:), U+003F (?), U+0040 U+005C (\), or U+005D (]).

A host's public suffix is the portion of a -host which is controlled by a registrar, public or otherwise. To obtain +host which is included on the Public Suffix List [[!PSL]]. To obtain host's public suffix, run these steps:

    @@ -290,8 +290,7 @@ these steps:
    1. If host's public suffix is null or host's - public suffix equals host, then return - null. + public suffix equals host, then return null.

    2. Return the registrable domain obtained by executing the algorithm defined by the Public Suffix List on @@ -307,7 +306,7 @@ these steps:

com com - + null
example.com com @@ -327,11 +326,7 @@ these steps:
github.io github.io - -
whatwg.github.io - github.io - whatwg.github.io + null
whatwg.github.io github.io @@ -339,7 +334,7 @@ these steps:
إختبار xn-kgbechtv - + null
example.إختبار xn-kgbechtv @@ -355,11 +350,42 @@ these steps: same site with each other if either of the following statements are true: +
+

Assuming that suffix.example is a public suffix, and that + example.com is not: + +

    +
  • example.com, sub.example.com, other.example.com, + sub.sub.example.com, and sub.other.example.com are all same site + with each other (and themselves), as each host's registrable domain is + example.com. + +

  • registrable.suffix.example, sub.registrable.suffix.example, + other.registrable.suffix.example, sub.sub.registrable.suffix.example, + and sub.other.registrable.suffix.example are all same site with each other + (and themselves), as each host's registrable domain is + registrable.suffix.example. + +

  • example.com and registrable.suffix.example are not same + site with each other, as their registrable domains differ. + +

  • suffix.example is not same site with suffix.example, as + it is a public suffix, and therefore has a null registrable + domain. +

+
+ +

Specifications should avoid relying on "public suffix", +"registrable domain", and "same site". The public suffix list will diverge +from client to client, and cannot be relied-upon to provide a hard security boundary.

+

IDNA

From bd35e7e1decefa692332f3af06bb658c983acdbc Mon Sep 17 00:00:00 2001 From: Mike West Date: Mon, 4 Jun 2018 10:37:55 +0200 Subject: [PATCH 5/7] fixup more feedback. --- url.bs | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-) diff --git a/url.bs b/url.bs index d953235f..b42bdfc8 100644 --- a/url.bs +++ b/url.bs @@ -284,9 +284,10 @@ U+005C (\), or U+005D (]). host. [[!PSL]]. -

A host's registrable domain is a domain that could -be registered at a registry. To obtain host's registrable domain, run -these steps: +

A host's registrable domain is a domain formed by +the most specific public suffix, along with the domain label immediately preceeding it. If no such +label is available, the registrable domain is null. To obtain host's +registrable domain, run these steps:

  1. If host's public suffix is null or host's @@ -384,7 +385,11 @@ these steps:

    Specifications should avoid relying on "public suffix", "registrable domain", and "same site". The public suffix list will diverge -from client to client, and cannot be relied-upon to provide a hard security boundary.

    +from client to client, and cannot be relied-upon to provide a hard security boundary. Specifications +which ignore this advice are encouraged to carefully consider whether URLs' schemes ought to be +incorporated into any decision made based upon whether or not two hosts are same +site. HTML's same origin-domain concept is a reasonable example of this consideration in +practice.

    IDNA

    From 2be718c3194163834ad1ac471caa9bd519db8f3e Mon Sep 17 00:00:00 2001 From: Anne van Kesteren Date: Mon, 4 Jun 2018 14:13:48 +0200 Subject: [PATCH 6/7] nits --- url.bs | 39 +++++++++++++++++++-------------------- 1 file changed, 19 insertions(+), 20 deletions(-) diff --git a/url.bs b/url.bs index b42bdfc8..d5c0f23b 100644 --- a/url.bs +++ b/url.bs @@ -273,8 +273,8 @@ U+0020 SPACE, U+0023 (#), U+0025 (%), U+002F (/), U+003A (:), U+003F (?), U+0040 U+005C (\), or U+005D (]).

    A host's public suffix is the portion of a -host which is included on the Public Suffix List [[!PSL]]. To obtain -host's public suffix, run these steps: +host which is included on the Public Suffix List. To obtain +host's public suffix, run these steps: [[!PSL]]

    1. If host is not a domain, then return null. @@ -285,9 +285,8 @@ U+005C (\), or U+005D (]).

    A host's registrable domain is a domain formed by -the most specific public suffix, along with the domain label immediately preceeding it. If no such -label is available, the registrable domain is null. To obtain host's -registrable domain, run these steps: +the most specific public suffix, along with the domain label immediately preceeding it, if any. To +obtain host's registrable domain, run these steps:

    1. If host's public suffix is null or host's @@ -295,7 +294,7 @@ label is available, the registrable domain is null. To obtain

      Return the registrable domain obtained by executing the algorithm defined by the Public Suffix List on - host. [[!PSL]]. + host. [[!PSL]]

    @@ -351,45 +350,45 @@ label is available, the registrable domain is null. To obtain same site with each other if either of the following statements are true:
    -

    Assuming that suffix.example is a public suffix, and that +

    Assuming that suffix.example is a public suffix and that example.com is not:

    -

    Specifications should avoid relying on "public suffix", +

    Specifications should avoid depending on "public suffix", "registrable domain", and "same site". The public suffix list will diverge from client to client, and cannot be relied-upon to provide a hard security boundary. Specifications which ignore this advice are encouraged to carefully consider whether URLs' schemes ought to be -incorporated into any decision made based upon whether or not two hosts are same -site. HTML's same origin-domain concept is a reasonable example of this consideration in -practice. +incorporated into any decision made based upon whether or not two hosts are +same site. HTML's same origin-domain concept is a reasonable example of this +consideration in practice.

    IDNA

    From 8828de3bfb6c4254b824f0aefacad14de2aaf160 Mon Sep 17 00:00:00 2001 From: Anne van Kesteren Date: Mon, 4 Jun 2018 14:45:31 +0200 Subject: [PATCH 7/7] the nits have nits --- url.bs | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/url.bs b/url.bs index d5c0f23b..a94982bc 100644 --- a/url.bs +++ b/url.bs @@ -364,13 +364,13 @@ obtain host's registrable domain, run these steps:
    • example.com, sub.example.com, other.example.com, sub.sub.example.com, and sub.other.example.com are all same site - with each other (and themselves), as each their registrable domain is + with each other (and themselves), as their registrable domains are example.com.

    • registrable.suffix.example, sub.registrable.suffix.example, other.registrable.suffix.example, sub.sub.registrable.suffix.example, and sub.other.registrable.suffix.example are all same site with each other - (and themselves), as each their registrable domain is + (and themselves), as their registrable domains are registrable.suffix.example.

    • example.com and registrable.suffix.example are not