The spec is unclear on the relationship of percent-encoding and Unicode #230

hsivonen · 2023-09-12T12:24:27Z

The spec says:

A text directive is a kind of directive representing a range of text to be indicated to the user. It is a struct that consists of four strings: start, end, prefix, and suffix.

So start, end, prefix, and suffix are defined as strings, though without a link to the Infra notion of "string".

The spec then says:

Set retVal’s prefix to the percent-decoding of the result of removing the last character from potential prefix.

Percent-decode returns a byte sequence but the spec assigns the return value to prefix, which is a string. (Likewise for start, end, and suffix.)

The spec should say whether the byte sequence is converted to a string by applying UTF-8 decode without BOM or by applying UTF-8 decode without BOM or fail (and what happens on failure if the latter).

hsivonen · 2023-09-12T12:31:03Z

Evidently Chrome uses UTF-8 decode without BOM.

This commit overhauls the parsing steps to avoid using the EBNF grammar for validity, instead specifying that imperatively. It also moves parsing to happen earlier in the process so that we pass around parsed Text Directive objects. Also makes the steps more precise, referring to infra types and correctly decoding the strings. Fixes WICG#221 Fixes WICG#230

* Specify parsing imperatively This commit overhauls the parsing steps to avoid using the EBNF grammar for validity, instead specifying that imperatively. It also moves parsing to happen earlier in the process so that we pass around parsed Text Directive objects. Also makes the steps more precise, referring to infra types and correctly decoding the strings. Fixes #221 Fixes #230 * Fix and make grammar non-normative The grammar is now provided solely as a convenience so this makes the section non-normative. Also fixes it so that UnknownDirective doesn't subsume TextDirective. Fixes #220

bokand added the spec issue label Nov 17, 2023

bokand mentioned this issue Dec 13, 2023

[Spec] Overhaul directive parsing #247

Merged

bokand closed this as completed in #247 Dec 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The spec is unclear on the relationship of percent-encoding and Unicode #230

The spec is unclear on the relationship of percent-encoding and Unicode #230

hsivonen commented Sep 12, 2023 •

edited

Loading

hsivonen commented Sep 12, 2023

The spec is unclear on the relationship of percent-encoding and Unicode #230

The spec is unclear on the relationship of percent-encoding and Unicode #230

Comments

hsivonen commented Sep 12, 2023 • edited Loading

hsivonen commented Sep 12, 2023

hsivonen commented Sep 12, 2023 •

edited

Loading