Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The spec is unclear on the relationship of percent-encoding and Unicode #230

Closed
hsivonen opened this issue Sep 12, 2023 · 1 comment · Fixed by #247
Closed

The spec is unclear on the relationship of percent-encoding and Unicode #230

hsivonen opened this issue Sep 12, 2023 · 1 comment · Fixed by #247

Comments

@hsivonen
Copy link

hsivonen commented Sep 12, 2023

The spec says:

A text directive is a kind of directive representing a range of text to be indicated to the user. It is a struct that consists of four strings: start, end, prefix, and suffix.

So start, end, prefix, and suffix are defined as strings, though without a link to the Infra notion of "string".

The spec then says:

Set retVal’s prefix to the percent-decoding of the result of removing the last character from potential prefix.

Percent-decode returns a byte sequence but the spec assigns the return value to prefix, which is a string. (Likewise for start, end, and suffix.)

The spec should say whether the byte sequence is converted to a string by applying UTF-8 decode without BOM or by applying UTF-8 decode without BOM or fail (and what happens on failure if the latter).

@hsivonen
Copy link
Author

Evidently Chrome uses UTF-8 decode without BOM.

bokand added a commit to bokand/ScrollToTextFragment that referenced this issue Nov 30, 2023
This commit overhauls the parsing steps to avoid using the EBNF grammar
for validity, instead specifying that imperatively. It also moves
parsing to happen earlier in the process so that we pass around parsed
Text Directive objects.

Also makes the steps more precise, referring to infra types and
correctly decoding the strings.

Fixes WICG#221
Fixes WICG#230
bokand added a commit to bokand/ScrollToTextFragment that referenced this issue Dec 13, 2023
This commit overhauls the parsing steps to avoid using the EBNF grammar
for validity, instead specifying that imperatively. It also moves
parsing to happen earlier in the process so that we pass around parsed
Text Directive objects.

Also makes the steps more precise, referring to infra types and
correctly decoding the strings.

Fixes WICG#221
Fixes WICG#230
bokand added a commit that referenced this issue Dec 13, 2023
* Specify parsing imperatively

This commit overhauls the parsing steps to avoid using the EBNF grammar
for validity, instead specifying that imperatively. It also moves
parsing to happen earlier in the process so that we pass around parsed
Text Directive objects.

Also makes the steps more precise, referring to infra types and
correctly decoding the strings.

Fixes #221
Fixes #230

* Fix and make grammar non-normative

The grammar is now provided solely as a convenience so this makes the
section non-normative. Also fixes it so that UnknownDirective doesn't
subsume TextDirective.

Fixes #220
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants