Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dedent only the raw array, and cook the dedented raw #60

Merged
merged 7 commits into from
Sep 13, 2022
Merged

Conversation

jridgewell
Copy link
Member

Fixes #57.

This changes the algorithm so that we only dedent the raw array, meaning that any escape sequences are treated as non-whitespace chars for the calculate step. We then cook the raw string into a cooked string for the cooked array.

The practical change is demonstrated as:

// A)
String.dedent`
  \x20 foo
`;

// B)
String.dedent`
  foo\n  bar
`

A) now prints a cooked "··foo" instead of "foo". The raw remains "\\x20·foo".
B) now prints a cooked "foo\n··bar" instead of "foo\nbar". The raw remains "foo\\n··bar".

spec.emu Outdated Show resolved Hide resolved
spec.emu Outdated Show resolved Hide resolved
spec.emu Outdated
1. Let _cooked_ be a new empty List.
1. For each element _str_ of _raw_, do
1. If Type(_str_) is String, then
1. Let _template_ be the string-concatenation of *"`"*, _str_, and *"`"*.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no reason to do this concatenation; you can just parse the string as a |TemplateCharacters| instead of as a |NoSubstitutionTemplate|.

I don't love this implementation, though. It has weird implications for the normative behavior of String.dedent - for example, it rejects { raw: ['${'] }. While it's true that this isn't something which can appear in a template, there's no particular reason that String.dedent ought to reject it. It would be better to define a more permissive version of |TemplateCharacters| (sharing most of the productions, just getting rid of $ [lookahead ≠ {] and changing the last one to SourceCharacter but not \ or LineTerminator), and define TV over the new thing. I can send a PR for that if you'd like.

(Until we get a better method for it, I'm going to use String.dedent on non-template strings with something like String.dedent({ raw: ['\n' + input.replace(/\\/g, '\\\\' + '\n'] }), and I would expect that to work even if x happens to contain ${ or a literal backtick.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no reason to do this concatenation; you can just parse the string as a |TemplateCharacters| instead of as a |NoSubstitutionTemplate|.

Done.

It would be better to define a more permissive version of |TemplateCharacters| (sharing most of the productions, just getting rid of $ [lookahead ≠ {] and changing the last one to SourceCharacter but not \ or LineTerminator), and define TV over the new thing. I can send a PR for that if you'd like.

My first attempt was to iterate all chars and look for \ and then tried to parse the next few chars, but it quickly became unwieldy because of how many branches are needed to cover TemplateEscapeSequence and NotEscapeSequence. If you can make it work better, I'm happy to accept the PR.

spec.emu Outdated Show resolved Hide resolved
jridgewell and others added 6 commits September 12, 2022 22:49
Fixes #57.

This changes the algorithm so that we only dedent the raw array, meaning that any escape sequences are treated as non-whitespace chars for the calculate step. We then cook the raw string into a cooked string for the cooked array.

The practical change is demonstrated as:

```js
// A)
String.dedent`
  \x20 foo
`;

// B)
String.dedent`
  foo\n  bar
`
```

A) now prints a cooked `"··foo"` instead of `"foo"`. The raw remains `"\\x20·foo"`.
B) now prints a cooked `"foo\n··bar"` instead of `"foo\nbar"`. The raw remains `"foo\\n··bar"`.
@jridgewell jridgewell merged commit 94e0773 into master Sep 13, 2022
@jridgewell jridgewell deleted the escapes branch September 13, 2022 02:51
jridgewell added a commit that referenced this pull request Sep 13, 2022
After #60, we no longer need to handle `undefined` in the input template strings array. That's because a well-formed `raw` array can **only** contain strings (only the `cooked` array can contain `undefined`).

Because we no longer have `undefined`s in our input, several of the AOs can remove branches for handling an empty block of lines.
jridgewell added a commit that referenced this pull request Sep 13, 2022
After #60, we no longer need to handle `undefined` in the input template strings array. That's because a well-formed `raw` array can **only** contain strings (only the `cooked` array can contain `undefined`).

Because we no longer have `undefined`s in our input, several of the AOs can remove branches for handling an empty block of lines.
jridgewell added a commit to jridgewell/string-dedent that referenced this pull request Sep 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Raw can mismatch cooked string in unexpected ways
2 participants