Skip to content
This repository has been archived by the owner on May 9, 2019. It is now read-only.

Wrong size when left padding a unicode string #58

Closed
dubzzz opened this issue Mar 27, 2018 · 3 comments
Closed

Wrong size when left padding a unicode string #58

dubzzz opened this issue Mar 27, 2018 · 3 comments

Comments

@dubzzz
Copy link
Contributor

dubzzz commented Mar 27, 2018

There is an inconsistency when padding strings containg unicode characters out of BMP plan (ie. code points encoded on two chars in UTF-16).

leftPad('a\u{1f431}b', 4, 'x') => 'a\u{1f431}b' // in: 3 code points, out: 3 code points
leftPad('abc', 4, '\u{1f431}') => '\u{1f431}abc' // in: 3 code points, out: 4 code points

You should maybe specify that left-pad does not handle code points out of BMP plan as single characters.

Failure found using property based testing:
https://runkit.com/dubzzz/5ab9f3d8cc861f0012852eff

@stevemao
Copy link
Member

You should maybe specify that left-pad does not handle code points out of BMP plan as single characters.

Let's add it to the docs. PR welcome

@dubzzz
Copy link
Contributor Author

dubzzz commented Apr 1, 2018

I had a look to the implementation selected by latest versions of ECMA for padStart. They chose to consider code points outside the BMP plan as two distinct characters.

With padStart on Chrome and Firefox I get the following:

'a\u{1f431}b'.padStart(4, 'x') => "a🐱b"
'abc'.padStart(4, '\u{1f431}') => "\ud83dabc"

@dubzzz
Copy link
Contributor Author

dubzzz commented Apr 1, 2018

If the choice of left-pad is to be compliant with padStart then the only way to handle this case would be to solve the issue of the third argument not accepting multiple characters.

Otherwise if the choice is leftPad on code points (and not code units) the fix consists in measuring the size of the input string by by-passing String.length and measuring the length manually (only characters in the range \ud800 to \udfff can be by pairs). Then the pad code will be the same.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants