Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TextDecoder regressed for ascii, windows-1252, ... encodings in v23.4.0 #56219

Open
lutzroeder opened this issue Dec 11, 2024 · 6 comments · May be fixed by #56222
Open

TextDecoder regressed for ascii, windows-1252, ... encodings in v23.4.0 #56219

lutzroeder opened this issue Dec 11, 2024 · 6 comments · May be fixed by #56222
Labels
confirmed-bug Issues with confirmed bugs.

Comments

@lutzroeder
Copy link

lutzroeder commented Dec 11, 2024

Version

23.4.0

Platform

macOS 15.1.1 (24B91)

Subsystem

No response

What steps will reproduce the bug?

~: node
Welcome to Node.js v23.4.0.
> b = new Uint8Array([97, 108, 101])
> decoder = new TextDecoder('ascii')
> decoder.decode(b)
<Buffer 61 6c 65>

How often does it reproduce? Is there a required condition?

Reproduces reliably.

What is the expected behavior? Why is that the expected behavior?

Returns a string instead of a buffer. See output from v23.3.0:

~: node
Welcome to Node.js v23.3.0.
> b = new Uint8Array([97, 108, 101])
> decoder = new TextDecoder('ascii')
> decoder.decode(b)
'ale'

What do you see instead?

Returns a buffer.

Additional information

@mertcanaltin @RafaelGSS @anonrig @jasnell @lemire @targos #55275

@lutzroeder lutzroeder changed the title TextDecoder ascii regressed in 23.4.0 TextDecoder regressed for ascii encoding in v23.4.0 Dec 11, 2024
@lemire lemire added the confirmed-bug Issues with confirmed bugs. label Dec 11, 2024
@lemire
Copy link
Member

lemire commented Dec 11, 2024

Verified.

> new TextDecoder('us-ascii').decode(new Uint8Array([0x32, 0x44]))
<Buffer 32 44>
> new TextDecoder('windows-1252').decode(new Uint8Array([0x32, 0x44]))
<Buffer 32 44>
> new TextDecoder('latin1').decode(new Uint8Array([0x32, 0x44]))
<Buffer 32 44>

@lemire lemire changed the title TextDecoder regressed for ascii encoding in v23.4.0 TextDecoder regressed for ascii, windows-1252, ... encodings in v23.4.0 Dec 11, 2024
@lemire
Copy link
Member

lemire commented Dec 11, 2024

This is related to #55275

@mertcanaltin Can you check?

I think that the C++ function returns a buffer, not a string.

@mertcanaltin
Copy link
Member

This is related to #55275

@mertcanaltin Can you check?

I think that the C++ function returns a buffer, not a string.

hello I'm looking into this, I'll post an improvement

@mertcanaltin
Copy link
Member

mertcanaltin commented Dec 11, 2024

hello, I am fixed this problem after then ı testet my local enviorement, ı see improvement this problem

➜  node git:(mert/fast-path-fix) ./node
Welcome to Node.js v24.0.0-pre.
Type ".help" for more information.
>  b = new Uint8Array([97, 108, 101])
Uint8Array(3) [ 97, 108, 101 ]
> decoder = new TextDecoder('ascii')
TextDecoder { encoding: 'windows-1252', fatal: false, ignoreBOM: false }
> decoder.decode(b)
'ale'
> 

many thanks for reporting the bug and for the suggestions.

@lutzroeder
Copy link
Author

@mertcanaltin @lemire since this is an impactful regression rolling out as part of v23.4.0, are there plans to fast follow shipping a fix in a v23.4.1 release?

@lemire
Copy link
Member

lemire commented Dec 12, 2024

@lutzroeder It is definitively getting fixed in the next release. As to whether this next release is labelled v23.4.1 or not... I do not know.

There is also a related issue: nodejs/performance#183

That last one is troubling as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
confirmed-bug Issues with confirmed bugs.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants