Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[OTP 26] Unexpected conversion when transferring data over stdout #7132

Closed
robertoaloi opened this issue Apr 18, 2023 · 7 comments · Fixed by #7211
Closed

[OTP 26] Unexpected conversion when transferring data over stdout #7132

robertoaloi opened this issue Apr 18, 2023 · 7 comments · Fixed by #7211
Assignees
Labels
team:VM Assigned to OTP team VM

Comments

@robertoaloi
Copy link
Contributor

Under OTP 26-RC3, an unexpected conversion (latin1 to unicode?) seems to be happening behind the scenes when transferring data via stdout. I put together a minimal reproducible example showing the issue:

https://github.com/robertoaloi/otp-26-stdout/

The repo is made of two tiny escripts: a client and a server. Upon request, the server sends EETF encoded data to the client via stdout. This works under OTP 25, but the data seems corrupted on OTP 26-RC3.

Please refer to the README for how to run the sample code, but essentially this is what happens:

OTP 25:

asdf global erlang 25.3
./client
<<131,104,4,100,0,2,105,115,100,0,4,116,104,105,115,100,0,1,97,100,0,3,98,117,103>>
{is,this,a,bug}%

OTP 26:

asdf global erlang 26.0-rc3
./client
<<194,131,104,4,119,2,105,115,119,4,116,104,105,115,119,1,97,119,3,98,117,103>>
escript: exception error: bad argument
  in function  binary_to_term/1
     called as binary_to_term(<<194,131,104,4,119,2,105,115,119,4,116,104,105,
                                115,119,1,97,119,3,98,117,103>>)
     *** argument 1: invalid external representation of a term

Things I tried (which didn't help):

  • Various emulator flags, trying to enforce latin1
  • Playing with various io:setopts configuration
  • Specify {minor_version, 1} and term_to_binary/2 (due to the UTF8 encoding of atoms by default in OTP 26)

What could explain the inconsistency?

@garazdawi
Copy link
Contributor

I got curious about this so I dug a bit in the code to figure out why this is happening.

Before 26, the user module handled all escript I/O and it defaulted to be in latin1 mode which just passed the bytes sent by the program to stdout. In 26, the group module now handles escript I/O and sends it on to user_drv and in the end prim_tty. What happens in your example is that group converts the latin1 string to unicode in order to send it to user_drv as user_drv only accepts unicode things.

tldr. With the 26 shell there is no way to send raw bytes to stdout. Maybe it would be a good idea to change group so that when encoding is set to latin1 using io:setopts and a {put_chars,latin1,...} request is made it just forwards it to user_drv and so on.

@rickard-green rickard-green added the team:VM Assigned to OTP team VM label Apr 24, 2023
@robertoaloi
Copy link
Contributor Author

@rickard-green @frazze-jobb Is this something you plan to address before the official OTP 26 release?

@michalmuskala
Copy link
Contributor

As a context - this is right now breaking eqWAlizer, so we'd like to understand if we need to find some workarounds to support OTP 26

@frazze-jobb
Copy link
Contributor

This does not seem to be simple, I will not be able to address this before the OTP 26 release.

@robertoaloi
Copy link
Contributor Author

This is unfortunate, as various tools may break because of this. As Michał suggestd, the EqWAlizer type checker (used to type check OTP itself, as far as I know) will not work under OTP 26 without modifications. The same may be true for other tools communicating via stdout, which would then slow down OTP 26 adoption. LSP language servers and DAP debuggers come to mind.

In my opinion this should at least be marked as a potential incompatibility at:

https://www.erlang.org/docs/24/general_info/upcoming_incompatibilities.html#otp-26

@rickard-green
Copy link
Contributor

@robertoaloi @michalmuskala #7211 will most likely be merged into master prior to the release of OTP 26

@lukaszsamson
Copy link
Contributor

It looks like the same problem is affecting reading from raw stdin. ElixirLS is not able to correctly read LSP and DAP protocol messages on OTP26-rc.3. I haven't tested master with #7211 yet but reading the code I assume that only stdout case was fixed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
team:VM Assigned to OTP team VM
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants