-
Notifications
You must be signed in to change notification settings - Fork 6.2k
8356985: Use "stdin.encoding" in Console's read*() methods #25271
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
8356985: Use "stdin.encoding" in Console's read*() methods #25271
Conversation
|
👋 Welcome back naoto! A progress list of the required criteria for merging this PR into |
|
@naotoj This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be: You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 158 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. ➡️ To integrate this PR with the above commit message to the |
Webrevs
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copyright year update is missing.
|
|
||
| try { | ||
| Terminal terminal = TerminalBuilder.builder().encoding(charset) | ||
| Terminal terminal = TerminalBuilder.builder().encoding(outCharset) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't ideally JdkConsole::charset and Terminal::encoding be adapted for stdin/stdout variants?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also noticed DumbTerminalProvider::sysTerminal calls DumbTerminal with new FileInputStream(FileDescriptor.in). Later on DumbTerminal applies encoding() both for passed stdin and std{out,err}. In short, TerminalProvider might need to undergo a similar refactoring separating input and output encodings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All FileDescriptor.in encounters in jdk.internal.org.jline.terminal that might need attention:
src/jdk.internal.le/share/classes/jdk/internal/org/jline/terminal/impl/DumbTerminalProvider.java
src/jdk.internal.le/share/classes/jdk/internal/org/jline/terminal/impl/exec/ExecPty.java
src/jdk.internal.le/share/classes/jdk/internal/org/jline/terminal/impl/ffm/FfmTerminalProvider.java
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
JLine is a 3rd party library. It would be desirable that they change their implementation to separately handle in/out in their terminal, but that is out of scope of this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Created jline/jline3#1282.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
JLine seems to incorporate the stdin encoding already, which is quick!
| static final Charset STDIN_CHARSET = | ||
| Charset.forName(System.getProperty("stdin.encoding"), UTF_8.INSTANCE); | ||
| static final Charset STDOUT_CHARSET = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess these can be marked private?
| static final Charset STDIN_CHARSET = | |
| Charset.forName(System.getProperty("stdin.encoding"), UTF_8.INSTANCE); | |
| static final Charset STDOUT_CHARSET = | |
| private static final Charset STDIN_CHARSET = | |
| Charset.forName(System.getProperty("stdin.encoding"), UTF_8.INSTANCE); | |
| private static final Charset STDOUT_CHARSET = |
| public class StdinEncodingTest { | ||
|
|
||
| @Test | ||
| @EnabledOnOs({OS.LINUX, OS.MAC}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we replace @EnabledOnOs with the @requires (os.family == "linux" | os.family == "mac") JTreg directive instead per JDK-8211673?
| protected CoderResult decodeLoop(ByteBuffer in, CharBuffer out) { | ||
| while (in.remaining() > 0) { | ||
| char c = (char)in.get(); | ||
| if (c != '\n') { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Character.toUpperCase('\n') == '\n', not? If so, do we still need this if-branching?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Eliminating newlines was a hack to ignore them in the expect responses so that it can check the combined output in one shot. Now I think it is better to check the result separately, so modified as such.
| import java.util.Iterator; | ||
|
|
||
| // A test charset provider that decodes every input byte into its uppercase | ||
| public class MockCharsetProvider extends CharsetProvider { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Maybe a more self-explanatory name, e.g., UpperCasingCharsetProvider?
| * @build csp/* | ||
| * @run junit StdinEncodingTest | ||
| */ | ||
| public class StdinEncodingTest { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAICT, there is no similar test (e.g., one using a mock CharsetProvider) for stdout.encoding. Will it be addressed by another ticket? Shall we consider adding a similar StdoutEncodingTest too? (Not necessarily in this PR.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
stdout.encoding validity is tested through the public charset() mehtod, which is in CharsetTest.java
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed there are stdout.encoding tests in test/jdk/java/io/Console, yet none1 that thoroughly tests them with expect and a dedicated (mock) CharsetProvider as you did here. FWIW, I really liked your new test using a mock CharsetProvider in combination with expect, hence my question for doing same for stdout and stderr too.
For the record, AFAICT, there are no tests for stderr.encoding.
1 There is script.exp, but it tests sun.stdout.encoding, not stdout.encoding.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually providing mock charset was a workaround of not having public method for getting the input encoding. I think it would be an overkill to introduce a new public method because it will not be used much, as most cases are suffice with the existing one (Console is used for interactive user enviornment, and I don't believe users would like to see different characters displayed for the input).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is script.exp, but it tests sun.stdout.encoding, not stdout.encoding.
It is addressed in this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the JBS bug ID needs to be added to CharsetTest.java as well then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bugid added to the test. Additionally replaced testsrc/jdk/classes with Utils ones
| * stdin.encoding} differs from {@link System##stdout.encoding | ||
| * stdout.encoding}, in which case read operations use the {@code Charset} | ||
| * designated by {@code stdin.encoding}. | ||
| * <p> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Console.charset() states "The returned charset is used for interpreting the input and output source (e.g., keyboard and/or display) specified by the host environment or user, which defaults to the one based on stdout.encoding." If stdin.encoding is set otherwise, this is no longer true, so I think this method may need a wording update as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. Brought the same wording to the charset() method description for further clarification.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused by the actual behavior here. What might be helpful is to divide the discussion between a) what charsets get used for input and output, and b) the return value of the charset() method.
I'm not entirely sure, but since stdin.encoding and stdout.encoding are always set to something -- whether it comes from the platform or the command line -- won't Console just use stdin.encoding for input and stdout.encoding for output? If this is true, maybe just say this instead of deferring to the charset() method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I agree with Stuart and it would be better to say that stdin.encoding is used for reading, and stdout.encoding for writing. They are usually the same but if they differ then Console will return the charset for output.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I reworded the descriptions of the class and charset() method.
| /** | ||
| * @test | ||
| * @bug 8356985 | ||
| * @summary Tests if "stdin.encoding" is reflected for reading |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be helpful to include why the test is limited to only linux and mac,
"expect" command in Windows/Cygwin does not work as expected. Ignoring tests on Windows.
| } | ||
|
|
||
| // invoking "expect" command | ||
| var testSrc = System.getProperty("test.src", "."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test already imports the JDK test lib, can we just replace this and the below ocurrences with jdk.test.lib.Utils.TEST_SRC/JDK/CLASSES directly?
| public void testStdinEncoding() throws Throwable { | ||
| // check "expect" command availability | ||
| var expect = Paths.get("/usr/bin/expect"); | ||
| if (!Files.exists(expect) || !Files.isExecutable(expect)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could use Assumptions.assumeTrue here. Condition becomes more readable as: Files.exists(expect) && Files.isExecutable(expect)
justin-curtis-lu
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New wording is straightforward and looks good to me.
|
I may be overthinking this, but let me just toss this out there to get people's opinions. This changes the API of Well, the IDEs can probably easily adapt based on the JDK version. Or, if we want to be nice, we can leave the two-arg |
Can they even possibly do so? |
Sure, as far as I know, IntelliJ IDEA runs on its own version of the JDK, and it can easily be invoked or modified to allow use of those modules. I took a quick look at NetBeans and didn't see any references to No need to worry about this now, then. Sorry for the distraction. |
FWIW, I am not aware about NetBeans using JdkConsoleProvider. |
|
IntelliJ seems to be going with |
AlanBateman
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spec update and src changes looks good.
I only skimmed through the test changes (not a detailed review) and they look reasonable. I assume you've done some "test repeat" jobs to ensure that any variance in the version of "expect" on test machines doesn't cause any issues.
Actually I have not. Thanks for reminding. So I ran this through our internal CI on Linux x64, macOS x64, and macOS AArch64. Across 10 tasks (i.e., machines), each with 30 repetitions, that’s a total of 900 test runs—all completed without failure. |
|
Thanks for the reviews! |
|
Going to push as commit b2a61a9.
Your commit was automatically rebased without conflicts. |
java.io.Consoleuses the charset specified by thestdout.encodingsystem property for both input and output. While this is generally sufficient, since Console is intended for interactive terminal use, some platforms allow different encodings to be configured for input and output. In such cases, using a single encoding may lead to incorrect behavior when reading from the terminal. To address this, the newly introduced system property,stdin.encoding, should be used specifically for input where appropriate.Progress
Issues
Reviewers
Reviewing
Using
gitCheckout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/25271/head:pull/25271$ git checkout pull/25271Update a local copy of the PR:
$ git checkout pull/25271$ git pull https://git.openjdk.org/jdk.git pull/25271/headUsing Skara CLI tools
Checkout this PR locally:
$ git pr checkout 25271View PR using the GUI difftool:
$ git pr show -t 25271Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/25271.diff
Using Webrev
Link to Webrev Comment