Enable Error Prone check: DefaultCharset#12313
Conversation
plugin/trino-atop/src/main/java/io/trino/plugin/atop/AtopProcessFactory.java
Outdated
Show resolved
Hide resolved
plugin/trino-atop/src/test/java/io/trino/plugin/atop/TestingAtopFactory.java
Outdated
Show resolved
Hide resolved
plugin/trino-druid/src/test/java/io/trino/plugin/druid/DruidQueryRunner.java
Outdated
Show resolved
Hide resolved
plugin/trino-local-file/src/main/java/io/trino/plugin/localfile/LocalFileRecordCursor.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
what does this read? where does modelData come from?
There was a problem hiding this comment.
Hm. Looks like nobody calls this method?
There was a problem hiding this comment.
Ah, it's called dynamically in io.trino.plugin.ml.ModelUtils#deserialize(io.airlift.slice.Slice). The byte[] data comes from theSlice. I have no idea what are the contents of the slice - some representation of the ML model, it appears, and it doesn't look like a text 🤔
There was a problem hiding this comment.
Upon further investigation: the model data is some binary metadata followed by a textual representation of the model. The text is generated by writing to a temporary file using java.io.DataOutputStream#writeBytes(String), which... treats the string as "a sequence of bytes" (out.write((byte)s.charAt(i)) 😱). So it's actually either US_ASCII or ISO_8859_1.
There was a problem hiding this comment.
ouch. use UTF-8 and add a comment that this may or may not be correct choice
.../trino-tpch/src/main/java/io/trino/plugin/tpch/statistics/TableStatisticsDataRepository.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
what is out? sounds like default charset can be appropriate, but idk
There was a problem hiding this comment.
this.out = new PrintStream(config.getEventLogFile());IOW, this is a log file.
There was a problem hiding this comment.
I'll change it to default then.
testing/trino-plugin-reader/src/main/java/io/trino/server/PluginReader.java
Outdated
Show resolved
Hide resolved
testing/trino-plugin-reader/src/main/java/io/trino/server/PluginReader.java
Outdated
Show resolved
Hide resolved
66c8b80 to
b0b963f
Compare
plugin/trino-ml/src/main/java/io/trino/plugin/ml/SvmClassifier.java
Outdated
Show resolved
Hide resolved
plugin/trino-ml/src/main/java/io/trino/plugin/ml/SvmRegressor.java
Outdated
Show resolved
Hide resolved
service/trino-verifier/src/main/java/io/trino/verifier/JsonEventClient.java
Outdated
Show resolved
Hide resolved
testing/trino-benchmark/src/main/java/io/trino/benchmark/SimpleLineBenchmarkResultWriter.java
Outdated
Show resolved
Hide resolved
testing/trino-plugin-reader/src/main/java/io/trino/server/PluginReader.java
Outdated
Show resolved
Hide resolved
testing/trino-plugin-reader/src/main/java/io/trino/server/PluginReader.java
Outdated
Show resolved
Hide resolved
This is focused on charsets, so it won't catch locale-related problems like `String#toLowerCase` with default locale. But it will catch `String` to `byte[]` (and vice versa) conversion with default charset, in addition to the IO stream related issues fixed in this commit. The existing cases were fixed to explicitly use `defaultCharset()` where the data being read comes from (or is written to a file on) the local system, otherwise they are fixed to use `UTF_8` - unless there are special considerations to use a different charset.
b0b963f to
9c8345e
Compare
|
AC, thanks @findepi |
Description
This is focused on charsets, so it won't catch locale-related problems like
String#toLowerCasewith default locale. But it will catchStringtobyte[](and vice versa) conversion with default charset, in addition to the IO stream related issues fixed in this commit.The existing cases were fixed to explicitly use
defaultCharset()where the data being read comes from (or is written to a file on) the local system, otherwise they are fixed to useUTF_8- unless there are special considerations to use a different charset.Static analysis improvement.
All over the code (some fixes in places reported by Error Prone).
Some internal improvements that should make everything slightly more stable.
Related issues, pull requests, and links
https://errorprone.info/bugpattern/DefaultCharset
Documentation
(x) No documentation is needed.
( ) Sufficient documentation is included in this PR.
( ) Documentation PR is available with #prnumber.
( ) Documentation issue #issuenumber is filed, and can be handled later.
Release notes
(x) No release notes entries required.
( ) Release notes entries required with the following suggested text: