Upgrade kudu client to 1.15.0#10940
Conversation
|
Want to let the CI process run so there is a build showing the timeouts causing query failure. After that, will push another commit to actually resolve the timeout problems by setting |
|
Also if we want to split the kudu client upgrade from the background flushing problem let me know. There just is not a way to [easily] confirm the background flushing problem with kudu client 1.10.0 since the timeouts never/rarely happen. |
a67b228 to
7f3eb5e
Compare
plugin/trino-kudu/src/main/java/io/trino/plugin/kudu/TypeHelper.java
Outdated
Show resolved
Hide resolved
de3d4de to
d32db3f
Compare
There was a problem hiding this comment.
Was this introduced in the latest version ?
There was a problem hiding this comment.
No, this config also existed in 1.10.0.
See the PR description, the only way I could reproduce this bug (#5687) was by upgrading the kudu client to 1.15.0 which triggered timeouts to kudu.
That is why I fixed this bug in this upgrade client PR.
There was a problem hiding this comment.
Can we extract it from the version update ? It looks like OperationResponse is returned in old APIs too.
There was a problem hiding this comment.
Same comment as above, there is no way to easily verify the fix because you need a very specific sequence of timeouts [no timeouts when initially connecting to kudu, but timeouts during deletes/upserts/etc]. The kudu 1.15.0 client [accidentally] provides those timeouts due to a bug in the client.
See here for the failing kudu tests when upgrading to kudu 1.15.0:
https://github.com/trinodb/trino/runs/5057440792?check_suite_focus=true
And here for the workaround that makes tests pass:
https://github.com/trinodb/trino/runs/5060298096?check_suite_focus=true
If we do not mind having no tests around the change, I can extract this into a separate PR.
plugin/trino-kudu/src/main/java/io/trino/plugin/kudu/KuduRecordCursor.java
Outdated
Show resolved
Hide resolved
plugin/trino-kudu/src/test/java/io/trino/plugin/kudu/TestingKuduServer.java
Outdated
Show resolved
Hide resolved
d32db3f to
9a02003
Compare
hashhar
left a comment
There was a problem hiding this comment.
I believe all needs to be squashed except the last commit.
LGTM
plugin/trino-kudu/src/main/java/io/trino/plugin/kudu/TypeHelper.java
Outdated
Show resolved
Hide resolved
5d09fa3 to
5e2a90c
Compare
hashhar
left a comment
There was a problem hiding this comment.
LGTM % doc update.
Please update "Requirements" in Kudu docs to specify minimum supported version as 1.15 now. It might work with older versions but we don't test it now - if we intend to claim otherwise then add a test extending BaseConnectorSmokeTest with an older Kudu version.
5e2a90c to
6e51247
Compare
Kudu 1.10.0 is pretty old (>2years, released on November 1, 2019) so I just updated the docs to say we only support 1.15.0 or higher |
|
That's fair. 1.13 is oldest supported release anyway according to https://kudu.apache.org/releases. |
If we care about supporting 1.13 lmk, I can add an additional set of smoke tests in. |
6e51247 to
07dff60
Compare
Would be nice to do if it's not too much work otherwise we can tackle that separately. Please remember to update docs accordingly. |
07dff60 to
5d749ef
Compare
plugin/trino-kudu/src/test/java/io/trino/plugin/kudu/TestKudu115SmokeTests.java
Outdated
Show resolved
Hide resolved
plugin/trino-kudu/src/test/java/io/trino/plugin/kudu/TestKudu113SmokeTests.java
Outdated
Show resolved
Hide resolved
plugin/trino-kudu/src/test/java/io/trino/plugin/kudu/TestingKuduServer.java
Outdated
Show resolved
Hide resolved
plugin/trino-kudu/src/main/java/io/trino/plugin/kudu/KuduClientConfig.java
Outdated
Show resolved
Hide resolved
plugin/trino-kudu/src/main/java/io/trino/plugin/kudu/KuduPageSink.java
Outdated
Show resolved
Hide resolved
plugin/trino-kudu/src/main/java/io/trino/plugin/kudu/KuduUpdatablePageSource.java
Outdated
Show resolved
Hide resolved
...ino-kudu/src/main/java/io/trino/plugin/kudu/schema/SchemaEmulationByTableNameConvention.java
Outdated
Show resolved
Hide resolved
c29de25 to
796587a
Compare
Upgrading the kudu client revealed a few problems: 1. Timeouts to kudu tablets were sometimes occurring during deletes due to a bug in the kudu java client in version 1.13.0. 2. Timeouts were *not* failing query execution because the kudu connector was configured to flush operations in the background. 3. The two combined above meant tests that did deletes sometimes actually did not perform deletes and would fail. This patch upgrades the kudu client, explicitly fails trino execution when kudu rpcs timeout, and marks unsupported data types from kudu 1.15.0.
Does not do anything in kudu 1.15.0
796587a to
1e28ba5
Compare
| { | ||
| private static final String KUDU_VERSION = "1.13.0"; | ||
|
|
||
| public static class TestKuduSmokeTestWithDisabledInferSchema |
There was a problem hiding this comment.
Any reason this is not defined as a top-level class?
There was a problem hiding this comment.
See #10940 (comment). To keep all tests for same version together.
There was a problem hiding this comment.
Sharing a constant doesn't require nesting classes.
i admire cleverness, but this also means unnecessary class hierarchy, which doesn't help browse the code.
if the paradigm was more frequent in the code base, i would probably get used to it and wouldn't complain.
There was a problem hiding this comment.
I noticed depending on how you run the test:
mvn test -Dtest=io.trino.plugin.kudu.KuduLatestConnectorTests
mvn test
some tests will be skipped in the class hierarchy.
(EDIT: BaseConnectorTest has a test to ensure the class name ends in ConnectorTest, for some reason when this is not satisfied for a static inner test class no test failure happens and instead the test gets skipped)
Additionally, when trying to run tests through intellij's UI you can only run the static inner classes (not the top level class).
Seems like the tooling support for static inner test classes is just not good, I'm going to move these to top level classes
Upgrading the kudu client revealed a few problems:
Timeouts to kudu tablets were sometimes occurring during deletes due to this change introduced in the kudu java client in version 1.13.0: apache/kudu@d23ee5d#diff-f1f50409d81052b8f8d7aea7da663c185c704c6206cb0ec901114f4d9ee8c28f
(see here for the reason why that commit broke the client: https://gerrit.cloudera.org/#/c/18166/)
Timeouts were not failing query execution because the kudu connector was configured to flush operations in the background.
The two combined above meant tests that did deletes sometimes actually did not perform deletes and would fail.
The first commit upgrades the kudu client and additionally explicitly fails trino execution when kudu rpcs timeout. This resolves: #5687
The second commit fixes the problem of one source of kudu timeouts in the 1.15.0 client .