[Coral-Trino] Cast char fields, if necessary, to varchar type in the set operation - SqlCallTransformer#442
Conversation
coral-trino/src/test/java/com/linkedin/coral/trino/rel2trino/TestUtils.java
Show resolved
Hide resolved
SqlCallTransformer
3a4dda3 to
37f39d3
Compare
|
|
||
| Optional<RelDataType> selectNodeDataType; | ||
| try { | ||
| selectNodeDataType = Optional.of(deriveRelDatatype(sqlNode)); |
There was a problem hiding this comment.
@aastha25 in some cases while running the HiveToTrinoConverterTest#testViews I stumbled on RuntimeException issued from deriveRelDataType(SqlNode).
java.lang.RuntimeException: Failed to derive the RelDataType for SqlNode: CAST(row("b"."b1") as row("b1" varchar)) AS `b` with topSqlNode: SELECT `tablef`.`a`, `generic_project`(`tablef`.`b`, 'b', 'struct<b1:string>') AS `b`
FROM `test`.`tablef` AS `tablef`
at com.linkedin.coral.common.utils.TypeDerivationUtil.getRelDataType(TypeDerivationUtil.java:87)
at com.linkedin.coral.common.transformers.SqlCallTransformer.deriveRelDatatype(SqlCallTransformer.java:68)
Is this expected?
I currently workaround this limitation by just working with column types for which the type can actually be derived.
Also side note - Why throwing the very generic RuntimeException in TypeDerivationUtil instead of specialized exceptions?
There was a problem hiding this comment.
@findinpath were you able to isolate the test case which is leading to the failure?
There was a problem hiding this comment.
When not catching the exception
try {
selectNodeDataType = Optional.of(deriveRelDatatype(sqlNode));
} catch (RuntimeException e) {
// The type derivation may fail for complex expressions
selectNodeDataType = Optional.empty();
}
, the following tests will be failing:
fuzzy_union_view_double_branch_evolved_differentfuzzy_union_view_more_than_two_branches_evolvedfuzzy_union_view_map_with_struct_value_evolvedfuzzy_union_view_array_with_struct_value_evolvedfuzzy_union_view_deeply_nested_struct_evolvedfuzzy_union_view_deeply_nested_complex_struct_evolved
There was a problem hiding this comment.
@ljfgem Can we fix the generic type derivation method and also throw the underlying error?
...o/src/main/java/com/linkedin/coral/trino/rel2trino/transformers/UnionSqlCallTransformer.java
Outdated
Show resolved
Hide resolved
...o/src/main/java/com/linkedin/coral/trino/rel2trino/transformers/UnionSqlCallTransformer.java
Outdated
Show resolved
Hide resolved
| } | ||
|
|
||
| List<SqlNode> updatedOperands = new ArrayList<>(sqlCall.getOperandList().size()); | ||
| for (SqlNode operand : operandsList) { |
There was a problem hiding this comment.
Any reason we iterate on the operandsList twice? Can we merge this loop and the above? I see below we cast to SqlSelect in the first loop we check if it is SqlSelect. In both we iterate on the selectNodes, etc. I think there is room for consolidation? PS. In any case let us reuse the naming conventions suggested above.
There was a problem hiding this comment.
We need to find the least restrictive type for each select item index from the operands - this is one looping.
Once we have the least restrictive type, we need another looping to verify whether the operands need to be updated.
I didn't find a better alternative to do this.
...o/src/main/java/com/linkedin/coral/trino/rel2trino/transformers/UnionSqlCallTransformer.java
Outdated
Show resolved
Hide resolved
...o/src/main/java/com/linkedin/coral/trino/rel2trino/transformers/UnionSqlCallTransformer.java
Outdated
Show resolved
Hide resolved
...o/src/main/java/com/linkedin/coral/trino/rel2trino/transformers/UnionSqlCallTransformer.java
Outdated
Show resolved
Hide resolved
...o/src/main/java/com/linkedin/coral/trino/rel2trino/transformers/UnionSqlCallTransformer.java
Outdated
Show resolved
Hide resolved
|
@KevinGe00 Can we run integration tests on this PR? Trino tests are enough. |
be01343 to
a2bf267
Compare
…set operation In case of dealing with Hive views which make use of set operation (e.g. UNION) ensure that the `char` fields from the inner SELECT statements have the same type as the output field types of the set operation. Due to wrong coercion between `varchar` and `char` in Trino, as described in trinodb/trino#9031 , a work-around needs to be applied in case of translating Hive views which contain a UNION dealing with char and varchar types. The work-around consists in the explicit cast of the field having char type towards varchar type corresponding of the set operation output type.
a2bf267 to
61aa89d
Compare
|
@wmoustafa addressed comments and rebased on |
wmoustafa
left a comment
There was a problem hiding this comment.
Confirmed with @KevinGe00 that integration tests passed. Thanks @findinpath for working on this. Appreciate the spin-off from #438.
… in the set operation (linkedin#442)" This reverts commit 4bb38a1.
… in the set operation (linkedin#442)" This reverts commit 4bb38a1. # Conflicts: # coral-trino/src/main/java/com/linkedin/coral/trino/rel2trino/DataTypeDerivedSqlCallConverter.java
What changes are proposed in this pull request, and why are they necessary?
In case of dealing with Hive views which make use of set operation (e.g. UNION) ensure that the
charfields from the inner SELECT statements have the same type as the output field types of the set operation.Due to wrong coercion between
varcharandcharin Trino, as described in trinodb/trino#9031, a work-around needs to be applied in case of translating Hive views which contain a UNION dealing with char and varchar types. The work-around consists in the explicit cast of the field having char type towards varchar type corresponding of the set operation output type.
This PR is a spin-off from #438 using a slighly different functionality to achieve a similar functionality.
From a technical design point of view, the present PR extends
SqlCallTransformerto modify on the fly the select branch of the set operation which needs explicit casting for its char columns (which need to be changed to varchar type due to trinodb/trino#9031 Trino limitation.How was this patch tested?
The patch has been tested with automatic unit tests.