Skip to content

Commit 9ec0f09

Browse files
ckunkikaklakariada
andauthored
Bug/26 charset conversion (#33)
* updated version, pk fix, artifact references and added documentation * #26: Enabled to use MySQL database with character set `latin1` and characters not strictly ASCII. * updated to latest PR of VSCJDBC * removed repository maven.exasol.com * upgraded to exasol-testcontainers 6.4.0 * excluded vulnerablity Co-authored-by: Christoph Pirkl <[email protected]>
1 parent 3bce096 commit 9ec0f09

File tree

13 files changed

+368
-36
lines changed

13 files changed

+368
-36
lines changed

doc/changes/changelog.md

Lines changed: 1 addition & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

doc/changes/changes_4.1.0.md

Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
# Virtual Schema for MySQL 4.1.0, released 2022-12-05
2+
3+
Code name: Configurable datatype detection
4+
5+
## Summary
6+
7+
Virtual-schema-common-jdbc version 10.0.0 introduced enhanced detection for data types of result sets.
8+
9+
Unfortunately with the new algorithm compatibility problems with the source database can happen under the following circumstances:
10+
11+
* data type `CHAR` or `VARCHAR`
12+
* 8-bit character sets with encodings like `latin1` or `ISO-8859-1`
13+
* characters being not strictly ASCII, e.g. German umlaut "Ü"
14+
15+
The current release therefore uses an updated version of `virtual-schema-common-jdbc` with an additional adapter property to configure the data type detection.
16+
17+
For details please see [adapter Properties for JDBC-Based Virtual Schemas](https://github.com/exasol/virtual-schema-common-jdbc/blob/main/README.md#adapter-properties-for-jdbc-based-virtual-schemas).
18+
19+
## Bugfixes
20+
21+
* #26: Enabled to use MySQL database with character set `latin1` and characters not strictly ASCII.
22+
23+
## Dependency Updates
24+
25+
### Compile Dependency Updates
26+
27+
* Updated `com.exasol:virtual-schema-common-jdbc:10.0.1` to `10.1.0`
28+
29+
### Test Dependency Updates
30+
31+
* Updated `com.exasol:exasol-testcontainers:6.3.1` to `6.4.0`
32+
* Updated `com.exasol:virtual-schema-common-jdbc:10.0.1` to `10.1.0`

doc/user_guide/mysql_user_guide.md

Lines changed: 8 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,7 @@ You need to specify the following settings when adding the JDBC driver via EXAOp
2828
| Port (optional) | default 3306 |
2929

3030
IMPORTANT: Currently you have to **Disable Security Manager** for the driver if you want to connect to MySQL using Virtual Schemas.
31-
It is necessary because JDBC driver requires a JAVA permission which we do not grant by default.
31+
It is necessary because JDBC driver requires a JAVA permission which we do not grant by default.
3232

3333
## Uploading the JDBC Driver to BucketFS
3434

@@ -50,10 +50,11 @@ CREATE SCHEMA SCHEMA_FOR_VS_SCRIPT;
5050
The SQL statement below creates the adapter script, defines the Java class that serves as entry point and tells the UDF framework where to find the libraries (JAR files) for Virtual Schema and JDBC database driver.
5151

5252
```sql
53+
--/
5354
CREATE OR REPLACE JAVA ADAPTER SCRIPT SCHEMA_FOR_VS_SCRIPT.ADAPTER_SCRIPT_MYSQL AS
5455
%scriptclass com.exasol.adapter.RequestDispatcher;
55-
%jar /buckets/<BFS service>/<bucket>/virtual-schema-dist-10.0.1-mysql-4.0.1.jar;
56-
%jar /buckets/<BFS service>/<bucket>/mysql-connector-j-<version>.jar;
56+
%jar /buckets/<BFS service>/<bucket>/virtual-schema-dist-10.1.0-mysql-4.1.0.jar;
57+
%jar /buckets/<BFS service>/<bucket>/mysql-connector-java-<version>.jar;
5758
/
5859
;
5960
```
@@ -81,6 +82,8 @@ CREATE VIRTUAL SCHEMA <virtual schema name>
8182
CATALOG_NAME = '<database name>';
8283
```
8384

85+
See also [Adapter Properties for JDBC-Based Virtual Schemas](https://github.com/exasol/virtual-schema-common-jdbc#adapter-properties-for-jdbc-based-virtual-schemas).
86+
8487
## Data Types Conversion
8588

8689
| MySQL Data Type | Supported | Converted Exasol Data Type| Known limitations |
@@ -115,8 +118,8 @@ CREATE VIRTUAL SCHEMA <virtual schema name>
115118
| VARCHAR || VARCHAR | |
116119
| YEAR || DATE | |
117120

118-
* The tested versions of MySQL Connector JDBC Driver return the column's size depending on the charset and its collation.
119-
As the real data in a MySQL table can sometimes exceed the size that we get from the JDBC driver, we set the size for all TEXT columns to 65535 characters.
121+
* The tested versions of MySQL Connector JDBC Driver return the column's size depending on the charset and its collation.
122+
As the real data in a MySQL table can sometimes exceed the size that we get from the JDBC driver, we set the size for all TEXT columns to 65535 characters.
120123

121124
If you need to use currently unsupported data types or find a way around known limitations, please, create a github issue in the [VS repository](https://github.com/exasol/virtual-schemas/issues).
122125

pk_generated_parent.pom

Lines changed: 1 addition & 1 deletion
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

pom.xml

Lines changed: 11 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -2,23 +2,14 @@
22
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
33
<modelVersion>4.0.0</modelVersion>
44
<artifactId>mysql-virtual-schema</artifactId>
5-
<version>4.0.1</version>
5+
<version>4.1.0</version>
66
<name>Virtual Schema for MySQL</name>
77
<description>Virtual Schema for MySQL</description>
88
<url>https://github.com/exasol/mysql-virtual-schema/</url>
99
<properties>
10-
<vscjdbc.version>10.0.1</vscjdbc.version>
10+
<vscjdbc.version>10.1.0</vscjdbc.version>
1111
<org.testcontainers.version>1.17.6</org.testcontainers.version>
1212
</properties>
13-
<repositories>
14-
<repository>
15-
<id>maven.exasol.com</id>
16-
<url>https://maven.exasol.com/artifactory/exasol-releases</url>
17-
<snapshots>
18-
<enabled>false</enabled>
19-
</snapshots>
20-
</repository>
21-
</repositories>
2213
<dependencies>
2314
<dependency>
2415
<groupId>com.exasol</groupId>
@@ -55,7 +46,7 @@
5546
<dependency>
5647
<groupId>com.exasol</groupId>
5748
<artifactId>exasol-testcontainers</artifactId>
58-
<version>6.3.1</version>
49+
<version>6.4.0</version>
5950
<scope>test</scope>
6051
</dependency>
6152
<dependency>
@@ -155,7 +146,13 @@
155146
<excludeVulnerabilityIds>
156147
<!-- False positive in snakeyaml. According to https://bitbucket.org/snakeyaml/snakeyaml/issues/531/stackoverflow-oss-fuzz-47081
157148
this is already fixed in 1.32. -->
158-
<exclude>CVE-2022-38752</exclude>
149+
<exclude>xCVE-2022-38752</exclude>
150+
<!-- Exclude vulnerabilities found in transitive dependency
151+
to org.yaml:snakeyaml:jar:1.33
152+
required by virtual-schema-shared-integration-tests
153+
as there is no update available currently -->
154+
<exclude>CVE-2022-1471</exclude>
155+
<exclude>CVE-2022-40150</exclude>
159156
</excludeVulnerabilityIds>
160157
</configuration>
161158
</plugin>
@@ -171,7 +168,7 @@
171168
<parent>
172169
<artifactId>mysql-virtual-schema-generated-parent</artifactId>
173170
<groupId>com.exasol</groupId>
174-
<version>4.0.1</version>
171+
<version>4.1.0</version>
175172
<relativePath>pk_generated_parent.pom</relativePath>
176173
</parent>
177174
</project>

src/test/java/com/exasol/adapter/dialects/mysql/IntegrationTestConstants.java

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
import java.nio.file.Path;
44

55
public final class IntegrationTestConstants {
6-
public static final String VIRTUAL_SCHEMAS_JAR_NAME_AND_VERSION = "virtual-schema-dist-10.0.1-mysql-4.0.1.jar";
6+
public static final String VIRTUAL_SCHEMAS_JAR_NAME_AND_VERSION = "virtual-schema-dist-10.1.0-mysql-4.1.0.jar";
77
public static final String EXASOL_DOCKER_IMAGE_REFERENCE = "7.1.14";
88
public static final String MYSQL_DOCKER_IMAGE_REFERENCE = "mysql:8.0.30";
99
public static final Path PATH_TO_VIRTUAL_SCHEMAS_JAR = Path.of("target", VIRTUAL_SCHEMAS_JAR_NAME_AND_VERSION);

src/test/java/com/exasol/adapter/dialects/mysql/MySQLSqlDialectIT.java

Lines changed: 73 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -4,21 +4,26 @@
44
import static com.exasol.matcher.ResultSetMatcher.matchesResultSet;
55
import static com.exasol.matcher.ResultSetStructureMatcher.table;
66
import static org.hamcrest.MatcherAssert.assertThat;
7-
import static org.junit.jupiter.api.Assertions.assertDoesNotThrow;
8-
import static org.junit.jupiter.api.Assertions.fail;
7+
import static org.hamcrest.Matchers.matchesRegex;
8+
import static org.junit.jupiter.api.Assertions.*;
99

1010
import java.io.IOException;
1111
import java.sql.*;
12-
import java.util.Collections;
13-
import java.util.List;
12+
import java.util.*;
1413
import java.util.stream.Collectors;
1514

1615
import org.hamcrest.Matcher;
16+
import org.junit.Assume;
1717
import org.junit.jupiter.api.*;
1818
import org.testcontainers.junit.jupiter.Testcontainers;
1919

20+
import com.exasol.adapter.dialects.DataTypeDetection;
21+
import com.exasol.adapter.dialects.mysql.charset.ColumnInspector;
22+
import com.exasol.adapter.dialects.mysql.charset.Version;
23+
import com.exasol.containers.ExasolDockerImageReference;
2024
import com.exasol.dbbuilder.dialects.*;
2125
import com.exasol.dbbuilder.dialects.exasol.VirtualSchema;
26+
import com.exasol.dbbuilder.dialects.mysql.MySQLIdentifier;
2227
import com.exasol.dbbuilder.dialects.mysql.MySqlSchema;
2328
import com.exasol.matcher.TypeMatchMode;
2429

@@ -90,13 +95,11 @@ private ResultSet getExpectedResultSet(final List<String> expectedColumns, final
9095
+ String.join(", ", expectedColumns) + ")");
9196
statement.execute("INSERT INTO " + qualifiedExpectedTableName + " VALUES" + expectedValues);
9297
return statement.executeQuery("SELECT * FROM " + qualifiedExpectedTableName);
93-
9498
}
9599

96100
private ResultSet getActualResultSet(final String query) throws SQLException {
97101
final Statement statement = SETUP.getExasolStatement();
98102
return statement.executeQuery(query);
99-
100103
}
101104

102105
private static void createMySqlSimpleTable(final Schema mySqlSchema) {
@@ -145,6 +148,70 @@ private static void createMySqlStringTable(final Schema mySqlSchema) {
145148
table.insert(null, null, "aaaaa", "a", "blob", "text", null, null, null, null, null, null, null, null);
146149
}
147150

151+
@Test
152+
void importDataTypesFromResultSet() throws SQLException {
153+
Assume.assumeTrue(runCharsetTest());
154+
final String query = setupCharacterSet(DataTypeDetection.Strategy.FROM_RESULT_SET);
155+
final ResultSet actual = getActualResultSet(query);
156+
final ResultSet expected = getExpectedResultSet(List.of("c1 CHAR(1) UTF8", "c2 CHAR(1) UTF8"), //
157+
List.of(SPECIAL_CHAR_QUOTED + ", " + SPECIAL_CHAR_QUOTED));
158+
assertThat(actual, matchesResultSet(expected));
159+
}
160+
161+
@Test
162+
void importDataTypesExasolCalculated() throws SQLException {
163+
Assume.assumeTrue(runCharsetTest());
164+
final String query = setupCharacterSet(DataTypeDetection.Strategy.EXASOL_CALCULATED);
165+
final Exception exception = assertThrows(SQLException.class, () -> getActualResultSet(query));
166+
assertThat(exception.getMessage(),
167+
matchesRegex("ETL-3009: .*Charset conversion from 'UTF-8' to 'ASCII' failed.*"));
168+
}
169+
170+
private boolean runCharsetTest() {
171+
final ExasolDockerImageReference dockerImage = SETUP.getExasolContainer().getDockerImageReference();
172+
if (!dockerImage.hasMajor() || !dockerImage.hasMinor() || !dockerImage.hasFix()) {
173+
return false;
174+
}
175+
final Version version = Version.of(dockerImage.getMajor(), dockerImage.getMinor(), dockerImage.getFixVersion());
176+
if ((dockerImage.getMajor() == 7) && version.isGreaterOrEqualThan(Version.parse("7.1.14"))) {
177+
return true;
178+
}
179+
if ((dockerImage.getMajor() == 8) && version.isGreaterOrEqualThan(Version.parse("8.6.0"))) {
180+
return true;
181+
}
182+
return false;
183+
}
184+
185+
private String setupCharacterSet(final DataTypeDetection.Strategy strategy) throws SQLException {
186+
final String tableName = MYSQL_SOURCE_TABLE;
187+
createMySqlTableWithCharacterSet(MYSQL_SOURCE_SCHEMA, tableName, "latin1");
188+
this.virtualSchema = SETUP.createVirtualSchema( //
189+
Map.of(DataTypeDetection.STRATEGY_PROPERTY, strategy.name()), //
190+
MYSQL_SOURCE_SCHEMA);
191+
final ColumnInspector inspector = SETUP.getColumnInspector(MYSQL_SOURCE_SCHEMA);
192+
inspector.describeFromMetadata(MYSQL_SOURCE_SCHEMA, MYSQL_SOURCE_TABLE);
193+
final String query = String.format("select * from %s.%s", MYSQL_SOURCE_SCHEMA, MYSQL_SOURCE_TABLE);
194+
inspector.describeFromQuery(MYSQL_SOURCE_SCHEMA, query);
195+
return "SELECT * FROM " + this.virtualSchema.getName() + "." + tableName;
196+
}
197+
198+
private static final char[] GERMAN_UMLAUT = { 0xDC };
199+
private static final String SPECIAL_CHAR = new String(GERMAN_UMLAUT);
200+
private static final String SPECIAL_CHAR_QUOTED = "'" + SPECIAL_CHAR + "'";
201+
202+
private void createMySqlTableWithCharacterSet(final String schemaName, final String tableName,
203+
final String characterSet) {
204+
this.sourceSchema = getSchemaWithCharacterSet(schemaName, "latin1");
205+
final String mySqlEnum = "ENUM('A', " + SPECIAL_CHAR_QUOTED + ")";
206+
final Table table = this.sourceSchema.createTable(tableName, List.of("c1", "c2"),
207+
List.of("CHAR(1)", mySqlEnum));
208+
table.insert(SPECIAL_CHAR, SPECIAL_CHAR);
209+
}
210+
211+
private static MySqlSchema getSchemaWithCharacterSet(final String schemaName, final String characterSet) {
212+
return new MySqlSchema(SETUP.getTableWriterWithCharacterSet(characterSet), MySQLIdentifier.of(schemaName));
213+
}
214+
148215
@Test
149216
void testSelectAll() throws SQLException {
150217
final String query = "SELECT * FROM " + virtualSchemaJdbc + "." + MYSQL_SIMPLE_TABLE;

src/test/java/com/exasol/adapter/dialects/mysql/MySQLSqlDialectTest.java

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,7 @@
2626
import org.mockito.junit.jupiter.MockitoExtension;
2727

2828
import com.exasol.adapter.AdapterProperties;
29+
import com.exasol.adapter.dialects.DataTypeDetection;
2930
import com.exasol.adapter.dialects.SqlDialect;
3031
import com.exasol.adapter.jdbc.ConnectionFactory;
3132
import com.exasol.adapter.jdbc.RemoteMetadataReaderException;
@@ -139,7 +140,8 @@ void testCreateRemoteMetadataReaderConnectionFails() throws SQLException {
139140
void testGetSupportedProperties() {
140141
assertThat(this.dialect.getSupportedProperties(),
141142
containsInAnyOrder(CONNECTION_NAME_PROPERTY, TABLE_FILTER_PROPERTY, CATALOG_NAME_PROPERTY,
142-
EXCLUDED_CAPABILITIES_PROPERTY, DEBUG_ADDRESS_PROPERTY, LOG_LEVEL_PROPERTY));
143+
EXCLUDED_CAPABILITIES_PROPERTY, DEBUG_ADDRESS_PROPERTY, LOG_LEVEL_PROPERTY,
144+
DataTypeDetection.STRATEGY_PROPERTY));
143145
}
144146

145147
@Test

src/test/java/com/exasol/adapter/dialects/mysql/MySQLVirtualSchemaIntegrationTestSetup.java

Lines changed: 11 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -6,18 +6,22 @@
66
import java.io.Closeable;
77
import java.io.FileNotFoundException;
88
import java.sql.*;
9-
import java.util.*;
9+
import java.util.HashMap;
10+
import java.util.Map;
1011
import java.util.concurrent.TimeoutException;
1112
import java.util.logging.Level;
1213
import java.util.logging.Logger;
1314

1415
import org.testcontainers.containers.MySQLContainer;
1516

17+
import com.exasol.adapter.dialects.mysql.charset.ColumnInspector;
18+
import com.exasol.adapter.dialects.mysql.charset.TableWriterWithCharacterSet;
1619
import com.exasol.bucketfs.Bucket;
1720
import com.exasol.bucketfs.BucketAccessException;
1821
import com.exasol.containers.ExasolContainer;
1922
import com.exasol.containers.ExasolService;
2023
import com.exasol.dbbuilder.dialects.exasol.*;
24+
import com.exasol.dbbuilder.dialects.mysql.MySqlImmediateDatabaseObjectWriter;
2125
import com.exasol.dbbuilder.dialects.mysql.MySqlObjectFactory;
2226
import com.exasol.errorreporting.ExaError;
2327
import com.exasol.udfdebugging.UdfTestSetup;
@@ -154,13 +158,12 @@ public VirtualSchema createVirtualSchema(final Map<String, String> additionalPro
154158
.properties(properties).build();
155159
}
156160

157-
Map<String, String> debugProperties() {
158-
final String debugAddress = System.getProperty("com.exasol.virtualschema.debug.address");
159-
if (debugAddress == null) {
160-
return Collections.emptyMap();
161-
}
162-
final String logLevel = System.getProperty("com.exasol.virtualschema.debug.level");
163-
return Map.of("DEBUG_ADDRESS", debugAddress, "LOG_LEVEL", (logLevel != null ? logLevel : "ALL"));
161+
public MySqlImmediateDatabaseObjectWriter getTableWriterWithCharacterSet(final String characterSet) {
162+
return new TableWriterWithCharacterSet(this.mySqlConnection, characterSet);
163+
}
164+
165+
public ColumnInspector getColumnInspector(final String catalogName) {
166+
return ColumnInspector.from(this.mySqlConnection, catalogName);
164167
}
165168

166169
@Override

0 commit comments

Comments
 (0)