Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] [Maxcompute] Failed to parse some maxcompute type #3894

Merged
merged 18 commits into from
Mar 4, 2023

Conversation

stdnt-xiao
Copy link
Contributor

@stdnt-xiao stdnt-xiao commented Jan 7, 2023

Purpose of this pull request

close #3893

I have tested most data types. eg

set odps.sql.hive.compatible=true;

DROP TABLE IF EXISTS fake_source;

CREATE TABLE IF NOT EXISTS fake_source(c1 TINYINT,c2 SMALLINT,c3 INT,c4 BIGINT,c5 FLOAT ,c6 DOUBLE
,c7 VARCHAR(10),c8 CHAR(10),c9 STRING,c10 DATE,c11 DATETIME ,c12 TIMESTAMP ,c13 BOOLEAN,c14 BINARY
,c15 MAP<STRING,STRING>,c16 ARRAY,c17 STRUCT<s1:STRING,s2:INT,s3:ARRAY>);

INSERT INTO fake_source(c1, c2, c3, c4, c5, c6, c7,c8,c9,c10,c11,c12,c13,c14,c15,c16,c17) VALUES (
CAST(-128 AS TINYINT ),CAST(-32768 AS SMALLINT ) ,0,10000000000000,0.01,0.0000000000000001
,CAST("varchar" as VARCHAR(10)),CAST("char" as CHAR(10)),"hello0",CAST("2022-12-31" as DATE )
,CAST("2022-12-31 23:59:59" as DATETIME ),CAST("2022-12-31 23:59:59.999" as TIMESTAMP ),FALSE,CAST("bytes" AS BINARY )
,MAP("int",1,"str","hello"),ARRAY("11","22"),named_struct("s1","s1","s2",100,"s3",array(1.1, 2.2)));

SELECT * FROM fake_source;

Check list

@stdnt-xiao stdnt-xiao changed the title [Bug] [Maxcompute] Failed to parse some maxcompute type #3893 [Bug] [Maxcompute] Failed to parse some maxcompute type Jan 7, 2023
@stdnt-xiao stdnt-xiao closed this Jan 7, 2023
@stdnt-xiao stdnt-xiao reopened this Jan 7, 2023
@stdnt-xiao
Copy link
Contributor Author

@TaoZex
Can you give me some reminders? thx.


SimpleArrayTypeInfo(TypeInfo typeInfo) {
if (typeInfo == null) {
throw new IllegalArgumentException("Invalid element type.");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

throw new MaxcomputeConnectorException(CommonErrorCode.UNSUPPORTED_DATA_TYPE)

this.keyType = keyType;
this.valueType = valueType;
} else {
throw new IllegalArgumentException("Invalid key or value type for map.");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above.

private void validateParameters(List<String> names, List<TypeInfo> typeInfos) {
if (names != null && typeInfos != null && !names.isEmpty() && !typeInfos.isEmpty()) {
if (names.size() != typeInfos.size()) {
throw new IllegalArgumentException("The amount of field names must be equal to the amount of field types.");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above.

throw new IllegalArgumentException("The amount of field names must be equal to the amount of field types.");
}
} else {
throw new IllegalArgumentException("Invalid name or element type for struct.");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above.

@stdnt-xiao
Copy link
Contributor Author

@TyrantLucifer
I have fixed the problem you pointed out. thx

@Hisoka-X
Copy link
Member

Can you help to add MaxCompute E2E, if so will be very helpful.

@stdnt-xiao
Copy link
Contributor Author

stdnt-xiao commented Jan 17, 2023

Can you help to add MaxCompute E2E, if so will be very helpful.

@Hisoka-X
I'm happy to complete it, but maxcompute is a commercial product of Alibaba Cloud. I can't build an instance through docker image.

@Hisoka-X
Copy link
Member

Can you help to add MaxCompute E2E, if so will be very helpful.

@Hisoka-X I'm happy to complete it, but maxcompute is a commercial product of Alibaba Cloud. I can't build an instance through docker image.

Got it.

@CalvinKirs
Copy link
Member

Can you help to add MaxCompute E2E, if so will be very helpful.↳

@Hisoka-X I'm happy to complete it, but maxcompute is a commercial product of Alibaba Cloud. I can't build an instance through docker image.↳

It would be even better if you can provide the corresponding test code, you can set it to disable, that is, it will not be executed by default, but other developers, they can perform simple configurations according to their own environment to execute IT

@stdnt-xiao
Copy link
Contributor Author

Can you help to add MaxCompute E2E, if so will be very helpful.↳

@Hisoka-X I'm happy to complete it, but maxcompute is a commercial product of Alibaba Cloud. I can't build an instance through docker image.↳

It would be even better if you can provide the corresponding test code, you can set it to disable, that is, it will not be executed by default, but other developers, they can perform simple configurations according to their own environment to execute IT

OK, I will supplement.

@stdnt-xiao
Copy link
Contributor Author

@Hisoka-X @CalvinKirs
I have added MaxCompute E2E. like connector-datahub-e2e

@CodingGPT
Copy link
Contributor

I tested the code using a tableSchema like "id BIGINT, shopname STRING, start_date DATE, price DOUBLE, create_time DATETIME, update_timestamp TIMESTAMP, f1 tinyint, f2 smallint, f3 int, f4 float, f5 decimal(38,18), f6 char(20), f7 varchar(30), f8 boolean" and sink to mysql , mostly get the right result.

TyrantLucifer
TyrantLucifer previously approved these changes Mar 4, 2023
Copy link
Member

@TyrantLucifer TyrantLucifer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM,but you should offer the test snapshot images to verify this pull request is valid. Thank you for your contribution of SeaTunnel.

stdnt-xiao and others added 3 commits March 4, 2023 17:13
# Conflicts:
#	seatunnel-connectors-v2/connector-maxcompute/src/main/java/com/aliyun/odps/type/SimpleArrayTypeInfo.java
#	seatunnel-connectors-v2/connector-maxcompute/src/main/java/com/aliyun/odps/type/SimpleMapTypeInfo.java
#	seatunnel-connectors-v2/connector-maxcompute/src/main/java/com/aliyun/odps/type/SimpleStructTypeInfo.java
#	seatunnel-connectors-v2/connector-maxcompute/src/main/java/org/apache/seatunnel/connectors/seatunnel/maxcompute/sink/MaxcomputeWriter.java
#	seatunnel-connectors-v2/connector-maxcompute/src/main/java/org/apache/seatunnel/connectors/seatunnel/maxcompute/util/MaxcomputeTypeMapper.java
#	seatunnel-connectors-v2/connector-maxcompute/src/test/java/BasicTypeToOdpsTypeTest.java
#	seatunnel-e2e/seatunnel-connector-v2-e2e/connector-maxcompute-e2e/pom.xml
#	seatunnel-e2e/seatunnel-connector-v2-e2e/connector-maxcompute-e2e/src/test/java/org/apache/seatunnel/e2e/connector/maxcompute/MaxcomputeIT.java
#	seatunnel-e2e/seatunnel-connector-v2-e2e/pom.xml
@stdnt-xiao
Copy link
Contributor Author

Overall LGTM,but you should offer the test snapshot images to verify this pull request is valid. Thank you for your contribution of SeaTunnel.

Maxcompute is a commercial product of Alibaba Cloud, so I can only provide screenshots.
image

Copy link
Member

@hailin0 hailin0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

@Hisoka-X Hisoka-X left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Hisoka-X Hisoka-X merged commit 642901f into apache:dev Mar 4, 2023
@stdnt-xiao stdnt-xiao deleted the dev-3893 branch March 4, 2023 12:46
@Dannila
Copy link

Dannila commented Jul 26, 2023

Please help me to see if there is something wrong with my configuration file, the configuration file is as follows:

{
  env {
    execution.parallelism = 1
    job.mode = "BATCH"
  }
  source {
    Maxcompute {
      parallelism = 3
      accessId = "LxxxxxxxxxxxxxxxxxxxxxxxK"
      accesskey = "9xxxxxxxxxxxxxxxxxxxxxxxj"
      endpoint = "http://service.cn-shanghai.maxcompute.aliyun.com/api"
      project = "maxcompute_test_1"
      table_name = "ods_xxxxxxxxx_month_dailyupdate"
    }
  }
  sink {
    Jdbc {
          url = "jdbc:mysql://172.xx.x.xxx:xxxx"
          driver = "com.mysql.cj.jdbc.Driver"
          connection_check_timeout_sec = 100
          user = "root"
          password = "xxxxxxx"
          query = "insert into `ods`.`ods_repo_month_dailyupdate_maxc` (repo_id, repo_name, stars, commits, pushes, pr_creators, pr_reviews, pr_reviewers, total_issues, forks, month) values (?,?,?,?,?,?,?,?,?,?,?,?);"
      }
  }
}

My execution command is as follows:

${SEATUNNEL_HOME}/bin/start-seatunnel-flink-15-connector-v2.sh --config /opt/apache-seatunnel-2.3.2/script/ods_repo.conf -e run

But the task error log is as follows:

Caused by: org.apache.flink.runtime.resourcemanager.exceptions.ResourceManagerException: Cannot initialize resource provider.
   at org.apache.flink.runtime.resourcemanager.active.ActiveResourceManager.initialize(ActiveResourceManager.java:158) ~[flink-dist-1.15.3.jar:1.15.3]
   at org.apache.flink.runtime.resourcemanager.ResourceManager.startResourceManagerServices(ResourceManager.java:241) ~[flink-dist-1.15.3.jar:1.15.3]
   at org.apache.flink.runtime.resourcemanager.ResourceManager.onStart(ResourceManager.java:218) ~[flink-dist-1.15.3.jar:1.15.3]
   ... 24 more
Caused by: org.apache.flink.runtime.resourcemanager.exceptions.ResourceManagerException: Could not start resource manager client.
   at org.apache.flink.yarn.YarnResourceManagerDriver.initializeInternal(YarnResourceManagerDriver.java:191) ~[flink-dist-1.15.3.jar:1.15.3]
   at org.apache.flink.runtime.resourcemanager.active.AbstractResourceManagerDriver.initialize(AbstractResourceManagerDriver.java:81) ~[flink-dist-1.15.3.jar:1.15.3]
   at org.apache.flink.runtime.resourcemanager.active.ActiveResourceManager.initialize(ActiveResourceManager.java:156) ~[flink-dist-1.15.3.jar:1.15.3]
   at org.apache.flink.runtime.resourcemanager.ResourceManager.startResourceManagerServices(ResourceManager.java:241) ~[flink-dist-1.15.3.jar:1.15.3]
   at org.apache.flink.runtime.resourcemanager.ResourceManager.onStart(ResourceManager.java:218) ~[flink-dist-1.15.3.jar:1.15.3]
   ... 24 more
Caused by: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.lang.reflect.InvocationTargetException
   at org.apache.hadoop.yarn.factories.impl.pb.RpcClientFactoryPBImpl.getClient(RpcClientFactoryPBImpl.java:81) ~[seatunnel-hadoop3-3.1.4-uber.jar:2.3.2]
...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug] [Maxcompute] Failed to parse some maxcompute type
7 participants