-
Notifications
You must be signed in to change notification settings - Fork 3k
Hive: Fix an error when create external table in hive catalog #4916
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hive: Fix an error when create external table in hive catalog #4916
Conversation
|
@renshangtao: No version of CDH has Hive 2.3.8. I think CDH 5 has Hive 1.1.x, CDH 6 has Hive 2.1.x, CDP 7 has Hive 3.1.x, so there should be some miscommunication here. Could you please elaborate? What is the error when you try to create the external table in the first time (without creating the Iceberg table first). Also could you please describe what you are trying to archive here? Thanks, |
|
@pvary thank you for your reply. First I didn't make it clear, this question is just an error message and doesn't affect the query. I use hive 2.3.8 for test, today i used hive 2.1.1 to test, the question is same. We have a customer, they use cdh 6.0 to process bigdata before, and now they deploy a flink +iceberg environment to process stream data. And they want to use cdh hive 2.1.1 to read the iceberg table created by flink, because they have a lot of sql written by for hive; so i create an EXTERNAL table on cdh hive 2.1.1 to access the table on iceberg which created by flink. The table name and database name must equal with iceberg, otherwise i can't read any data from the table; when I create the same name table in hive it returns an error, like this: this two environment use different hms. Thanks you |
|
@renshangtao: Thanks for the detailed answer. This helps a lot.
I see 2 options:
|
|
@pvary Thank you the pr which i commit, just want to solve the error message. I think it should not return the existing error message, |
|
@renshangtao: Registering a table to multiple catalogs could cause serious trouble later, so it is not supported. See: #4946 (comment) If the same HMS could not be used then it might be time to develop the feature where the non-default HMS could be used for a catalog when accessing the table:
This way the linked table will reflect the changes on the original table, and there will be no conflict between the cleanup processes. |
|
@pvary OK,Thank you. |
To access the iceberg table using CDH hive 2.1.1, we need to create an external table in CDH hive, and ensure that the database name and table name are consistent with those in iceberg. If they are inconsistent, no data can be queried.
When creating the external table, an error that the table already exists will be reported, because when iceberg.mr creates the table, if it is hive catalog, it does not check whether the table exists.
hive> CREATE EXTERNAL TABLE iceberg_db.test_eet(id int, chengshi string) STORED BY 'org.apache.iceberg.mr.hive.HiveIcebergStorageHandler' LOCATION 'hdfs://xx.x.xx.xx:9000/user/hive/warehouse/iceberg_db.db/test_eet' TBLPROPERTIES ('iceberg.catalog'='hive_catalog'); FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. org.apache.iceberg.exceptions.AlreadyExistsException: Table already exists: iceberg_db.test_eetLike this:
cdh hiveA 2.1.1
cdh hmsA
iceberg B
iceberg hmsB
Hive A want to read the iceberg table on iceberg B,so we must create a external table on hive A, when i create the external table on hive A , an error returned: Table already exists: iceberg_db.test_eet.