Flink: Support create iceberg table with 'connector'='iceberg' #2666

openinx · 2021-06-03T03:19:43Z

This PR is trying to address the issue : #2572

For people who want to create an iceberg table under hadoop catalog, could use the following sql:

CREATE TABLE hadoop_table (
  id        BIGINT,
  data   STRING
) WITH (
  'connector' = 'iceberg',
  'catalog-name' = 'hadoop-catalog',
  'catalog-type' = 'hadoop',
  'catalog-database' = 'local_db',
  'warehouse' = 'hdfs://nn:9090/path/to/warehouse'
);

If want to create an iceberg table under hive catalog, we could use the following SQL:

CREATE TABLE hive_table (
   id      BIGINT,
   data STRING
) WITH (
  'connector' = 'iceberg',
  'catalog-name' = 'hive-catalog',
  'catalog-type' = 'hive',
  'catalog-database' = 'default',
  'uri' = 'thrift://localhost:9093',
  'warehouse' = 'hdfs://nn:9090/path/to/warehouse'
);

The flink CREATE TABLE with 'connector'='iceberg' option is actually trying to mapping the underlying iceberg table to an flink table which managed in flink's in-memory catalogs, so in theory if people execute an ALTER TABLE or DROP TABLE clause, it won't affect the underlying iceberg table. For the convenience of use, this PR will try to initialize/create the underlying iceberg table if it does not exist when mapping the flink table from in-memory catalogs to iceberg table, then people don't need to create the real iceberg table before mapping the flink table to it.

flink/src/main/java/org/apache/iceberg/flink/FlinkDynamicTableFactory.java

kbendick · 2021-06-04T17:26:37Z

flink/src/main/java/org/apache/iceberg/flink/FlinkDynamicTableFactory.java

+    options.add(CATALOG_TYPE);
+    options.add(CATALOG_NAME);
+    options.add(CATALOG_DATABASE);
+    return Sets.newHashSet();


Are you intending to return options here? Possibly I am not understanding the interface fully, but it seems like options should be returned.

You are correct, we will need to return the options. Thanks for the checking.

stevenzwu · 2021-06-08T05:17:34Z

flink/src/test/java/org/apache/iceberg/flink/TestIcebergConnector.java

+    return String.format("file://%s", warehouse.getRoot().getAbsolutePath());
+  }
+
+  protected List<Row> sql(String query, Object... args) {


FlinkTestBase already has a sql method. it seems that the only diff is the extra call of tableResult.await() here. Is it necessary considering we are calling tableResult.collect()?

+1 on double checking whether it's needed at all. When I've removed the whole function then the test still passed.
If there is an objective reason why this needed then @Override must be added.

Sounds reasonable.

stevenzwu · 2021-06-08T05:27:37Z

flink/src/test/java/org/apache/iceberg/flink/TestIcebergConnector.java

+    tableProps.put("catalog-database", "local_db");
+    tableProps.put(CatalogProperties.WAREHOUSE_LOCATION, warehouseRoot());
+
+    sql("CREATE TABLE hadoop_table (id BIGINT, data STRING) WITH %s", toWithClause(tableProps));


Here we configure the catalog-name and catalog-database in the props. Flink SQL supports the namespace syntax. trying to understand if that is a problem.

https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/dev/table/sql/create/

CREATE TABLE [catalog_name.][db_name.]table_name

Here I'd like to explain that why I have planned to introduce the catalog-name & catalog-database in the table properties: For flink users, the SQL CREATE TABLE iceberg_sink(id BIGINT, data STRING) WITH ('connector'='iceberg', ... ) is actually creating a flink table which is managed in the flink's in-memory catalog default_catalog and default_database, it don't have any relationship to the iceberg catalog which manage the storage layer's database -> table -> table location and provide global table lock services.

Those table properties from flink create DDL is actually building the mapping relationship between flink's in-memory catalog and the storage layer catalog.

If we use the CREATE TABLE [catalog_name.][db_name]table_name, that's what the current way we've provided in the document, I mean the flink's catalog is exactly to the same iceberg catalog.

I understand the meaning of specifying catalog database, but I still don't understand why catalog name must be specified What's the use of specifying catalog name? For Flink, any value of catalog name can be specified, which has no impact on my data processing tasks. @openinx

stevenzwu · 2021-06-08T05:36:49Z

flink/src/main/java/org/apache/iceberg/flink/FlinkDynamicTableFactory.java

+    Preconditions.checkNotNull(catalogDatabase, "Table property '%s' cannot be null", CATALOG_DATABASE.key());
+
+    org.apache.hadoop.conf.Configuration hadoopConf = FlinkCatalogFactory.clusterHadoopConf();
+    CatalogLoader catalogLoader = FlinkCatalogFactory.createCatalogLoader(catalogName, tableProps, hadoopConf);


I am little confused. FlinkCatalogFactory#createCatalogLoader is a non-static method. How does this line work?

Also since we are creating a FlinkCatalogFactory instance below, we can call the non-static createCatalogLoader below

@stevenzwu please see up. In this PR createCatalogLoader is made static. +1 not making static.

Yes, we could use the following created factory to create the CatalogLoader instance, don't have to make it to be static here.

gaborgsomogyi · 2021-06-11T16:18:38Z

flink/src/test/java/org/apache/iceberg/flink/TestIcebergConnector.java

+    return String.format("file://%s", warehouse.getRoot().getAbsolutePath());
+  }
+
+  protected List<Row> sql(String query, Object... args) {


+1 on double checking whether it's needed at all. When I've removed the whole function then the test still passed.
If there is an objective reason why this needed then @Override must be added.

gaborgsomogyi · 2021-06-11T16:19:17Z

flink/src/test/java/org/apache/iceberg/flink/TestIcebergConnector.java

+
+  @Rule
+  public final TemporaryFolder warehouse = new TemporaryFolder();
+


Nit: Extra newline can be removed.

gaborgsomogyi · 2021-06-11T16:22:51Z

flink/src/test/java/org/apache/iceberg/flink/TestIcebergConnector.java

+    EnvironmentSettings settings = EnvironmentSettings
+        .newInstance()
+        .useBlinkPlanner()
+        .inBatchMode()


I've double checked it works with inStreamingMode(). It would be good to add parameterized tests which runs in both modes.

Yes, we can add the parameterized isStreaming to test both batch job and streaming job, although I think the batch job is enough to cover this PR changes.

OK, this is nice to have.

gaborgsomogyi · 2021-06-11T16:24:37Z

flink/src/main/java/org/apache/iceberg/flink/FlinkDynamicTableFactory.java

+    Preconditions.checkNotNull(catalogDatabase, "Table property '%s' cannot be null", CATALOG_DATABASE.key());
+
+    org.apache.hadoop.conf.Configuration hadoopConf = FlinkCatalogFactory.clusterHadoopConf();
+    CatalogLoader catalogLoader = FlinkCatalogFactory.createCatalogLoader(catalogName, tableProps, hadoopConf);


@stevenzwu please see up. In this PR createCatalogLoader is made static. +1 not making static.

flink/src/main/java/org/apache/iceberg/flink/FlinkDynamicTableFactory.java

gaborgsomogyi

LGTM.

openinx · 2021-06-23T02:29:14Z

@rdblue , @stevenzwu do you have another other concerns for this PR ? I think this is a very important issue for flink users because it's more in line with the usage habits of flink users.

rdblue · 2021-06-23T16:43:20Z

@openinx, sorry for the delay. I'll make some time to review this.

I also know that while I was out there were a few PRs that I wasn't able to help move along. If you have a list of PRs that are important for you to get in, please send them to me and I'll make time to get them reviewed as I can. Thank you for being patient with me!

stevenzwu · 2021-06-23T22:56:59Z

flink/src/main/java/org/apache/iceberg/flink/FlinkDynamicTableFactory.java

+      try {
+        flinkCatalog.createDatabase(catalogDatabase, new CatalogDatabaseImpl(Maps.newHashMap(), null), true);
+      } catch (DatabaseAlreadyExistException | CatalogException e) {
+        throw new RuntimeException(e);


nit: maybe add an error msg like "Failed to create database"

stevenzwu · 2021-06-23T22:57:15Z

flink/src/main/java/org/apache/iceberg/flink/FlinkDynamicTableFactory.java

+      try {
+        flinkCatalog.createTable(objectPath, catalogTable, true);
+      } catch (TableAlreadyExistException | CatalogException e) {
+        throw new RuntimeException(e);


nit: add an error msg

stevenzwu · 2021-06-23T23:05:38Z

flink/src/test/java/org/apache/iceberg/flink/TestIcebergConnector.java

+    // Drop and create it again.
+    sql("DROP TABLE %s", TABLE_NAME);
+    sql("CREATE TABLE %s (id BIGINT, data STRING) WITH %s", TABLE_NAME, toWithClause(tableProps));
+    Assert.assertEquals("Should have expected rows",


drop table doesn't purge data and metadata files, right? table creation again will be able to see the old data. I guess it is to prevent the disaster from accidental "drop table"?

drop table doesn't purge data and metadata files, right?

Yes, this DROP TABLE test_table is actually doing DROP TABLE default_catalog.default_database.test_table, which means dropping the test_table from flink's in-memory catalog, the underlying iceberg catalog & table (test-hadoop.default.test_table) that flink table is mapping to won't be effected. Here I want to make sure that we could map the flink in-memory table to the underlying iceberg table again once we drop and create again.

stevenzwu · 2021-06-23T23:10:33Z

flink/src/test/java/org/apache/iceberg/flink/TestIcebergConnector.java

+        sql("SELECT * FROM %s", TABLE_NAME));
+
+    sql("DROP TABLE %s", TABLE_NAME);
+    HiveMetaStoreClient metaStoreClient = new HiveMetaStoreClient(hiveConf);


should we run this cleanup block in the try-finally? Another option to split this file into two files (one for hadoop and one for hive). we can also potentially remove some of the redundant code with a base test class

Moving the cleanup block in the try-finally looks great to me. About splitting it into two files for hadoop and hive, I think we won't have too much test branches for hadoop and hive because most of the flink+iceberg integration test work will be accomplished in the flink catalog test suite, this UT is only used for making the CREATE TABLE xx () WITH ('connector'='iceberg') work, so I made them into this whole test class.

openinx · 2021-06-24T13:27:56Z

@openinx, sorry for the delay. I'll make some time to review this.

I also know that while I was out there were a few PRs that I wasn't able to help move along. If you have a list of PRs that are important for you to get in, please send them to me and I'll make time to get them reviewed as I can. Thank you for being patient with me!

@rdblue , thanks for your time. All the PRs that were published by me are here : https://github.com/apache/iceberg/pulls/openinx . There are three parts :

The first part is about improvement for iceberg+flink module, I think this PR is the highest priority. Others are related to flink CDC write path, let's discuss them in the second parts.
Write & analysis the change log events in format v2 iceberg table. After Flink: Support SQL primary key #2410, we could ingest change log events into apache iceberg by pure flink SQL now. But we still have other important issues that need to be addressed:
a. How to ensure the stability of streaming jobs ? Currently, the most important PR is : Core: Add RocksDBStructLikeMap #2680 . There are some other problems. For example, random change log events may cause a large number of parquet writers to be opened in a checkpoint, and eventually cause OOM of the job. I will open a specific issue for that.
b. How to implement the compaction action for v2. As we know we are implementing the minor compaction (I mean translate the equality deletes to pos-deletes ? ) , but I think we may could get this major compaction merged first (Of course , the current patch still has few concerns that need to be addressed, I will do) . I also had a temporary fix for Handle the case that RewriteFiles and RowDelta commit the transaction at the same time #2308, and will publish it to the apache repo for reviewing.
Aliyun OSS + DLF integration work. I had an opened PR for aliyun OSS and an pending PR for aliyun DLF integration (now it's in my personal repo) . Let's put these PRs aside for now, and I will try to split these PRs appropriately so that we can better review them in the future.

stevenzwu

LGTM

rdblue · 2021-06-26T00:06:56Z

flink/src/main/java/org/apache/iceberg/flink/FlinkDynamicTableFactory.java

+    if (catalog != null) {
+      tableLoader = createTableLoader(catalog, objectIdentifier.toObjectPath());
+    } else {
+      tableLoader = createTableLoader(catalogTable, tableProps, objectIdentifier.getObjectName());


Does this discard the rest of the identifier? What if the database is non-null? It seems like it should not be ignored and is a better way to get the database than a property.

I think this comment is similar to the @stevenzwu 's question . Saying if the flink table identifier is flink_catalog.flink_database.table_name (when creating table using the connector=iceberg, not the iceberg catalog approach), then it's mapping the flink table with name table_name to the underlying iceberg table table_name with the configured database in table property. The flink's flink_catalog and flink_database does not has any relationship to the iceberg's catalog & database.

Maybe it's better to forbidden people to specify a connector=iceberg property when creating table under the iceberg catalog ( adding a Precondition.checkArgument in createDynamicTableSource and createDynamicTableSink), then we don't mixed the two approaches to create flink+iceberg table or map flink table to iceberg table.

Okay, I thought that this was a way to run DDL even if you don't have an Iceberg catalog defined. It sort of does that, but it also creates a reference in the in-memory catalog. That's fine, but it does bring up a couple other questions:

How do you create an in-memory catalog table pointing to an Iceberg table if the Iceberg table already exists? What if the DDL, like the schema, doesn't match?

Why share the table name between the in-memory and external catalog but not the database? I think it makes sense to default the external database and table name using the ones from the DDL command but allow both to be overridden.

How do you create an in-memory catalog table pointing to an Iceberg table if the Iceberg table already exists? What if the DDL, like the schema, doesn't match?

If the underlying iceberg table already exists, the we still need to create the in-memory catalog table pointing to it. If the in-memory catalog table schema does not match the underlying iceberg table, then the create table statement won't throw any exception but when executing the SELECT query or INSERT INTO query it will throw exception if the RowData read from iceberg table could not be parsed by the in-memory catalog table schema. People will need to re-create the in-memory table and map to the underlying table once again. That's the default behavior for flink users, because almost of the flink connectors are the similar behavior ( such as JDBC connector, hive connector, hbase connector).

Why share the table name between the in-memory and external catalog but not the database?

I think you are right. I checked the jdbc connector, we could specify a different jdbc table name when creating the flink table. It make sense to allow both db & table name to be overridden.

rdblue · 2021-06-26T00:09:24Z

flink/src/main/java/org/apache/iceberg/flink/FlinkDynamicTableFactory.java

-    TableLoader tableLoader = createTableLoader(objectPath);
+    ObjectIdentifier objectIdentifier = context.getObjectIdentifier();
+    Map<String, String> tableProps = context.getCatalogTable().getOptions();
+    CatalogTable catalogTable = context.getCatalogTable();


When is catalogTable set in the context? Is it possible that it is set and the factory's context is non-null? In that case, should an error be thrown?

I think you are talking about the issue : Is it possible that we access the catalogTable by context.getCatalogTable() before setting it inside the context, if so then the catalogTable will be a null object.

I think we don't have to concern this issue because the catalogTable was set in the context contructor here. And in the flink code path, the set catalogTable must not be null. So I think it's OK here :-)

Sorry, I meant is there a case when context.getCatalogTable() is non-null but passed to a source where catalog is set in the constructor?

I got your question, yes, it's possible in your case. For example, when we create a table under the iceberg catalog in flink sql, then it will create the FlinkDynamicTableFactory with the specified iceberg catalog.

There should be no problem in that case. the catalogTable is the correct flink table which is parsed from SQL such as INSERT INTO iceberg_catalog.iceberg_db.iceberg_table , although we don't use it in the current code path.

rdblue · 2021-06-26T00:11:27Z

flink/src/main/java/org/apache/iceberg/flink/FlinkDynamicTableFactory.java

+      try {
+        flinkCatalog.createDatabase(catalogDatabase, new CatalogDatabaseImpl(Maps.newHashMap(), null), true);
+      } catch (DatabaseAlreadyExistException | CatalogException e) {
+        throw new RuntimeException(String.format("Failed to create database %s.%s", catalogName, catalogDatabase), e);


What about using Iceberg's exceptions here, like AlreadyExistsException, instead of RuntimeException?

Yeah, that' looks pretty good to me !

rdblue · 2021-06-26T00:11:54Z

flink/src/main/java/org/apache/iceberg/flink/FlinkDynamicTableFactory.java

+      try {
+        flinkCatalog.createTable(objectPath, catalogTable, true);
+      } catch (TableAlreadyExistException | CatalogException e) {
+        throw new RuntimeException(String.format("Failed to create table %s.%s", catalogName, objectPath), e);


Same here. I think it is better to throw Iceberg exceptions if they exist.

rdblue · 2021-06-26T00:15:19Z

flink/src/test/java/org/apache/iceberg/flink/TestIcebergConnector.java

+  }
+
+  @Test
+  public void testHadoop() {


I would like to have tests for when the database option and the one from the identifier conflict.

rdblue · 2021-06-26T00:16:23Z

flink/src/test/java/org/apache/iceberg/flink/TestIcebergConnector.java

+  private final boolean isStreaming;
+  private volatile TableEnvironment tEnv;
+
+  @Parameterized.Parameters(name = "isStreaming={0}")


Could the catalog be a parameter as well? We do that in Spark tests and it works well.

rdblue · 2021-06-26T00:20:02Z

@openinx, I had a few questions but this looks close overall.

Thanks for giving me an overview of what you're currently working on. I agree that the v2 writes are important. And I think we need the initial fix for #2308 as soon as possible. I'll mark that as a blocker for 0.12.0.

rdblue · 2021-06-29T00:45:20Z

flink/src/test/java/org/apache/iceberg/flink/TestIcebergConnector.java

+
+    // Drop and create it again.
+    sql("DROP TABLE %s", TABLE_NAME);
+    sql("CREATE TABLE %s (id BIGINT, data STRING) WITH %s", TABLE_NAME, toWithClause(tableProps));


Is it possible to omit the schema for cases where the underlying table already exists?

Unfortunately, the flink sql don't provide the syntax to support it now, although I agree it's a better user-experience . Currently, the flink sql provide the CREATE TABLE a LIKE b, but it still require that the table b is a flink catalog table.

…atabase from table properties

openinx · 2021-07-13T01:14:51Z

Ping @rdblue , How is your feeling about this PR now ?

rdblue · 2021-07-20T00:11:34Z

@openinx, I'll take another look at this. Is there a description of the proposed behavior for these tables in the Flink catalog?

openinx · 2021-09-02T04:31:28Z

Is there a description of the proposed behavior for these tables in the Flink catalog?

Yes, the default in-memory flink catalog will maintain those tables inside it, once people drop those tables from the in-memory flink catalog. The in-memory flink catalog will don't see the tables, but the underlying iceberg tables in file system are still there. If people want to drop those data from tables, the correct way is to drop tables from the backend catalog.

openinx · 2021-09-02T04:34:49Z

I think this PR has been fully reviewed. Seeem this useful feature has been blocked for so long, I plan to get this merged into the offical apache iceberg repo. If people have more comments or improvements, pls file new issue or PR to address the following thing. Thanks all for reviewing !

…#2666

github-actions bot added the flink label Jun 3, 2021

stevenzwu reviewed Jun 4, 2021

View reviewed changes

flink/src/main/java/org/apache/iceberg/flink/FlinkDynamicTableFactory.java Show resolved Hide resolved

kbendick reviewed Jun 4, 2021

View reviewed changes

stevenzwu reviewed Jun 8, 2021

View reviewed changes

gaborgsomogyi reviewed Jun 11, 2021

View reviewed changes

gaborgsomogyi approved these changes Jun 21, 2021

View reviewed changes

stevenzwu reviewed Jun 23, 2021

View reviewed changes

stevenzwu approved these changes Jun 24, 2021

View reviewed changes

rdblue reviewed Jun 26, 2021

View reviewed changes

rdblue reviewed Jun 29, 2021

View reviewed changes

openinx added 6 commits June 29, 2021 21:15

Flink: Support create iceberg table with 'connector'='iceberg'

df12eaf

Addressing comments.

1101d95

Create database if not exists

3424c29

Minor fixes

e79c6e6

Improve the unit tests.

b59e198

Addressing comments from Steven Z Wu.

004fc7e

openinx added 6 commits June 29, 2021 21:15

Addressing comment from Ryan

baba708

Use the parameterizied approach to refactor the unit tests.

8404488

Add an unit tests to test when database name conflicts with catalog d…

2307b9e

…atabase from table properties

Add sanity check when create iceberg connector table in iceberg catalog

aeb3a43

Allow both the database and table to be overide.

5e4e61b

Rebase

43a4469

kbendick mentioned this pull request Aug 31, 2021

Flink : upgrade to flink 1.13 #2629

Closed

openinx merged commit fa17d82 into apache:master Sep 2, 2021

This was referenced Sep 7, 2021

java.io.IOException: Mkdirs failed to create file:/user/hive/warehouse/bench/metadata #3079

Closed

Docs: Add flink iceberg connector #3085

Merged

liubo1022126 pushed a commit to liubo1022126/iceberg that referenced this pull request Oct 8, 2021

Flink: Support create iceberg table with 'connector'='iceberg' apache…

73a266d

…#2666

swapna267 mentioned this pull request Jan 29, 2025

support create table like in flink catalog and watermark in windows #12116

Closed


		@Rule
		public final TemporaryFolder warehouse = new TemporaryFolder();

Flink: Support create iceberg table with 'connector'='iceberg' #2666

Flink: Support create iceberg table with 'connector'='iceberg' #2666

Uh oh!

Conversation

openinx commented Jun 3, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gaborgsomogyi Jun 11, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

qyw919867774 Mar 30, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gaborgsomogyi Jun 11, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

gaborgsomogyi left a comment

Choose a reason for hiding this comment

Uh oh!

openinx commented Jun 23, 2021

Uh oh!

rdblue commented Jun 23, 2021

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

openinx commented Jun 24, 2021

Uh oh!

stevenzwu left a comment

Choose a reason for hiding this comment

Uh oh!

rdblue Jun 26, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

openinx commented Jun 3, 2021 •

edited

Loading

gaborgsomogyi Jun 11, 2021 •

edited

Loading

qyw919867774 Mar 30, 2022 •

edited

Loading

gaborgsomogyi Jun 11, 2021 •

edited

Loading

rdblue Jun 26, 2021 •

edited

Loading