Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
296 changes: 296 additions & 0 deletions site/docs/flink.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,296 @@
<!--
- Licensed to the Apache Software Foundation (ASF) under one or more
- contributor license agreements. See the NOTICE file distributed with
- this work for additional information regarding copyright ownership.
- The ASF licenses this file to You under the Apache License, Version 2.0
- (the "License"); you may not use this file except in compliance with
- the License. You may obtain a copy of the License at
-
- http://www.apache.org/licenses/LICENSE-2.0
-
- Unless required by applicable law or agreed to in writing, software
- distributed under the License is distributed on an "AS IS" BASIS,
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- See the License for the specific language governing permissions and
- limitations under the License.
-->

# Flink

Apache Iceberg support both [Apache Flink](https://flink.apache.org/)'s DataStream API and Table API to write records into iceberg table. Currently,
we only integrate iceberg with apache flink 1.11.x .

| Feature support | Flink 1.11.0 | Notes |
|------------------------------------------------------------------------|--------------------|--------------------------------------------------------|
| [SQL create catalog](#creating-catalogs-and-using-catalogs) | ✔️ | |
| [SQL create database](#create-database) | ✔️ | |
| [SQL create table](#create-table) | ✔️ | |
| [SQL alter table](#alter-table) | ✔️ | Only support altering table properties, Columns/PartitionKey changes are not supported now|
| [SQL drop_table](#drop-table) | ✔️ | |
| [SQL select](#querying-with-sql) | ️ | |
| [SQL insert into](#insert-into) | ✔️ ️ | Support both streaming and batch mode |
| [SQL insert overwrite](#insert-overwrite) | ✔️ ️ | |
| [DataStream read](#reading-with-datastream) | ✔️ ️ | |
| [DataStream append](#appending-data) | ✔️ ️ | |
| [DataStream overwrite](#overwrite-data) | ✔️ ️ | |
| [Metadata tables](#inspecting-tables) | ️ | |

## Preparation

To create iceberg table in flink, we recommend to use [Flink SQL Client](https://ci.apache.org/projects/flink/flink-docs-stable/dev/table/sqlClient.html) because it's easier for users to understand the concepts.

Step.1 Downloading the flink 1.11.x binary package from the apache flink [download page](https://flink.apache.org/downloads.html). We now use scala 2.12 to archive the apache iceberg-flink-runtime jar, so it's recommended to use flink 1.11 bundled with scala 2.12.

```bash
wget https://downloads.apache.org/flink/flink-1.11.1/flink-1.11.1-bin-scala_2.12.tgz
tar xzvf flink-1.11.1-bin-scala_2.12.tgz
```

Step.2 Start a standalone flink cluster within hadoop environment.

```bash
# HADOOP_HOME is your hadoop root directory after unpack the binary package.
export HADOOP_CLASSPATH=`$HADOOP_HOME/bin/hadoop classpath`

# Start the flink standalone cluster
./bin/start-cluster.sh
```

Step.3 Start the flink SQL client.

We've created a separate `flink-runtime` module in iceberg project to generate a bundled jar, which could be loaded by flink SQL client directly.

If we want to build the `flink-runtime` bundled jar manually, please just build the `iceberg` project and it will generate the jar under `<iceberg-root-dir>/flink-runtime/build/libs`. Of course, we could also download the `flink-runtime` jar from the [apache official repository](https://repo.maven.apache.org/maven2/org/apache/iceberg/iceberg-flink-runtime/).

```bash
# HADOOP_HOME is your hadoop root directory after unpack the binary package.
export HADOOP_CLASSPATH=`$HADOOP_HOME/bin/hadoop classpath`

./bin/sql-client.sh embedded -j <flink-runtime-directory>/iceberg-flink-runtime-xxx.jar shell
```

By default, iceberg has included hadoop jars for hadoop catalog. If we want to use hive catalog, we will need to load the hive jars when opening the flink sql client. Fortunately, apache flink has provided a [bundled hive jar](https://ci.apache.org/projects/flink/flink-docs-stable/dev/table/hive/#using-bundled-hive-jar) for sql client. So we could open the sql client
as the following:

```bash
# HADOOP_HOME is your hadoop root directory after unpack the binary package.
export HADOOP_CLASSPATH=`$HADOOP_HOME/bin/hadoop classpath`

# wget the flink-sql-connector-hive-2.3.6_2.11-1.11.0.jar from the above bundled jar URL firstly.

# open the SQL client.
./bin/sql-client.sh embedded \
-j <flink-runtime-directory>/iceberg-flink-runtime-xxx.jar \
-j <hive-bundlded-jar-directory>/flink-sql-connector-hive-2.3.6_2.11-1.11.0.jar \
shell
```

## Creating catalogs and using catalogs.

Flink 1.11 support to create catalogs by using flink sql.

This creates an iceberg catalog named `hive_catalog` that loads tables from a hive metastore:

```sql
CREATE CATALOG hive_catalog WITH (
'type'='iceberg',
'catalog-type'='hive',
'uri'='thrift://localhost:9083',
'clients'='5',
'property-version'='1'
);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's necessary to fix this issue #1437 before we get this pr merged, because when i create the hive_catalog, it will try to create the default database but failed to :

Flink SQL> CREATE CATALOG hive_catalog WITH (
>   'type'='iceberg',
>   'catalog-type'='hive',
>   'uri'='thrift://localhost:9083',
>   'clients'='5',
>   'property-version'='1'
> );
2020-09-24 18:24:21,111 WARN  org.apache.flink.table.client.cli.CliClient                  [] - Could not execute SQL statement.
org.apache.flink.table.client.gateway.SqlExecutionException: Could not execute statement: CREATE CATALOG hive_catalog WITH (
  'type'='iceberg',
  'catalog-type'='hive',
  'uri'='thrift://localhost:9083',
  'clients'='5',
  'property-version'='1'
)
	at org.apache.flink.table.client.gateway.local.LocalExecutor.executeSql(LocalExecutor.java:362) ~[flink-sql-client_2.12-1.11.1.jar:1.11.1]
	at org.apache.flink.table.client.cli.CliClient.callDdl(CliClient.java:642) ~[flink-sql-client_2.12-1.11.1.jar:1.11.1]
	at org.apache.flink.table.client.cli.CliClient.callDdl(CliClient.java:637) ~[flink-sql-client_2.12-1.11.1.jar:1.11.1]
	at org.apache.flink.table.client.cli.CliClient.callCommand(CliClient.java:357) ~[flink-sql-client_2.12-1.11.1.jar:1.11.1]
	at java.util.Optional.ifPresent(Optional.java:159) [?:1.8.0_221]
	at org.apache.flink.table.client.cli.CliClient.open(CliClient.java:212) [flink-sql-client_2.12-1.11.1.jar:1.11.1]
	at org.apache.flink.table.client.SqlClient.openCli(SqlClient.java:142) [flink-sql-client_2.12-1.11.1.jar:1.11.1]
	at org.apache.flink.table.client.SqlClient.start(SqlClient.java:114) [flink-sql-client_2.12-1.11.1.jar:1.11.1]
	at org.apache.flink.table.client.SqlClient.main(SqlClient.java:201) [flink-sql-client_2.12-1.11.1.jar:1.11.1]
Caused by: java.lang.IllegalArgumentException: Can not create a Path from a null string
	at org.apache.hadoop.fs.Path.checkPathArg(Path.java:159) ~[hadoop-common-2.9.2.jar:?]
	at org.apache.hadoop.fs.Path.<init>(Path.java:175) ~[hadoop-common-2.9.2.jar:?]
	at org.apache.hadoop.fs.Path.<init>(Path.java:110) ~[hadoop-common-2.9.2.jar:?]
	at org.apache.iceberg.hive.HiveCatalog.convertToDatabase(HiveCatalog.java:459) ~[?:?]
	at org.apache.iceberg.hive.HiveCatalog.lambda$createNamespace$5(HiveCatalog.java:214) ~[?:?]
	at org.apache.iceberg.hive.ClientPool.run(ClientPool.java:54) ~[?:?]
	at org.apache.iceberg.hive.HiveCatalog.createNamespace(HiveCatalog.java:213) ~[?:?]
	at org.apache.iceberg.flink.FlinkCatalog.createDatabase(FlinkCatalog.java:195) ~[?:?]
	at org.apache.iceberg.flink.FlinkCatalog.open(FlinkCatalog.java:116) ~[?:?]
	at org.apache.flink.table.catalog.CatalogManager.registerCatalog(CatalogManager.java:191) ~[flink-table-blink_2.12-1.11.1.jar:1.11.1]
	at org.apache.flink.table.api.internal.TableEnvironmentImpl.createCatalog(TableEnvironmentImpl.java:1086) ~[flink-table-blink_2.12-1.11.1.jar:1.11.1]
	at org.apache.flink.table.api.internal.TableEnvironmentImpl.executeOperation(TableEnvironmentImpl.java:1019) ~[flink-table-blink_2.12-1.11.1.jar:1.11.1]
	at org.apache.flink.table.api.internal.TableEnvironmentImpl.executeSql(TableEnvironmentImpl.java:690) ~[flink-table-blink_2.12-1.11.1.jar:1.11.1]
	at org.apache.flink.table.client.gateway.local.LocalExecutor.lambda$executeSql$7(LocalExecutor.java:360) ~[flink-sql-client_2.12-1.11.1.jar:1.11.1]
	at org.apache.flink.table.client.gateway.local.ExecutionContext.wrapClassLoader(ExecutionContext.java:255) ~[flink-sql-client_2.12-1.11.1.jar:1.11.1]
	at org.apache.flink.table.client.gateway.local.LocalExecutor.executeSql(LocalExecutor.java:360) ~[flink-sql-client_2.12-1.11.1.jar:1.11.1]
	... 8 more

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I'm not sure about that. Maybe we should not be using the value of hive.metastore.warehouse.dir.

If we are not using hive.metastore.uris, then it doesn't make sense to use the Hive warehouse path, either. Maybe we should support the warehouse config property for Hive catalog. Then we could recommend how to configure the catalog with it and not need to load hive-site.xml.

```

* `type`: Please just use `iceberg` for iceberg table format. (Required)
* `catalog-type`: Iceberg currently support `hive` or `hadoop` catalog type. (Required)
* `uri`: The Hive metastore's thrift URI. (Required)
* `clients`: The Hive metastore client pool size, default value is 2. (Optional)
* `property-version`: Version number to describe the property version. This property can be used for backwards compatibility in case the property format changes. The currently property version is `1`. (Optional)

Iceberg also supports a directory-based catalog in HDFS that can be configured using `'catalog-type'='hadoop'`:

```sql
CREATE CATALOG hadoop_catalog WITH (
'type'='iceberg',
'catalog-type'='hadoop',
'warehouse'='hdfs://nn:8020/warehouse/path',
'property-version'='1'
);
```

* `warehouse`: The HDFS directory to store metadata files and data files. (Required)

We could execute the sql command `USE CATALOG hive_catalog` to set the current catalog.

## DDL commands

### `CREATE DATABASE`

By default, iceberg will use the `default` database in flink. Using the following example to create a separate database if we don't want to create tables under the `default` database:

```sql
CREATE DATABASE iceberg_db;
USE iceberg_db;
```

### `CREATE TABLE`

```sql
CREATE TABLE hive_catalog.default.sample (
id BIGINT COMMENT 'unique id',
data STRING
);
```

Table create commands support the most commonly used [flink create clauses](https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/table/sql/create.html#create-table) now, including:

* `PARTITION BY (column1, column2, ...)` to configure partitioning, apache flink does not yet support hidden partitioning.
* `COMMENT 'table document'` to set a table description.
* `WITH ('key'='value', ...)` to set [table configuration](../configuration) which will be stored in apache iceberg table properties.

Currently, it does not support computed column, primary key and watermark definition etc.

### `PARTITIONED BY`

To create a partition table, use `PARTITIONED BY`:

```sql
CREATE TABLE hive_catalog.default.sample (
id BIGINT COMMENT 'unique id',
data STRING
) PARTITIONED BY (data);
```

Apache Iceberg support hidden partition but apache flink don't support partitioning by a function on columns, so we've no way to support hidden partition in flink DDL now, we will improve apache flink DDL in future.

### `ALTER TABLE`

Iceberg only support altering table properties in flink 1.11 now.

```sql
ALTER TABLE hive_catalog.default.sample SET ('write.format.default'='avro')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is UNSET supported as well?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, we flink does not yet support UNSET now, it only provide SET syntax. That means we could not remove the table properties from iceberg table, but could set it to the target value. It sounds not so friendly to users but could meet their requirement now. @JingsongLi , we may need to propose apache flink to support UNSET ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://issues.apache.org/jira/browse/FLINK-19062 There is a ticket for tracking this.

```

### `ALTER TABLE .. RENAME TO`

```sql
ALTER TABLE hive_catalog.default.sample RENAME TO hive_catalog.default.new_sample;
```

### `DROP TABLE`

To delete a table, run:

```sql
DROP TABLE hive_catalog.default.sample;
```

## Querying with SQL

Iceberg does not support streaming read or batch read in flink now, it's still working in-progress.

## Writing with SQL

Iceberg support both `INSERT INTO` and `INSERT OVERWRITE` in flink 1.11 now.

Notice: we could execute the following sql command to switch the execute type from 'streaming' mode to 'batch' mode, and vice versa:

```sql
-- Execute the flink job in streaming mode for current session context
SET execution.type = streaming

-- Execute the flink job in batch mode for current session context
SET execution.type = batch
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a good place to link to in Flink docs for this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

```

### `INSERT INTO`

To append new data to a table with a flink streaming job, use `INSERT INTO`:

```sql
INSERT INTO hive_catalog.default.sample VALUES (1, 'a');
INSERT INTO hive_catalog.default.sample SELECT id, data from other_kafka_table;
```

### `INSERT OVERWRITE`

To replace data in the table with the result of a query, use `INSERT OVERWRITE` in batch job (flink streaming job does not support `INSERT OVERWRITE`). Overwrites are atomic operations for Iceberg tables.

Partitions that have rows produced by the SELECT query will be replaced, for example:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, this is a good way to state what will happen.

But, the example is incomplete because it doesn't state which partitions would be replaced. I think this needs to have an example table as well as a query. If Flink supports a CASE statement, then I suggest partitioning by "even" and "odd" ids. That's easy to understand and you can show what happens with very few rows.


```sql
INSERT OVERWRITE sample VALUES (1, 'a');
```

Iceberg also support overwriting given partitions by the `select` values:

```sql
INSERT OVERWRITE hive_catalog.default.sample PARTITION(data='a') SELECT 6;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation for overwrites doesn't account for hidden partitions.

When all the partition columns are set a value in PARTITION clause, it is inserting into a static partition. . . . When partial partition columns (prefix part of all partition columns) are set a value in PARTITION clause, it is writing the query result into a dynamic partition.

It would be great to clarify how hidden partitions are handled in that documentation, but it definitely needs to be covered here.

I think this should also specifically say that unpartitioned tables are completely overwritten by INSERT OVERWRITE.

```

For a partitioned iceberg table, when all the partition columns are set a value in `PARTITION` clause, it is inserting into a static partition, otherwise if partial partition columns (prefix part of all partition columns) are set a value in `PARTITION` clause, it is writing the query result into a dynamic partition.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This copies from the Flink docs, but doesn't cover what happens with hidden partitioning. I think this should give an example and be specific: hidden partitions are overwritten dynamically.

For an unpartitioned iceberg table, its data will be completely overwritten by `INSERT OVERWRITE`.

## Reading with DataStream

Iceberg does not support streaming or batch read now, but it's working in-progress.

## Writing with DataStream

Iceberg support writing to iceberg table from different DataStream input.


### Appending data.

we have supported writing `DataStream<RowData>` and `DataStream<Row` to the sink iceberg table natively.

```java
StreamExecutionEnvironment env = ...;

DataStream<RowData> input = ... ;
Configuration hadoopConf = new Configuration();
TableLoader tableLoader = TableLoader.fromHadooptable("hdfs://nn:8020/warehouse/path");

FlinkSink.forRowData(input)
.tableLoader(tableLoader)
.hadoopConf(hadoopConf)
.build();

env.execute("Test Iceberg DataStream");
```

The iceberg API also allows users to write generic `DataStream<T>` to iceberg table, more example could be found in this [unit test](https://github.com/apache/iceberg/blob/master/flink/src/test/java/org/apache/iceberg/flink/sink/TestFlinkIcebergSink.java).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that I agree that the API supports writing a generic DataStream<T> because the rows must be converted before they are actually written. Writing a generic T would require a way to control the data model. I think it may be worth adding support to write Iceberg generics, but probably not arbitrary T.


### Overwrite data

To overwrite the data in existing iceberg table dynamically, we could set the `overwrite` flag in FlinkSink builder.

```java
StreamExecutionEnvironment env = ...;

DataStream<RowData> input = ... ;
Configuration hadoopConf = new Configuration();
TableLoader tableLoader = TableLoader.fromHadooptable("hdfs://nn:8020/warehouse/path");

FlinkSink.forRowData(input)
.tableLoader(tableLoader)
.overwrite(true)
.hadoopConf(hadoopConf)
.build();

env.execute("Test Iceberg DataStream");
```

## Inspecting tables.

Iceberg does not support inspecting table in flink sql now, we need to use [iceberg's Java API](../api) to read iceberg's meta data to get those table information.

## Future improvement.

There are some features that we do not yet support in the current flink iceberg integration work:

* Don't support creating iceberg table with hidden partitioning. [Discussion](http://mail-archives.apache.org/mod_mbox/flink-dev/202008.mbox/%3cCABi+2jQCo3MsOa4+ywaxV5J-Z8TGKNZDX-pQLYB-dG+dVUMiMw@mail.gmail.com%3e) in flink mail list.
* Don't support creating iceberg table with computed column.
* Don't support creating iceberg table with watermark.
* Don't support adding columns, removing columns, renaming columns, changing columns. [FLINK-19062](https://issues.apache.org/jira/browse/FLINK-19062) is tracking this.
* Don't support flink read iceberg table in batch or streaming mode. [#1346](https://github.com/apache/iceberg/pull/1346) and [#1293](https://github.com/apache/iceberg/pull/1293) are tracking this.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that documentation is a good place for this. If you want, we could reference a tag in the issue tracker, or a milestone. But we don't want to need to update docs when these are done. Compatibility and features should be covered mainly in the table.

1 change: 1 addition & 0 deletions site/mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,7 @@ nav:
- How to Release: how-to-release.md
- User docs:
- Getting Started: getting-started.md
- Flink: flink.md
- Spark: spark.md
- Spark Streaming: spark-structured-streaming.md
- Presto: presto.md
Expand Down