Flink: Implement Flink InputFormat and integrate it to FlinkCatalog #1293

JingsongLi · 2020-08-04T07:42:38Z

Fixes #1275

This is Proof of Concept (POC) for Flink reader.

The Flink reader is essentially the same as Spark.

Flink InputFormat is similar to Hive (Hadoop) input format. Its splits are generated in the job manager. Therefore, an iceberg catalog loader is needed to obtain the Iceberg Table object.
Flink TableFactory and TableSource are similar to Spark TableProvider and SparkScanBuilder, It also provides projection push down ProjectableTableSource and filter push down FilterableTableSource.

Work can be divided into:

~~(Done) Flink: Using RowData to avro reader and writer Flink: Using RowData to avro reader and writer #1232~~
~~(Done) Introduce CatalogLoader. Flink: Introduce CatalogLoader and TableLoader #1332~~
~~(Done) Introduce GenericAppenderFactory and GenericAppenderHelper for reusing and testing. Introduce GenericAppenderFactory and GenericAppenderHelper #1340~~
~~(Done) Introduce FlinkInputFormat: implement SplitGenerator and RowDataReader. Flink: Introduce Flink InputFormat #1346~~
~~(Done) Introduce FlinkTableFactory and FlinkTableSource. Flink: Integrate Flink reader to SQL #1509~~

kbendick · 2020-08-05T07:13:51Z

flink/src/main/java/org/apache/iceberg/flink/FlinkTableFactory.java

+
+  @Override
+  public Map<String, String> requiredContext() {
+    throw new UnsupportedOperationException("Iceberg Table Factory can not loaded from Java SPI");


Nit: Probable typo. Should possible be can not *be* loaded from Java SPI

I also think it might help the average reader if you defined the acronym SPI at least once.

openinx

I had a rough look, left few comments. Will take a more closer review.

openinx · 2020-08-13T09:00:11Z

data/src/test/java/org/apache/iceberg/TestAppendHelper.java

+    switch (fileFormat) {
+      case AVRO:
+        appender = Avro.write(Files.localOutput(file))
+            .schema(table.schema())
+            .createWriterFunc(DataWriter::create)
+            .named(fileFormat.name())
+            .build();
+        break;
+
+      case PARQUET:
+        appender = Parquet.write(Files.localOutput(file))
+            .schema(table.schema())
+            .createWriterFunc(GenericParquetWriter::buildWriter)
+            .named(fileFormat.name())
+            .build();
+        break;
+
+      case ORC:
+        appender = ORC.write(Files.localOutput(file))
+            .schema(table.schema())
+            .createWriterFunc(GenericOrcWriter::buildWriter)
+            .build();
+        break;
+
+      default:
+        throw new UnsupportedOperationException("Cannot write format: " + fileFormat);
+    }
+
+    try {
+      appender.addAll(records);
+    } finally {
+      appender.close();
+    }


Seems we could abstract those appender building into a separate method, I saw lots of unit tests depend on this common logics, could be a separate issue to do this.

I think we can have a GenericAppenderFactory

Created #1340

openinx · 2020-08-13T09:04:42Z

flink/src/main/java/org/apache/iceberg/flink/CatalogLoader.java

+ * Flink needs to get {@link Table} objects in the cluster (for example, to get splits), not just on the client side.
+ * So we need an Iceberg catalog loader to get the {@link Catalog} and get the {@link Table} object.
+ */
+public interface CatalogLoader extends Serializable {


Seems we could move this class to hive modules ?

Maybe not in hive but somewhere in Iceberg modules, I don't know if any other modules want to use CatalogLoader.
We can discuss it in #1332

flink/src/test/java/org/apache/iceberg/flink/data/RandomData.java

openinx · 2020-08-13T10:12:05Z

flink/src/main/java/org/apache/iceberg/flink/FlinkCatalog.java

-    this.originalCatalog = icebergCatalog;
-    this.icebergCatalog = cacheEnabled ? CachingCatalog.wrap(icebergCatalog) : icebergCatalog;
+    this.originalCatalog = catalogLoader.loadCatalog(
+        HadoopUtils.getHadoopConfiguration(GlobalConfiguration.loadConfiguration()));


How about adding a Configuration argument in this current constructor. IMO for FlinkCatalog, it shouldn't handle the logic about configuration initialization. Moving the configuration loading to the FlinkCatalogFactory sounds more reasonable. besides, unit test for FlinkCatalog will be easy because it don't depend on how the flink will load its configuration.

We can discuss it in #1332

flink/src/main/java/org/apache/iceberg/flink/FlinkCatalog.java

openinx · 2020-08-13T11:50:07Z

flink/src/main/java/org/apache/iceberg/flink/source/FlinkInputFormat.java

+    }
+
+    public FlinkInputFormat build() {
+      return new FlinkInputFormat(


We'd better to have a Precondition#check for those arguments in case of throw NPE in the following call stack.

openinx · 2020-08-13T12:10:44Z

flink/src/test/java/org/apache/iceberg/flink/source/TestFlinkInputFormat.java

+
+  @Override
+  protected List<Row> executeWithSnapshotId(Table table, long snapshotId) throws IOException {
+    return run(builder.table(table).options(ScanOptions.builder().snapshotId(snapshotId).build()).build());


Should we also add unit tests for following cases:

scan with both startSnapshotId and endSnapshotId;

scan with only asOfTimestamp;

scan with only startSnapshotId .

openinx · 2020-08-13T12:38:06Z

flink/src/main/java/org/apache/iceberg/flink/FlinkSchemaUtil.java

+
+  /**
+   * Prune columns from a {@link Schema} using a projected fields.
+   * TODO Why Spark care about filters?


As the javadoc said:

The filters list of {@link Expression} is used to ensure that columns referenced by filters are projected.

For my understanding, when doing filter push down, we need to keep the columns which has been involved in push-down filter even if it does not in the projection column list.

Now we do not implement the FilterableTableSource interface, so I think we don't need to consider it now.

The columns which has been involved in push-down filter must be in the projection column list. Because just like spark:
Spark doesn't support residuals per task, so return all filters to get Spark to handle record-level filtering.
Flink source also doesn't support residuals per task, left these filtering to Flink planner.

openinx · 2020-08-13T13:09:16Z

flink/src/main/java/org/apache/iceberg/flink/source/RowDataReader.java

+    CloseableIterable<RowData> iterable = newIterable(task, idToConstant);
+    ProjectionRowData projectionRow = new ProjectionRowData();
+    return (finalProjection == null ? iterable : CloseableIterable.transform(
+        iterable, rowData -> (RowData) projectionRow.replace(rowData, finalProjection))).iterator();


Q: Why we need to transform the CloseableIterable<RowData> to be a projected RowData iterable again, I mean we've created an AVRO iterable with the projected read schema, it should guarantee that iterable only contains the projected columns ?

You can take a look to FlinkSchemaUtil. pruneWithoutReordering, it just keep the order from original schema. But the real Flink projection may change the order.

openinx · 2020-08-13T13:35:33Z

flink/src/main/java/org/apache/iceberg/flink/source/FlinkInputFormat.java

+  @Override
+  public FlinkInputSplit[] createInputSplits(int minNumSplits) throws IOException {
+    // Invoked by Job manager, so it is OK to load table from catalog.
+    tableLoader.open(HadoopUtils.getHadoopConfiguration(GlobalConfiguration.loadConfiguration()));


Q: Are we binding the iceberg table's hadoop configuration with flink job manager's hadoop configuration ? Is it possible to access an iceberg table in the hadoop cluster which is different with flink's hadoop cluster ? Seems a more reasonable way is: Passing a customized SerializeableConfiguration from client, then job manager could access any hadoop clusters.

I'm OK to pass a configuration to FlinkInputFormat, although there will be some serialization cost.

JingsongLi · 2020-08-17T06:23:42Z

Hi @openinx , I have create #1346 for InputFormat and addressed your comments.

openinx · 2020-08-17T06:44:55Z

Thanks for the update, I will take a look today.

openinx · 2020-08-31T09:20:09Z

flink/src/main/java/org/apache/iceberg/flink/FlinkCatalog.java

+  }
+
+  @Override
+  public Optional<TableFactory> getTableFactory() {


Q: The javadoc says it's deprecated now, What's the time for us to use the getFactory in future ?

* @deprecated Use {@link #getFactory()} for the new factory stack. The new factory stack uses the * new table sources and sinks defined in FLIP-95 and a slightly different discovery mechanism. */ @Deprecated default Optional<TableFactory> getTableFactory() {

Until the new API is really ready...
In the 1.11 and master, FLIP-95 interfaces still lack many things. I am not sure about Flink 1.12, maybe 1.13 is the time.

openinx · 2020-08-31T09:23:31Z

flink/src/main/java/org/apache/iceberg/flink/FlinkTableFactory.java

+ * Flink Iceberg table factory to create table source and sink.
+ * Only works for catalog, can not be loaded from Java SPI(Service Provider Interface).
+ */
+class FlinkTableFactory implements TableSourceFactory<RowData> {


We may put the tableSink creator in this class too. https://github.com/apache/iceberg/pull/1348/files#diff-0ad7dfff9cfa32fbb760796d976fd650R34

Yes, should be in this class too.

openinx · 2020-08-31T09:31:05Z

flink/src/main/java/org/apache/iceberg/flink/source/FlinkTableSource.java

+  private final CatalogLoader catalogLoader;
+  private final Configuration hadoopConf;
+  private final TableSchema schema;
+  private final Map<String, String> options;


How about renaming it to scanOptions ? I was thought it's an options map of the iceberg table.

It also includes table hints.

openinx · 2020-08-31T09:41:12Z

flink/src/main/java/org/apache/iceberg/flink/FlinkTableFactory.java

+  public TableSource<RowData> createTableSource(Context context) {
+    ObjectIdentifier identifier = context.getObjectIdentifier();
+    ObjectPath objectPath = new ObjectPath(identifier.getDatabaseName(), identifier.getObjectName());
+    TableIdentifier icebergIdentifier = catalog.toIdentifier(objectPath);


nit: the toIdentifier could be a static method in catalog .

No, it can not, because it needs baseNamespace information.

OK , I did not take look into the toNamespace, sounds good.

openinx · 2020-08-31T09:47:04Z

flink/src/main/java/org/apache/iceberg/flink/FlinkTableFactory.java

+    try {
+      Table table = catalog.getIcebergTable(objectPath);
+      // Excludes computed columns
+      TableSchema icebergSchema = TableSchemaUtils.getPhysicalSchema(context.getTable().getSchema());


Q: what if someone query the flink table with projecting a computed column (which does not exist in iceberg table )? Does that works fine in current version , or will it throw an exception ?

Not works.
Even if the iceberg table supports computed columns in the future, these computed columns will be generated by Flink instead of Iceberg source. (So here should always be getPhysicalSchema)

JingsongLi · 2020-10-09T03:25:30Z

Thanks all for your help, all sub-PRs have been completed. I'll close this PR.

JingsongLi force-pushed the flink_reader branch 2 times, most recently from 76426e6 to 3e43afd Compare August 4, 2020 10:41

kbendick reviewed Aug 5, 2020

View reviewed changes

JingsongLi force-pushed the flink_reader branch from 3e43afd to 2f88127 Compare August 6, 2020 03:16

JingsongLi mentioned this pull request Aug 6, 2020

Flink: Introduce IcebergCatalogLoader for loading table at runtime #1303

Closed

JingsongLi mentioned this pull request Aug 13, 2020

Flink: Introduce CatalogLoader and TableLoader #1332

Merged

openinx reviewed Aug 13, 2020

View reviewed changes

JingsongLi force-pushed the flink_reader branch from 2622071 to 9942126 Compare August 13, 2020 09:59

openinx reviewed Aug 13, 2020

View reviewed changes

flink/src/main/java/org/apache/iceberg/flink/FlinkCatalog.java Outdated Show resolved Hide resolved

openinx reviewed Aug 13, 2020

View reviewed changes

This was referenced Aug 14, 2020

Introduce GenericAppenderFactory and GenericAppenderHelper #1340

Merged

Flink: Introduce Flink InputFormat #1346

Merged

JingsongLi force-pushed the flink_reader branch from 9942126 to 2e0fc4c Compare August 19, 2020 03:07

probot-autolabeler bot added data flink labels Aug 19, 2020

JingsongLi force-pushed the flink_reader branch from 2e0fc4c to 70755cc Compare August 19, 2020 03:11

JingsongLi force-pushed the flink_reader branch from 70755cc to a18e064 Compare August 26, 2020 13:23

probot-autolabeler bot added the API label Aug 26, 2020

JingsongLi added 2 commits August 31, 2020 11:42

Flink: Introduce Flink InputFormat

ad42994

Flink: Integrate Flink input format to FlinkCatalog

dfb6825

JingsongLi force-pushed the flink_reader branch from 24ec004 to dfb6825 Compare August 31, 2020 03:48

Fix TestFlinkSchemaUtil case

1194d43

openinx reviewed Aug 31, 2020

View reviewed changes

Address comments

c826b03

openinx reviewed Aug 31, 2020

View reviewed changes

JingsongLi added 2 commits September 1, 2020 15:08

Minor fix to source

9bd4248

checkstyles

02bc024

JingsongLi mentioned this pull request Sep 25, 2020

Flink: Integrate Flink reader to SQL #1509

Merged

JingsongLi closed this Oct 9, 2020

JingsongLi deleted the flink_reader branch November 5, 2020 09:42

Flink: Implement Flink InputFormat and integrate it to FlinkCatalog #1293

Flink: Implement Flink InputFormat and integrate it to FlinkCatalog #1293

Uh oh!

Conversation

JingsongLi commented Aug 4, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

openinx left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

JingsongLi Aug 14, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

JingsongLi commented Aug 17, 2020

Uh oh!

openinx commented Aug 17, 2020

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

JingsongLi commented Oct 9, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

JingsongLi commented Aug 4, 2020 •

edited

Loading

JingsongLi Aug 14, 2020 •

edited

Loading