Skip to content

Conversation

@Flyangz
Copy link
Contributor

@Flyangz Flyangz commented Sep 14, 2021

This job is for #2558 and is completed on the basis of #2629 , so thanks for the contribution of @zhangjun0x01 .
Below are some explanations:

  1. The options are not checked in FlinkCatalogFactory.createCatalog(), this is to be compatible with the old implementation. And this way we can use customized options as before without pre-settings.
  2. The implementation is not compatible with Flink 1.12 because CatalogFactory interface does not extend Factory interface in 1.12.

@kbendick
Copy link
Contributor

kbendick commented Sep 14, 2021

  1. The options are not checked in FlinkCatalogFactory.createCatalog(), this is to be compatible with the old implementation. And this way we can use customized options as before without pre-settings.

+1 to this. We have a number of customized options for things like AWS s3 / IAM etc that go on catalogs and we need to be able to support them.

  1. The implementation is not compatible with Flink 1.12 because CatalogFactory interface does not extend Factory interface in 1.12.

Does this mean we're dropping support entirely for Flink 1.12 in the next release (and effectively in the master branch)? This seems potentially premature. I thought that the Flink 1.13 interface added Factory, but kept the old interface TableFactory as well that was extended in 1.12 for backwards compatibility.

Is it possible we can do the same (continue to implement both interfaces or have two separate classes) to keep support for Flink 1.12 @Flyangz?

@wg1026688210
Copy link
Contributor

we do not have the plan of improving flink to 1.13 recently, is there a way to support both 1.12 and 1.13 versions.

@kbendick
Copy link
Contributor

kbendick commented Sep 15, 2021

we do not have the plan of improving flink to 1.13 recently, is there a way to support both 1.12 and 1.13 versions.

I believe that @zhangjun0x01's original PR supported both flink 1.12 and flink 1.13. Like I mentioned, if the FlinkCatalogFactory implements both APIs, will that not suffice to support both 1.12 and 1.13?

Dropping support for Flink 1.12 is a big decision in my opinion, and should be brought up on the mailing list or at the community sync-up.

Does my idea of continuing to implement TableFactory, in addition to adding the newer Factory interface functions, just like Flink's CatalogFactory does itself not work? I thought that was why Flink's CatalogFactory implemented both the old interface and the new interface, to mark the old interface as deprecated but still support both versions.

@Flyangz
Copy link
Contributor Author

Flyangz commented Sep 17, 2021

This committed code implements both TableFactory and Factory. Actually, Flink 1.13 will continue to use TableFactory API. So only implementing TableFactory is the method with the least changes and the lowest risk to upgrade to 1.13. Which way do you prefer? Implementing both or just TableFactory?

@kbendick
Copy link
Contributor

kbendick commented Sep 17, 2021

This committed code implements both TableFactory and Factory. Actually, Flink 1.13 will continue to use TableFactory API. So only implementing TableFactory is the method with the least changes and the lowest risk to upgrade to 1.13. Which way do you prefer? Implementing both or just TableFactory?

Oh that's rather unfortunate. I had hoped that Flink 1.13 would use the new interface, if present, while allowing for the old interface to also be implemented for backwards compatibility.

Perhaps somebody more involved with the Flink Community than myself can comment as to whether or not there might be a flag added or that could be added to tell Flink to prefer the new interface first (when present).

I would suggest implementing both interfaces for now (and also organizing the interface functions so that they're organized together based on which interface), so as to keep support for Flink 1.12.

While this won't give us the benefit of the new API, we can upgrade to Flink 1.13.2 and then consider the importance of that later on. I would still implement the new API functions though (since it's already been done), and then we can track that in a follow up issue. We might indeed decide dropping support for 1.12 is worth it, but we can't make that decision on our own.

EDIT: Perhaps it can be broken up into two different classes, and then when we build the jar only one META-INF file is placed on the class path depending on the Flink version. Seems really messy, but possibly we can consider this down the road.

Any thoughts on this @stevenzwu @openinx?

@@ -1,7 +1,7 @@
org.slf4j:* = 1.7.25
org.apache.avro:avro = 1.10.1
org.apache.calcite:* = 1.10.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we know if our calcite version needs to be upgraded?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The apache flink runtime won't depend on this calcite jar , this calcite was introduced for hive before. So I think we don't need to change this calcite version for flink upgrading.

@openinx openinx changed the title upgrade to flink 1.13 Flink: Upgrade to flink 1.13.2 Sep 22, 2021
Comment on lines 162 to 165
public static void assertThrowsRootCause(String message,
Class<? extends Exception> expected,
String containedInMessage,
Runnable runnable) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: It's not the correct iceberg indent, I think you will need to configure your IDE by following this: https://iceberg.apache.org/community/#setting-up-ide-and-code-style

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed it.

}

@Override
public Map<String, String> requiredContext() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As the requiredContext and supportedProperties have been marked as deprecated, I will suggest to implement those in requiredOptions & optionalOptions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As Context class is missing in 1.12, I think we can only implement these deprecated methods.


org.apache.iceberg.flink.FlinkDynamicTableFactory
org.apache.iceberg.flink.FlinkDynamicTableFactory
org.apache.iceberg.flink.FlinkCatalogFactory
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we remove this file org.apache.flink.table.factories.TableFactory) if we plan to upgrade the flink version to 1.13.2 directly ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we only support 1.13, then is ok to remove it. But if we want to be compatible with 1.12, FlinkCatalogFactory need to be treated as TableFactory because Context class is missing in 1.12 and without implementing createCatalog(Context context) will throw exception when we place FlinkCatalogFactory in org.apache.flink.table.factories.Factory

@openinx
Copy link
Member

openinx commented Sep 22, 2021

@kbendick @Flyangz According to the last discussion from mail list , I think we'd better to keep the same strategy as the apache spark plan to do, users from apache iceberg community also proposed to support both flink 1.12 & flink 1.13. Since the difference between flink 1.12 & 1.13 is not so large, I think it's possible to support both of them in the single flink module ( I mean we don't need to separate them into two different modules).

@Flyangz
Copy link
Contributor Author

Flyangz commented Sep 25, 2021

In order to be compatible with Flink 1.12 and 1.13, this commit only implements TableFactory interface. By 'compatible', I mean the code can be compiled with both 1.12 and 1.13 and pass all unit tests in iceberg-flink module.
I have tried to make iceberg to use different META-INF file based on Flink version, but this still does not work cause the createCatalog(Context context) method in Factory. The Context object is new in 1.13. Are there any feasible way that we can compile iceberg depended on 1.12 without this class?
@kbendick @openinx

@openinx
Copy link
Member

openinx commented Sep 26, 2021

In order to be compatible with Flink 1.12 and 1.13, this commit only implements TableFactory interface. By 'compatible', I mean the code can be compiled with both 1.12 and 1.13 and pass all unit tests in iceberg-flink module.
I have tried to make iceberg to use different META-INF file based on Flink version, but this still does not work cause the createCatalog(Context context) method in Factory. The Context object is new in 1.13. Are there any feasible way that we can compile iceberg depended on 1.12 without this class?
@kbendick @openinx

Is the deprecated createCatalog work for both flink 1.12 & flink 1.13 ?

@Flyangz
Copy link
Contributor Author

Flyangz commented Sep 26, 2021

In order to be compatible with Flink 1.12 and 1.13, this commit only implements TableFactory interface. By 'compatible', I mean the code can be compiled with both 1.12 and 1.13 and pass all unit tests in iceberg-flink module.
I have tried to make iceberg to use different META-INF file based on Flink version, but this still does not work cause the createCatalog(Context context) method in Factory. The Context object is new in 1.13. Are there any feasible way that we can compile iceberg depended on 1.12 without this class?
@kbendick @openinx

Is the deprecated createCatalog work for both flink 1.12 & flink 1.13 ?

I think so, at least it can pass all the unit tests in the iceberg-flink module compiled with 1.12 or 1.13.
In 1.13, as I mentioned above, Flink will use deprecated createCatalog first for compatibility. In 1.12, this method is not deprecated and has already been used in current iceberg master code.

@openinx
Copy link
Member

openinx commented Sep 26, 2021

Okay, so does any other problems for guaranteeing compatibility for both flink 1.12 & flink 1.13 in your own context ? I think we will need to make the whole unit tests work for both flink 1.12 & flink 1.13 , the work should be similar to this PR: 111fe81. I think it's good to make it into a separate PR !

@Flyangz
Copy link
Contributor Author

Flyangz commented Sep 26, 2021

Okay, so does any other problems for guaranteeing compatibility for both flink 1.12 & flink 1.13 in your own context ? I think we will need to make the whole unit tests work for both flink 1.12 & flink 1.13 , the work should be similar to this PR: 111fe81. I think it's good to make it into a separate PR !

  1. No.
  2. Yes, integration testing is a more reliable way to check compatibility.

Copy link
Member

@openinx openinx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM ! Thanks @Flyangz for contributing, and thanks @kbendick for reviewing !

@openinx openinx merged commit bae5ce9 into apache:master Sep 26, 2021
@openinx
Copy link
Member

openinx commented Sep 26, 2021

@Flyangz Please file a new issue to address the unit tests things, and attach the issue to [Priority 1] Flink: Upgrade to 1.13.2 project.

@Flyangz
Copy link
Contributor Author

Flyangz commented Sep 27, 2021

@Flyangz Please file a new issue to address the unit tests things, and attach the issue to [Priority 1] Flink: Upgrade to 1.13.2 project.

Sorry for the late reply. Do you mean #3183 issue that you have already created ?

@openinx
Copy link
Member

openinx commented Sep 28, 2021

@Flyangz Yes, this's the correct issue. You can just assign it to yourself.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants