Skip to content

Conversation

@zhangjun0x01
Copy link
Contributor

upgrade to flink 1.13
#2558

.stringType()
.defaultValue("true")
.withDescription("Whether to cache the catalog in FlinkCatalog.");
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing option default-database

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is some options for using sql to create catalog, I think we don’t need to add default-database , http://iceberg.apache.org/flink/#creating-catalogs-and-using-catalogs

@nastra
Copy link
Contributor

nastra commented Jun 8, 2021

There are a bunch of docs that still mention an older version of Flink. Should we update those as well as part of this PR?

@zhangjun0x01
Copy link
Contributor Author

There are a bunch of docs that still mention an older version of Flink. Should we update those as well as part of this PR?

At present, the community has not reached an agreement whether to upgrade to flink 1.13, if you need to upgrade to flink 1.13. You can apply the PR to your branch first

@gaborgsomogyi
Copy link

@gaborgsomogyi
Copy link

I've tested the changes and works fine with 1.13.0.

@mbalassi
Copy link

Are these changes backwards compatible with Flink 1.12 too or does it strictly require 1.13?

"Version number to describe the property version. This property can be used for backwards " +
"compatibility in case the property format changes. The current property version is `1`. (Optional)");

public static final ConfigOption<String> URI =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we reuse some constants from CatalogProperties in iceberg-core module? like

ConfigOptions.key(CatalogProperties.URI)

Copy link
Contributor

@stevenzwu stevenzwu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I am also +1 for upgrading to 1.13

Copy link
Contributor

@kbendick kbendick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this. Some people are interested in being able to use Flink + Iceberg in environments where Hadoop is not easily configured, which we'd thought that the access to the class loader provided by the new CatalogFactory.Context might make easier.

I don't want to block this PR, so I've left some comments. My only outstanding concern is the change from using * in supportedOptons to the defined list in optionalOptions.

Left a few comments with some configs that I know of.

FactoryUtil.createCatalogFactoryHelper(this, context);
helper.validate();

return createCatalog(context.getName(), context.getOptions(), clusterHadoopConf());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is an effort to decouple the need for Hadoop from the FlinkCatalogFactory for environments where Hadoop is not easily configurable and for catalog implementations that don't actually need it (basically anything that uses S3FileIO - though we'll need to update GlueCatalog and eventually DynamoDbCatalog as well).

Even after updating GlueCatalog to remove the Hadoop Configurable interface, this call to clusterHadoopConf() still makes it so that Hadoop is needed on the class path.

I'm not proposing that we change that in this PR (as this PR has been open for a while, and it's a separate concern), but I wanted to draw attention to the issue as I'm a bit less informed on the Flink side compared to many of you: #3044

Comment on lines +115 to +125
public Set<ConfigOption<?>> optionalOptions() {
final Set<ConfigOption<?>> options = new HashSet<>();
options.add(FlinkCatalogFactoryOptions.PROPERTY_VERSION);
options.add(FlinkCatalogFactoryOptions.URI);
options.add(FlinkCatalogFactoryOptions.WAREHOUSE);
options.add(FlinkCatalogFactoryOptions.CLIENTS);
options.add(FlinkCatalogFactoryOptions.BASE_NAMESPACE);
options.add(FlinkCatalogFactoryOptions.HIVE_CONF_DIF);
options.add(FlinkCatalogFactoryOptions.CACHE_ENABLED);
return options;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to add any of the known additional catalog properties that are used with some of the AWS catalogs?

Also, for catalogs that offer additional options (including custom catalogs), how will users set those? Presently, supportedOptions has a *. Is that still possible? The usage of * would be the easiest path forward, but that might not be possible.

If not, the ones I can think of are presently coming from the AWS catalogs (though we should also look into the JDBC catalog as well).

Additional Catalog Properties To Consider:
General Catalog Properties

  • catalog-impl
  • io-impl

GlueCatalog Specific Options

  • lock-impl
  • lock.table

DynamoDbCatalog Specific Options

  • dynamodb.table-name

AWS Client Configuration

  • client.factory

There are further ones which I'll try to link to in a bit (such as S3FileIO options for AWS authentication and encryption parameters), but those are the first ones I encountered in the docs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, will users be able to specify their own catalog implementation (with its own options) still?

@kbendick
Copy link
Contributor

@tiborkiss mentioned in the apache-iceberg Slack that this PR would also be needed to fully support Flink SQL (and not just streaming apps): #2666

@openinx
Copy link
Member

openinx commented Sep 2, 2021

@zhangjun0x01 , I think it's time to upgrade the version to flink 1.13 now. Would you like to fix the conflicts ? I'd like to check whether it's OK to get this merge..

@Flyangz
Copy link
Contributor

Flyangz commented Sep 8, 2021

@zhangjun0x01 , are you still working on it? If not or you do not have time to fix it, I would like to pick it up.

@openinx
Copy link
Member

openinx commented Sep 26, 2021

Closed via #3116

@openinx openinx closed this Sep 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants