Skip to content

Core: Add view support for JDBC catalog#9487

Merged
danielcweeks merged 1 commit intoapache:mainfrom
jbonofre:8697
Feb 17, 2024
Merged

Core: Add view support for JDBC catalog#9487
danielcweeks merged 1 commit intoapache:mainfrom
jbonofre:8697

Conversation

@jbonofre
Copy link
Copy Markdown
Member

@jbonofre jbonofre commented Jan 16, 2024

This PR adds view support to the JDBC catalog:

  • JdbcCatalog extends BaseMetastoreViewCatalog
  • Introduce JdbcViewOperations
  • Dedicated JDBC table to store views and associated namespaces
  • JdbcUtil refactoring to deal with SQL statements common to table and view
  • Introduce TestJdbcViewCatalog by extending ViewCatalogTests

Close #8697

@github-actions github-actions bot added the core label Jan 16, 2024
@jbonofre
Copy link
Copy Markdown
Member Author

Few notes about this PR:

  1. In JdbcUtil, I used methods to generate the SQL statement for table and view as it's very similar. I used a boolean to define if the generated SQL statement targets a table or a view. For clarity, I can define an enum instead of boolean.
  2. I verified the JDBC catalog for both table and view using ViewCatalogTests and CatalogTests.

@nastra @rdblue do you mind to take a look ? Thanks !

@jbonofre
Copy link
Copy Markdown
Member Author

@ajantha-bhat @nk1506 if you guys want to take a look :) Thanks !

@ajantha-bhat ajantha-bhat self-requested a review January 17, 2024 02:01
Copy link
Copy Markdown
Member

@ajantha-bhat ajantha-bhat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work. Thanks for the contributions.

Some minor comments about public interfaces. Almost look good to me.

@ajantha-bhat
Copy link
Copy Markdown
Member

Tagging @amogh-jahagirdar, as I feel this PR can help in reducing the changes at Trino side for REST catalog view support (trinodb/trino#19818 (comment))

@jbonofre jbonofre force-pushed the 8697 branch 2 times, most recently from 989b15a to ac2d4c1 Compare January 18, 2024 13:38
@jbonofre
Copy link
Copy Markdown
Member Author

@ajantha-bhat I should have addressed your comments 😄

Copy link
Copy Markdown
Member

@ajantha-bhat ajantha-bhat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks much cleaner now. Almost there.

Copy link
Copy Markdown
Contributor

@nastra nastra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall this looks almost ready, I've added a few comments but nothing major

@jbonofre
Copy link
Copy Markdown
Member Author

jbonofre commented Feb 9, 2024

@nastra thanks ! I addressed your comments. The PR is ready for a new round 😄 Thanks again !

Copy link
Copy Markdown
Contributor

@danielcweeks danielcweeks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking much closer. The main issue I see is that a lot of the differentiation between the original and new schemas are referenced as "old" and "new", which is very contextual to this particular change.

I think we need to say the original schmea and related queries are V0 and this set of changes is V1 to enable future changes.

This would mean that we don't rely on checks like:
if (isNewSchema) ? newBehavior : oldBehavior
which will be difficult to maintain going forward.

Ideally, we would have somethimg more like:

switch(schemaVersion) {
  case V0:  ...
  case V1: ...
}

@jbonofre
Copy link
Copy Markdown
Member Author

@danielcweeks Thanks ! Yeah, I assumed we won't have new schema update, but you are right, it's possible to have new updates in the future. So let me update the PR with schema versioning. Thanks !

@jbonofre
Copy link
Copy Markdown
Member Author

@nastra @ajantha-bhat @danielcweeks @rdblue I updated the PR by introducing SchemaVersion, but I keep the same jdbc.add-view-support property for users.
Some notes:

  1. I didn't rename all statements to use V1_, only on the statements where we have a difference between V0 and V1.
  2. As we only have V0 and V1 for now, I still use if (not switch). We will eventually change later when introducing a new schema version.
  3. After 1.5.0 release, I plan to refactore the JdbcCatalog using JdbcAdapter (similar to what I did in Apache ActiveMQ for the JDBC store). It will allow to support different backends and having statements specific to backend when needed.

Copy link
Copy Markdown
Contributor

@nastra nastra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall this LGTM and thanks @jbonofre for working on this. I'm ok using using if with V0/ V1 for now, but obviously that won't scale in the long run, so we would need some kind of wrapper class that contains all the SQLs for a particular schema version. We could then instantiate this schema class for the given schema version and make the code easier to read and maintain. However, I don't think that this is a 1.5.0 blocker, so it should be ok to ship this with the next release

@jbonofre
Copy link
Copy Markdown
Member Author

@nastra yes, actually, I plan to improve V0 / V1 and more support in a separate PR introducing JdbcAdapter allowing us to adapt to different backend (like I did here https://github.com/apache/activemq/tree/main/activemq-jdbc-store/src/main/java/org/apache/activemq/store/jdbc/adapter).
I propose to move forward with this PR for Iceberg 1.5.0, then I will do a refactoring of the JdbcCatalog for 1.6.0.

Thanks !

Copy link
Copy Markdown
Contributor

@danielcweeks danielcweeks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jbonofre!

@danielcweeks danielcweeks merged commit 0316be3 into apache:main Feb 17, 2024
@ajantha-bhat
Copy link
Copy Markdown
Member

Thanks for working on this @jbonofre.

Thanks for reviewing it @danielcweeks, @nastra, @rdblue, @amogh-jahagirdar.

I think we also need to keep the PR ready for Trino and pyiceberg.
For 1.5.0 version bump, Trino might need related changes.

I will start working on 1.5.0 release and try to have an RC by Monday.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

Add view support for JDBC catalog

6 participants