Skip to content

Conversation

@jzhuge
Copy link
Member

@jzhuge jzhuge commented Jan 30, 2023

What changes were proposed in this pull request?

This PR adds support to load, create, alter, and drop views in DataSource V2 catalogs.

  • View substitution rule
  • Create view DDL
  • View SQL DDLs

Why are the changes needed?

Support views stored in DataSourceV2 catalogs. Details in SPIP.

Does this PR introduce any user-facing change?

  • Support views from DataSource V2 catalogs in SQL
  • Support views from DataSource V2 catalogs in DataFrame API

How was this patch tested?

New unit tests

  • TODO

Regression

  • DDLParserSuite
  • PlanResolutionSuite
  • DataSourceV2SQLSuite

@github-actions github-actions bot added the SQL label Jan 30, 2023
@mridulm
Copy link
Contributor

mridulm commented Jan 30, 2023

+CC @shardulm94, @wmoustafa

@jzhuge
Copy link
Member Author

jzhuge commented Jan 30, 2023

TODOs and comments:

  • More unit tests
  • Conform to “v2 command framework” (SPARK-36586). ResolveCatalogs seem to have too much extra view code, but that cleanup might require migrating more SQL commands to “v2 command framework”. Maybe a good opportunity to split PR.
  • Further cleanup/simplify/unify code, especially around ResolvedV2View.

@jzhuge
Copy link
Member Author

jzhuge commented Jan 30, 2023

Thanks @amogh-jahagirdar for driving this PR!

@jzhuge
Copy link
Member Author

jzhuge commented Jan 30, 2023

cc @holdenk (Shepherd) @cloud-fan @imback82 @huaxingao @xkrogen

protected def analyzer: Analyzer = new Analyzer(catalogManager) {

override val extendedSubstitutionRules: Seq[Rule[LogicalPlan]] =
Seq(ViewSubstitution(sqlParser))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why does it need to be a substitution rule? The v1 view is resolved in the main resolution batch.

Copy link
Member Author

@jzhuge jzhuge Jan 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ViewSubstitution had to be done in the same batch as CTESubstitution to handle nested cte or view in each other.

However, since there have been many changes in master since I originally wrote the code, this requirement may no longer be necessary. Let me revisit. It'd be much better to move the rule to Resolution batch if possible!

@cloud-fan
Copy link
Contributor

shall we have the first PR only support creating and reading v2 views? Then we can add alter view commands (which can be done in parallel) later.

@jzhuge
Copy link
Member Author

jzhuge commented Jan 31, 2023

shall we have the first PR only support creating and reading v2 views? Then we can add alter view commands (which can be done in parallel) later.

Yes it is exactly what we planed. Amogh will split the PR.

@amogh-jahagirdar
Copy link

Thanks @jzhuge, happy to help ! Yes I plan on splitting this PR, I will get more time later this week to look into this.

}

test("View commands are not supported in v2 catalogs") {
ignore("View commands are not supported in v2 catalogs") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you remove the test if you supported all those commands.


override protected def run(): Seq[InternalRow] = {
val schema = desc.schema.map(_.name).mkString("(", ", ", ")")
val create = s"CREATE VIEW ${desc.identifier} $schema AS\n${desc.query}\n"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just in case, shouldn't ${desc.identifier} be quoted?


override protected def run(): Seq[InternalRow] = {
val exists = try {
catalog.viewExists(ident)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

viewExists() doesn't throw the NoSuchViewException exception. Even the default implementation catches it and returns a boolean, see

default boolean viewExists(Identifier ident) {
try {
return loadView(ident) != null;
} catch (NoSuchViewException e) {
return false;
}
}

} catch {
case e: IllegalArgumentException =>
throw new SparkException(s"Invalid view change: ${e.getMessage}", e)
case e: UnsupportedOperationException =>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alterView doesn't throw UnsupportedOperationException but throws NoSuchViewException. Could you adjust the comment of alterView according to status quo.

catalog.alterView(ident, changes: _*)
} catch {
case e: IllegalArgumentException =>
throw new SparkException(s"Invalid view change: ${e.getMessage}", e)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, introduce an error class in error-classes.json if such class doesn't exist.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MaxGekk I see you manually reviewing and noticing when PRs introduce new cases of errors that don't go through the error-classes, I wonder if we can capture this with a linter rule to save you some trouble? 🙂

case e: IllegalArgumentException =>
throw new SparkException(s"Invalid view change: ${e.getMessage}", e)
case e: UnsupportedOperationException =>
throw new SparkException(s"Unsupported view change: ${e.getMessage}", e)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the same, please, use an error class.

@github-actions
Copy link

github-actions bot commented Jun 3, 2023

We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.
If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag!

@jzhuge
Copy link
Member Author

jzhuge commented Dec 4, 2023

Current status:

  • Incorporate all @MaxGekk comments (took care of some)
  • Add unit tests

Follow up:

  • Support user specified column names
  • Support viewSQLConfigs
  • Conform to “v2 command framework” (SPARK-36586). - ResolveCatalogs seem to have too much extra view code, but that cleanup might require migrating more SQL commands to “v2 command framework”.

@holdenk holdenk removed the Stale label Dec 5, 2023
@holdenk
Copy link
Contributor

holdenk commented Dec 5, 2023

It won't let me re-open, can you re-create the PR?

@arpitporwal2293
Copy link

Hey @jzhuge, since this pull request is not merged. Is DSv2 support for Views missing in spark? I am not able to find any follow up pull requests which adds that functionality. Would appreciate the help.

@jzhuge
Copy link
Member Author

jzhuge commented Oct 1, 2025

Could you share your use case? Specifically what type of view backend?

Along with Iceberg community, we added view support to Iceberg 1.5. Spark components for view support were added to the Iceberg SQL extension.

If desirable, I will be happy to restart the effort for a Spark PR.

@sunchao
Copy link
Member

sunchao commented Dec 19, 2025

@jzhuge we're also very interested in this feature. Currently we are using Delta Lake with DSv2 (we have a custom v2 catalog impl), and it'd be great if the view support is in Spark itself. Happy to help review & push the Spark PR too!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants