-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-39800][SQL][WIP] DataSourceV2: View Support #39796
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
+CC @shardulm94, @wmoustafa |
|
TODOs and comments:
|
|
Thanks @amogh-jahagirdar for driving this PR! |
|
cc @holdenk (Shepherd) @cloud-fan @imback82 @huaxingao @xkrogen |
| protected def analyzer: Analyzer = new Analyzer(catalogManager) { | ||
|
|
||
| override val extendedSubstitutionRules: Seq[Rule[LogicalPlan]] = | ||
| Seq(ViewSubstitution(sqlParser)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why does it need to be a substitution rule? The v1 view is resolved in the main resolution batch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ViewSubstitution had to be done in the same batch as CTESubstitution to handle nested cte or view in each other.
However, since there have been many changes in master since I originally wrote the code, this requirement may no longer be necessary. Let me revisit. It'd be much better to move the rule to Resolution batch if possible!
|
shall we have the first PR only support creating and reading v2 views? Then we can add alter view commands (which can be done in parallel) later. |
Yes it is exactly what we planed. Amogh will split the PR. |
|
Thanks @jzhuge, happy to help ! Yes I plan on splitting this PR, I will get more time later this week to look into this. |
| } | ||
|
|
||
| test("View commands are not supported in v2 catalogs") { | ||
| ignore("View commands are not supported in v2 catalogs") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you remove the test if you supported all those commands.
|
|
||
| override protected def run(): Seq[InternalRow] = { | ||
| val schema = desc.schema.map(_.name).mkString("(", ", ", ")") | ||
| val create = s"CREATE VIEW ${desc.identifier} $schema AS\n${desc.query}\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just in case, shouldn't ${desc.identifier} be quoted?
|
|
||
| override protected def run(): Seq[InternalRow] = { | ||
| val exists = try { | ||
| catalog.viewExists(ident) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
viewExists() doesn't throw the NoSuchViewException exception. Even the default implementation catches it and returns a boolean, see
spark/sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/ViewCatalog.java
Lines 108 to 114 in d12abff
| default boolean viewExists(Identifier ident) { | |
| try { | |
| return loadView(ident) != null; | |
| } catch (NoSuchViewException e) { | |
| return false; | |
| } | |
| } |
| } catch { | ||
| case e: IllegalArgumentException => | ||
| throw new SparkException(s"Invalid view change: ${e.getMessage}", e) | ||
| case e: UnsupportedOperationException => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
alterView doesn't throw UnsupportedOperationException but throws NoSuchViewException. Could you adjust the comment of alterView according to status quo.
| catalog.alterView(ident, changes: _*) | ||
| } catch { | ||
| case e: IllegalArgumentException => | ||
| throw new SparkException(s"Invalid view change: ${e.getMessage}", e) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, introduce an error class in error-classes.json if such class doesn't exist.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MaxGekk I see you manually reviewing and noticing when PRs introduce new cases of errors that don't go through the error-classes, I wonder if we can capture this with a linter rule to save you some trouble? 🙂
| case e: IllegalArgumentException => | ||
| throw new SparkException(s"Invalid view change: ${e.getMessage}", e) | ||
| case e: UnsupportedOperationException => | ||
| throw new SparkException(s"Unsupported view change: ${e.getMessage}", e) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the same, please, use an error class.
|
We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable. |
|
Current status:
Follow up:
|
|
It won't let me re-open, can you re-create the PR? |
|
Hey @jzhuge, since this pull request is not merged. Is DSv2 support for Views missing in spark? I am not able to find any follow up pull requests which adds that functionality. Would appreciate the help. |
|
Could you share your use case? Specifically what type of view backend? Along with Iceberg community, we added view support to Iceberg 1.5. Spark components for view support were added to the Iceberg SQL extension. If desirable, I will be happy to restart the effort for a Spark PR. |
|
@jzhuge we're also very interested in this feature. Currently we are using Delta Lake with DSv2 (we have a custom v2 catalog impl), and it'd be great if the view support is in Spark itself. Happy to help review & push the Spark PR too! |
What changes were proposed in this pull request?
This PR adds support to load, create, alter, and drop views in DataSource V2 catalogs.
Why are the changes needed?
Support views stored in DataSourceV2 catalogs. Details in SPIP.
Does this PR introduce any user-facing change?
How was this patch tested?
New unit tests
Regression