-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-46179][SQL] Add CrossDbmsQueryTestSuites, which runs other DBMS against golden files, starting with Postgres #44084
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
dtenedor
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding this, it will be useful to increase test coverage!
sql/core/src/test/scala/org/apache/spark/sql/SQLQueryTestSuite.scala
Outdated
Show resolved
Hide resolved
sql/core/src/test/scala/org/apache/spark/sql/crossdbms/JdbcConnection.scala
Outdated
Show resolved
Hide resolved
sql/core/src/test/scala/org/apache/spark/sql/crossdbms/JdbcConnection.scala
Outdated
Show resolved
Hide resolved
sql/core/src/test/scala/org/apache/spark/sql/crossdbms/JdbcConnection.scala
Outdated
Show resolved
Hide resolved
sql/core/src/test/scala/org/apache/spark/sql/crossdbms/CrossDbmsQueryTestSuite.scala
Outdated
Show resolved
Hide resolved
sql/core/src/test/scala/org/apache/spark/sql/crossdbms/CrossDbmsQueryTestSuite.scala
Outdated
Show resolved
Hide resolved
sql/core/src/test/scala/org/apache/spark/sql/crossdbms/CrossDbmsQueryTestSuite.scala
Outdated
Show resolved
Hide resolved
sql/core/src/test/scala/org/apache/spark/sql/crossdbms/CrossDbmsQueryTestSuite.scala
Outdated
Show resolved
Hide resolved
sql/core/src/test/resources/sql-tests/inputs/postgres-crosstest/sqllogictest-select1.sql
Outdated
Show resolved
Hide resolved
This reverts commit bef8011.
sql/core/src/test/scala/org/apache/spark/sql/crossdbms/CrossDbmsQueryTestSuite.scala
Outdated
Show resolved
Hide resolved
The reason why we wanted to use DBMS to generate golden files is because we didn't want the CI/CD/developer to have to install some version of postgres/other DBMS just to run tests. |
Isn't it better to trust CI/CD to install the proper version of DBMS and run the tests, instead of humans? AFAIK we are already doing it to run tests for JDBC data sources. We already have postgres tests in https://github.com/apache/spark/blob/master/connector/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/PostgresIntegrationSuite.scala |
@cloud-fan I didn't know about those tests! I made the changes to add this as an integration test instead, as you suggested. The two tests (
|
| } | ||
| } | ||
|
|
||
| /** A test case. */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume these are just code move around, no actual changes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's correct.
I think it's OK. We have JDBC tests in
Is there a way to refactor it? |
I tried to move as much code as possible to the trait |
|
@cloud-fan Can we merge? |
|
thanks, merging to master! |
…S against golden files with other DBMS, starting with Postgres ### What changes were proposed in this pull request? Create `CrossDbmsQueryTestSuite`, which extends `SQLQueryTestHelper` and `DockerIntegrationSuite`. `CrossDbmsQueryTestSuite` is a trait class that allows testing golden files against other DBMS. `PostgreSQLQueryTestSuite` is an implementation of `CrossDbmsQueryTestSuite`. For starters, sql files in the subquery sql-tests are automatically opted into this test. In this PR, all files except for `exists-having.sql` are ignored, otherwise this PR would have 10K+ line changes (I would like to do that in the next PR, if possible). I had to change the syntax for view creation in `exists-having.sql` slightly, and this is reflected in the `analyzer-results` file, but crucially, the query output (in the `results` file) are not changed. Note that this will not be applicable to many of the current sql tests we have due to: - Incompatible SQL syntax between spark sql and postgres. - Incompatible data types. - Difference in numerical precision with doubles. - Missing functions in either system. - Test files with specific configs, such as ANSI, count bug etc. ### Why are the changes needed? For correctness checking of our SQLQueryTestSuites, we want to run SQLQueryTestSuites with Postgres as a reference DBMS. This can be easily extensible to other DBMS. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? This is a test-related PR, does not affect system behaviors. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#44084 from andylam-db/crossdbms. Authored-by: Andy Lam <[email protected]> Signed-off-by: Wenchen Fan <[email protected]>
What changes were proposed in this pull request?
Create
CrossDbmsQueryTestSuite, which extendsSQLQueryTestHelperandDockerIntegrationSuite.CrossDbmsQueryTestSuiteis a trait class that allows testing golden files against other DBMS.PostgreSQLQueryTestSuiteis an implementation ofCrossDbmsQueryTestSuite. For starters, sql files in the subquery sql-tests are automatically opted into this test. In this PR, all files except forexists-having.sqlare ignored, otherwise this PR would have 10K+ line changes (I would like to do that in the next PR, if possible). I had to change the syntax for view creation inexists-having.sqlslightly, and this is reflected in theanalyzer-resultsfile, but crucially, the query output (in theresultsfile) are not changed.Note that this will not be applicable to many of the current sql tests we have due to:
Why are the changes needed?
For correctness checking of our SQLQueryTestSuites, we want to run SQLQueryTestSuites with Postgres as a reference DBMS. This can be easily extensible to other DBMS.
Does this PR introduce any user-facing change?
No.
How was this patch tested?
This is a test-related PR, does not affect system behaviors.
Was this patch authored or co-authored using generative AI tooling?
No.