Skip to content

Conversation

@imback82
Copy link
Contributor

What changes were proposed in this pull request?

  1. Move ALTER NAMESPACE ... SET LOCATION parsing tests to AlterNamespaceSetLocationParserSuite.
  2. Put common ALTER NAMESPACE ... SET LOCATION tests into one trait org.apache.spark.sql.execution.command.AlterNamespaceSetLocationSuiteBase, and put datasource specific tests to the v1.AlterNamespaceSetLocationSuite and v2.AlterNamespaceSetLocationSuite.

The changes follow the approach of #30287.

Why are the changes needed?

  1. The unification will allow to run common ALTER NAMESPACE ... SET LOCATION tests for both DSv1/Hive DSv1 and DSv2
  2. We can detect missing features and differences between DSv1 and DSv2 implementations.

Does this PR introduce any user-facing change?

No

How was this patch tested?

Existing unit tests and new tests.

@github-actions github-actions bot added the SQL label Nov 15, 2021
@imback82
Copy link
Contributor Author

cc @cloud-fan @MaxGekk Thanks in advance!

withTempDir { tmpDir =>
sql(s"ALTER NAMESPACE $catalog.$ns SET LOCATION '${tmpDir.toURI}'")
val sessionCatalog = spark.sessionState.catalog
val uriInCatalog = sessionCatalog.getDatabaseMetadata(ns).locationUri
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cloud-fan This check is used in v1 session catalog (existing code), but this can be unified with v2 test if we use DESCRIBE NAMESPACE, WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea let's unify it, so that we can move this test to the base trait.

@SparkQA
Copy link

SparkQA commented Nov 15, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49720/

@SparkQA
Copy link

SparkQA commented Nov 15, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49720/

@SparkQA
Copy link

SparkQA commented Nov 16, 2021

Test build #145250 has finished for PR 34610 at commit 4629166.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

runBasicTest()
}

test("Empty location string is allowed") {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is another difference. v1 doesn't allow empty string because it converts the string to URI.

@SparkQA
Copy link

SparkQA commented Nov 21, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49953/

@SparkQA
Copy link

SparkQA commented Nov 21, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49953/

@SparkQA
Copy link

SparkQA commented Nov 21, 2021

Test build #145481 has finished for PR 34610 at commit 6df9efd.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

assert(message.contains(s"$notFoundMsgPrefix '$ns' not found"))
}

// Hive catalog does not support "ALTER NAMESPACE ... SET LOCATION", thus
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we can have a

protected def namespaceKeyword = "NAMESPACE"

and in the hive test we override it

override def namespaceKeyword = "DATABASE"

Then we can define the test in base trait

test(...) {
  ...
  sql(s"ALTER $namepaceKeyword ...")
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or we just allow using NAMESPACE in hive context as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant Hive doesn't support altering location:

override def alterDatabase(database: CatalogDatabase): Unit = withHiveState {
if (!getDatabase(database.name).locationUri.equals(database.locationUri)) {
// SPARK-29260: Enable supported versions once it support altering database location.
if (!(version.equals(hive.v3_0) || version.equals(hive.v3_1))) {
throw QueryCompilationErrors.alterDatabaseLocationUnsupportedError(version.fullVersion)

Copy link
Contributor

@cloud-fan cloud-fan Nov 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, thanks for the explanation!

cloud-fan pushed a commit that referenced this pull request Nov 24, 2021
…ty location consistently across v1 and v2 command

### What changes were proposed in this pull request?

Currently, there is an inconsistency when handling an empty location for `ALTER NAMESPACE .. SET LOCATION` between v1 and v2 command. In v1 command, an empty string location will result in the `IllegalArgumentException` exception thrown whereas v2 uses the empty string as it is.

This PR proposes to make the behavior consistent by following the v1 command behavior.

### Why are the changes needed?

To make the behavior consistent and the reason for following v1 behavior is that "Spark should be responsible to qualify the user-specified path using its spark/hadoop configs, before passing the path to v2 sources": #34610 (comment)

### Does this PR introduce _any_ user-facing change?

Yes, now the empty string location will result in the `IllegalArgumentException` exception thrown even for v2 catalogs.

### How was this patch tested?

Added a new test

Closes #34686 from imback82/empty_location_fix.

Authored-by: Terry Kim <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
}
}

test("SPARK-37444: ALTER NAMESPACE .. SET LOCATION using v2 catalog with empty location") {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now being tested in test("Empty location string")

@SparkQA
Copy link

SparkQA commented Nov 25, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50089/

@SparkQA
Copy link

SparkQA commented Nov 25, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50089/

@cloud-fan
Copy link
Contributor

thanks, merging to master!

@SparkQA
Copy link

SparkQA commented Nov 25, 2021

Test build #145617 has finished for PR 34610 at commit caacba6.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

cloud-fan pushed a commit that referenced this pull request Nov 25, 2021
… command by default

### What changes were proposed in this pull request?

This PR proposes to use V2 commands as default as outlined in [SPARK-36588](https://issues.apache.org/jira/browse/SPARK-36588), and this PR migrates `ALTER NAMESPACE ... SET LOCATION` to use v2 command by default.

Note that the work to make v1/v2 commands consistent was done in #34686 and tests covering both v1/v2 were done in #34610.

### Why are the changes needed?

It's been a while since we introduced the v2 commands,  and it seems reasonable to use v2 commands by default even for the session catalog, with a legacy config to fall back to the v1 commands.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

Existing *AlterNamespaceSetLocationSuite tests cover this PR's change.

Closes #34708 from imback82/v2_alter_ns_location.

Authored-by: Terry Kim <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
@imback82 imback82 changed the title [SPARK-34332][SQL][TEST] Unify v1 and v2 ALTER TABLE .. SET LOCATION tests [SPARK-34332][SQL][TEST] Unify v1 and v2 ALTER NAMESPACE .. SET LOCATION tests Dec 9, 2021
cloud-fan pushed a commit that referenced this pull request Dec 10, 2021
…NAMESPACE .. SET LOCATION

### What changes were proposed in this pull request?

This is a follow up to #34610 to remove unnecessary test for `ALTER NAMESPACE .. SET LOCATION`. The test being removed is already covered by https://github.com/apache/spark/blob/7b000116575cdc8e42653a694d9ad5273346fa61/sql/core/src/test/scala/org/apache/spark/sql/execution/command/AlterNamespaceSetLocationSuiteBase.scala#L65

### Why are the changes needed?

To remove unnecessary test.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Removing test.

Closes #34854 from imback82/followup-SPARK-34332.

Authored-by: Terry Kim <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
dongjoon-hyun pushed a commit that referenced this pull request Jan 23, 2024
### What changes were proposed in this pull request?
This pr aims to upgrade Arrow from 14.0.2 to 15.0.0, this version fixes the compatibility issue with Netty 4.1.104.Final(GH-39265).

Additionally, since the `arrow-vector` module uses `eclipse-collections` to replace `netty-common` as a compile-level dependency, Apache Spark has added a dependency on `eclipse-collections` after upgrading to use Arrow 15.0.0.

### Why are the changes needed?
The new version brings the following major changes:

Bug Fixes
GH-34610 - [Java] Fix valueCount and field name when loading/transferring NullVector
GH-38242 - [Java] Fix incorrect internal struct accounting for DenseUnionVector#getBufferSizeFor
GH-38254 - [Java] Add reusable buffer getters to char/binary vectors
GH-38366 - [Java] Fix Murmur hash on buffers less than 4 bytes
GH-38387 - [Java] Fix JDK8 compilation issue with TestAllTypes
GH-38614 - [Java] Add VarBinary and VarCharWriter helper methods to more writers
GH-38725 - [Java] decompression in Lz4CompressionCodec.java does not set writer index

New Features and Improvements
GH-38511 - [Java] Add getTransferPair(Field, BufferAllocator, CallBack) for StructVector and MapVector
GH-14936 - [Java] Remove netty dependency from arrow-vector
GH-38990 - [Java] Upgrade to flatc version 23.5.26
GH-39265 - [Java] Make it run well with the netty newest version 4.1.104

The full release notes as follows:

- https://arrow.apache.org/release/15.0.0.html

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Pass GitHub Actions

### Was this patch authored or co-authored using generative AI tooling?
No

Closes #44797 from LuciferYang/SPARK-46718.

Authored-by: yangjie01 <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants