Skip to content

Conversation

@MaxGekk
Copy link
Member

@MaxGekk MaxGekk commented Nov 16, 2019

What changes were proposed in this pull request?

In the PR, I propose to remove the following SQL configs:

  1. spark.sql.fromJsonForceNullableSchema
  2. spark.sql.legacy.compareDateTimestampInTimestamp
  3. spark.sql.legacy.allowCreatingManagedTableUsingNonemptyLocation

that are declared to be removed in Spark 3.0

Why are the changes needed?

To make code cleaner and improve maintainability.

Does this PR introduce any user-facing change?

Yes

How was this patch tested?

By TypeCoercionSuite, JsonExpressionsSuite and DDLSuite.

@SparkQA
Copy link

SparkQA commented Nov 17, 2019

Test build #113948 has finished for PR 26559 at commit 4359dd1.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you check the other configs once more? You may want to remove spark.sql.legacy.allowCreatingManagedTableUsingNonemptyLocation, too.

@dongjoon-hyun
Copy link
Member

Also, cc @srowen .

@SparkQA
Copy link

SparkQA commented Nov 17, 2019

Test build #113954 has finished for PR 26559 at commit 051c6f5.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

"the SQL parser.")
.fallbackConf(ANSI_ENABLED)

val ALLOW_CREATING_MANAGED_TABLE_USING_NONEMPTY_LOCATION =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this explicitly to be removed in 3.0? doesn't say so in the doc but it may have been otherwise documented or well understood.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is mentioned in the SQL migration guide:

- Since Spark 2.4, creating a managed table with nonempty location is not allowed. An exception is thrown when attempting to create a managed table with nonempty location. To set `true` to `spark.sql.legacy.allowCreatingManagedTableUsingNonemptyLocation` restores the previous behavior. This option will be removed in Spark 3.0.

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM. Thank you, @MaxGekk and @srowen .
Merged to master.

.stringConf
.createWithDefault("_corrupt_record")

val FROM_JSON_FORCE_NULLABLE_SCHEMA = buildConf("spark.sql.fromJsonForceNullableSchema")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we throw an exception if users try to set the removed conf to a value that is different from the default?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, we could throw an exception for 3 configs. I am just wondering why we silently ignore non-existed SQL configs:

scala> spark.conf.set("spark.sql.abc", 1)

How about throwing AnalysisException for not existed SQL configs that have the spark.sql prefix but don't present in

private[sql] val sqlConfEntries = java.util.Collections.synchronizedMap(
?

or there are SQL configs that we have to bypass for some reasons?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is the PR #27057

HyukjinKwon pushed a commit that referenced this pull request Jan 3, 2020
…removed SQL configs

### What changes were proposed in this pull request?
In the PR, I propose to throw `AnalysisException` when a removed SQL config is set to non-default value. The following SQL configs removed by #26559 are marked as removed:
1. `spark.sql.fromJsonForceNullableSchema`
2. `spark.sql.legacy.compareDateTimestampInTimestamp`
3. `spark.sql.legacy.allowCreatingManagedTableUsingNonemptyLocation`

### Why are the changes needed?
To improve user experience with Spark SQL by notifying of removed SQL configs used by users.

### Does this PR introduce any user-facing change?
Yes, before the `set` command was silently ignored:
```sql
spark-sql> set spark.sql.fromJsonForceNullableSchema=false;
spark.sql.fromJsonForceNullableSchema	false
```
after the exception should be raised:
```sql
spark-sql> set spark.sql.fromJsonForceNullableSchema=false;
Error in query: The SQL config 'spark.sql.fromJsonForceNullableSchema' was removed in the version 3.0.0. It was removed to prevent errors like SPARK-23173 for non-default value.;
```

### How was this patch tested?
Added new tests into `SQLConfSuite` for both cases when removed SQL configs are set to default and non-default values.

Closes #27057 from MaxGekk/remove-sql-configs-followup.

Authored-by: Maxim Gekk <[email protected]>
Signed-off-by: HyukjinKwon <[email protected]>
@MaxGekk MaxGekk deleted the remove-sql-configs branch June 5, 2020 19:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants