From ac10d8ed21ed4e7c6c6ca34ea4920a50c002bffe Mon Sep 17 00:00:00 2001 From: Gengliang Wang Date: Tue, 18 Aug 2020 11:46:24 +0800 Subject: [PATCH 1/4] add migration guide --- docs/sql-migration-guide.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/docs/sql-migration-guide.md b/docs/sql-migration-guide.md index c7f6116b88f87..6857d9d29ccce 100644 --- a/docs/sql-migration-guide.md +++ b/docs/sql-migration-guide.md @@ -36,6 +36,10 @@ license: | - In Spark 3.1, NULL elements of structures, arrays and maps are converted to "null" in casting them to strings. In Spark 3.0 or earlier, NULL elements are converted to empty strings. To restore the behavior before Spark 3.1, you can set `spark.sql.legacy.castComplexTypesToString.enabled` to `true`. + - In Spark 3.1, when `spark.sql.ansi.enabled` is false, sum aggregation of decimal type column always returns `null` on decimal value overflow. In Spark 3.0 or earlier, when `spark.sql.ansi.enabled` is false and decimal value overflow happens in sum aggregation of decimal type column: + - If it is hash aggregation with `group by` clause, a runtime exception is thrown. + - Otherwise, null is returned. + ## Upgrading from Spark SQL 3.0 to 3.0.1 - In Spark 3.0, JSON datasource and JSON function `schema_of_json` infer TimestampType from string values if they match to the pattern defined by the JSON option `timestampFormat`. Since version 3.0.1, the timestamp type inference is disabled by default. Set the JSON option `inferTimestamp` to `true` to enable such type inference. From b9dd5cded15f5a2e271de4f1d6c6334c73a1ca23 Mon Sep 17 00:00:00 2001 From: Gengliang Wang Date: Tue, 18 Aug 2020 16:04:47 +0800 Subject: [PATCH 2/4] revise --- docs/sql-migration-guide.md | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/docs/sql-migration-guide.md b/docs/sql-migration-guide.md index 6857d9d29ccce..118c0d451fb33 100644 --- a/docs/sql-migration-guide.md +++ b/docs/sql-migration-guide.md @@ -36,9 +36,7 @@ license: | - In Spark 3.1, NULL elements of structures, arrays and maps are converted to "null" in casting them to strings. In Spark 3.0 or earlier, NULL elements are converted to empty strings. To restore the behavior before Spark 3.1, you can set `spark.sql.legacy.castComplexTypesToString.enabled` to `true`. - - In Spark 3.1, when `spark.sql.ansi.enabled` is false, sum aggregation of decimal type column always returns `null` on decimal value overflow. In Spark 3.0 or earlier, when `spark.sql.ansi.enabled` is false and decimal value overflow happens in sum aggregation of decimal type column: - - If it is hash aggregation with `group by` clause, a runtime exception is thrown. - - Otherwise, null is returned. + - In Spark 3.1, when `spark.sql.ansi.enabled` is false, Spark always returns null if the sum of decimal type column overflows. In Spark 3.0 or earlier, when `spark.sql.ansi.enabled` is false, the sum of decimal type column may return null or incorrect answer, or even fails at runtime, depending on the actual query plan execution. ## Upgrading from Spark SQL 3.0 to 3.0.1 From a6b03f84e5966331a85a4913583f8ca8ca683c3a Mon Sep 17 00:00:00 2001 From: Gengliang Wang Date: Tue, 18 Aug 2020 16:07:16 +0800 Subject: [PATCH 3/4] revise --- docs/sql-migration-guide.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/sql-migration-guide.md b/docs/sql-migration-guide.md index 118c0d451fb33..417b58e70b304 100644 --- a/docs/sql-migration-guide.md +++ b/docs/sql-migration-guide.md @@ -36,7 +36,7 @@ license: | - In Spark 3.1, NULL elements of structures, arrays and maps are converted to "null" in casting them to strings. In Spark 3.0 or earlier, NULL elements are converted to empty strings. To restore the behavior before Spark 3.1, you can set `spark.sql.legacy.castComplexTypesToString.enabled` to `true`. - - In Spark 3.1, when `spark.sql.ansi.enabled` is false, Spark always returns null if the sum of decimal type column overflows. In Spark 3.0 or earlier, when `spark.sql.ansi.enabled` is false, the sum of decimal type column may return null or incorrect answer, or even fails at runtime, depending on the actual query plan execution. + - In Spark 3.1, when `spark.sql.ansi.enabled` is false, Spark always returns null if the sum of decimal type column overflows. In Spark 3.0 or earlier, when `spark.sql.ansi.enabled` is false, the sum of decimal type column may return null or incorrect result, or even fails at runtime (depending on the actual query plan execution). ## Upgrading from Spark SQL 3.0 to 3.0.1 From 877f47653cd1b8cc580a221870be78c5bd407bea Mon Sep 17 00:00:00 2001 From: Gengliang Wang Date: Wed, 19 Aug 2020 11:33:20 +0800 Subject: [PATCH 4/4] address comment --- docs/sql-migration-guide.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/sql-migration-guide.md b/docs/sql-migration-guide.md index 417b58e70b304..4e8e7378d9a1e 100644 --- a/docs/sql-migration-guide.md +++ b/docs/sql-migration-guide.md @@ -36,7 +36,7 @@ license: | - In Spark 3.1, NULL elements of structures, arrays and maps are converted to "null" in casting them to strings. In Spark 3.0 or earlier, NULL elements are converted to empty strings. To restore the behavior before Spark 3.1, you can set `spark.sql.legacy.castComplexTypesToString.enabled` to `true`. - - In Spark 3.1, when `spark.sql.ansi.enabled` is false, Spark always returns null if the sum of decimal type column overflows. In Spark 3.0 or earlier, when `spark.sql.ansi.enabled` is false, the sum of decimal type column may return null or incorrect result, or even fails at runtime (depending on the actual query plan execution). + - In Spark 3.1, when `spark.sql.ansi.enabled` is false, Spark always returns null if the sum of decimal type column overflows. In Spark 3.0 or earlier, in the case, the sum of decimal type column may return null or incorrect result, or even fails at runtime (depending on the actual query plan execution). ## Upgrading from Spark SQL 3.0 to 3.0.1