Skip to content

Conversation

@MaxGekk
Copy link
Member

@MaxGekk MaxGekk commented Jun 14, 2020

What changes were proposed in this pull request?

  • Modify DateTimeRebaseBenchmark to benchmark the default date-time rebasing mode - EXCEPTION for saving/loading dates/timestamps from/to parquet files. The mode is benchmarked for modern timestamps after 1900-01-01 00:00:00Z and dates after 1582-10-15.
  • Regenerate benchmark results in the environment:
Item Description
Region us-west-2 (Oregon)
Instance r3.xlarge
AMI ubuntu/images/hvm-ssd/ubuntu-bionic-18.04-amd64-server-20190722.1 (ami-06f2f779464715dc5)
Java OpenJDK 64-Bit Server VM 1.8.0_252 and OpenJDK 64-Bit Server VM 11.0.7+10

Why are the changes needed?

The EXCEPTION rebasing mode is the default mode of the SQL configs spark.sql.legacy.parquet.datetimeRebaseModeInRead and spark.sql.legacy.parquet.datetimeRebaseModeInWrite. The changes are needed to improve benchmark coverage for default settings.

Does this PR introduce any user-facing change?

No

How was this patch tested?

By running the benchmark and check results manually.

@SparkQA
Copy link

SparkQA commented Jun 15, 2020

Test build #124016 has finished for PR 28829 at commit 16e90be.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@MaxGekk
Copy link
Member Author

MaxGekk commented Jun 15, 2020

jenkins, retest this, please

@MaxGekk MaxGekk changed the title [WIP][SQL] Benchmark the EXCEPTION rebase mode [SPARK-31992][SQL] Benchmark the EXCEPTION rebase mode Jun 15, 2020
@MaxGekk
Copy link
Member Author

MaxGekk commented Jun 15, 2020

@cloud-fan Please, review the PR

@SparkQA
Copy link

SparkQA commented Jun 15, 2020

Test build #124033 has finished for PR 28829 at commit 16e90be.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor

retest this please

@cloud-fan
Copy link
Contributor

This updates benchmark only and doesn't affect jenkins builder, I'm merging it to master/3.0, thanks!

@cloud-fan cloud-fan closed this in 9d95f1b Jun 15, 2020
cloud-fan pushed a commit that referenced this pull request Jun 15, 2020
### What changes were proposed in this pull request?
- Modify `DateTimeRebaseBenchmark` to benchmark the default date-time rebasing mode - `EXCEPTION` for saving/loading dates/timestamps from/to parquet files. The mode is benchmarked for modern timestamps after 1900-01-01 00:00:00Z and dates after 1582-10-15.
- Regenerate benchmark results in the environment:

| Item | Description |
| ---- | ----|
| Region | us-west-2 (Oregon) |
| Instance | r3.xlarge |
| AMI | ubuntu/images/hvm-ssd/ubuntu-bionic-18.04-amd64-server-20190722.1 (ami-06f2f779464715dc5) |
| Java | OpenJDK 64-Bit Server VM 1.8.0_252 and OpenJDK 64-Bit Server VM 11.0.7+10 |

### Why are the changes needed?
The `EXCEPTION` rebasing mode is the default mode of the SQL configs `spark.sql.legacy.parquet.datetimeRebaseModeInRead` and `spark.sql.legacy.parquet.datetimeRebaseModeInWrite`. The changes are needed to improve benchmark coverage for default settings.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
By running the benchmark and check results manually.

Closes #28829 from MaxGekk/benchmark-exception-mode.

Authored-by: Max Gekk <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
(cherry picked from commit 9d95f1b)
Signed-off-by: Wenchen Fan <[email protected]>
@SparkQA
Copy link

SparkQA commented Jun 15, 2020

Test build #124039 has finished for PR 28829 at commit 16e90be.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@MaxGekk MaxGekk deleted the benchmark-exception-mode branch December 11, 2020 20:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants