Skip to content

Conversation

@MaxGekk
Copy link
Member

@MaxGekk MaxGekk commented Feb 14, 2021

What changes were proposed in this pull request?

In the PR, I propose to update the Spark SQL guide about the SQL configs that are related to datetime rebasing:

  • spark.sql.parquet.int96RebaseModeInWrite
  • spark.sql.parquet.datetimeRebaseModeInWrite
  • spark.sql.parquet.int96RebaseModeInRead
  • spark.sql.parquet.datetimeRebaseModeInRead
  • spark.sql.avro.datetimeRebaseModeInWrite
  • spark.sql.avro.datetimeRebaseModeInRead

Parquet options added by #31489:

  • datetimeRebaseMode
  • int96RebaseMode

and Avro options added by #31529:

  • datetimeRebaseMode

Screenshot 2021-02-17 at 21 42 09

Why are the changes needed?

To inform users about supported DS options and SQL configs.

Does this PR introduce any user-facing change?

No

How was this patch tested?

By generating the doc and manually checking:

$ SKIP_API=1 SKIP_SCALADOC=1 SKIP_PYTHONDOC=1 SKIP_RDOC=1 jekyll serve --watch

@github-actions github-actions bot added the DOCS label Feb 14, 2021
@SparkQA
Copy link

SparkQA commented Feb 14, 2021

Test build #135150 has finished for PR 31564 at commit 4795e90.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Feb 14, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39731/

@SparkQA
Copy link

SparkQA commented Feb 14, 2021

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39731/

@MaxGekk
Copy link
Member Author

MaxGekk commented Feb 15, 2021

@dongjoon-hyun @gengliangwang @cloud-fan @HyukjinKwon Could you review this PR, please.

@cloud-fan
Copy link
Contributor

These options are for migration purposes (same as the legacy configs). Do we really need to mention them in the public doc? How about the migration guide?

@MaxGekk
Copy link
Member Author

MaxGekk commented Feb 17, 2021

These options are for migration purposes (same as the legacy configs).

I believe the rebase configs were placed to the legacy namespace mistakenly because they can be used not only for migration from previous Spark versions but also for reading (and writing) files written by other systems/frameworks/libs. So, the configs will stay with us forever. I would like to propose to "rename" existing configs via:
ConfigBuilder has the method withAlternative , so, we can to introduce an alternative per each legacy rebase config:

spark.sql.legacy.parquet.int96RebaseModeInRead -> spark.sql.parquet.int96RebaseModeInRead

and deprecate

spark.sql.legacy.parquet.int96RebaseModeInRead

After that, document spark.sql.parquet.int96RebaseModeInRead in the Spark SQL guide.

@MaxGekk
Copy link
Member Author

MaxGekk commented Feb 17, 2021

Since #31576 has been merged by @cloud-fan , I documented public configs here.

@SparkQA
Copy link

SparkQA commented Feb 17, 2021

Test build #135203 has finished for PR 31564 at commit 773cb0b.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Feb 17, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39784/

@SparkQA
Copy link

SparkQA commented Feb 17, 2021

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39784/


</div>

## Data Source Option
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you plan to list other datasource options here? If not, I would write it as another section like Rebasing Datetime.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, other options should be here.

@HyukjinKwon
Copy link
Member

Merged to master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants