Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Databricks Delta defaults to LEGACY for int96RebaseModeInWrite #8166

Closed
tgravescs opened this issue Apr 21, 2023 · 2 comments
Closed

Databricks Delta defaults to LEGACY for int96RebaseModeInWrite #8166

tgravescs opened this issue Apr 21, 2023 · 2 comments
Assignees
Labels
bug Something isn't working

Comments

@tgravescs
Copy link
Collaborator

Describe the bug
It looks like Databricks defaults all of the rebase mode stuff to LEGACY mode. It appears that perhaps that Delta writes didn't used to write this metadata to the parquet file, but recently changed it to honor it. This means that any INT96 timestamp fields we can't read with the GPU parquet reader. This particular job had spark.sql.parquet.int96RebaseModeInRead=CORRECTED, but since the file itself has the INT96 tag (key = org.apache.spark.legacyINT96) that takes precedence and we end up failing to read

@tgravescs tgravescs added bug Something isn't working ? - Needs Triage Need team to review and classify labels Apr 21, 2023
@tgravescs
Copy link
Collaborator Author

Here I think we may be able to actually check the data to see if it needs to be rebased and only throw if it actually does. Investigating this approach.

@tgravescs tgravescs self-assigned this Apr 21, 2023
@tgravescs tgravescs removed the ? - Needs Triage Need team to review and classify label Apr 24, 2023
@tgravescs
Copy link
Collaborator Author

so while Databricks does default to this, the customer issue reported was actually just that they had some timestamps that were to old. We already have some base logic to see if a date it older then the max timestamp of any timezone and only throw if it is older.

closing this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant