Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: unload parquet with version as created_by. #15680

Merged
merged 1 commit into from
May 30, 2024

Conversation

youngsofun
Copy link
Member

@youngsofun youngsofun commented May 30, 2024

I hereby agree to the terms of the CLA available at: https://docs.databend.com/dev/policies/cla/

Summary

  1. by default, it is concat!("parquet-rs version ", env!("CARGO_PKG_VERSION"))
  2. since we have our own extention types, and only databend know how to interpreter them, we should mark the file as databend-created .
  3. we need add some additional metadata, and it may change. then we can know how to read files generated by older version. for example, we now can read column IS variant type, but not Has variant (bug: unable to varaint inside tuple from parquet #13385), obviously, we need a new way to do so.

#13385

  • Fixes #[Link the issue here]

Tests

  • Unit Test
  • Logic Test
  • Benchmark Test
  • No Test - Explain why

Type of change

  • Bug Fix (non-breaking change which fixes an issue)
  • New Feature (non-breaking change which adds functionality)
  • Breaking Change (fix or feature that could cause existing functionality not to work as expected)
  • Documentation Update
  • Refactoring
  • Performance Improvement
  • Other (please describe):

This change is Reviewable

@github-actions github-actions bot added the pr-feature this PR introduces a new feature to the codebase label May 30, 2024
@youngsofun youngsofun force-pushed the unload branch 2 times, most recently from 889f7b7 to 9538374 Compare May 30, 2024 03:58
@youngsofun youngsofun requested review from b41sh and sundy-li May 30, 2024 03:58
@youngsofun
Copy link
Member Author

youngsofun commented May 30, 2024

error just because the unload file size changed.
I will update them after pr is at least approved by some one.

@youngsofun
Copy link
Member Author

youngsofun commented May 30, 2024

I change the version string in parquet from sth like Databend v1.2.261-nightly-a2fcca8d1d(rust-1.78.0-nightly-2024-05-30T08:45:18.525800000Z) with simply ``Databend 1.2.261-nightly` if no problem wit you @b41sh @sundy-li

@youngsofun youngsofun force-pushed the unload branch 4 times, most recently from 1e6e6c2 to d45d2d5 Compare May 30, 2024 11:54
@BohuTANG BohuTANG merged commit 112a1f9 into databendlabs:main May 30, 2024
69 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr-feature this PR introduces a new feature to the codebase
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants