-
Notifications
You must be signed in to change notification settings - Fork 122
Sketch / MVP: Upgrade to arrow-rs 55.0.0 / parquet 55.0.0 / object_store 0.12.0 #831
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| //! This module exists to help re-export the version of arrow used by default-engine and other | ||
| //! parts of kernel that need arrow | ||
| #[cfg(feature = "arrow_53")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am pretty sure that simply deleting support for the older arrow releases is not the right approach for this crate but it was the most expedient for my purposes
|
awesome, thanks @alamb! just saw arrow 55 is out, thanks for starting this - ill pick it up and see if we can support arrow 55 soon! regarding your questions:
yea we have pinned chrono so that we can support arrow 53 (some context here) - perhaps we can just get by with dropping 53 support now and just do 54/55?
ah yep - I can try to follow up over there to see if they can bump too! |
I don't think I am in a position to judge this 🤷 -- that will be on the maintainers of this crate I think |
|
did the same thing for the |
|
@alamb Thanks for taking a crack at this, I noticed 55 yesterday when doing Personally I think the hdfs-native-object-store is a boat anchor that should be I'll see what I can do with the chrono juggling to ensure that kernel try to keep support for multiple arrow versions via feature flags. If we can get that to work, then the arrow_55 flag could go into a 0.9.1 without much trouble 🤞 |
Donator of the code here. What trouble does it cause? First I'm hearing of any issues. Just something with transitive dependencies? I didn't notice there was a new object store release, but happy to push a new release out asap for it. |
|
@Kimahriman the challenge with hdfs-native-object-store is not really about the code that's there or how it works, but the dependency treadmill that we can get trapped in. Every set of arrow/datafusion/object_store releases creates a juggling act that must be done before any new release can be cut. This can mean that in the cases of object_store incompatibilities the delta-rs project is blocked from releasing new code until hdfs-native-object-store is upgraded, a crate outside of our control. The crate does get updated fairly regularly which is helpful. I have prepared what I hope is an informative graphic for this situation (ref) 😺 With #873 I'm hoping that we can unlink delta-kernel-rs from hdfs-native-object-store as a hard linked dependency. |
|
Seems slightly dramatic considering the the big effort it takes for every arrow and datafusion major version bump 😛 , and I can have a new release out in like 30 minutes once I actually know there's a new object store version out there. I might make an issue on the object store repo seeing if they'd be willing to just merge it into their base to remove that as an issue completely. The whole thing is a very thin wrapper around my actual HDFS library, so it would only need updates for the API changes, and I could get a second set of eyes on how I've implemented the object_store semantics |
## What changes are proposed in this pull request? TLDR: add support for arrow 55 and drop support for arrow 53 (this let's us unpin our `chrono` dependency) Exhaustive list of changes: 1. New `pub` `FileSize` and `FileIndex` which are both `u64` now instead of `usize`. 2. removing `arrow_53` feature flag 3. removing `object_store` feature flag (now we just 'include it' in `default-engine`) 4. adding `arrow-55` feature flag 5. renamed `arrow_54` to `arrow-54` 6. adding respective `object_store` to each of `arrow_54` and `arrow_55` (and re-exporting all of object_store) 7. got rid of `cloud` feature flag: now `aws`, `azure`, and `gcp` are just on anytime `object_store` is pulled in 8. unpin `chrono` dep 9. handful of conditionally-compiled things to take care of breaking changes across arrow 54/55 ### This PR affects the following public APIs Adds new `arrow-55` flag (and renames `arrow_54` to `arrow-54`) and removes `arrow_53`, `cloud`, and `object_store`. Additionally re-exports all of `object_store` that's compatible with the version of arrow you've selected. Note there are two additional `pub` types (or rather aliases) now: `FileSize` and `FileIndex`. related: #831 ## How was this change tested? Existing
|
Thanks @zachschuermann |

This is part of testing delta.rs with DataFusion 37 (which will, among other things, use a new version of arrow-rs and object_store)
I don't really intend this PR to be merged as is -- it is simply enough modifications to get the code to compile to unblock me testing delta.rs
Notes for anyone who finds this PR:
Note this is based on the
v0.8.0tag