Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -101,7 +101,7 @@ num-bigint = "0.4.6"
once_cell = "1.20"
opendal = "0.55.0"
ordered-float = "4"
parquet = "57.0"
parquet = "57.1.0"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why we need to change this? We expect to keep parquet version same as arrow.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't realize it initially, but the necessary ParquetMetaDataReader method with_metadata_options did not get added until 57.1.0. You can see the initial clippy failure here: https://github.com/apache/iceberg-rust/actions/runs/20211830111/job/58018793697#step:4:854

Given this, I am fine if you'd prefer to just close this PR for now. I assume upgrading Arrow also isn't a preferred solution for such a minor PR.

Also, sorry for forgetting to update the PR comment with these details!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @lgingerich , I prefer to wait for next round upgrading of arrow to add this change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, will close this.

pilota = "0.11.10"
port_scanner = "0.1.5"
pretty_assertions = "1.4"
Expand Down
16 changes: 9 additions & 7 deletions crates/iceberg/src/arrow/reader.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1705,19 +1705,21 @@ impl<R: FileRead> AsyncFileReader for ArrowFileReader<R> {
)
}

// TODO: currently we don't respect `ArrowReaderOptions` cause it don't expose any method to access the option field
// we will fix it after `v55.1.0` is released in https://github.com/apache/arrow-rs/issues/7393
fn get_metadata(
&mut self,
_options: Option<&'_ ArrowReaderOptions>,
) -> BoxFuture<'_, parquet::errors::Result<Arc<ParquetMetaData>>> {
fn get_metadata<'a>(
&'a mut self,
options: Option<&'a ArrowReaderOptions>,
) -> BoxFuture<'a, parquet::errors::Result<Arc<ParquetMetaData>>> {
async move {
let reader = ParquetMetaDataReader::new()
let mut reader = ParquetMetaDataReader::new()
.with_prefetch_hint(self.metadata_size_hint)
// Set the page policy first because it updates both column and offset policies.
.with_page_index_policy(PageIndexPolicy::from(self.preload_page_index))
.with_column_index_policy(PageIndexPolicy::from(self.preload_column_index))
.with_offset_index_policy(PageIndexPolicy::from(self.preload_offset_index));

if let Some(opts) = options {
reader = reader.with_metadata_options(Some(opts.metadata_options().clone()));
}
let size = self.meta.size;
let meta = reader.load_and_finish(self, size).await?;

Expand Down
Loading