Skip to content

Conversation

@alamb
Copy link
Contributor

@alamb alamb commented Sep 24, 2025

Which issue does this PR close?

Rationale for this change

In #8340 I am trying to split the "IO" from the "where is the metadata in the file" from the "decode thrift into Rust structures" logic.

I want to make it as easy as possible to review so I split it into pieces, but you can see #8340 for how it all fits together

What changes are included in this PR?

This PR cleans up the code that handles parsing the 8 byte parquet file footer, FooterTail, into its own module and construtor

Are these changes tested?

yes, by CI

Are there any user-facing changes?

No, this is entirely internal reorganization and I left a pub use

@github-actions github-actions bot added the parquet Changes to the parquet crate label Sep 24, 2025
@alamb alamb force-pushed the alamb/extract_footer_tail_parsing branch from 999eb6a to 94a75ae Compare September 24, 2025 17:44
@alamb alamb force-pushed the alamb/extract_footer_tail_parsing branch from 94a75ae to 17f9972 Compare September 24, 2025 17:46
pub fn decode_footer_tail(slice: &[u8; FOOTER_SIZE]) -> Result<FooterTail> {
let magic = &slice[4..];
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the code to parse the footer was previously a member function of ParquetMetadataReader, but I need to use it in the ParquetMetadataPushDecoder, so I pulled it into its own function

};
#[cfg(feature = "encryption")]
use crate::thrift::{TCompactSliceInputProtocol, TSerializable};
pub use footer_tail::FooterTail;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the maintains the same public API

@alamb alamb requested a review from etseidl September 24, 2025 18:23
Copy link
Contributor

@etseidl etseidl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So much code for 8 bytes 😅

LGTM. Shouldn't impact the remodel much. Thanks!

@alamb
Copy link
Contributor Author

alamb commented Sep 24, 2025

So much code for 8 bytes 😅

LGTM. Shouldn't impact the remodel much. Thanks!

Well, to be fair, there are also a lot of license and doc lines :)

Copy link
Member

@mbrobbel mbrobbel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @alamb

@alamb alamb merged commit b444ea7 into apache:main Sep 25, 2025
16 checks passed
@alamb
Copy link
Contributor Author

alamb commented Sep 25, 2025

Thanks again @etseidl and @mbrobbel

@alamb alamb deleted the alamb/extract_footer_tail_parsing branch September 25, 2025 21:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

parquet Changes to the parquet crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants