feat: Include statistics for Reserved Fields #1849

Fokko · 2025-11-12T18:29:13Z

This is a behavioral change.

In Iceberg-Rust we require upper/lower bounds to be part of the schema. But in some cases, this isn't the case, for example when you use reserved fields. In PyIceberg we expect these values in some tests:

FAILED tests/integration/test_inspect_table.py::test_inspect_files[2] - AssertionError: Difference in column lower_bounds: {} != {2147483546: b's3://warehouse/default/table_metadata_files/data/00000-0-8d621c18-079b-4217-afd8-559ce216e875.parquet', 2147483545: b'\x00\x00\x00\x00\x00\x00\x00\x00'}
assert {} == {2147483545: ...e875.parquet'}
  Right contains 2 more items:
  {2147483545: b'\x00\x00\x00\x00\x00\x00\x00\x00',
   2147483546: b's3://warehouse/default/table_metadata_files/data/00000-0-8d621c1'
               b'8-079b-4217-afd8-559ce216e875.parquet'}
  Full diff:
    {
  +  ,
  -  2147483545: b'\x00\x00\x00\x00\x00\x00\x00\x00',
  -  2147483546: b's3://warehouse/default/table_metadata_files/data/00000-0-8d621c1'
  -              b'8-079b-4217-afd8-559ce216e875.parquet',
    }
!!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
==== 1 failed, 238 passed, 32 skipped, 3123 deselected in 61.56s (0:01:01) =====

This is a positional delete where the field-IDs are constant, but never part of a schema (they are reserved).

Which issue does this PR close?

Closes #.

What changes are included in this PR?

Are these changes tested?

kevinjqliu

LGTM! This aligns with both pyiceberg and spark behavior

kevinjqliu · 2025-11-12T19:32:47Z

manually retriggering ci runs, due to #1838 😞

crates/iceberg/src/spec/manifest/_serde.rs

They don't show up in the table schema, but can be important for query optimization

kevinjqliu

LGTM!!

should we update the PR title/description to reflect the new changes?

kevinjqliu · 2025-12-12T05:07:15Z

crates/iceberg/src/metadata_columns.rs

+
+/// Reserved field ID for the spec ID (_spec_id) column per Iceberg spec
+pub const RESERVED_FIELD_ID_SPEC_ID: i32 = i32::MAX - 4;
+


nit: add _partition here for completeness?

kevinjqliu · 2025-12-12T05:07:53Z

crates/iceberg/src/metadata_columns.rs

+
+/// Reserved field ID for the position in position delete files
+pub const RESERVED_FIELD_ID_DELETE_FILE_POS: i32 = i32::MAX - 102;
+


nit: add row for completeness?

I left out row on purpose, since it is a struct that corresponds with the table schema. row can be used to decide if a positional delete is relevant for your query (since you collect statistics for the rows that are dropped), but I don't think any engine leverages that today. There was even a thread on the dev-list to deprecate this functionality.

crates/iceberg/src/spec/manifest/_serde.rs

Co-authored-by: Kevin Liu <[email protected]>

kevinjqliu

LGTM!

looks like the ci is stuck, i retriggered it

liurenjie1024

Thanks @Fokko , LGTM!

…he-schema

Fokko · 2025-12-15T11:02:10Z

Thanks @kevinjqliu and @liurenjie1024 for checking 🙌

Fokko mentioned this pull request Nov 12, 2025

[epic] address manifest reader feature gaps between rust and python implementations #1714

Closed

10 tasks

kevinjqliu approved these changes Nov 12, 2025

View reviewed changes

kevinjqliu mentioned this pull request Nov 12, 2025

Tracking issues of Iceberg Rust 0.8 Release #1850

Closed

17 tasks

liurenjie1024 reviewed Nov 13, 2025

View reviewed changes

crates/iceberg/src/spec/manifest/_serde.rs Outdated Show resolved Hide resolved

Return statistics for field-IDs

6e32474

They don't show up in the table schema, but can be important for query optimization

Fokko force-pushed the fd-include-statistics-that-are-not-part-of-the-schema branch from 13f25cd to 6e32474 Compare December 11, 2025 17:27

cargo fmt

489989a

kevinjqliu approved these changes Dec 12, 2025

View reviewed changes

Fokko changed the title ~~feat: Don't drop additional statistics~~ feat: Include statistics for Reserved Fields Dec 12, 2025

Fokko and others added 4 commits December 12, 2025 07:57

Less is more, thanks Kevin!

518f63e

Co-authored-by: Kevin Liu <[email protected]>

cargo fmt

26a9392

Make clippy happy

5feaa96

Add _partition as well

af99db1

kevinjqliu approved these changes Dec 12, 2025

View reviewed changes

kevinjqliu requested a review from liurenjie1024 December 12, 2025 16:14

liurenjie1024 approved these changes Dec 15, 2025

View reviewed changes

liurenjie1024 added 2 commits December 15, 2025 17:58

Merge branch 'main' into fd-include-statistics-that-are-not-part-of-t…

d736a31

…he-schema

Merge branch 'main' into fd-include-statistics-that-are-not-part-of-t…

43558ea

…he-schema

liurenjie1024 merged commit b047baa into apache:main Dec 15, 2025
17 checks passed

Fokko deleted the fd-include-statistics-that-are-not-part-of-the-schema branch December 15, 2025 11:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Include statistics for Reserved Fields #1849

feat: Include statistics for Reserved Fields #1849

Fokko commented Nov 12, 2025 •

edited

Loading

Uh oh!

kevinjqliu left a comment

Uh oh!

kevinjqliu commented Nov 12, 2025

Uh oh!

Uh oh!

kevinjqliu left a comment

Uh oh!

kevinjqliu Dec 12, 2025

Uh oh!

Fokko Dec 12, 2025

Uh oh!

kevinjqliu Dec 12, 2025

Uh oh!

Fokko Dec 12, 2025

Uh oh!

Uh oh!

kevinjqliu left a comment

Uh oh!

liurenjie1024 left a comment

Uh oh!

Uh oh!

Fokko commented Dec 15, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants


		/// Reserved field ID for the spec ID (_spec_id) column per Iceberg spec
		pub const RESERVED_FIELD_ID_SPEC_ID: i32 = i32::MAX - 4;


		/// Reserved field ID for the position in position delete files
		pub const RESERVED_FIELD_ID_DELETE_FILE_POS: i32 = i32::MAX - 102;

feat: Include statistics for Reserved Fields #1849

feat: Include statistics for Reserved Fields #1849

Conversation

Fokko commented Nov 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Which issue does this PR close?

What changes are included in this PR?

Are these changes tested?

Uh oh!

kevinjqliu left a comment

Choose a reason for hiding this comment

Uh oh!

kevinjqliu commented Nov 12, 2025

Uh oh!

Uh oh!

kevinjqliu left a comment

Choose a reason for hiding this comment

Uh oh!

kevinjqliu Dec 12, 2025

Choose a reason for hiding this comment

Uh oh!

Fokko Dec 12, 2025

Choose a reason for hiding this comment

Uh oh!

kevinjqliu Dec 12, 2025

Choose a reason for hiding this comment

Uh oh!

Fokko Dec 12, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

kevinjqliu left a comment

Choose a reason for hiding this comment

Uh oh!

liurenjie1024 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Fokko commented Dec 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Fokko commented Nov 12, 2025 •

edited

Loading

Fokko commented Dec 15, 2025 •

edited

Loading