-
Notifications
You must be signed in to change notification settings - Fork 344
chore(deps): Bump aws-sdk-glue from 1.125.0 to 1.126.0 #1812
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Bumps [aws-sdk-glue](https://github.com/awslabs/aws-sdk-rust) from 1.125.0 to 1.126.0. - [Release notes](https://github.com/awslabs/aws-sdk-rust/releases) - [Commits](https://github.com/awslabs/aws-sdk-rust/commits) --- updated-dependencies: - dependency-name: aws-sdk-glue dependency-version: 1.126.0 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <[email protected]>
liurenjie1024
approved these changes
Nov 3, 2025
gbrgr
added a commit
to RelationalAI/iceberg-rust
that referenced
this pull request
Nov 3, 2025
* fix(reader): fix position delete bugs with row group skipping (apache#1806) ## Which issue does this PR close? Partially address apache#1749. ## What changes are included in this PR? This PR fixes two related correctness bugs in `ArrowReader::build_deletes_row_selection()` where position deletes targeting rows in skipped or skipped-to row groups were not being applied correctly. ### Background: How These Bugs Were Discovered While running Apache Spark + Apache Iceberg integration tests through DataFusion Comet, we discovered that the following tests were failing or hanging: - org.apache.iceberg.spark.extensions.TestMergeOnReadMerge - org.apache.iceberg.spark.extensions.TestMergeOnReadDelete - org.apache.iceberg.spark.extensions.TestMergeOnReadUpdate Investigation revealed that recent work to support Iceberg's file splitting feature (via `filter_row_groups_by_byte_range()`) exposed latent bugs in the position delete logic. While the byte range filtering code itself is correct, it exercises code paths that were previously untested, revealing these pre-existing issues. #### Bug 1: Missing base index increment when skipping row groups **The Issue:** When processing a Parquet file with multiple row groups, if a position delete targets a row in a later row group, the function would skip row groups without deletes but fail to increment `current_row_group_base_idx`. This caused the row index tracking to become desynchronized. **Example scenario:** - File with 2 row groups: rows 0-99 (group 0) and rows 100-199 (group 1) - Position delete targets row 199 (last row in group 1) - When processing group 0: delete (199) is beyond the group's range, so code hits `continue` at lines 469-471 - BUG: `current_row_group_base_idx` is NOT incremented, stays at 0 - When processing group 1: code thinks rows start at 0 instead of 100 - Delete at position 199 is never applied (thinks file only has rows 0-99) **The Fix:** Add `current_row_group_base_idx += row_group_num_rows` before the two `continue` statements at lines ~470 and ~481. This ensures row index tracking stays synchronized when skipping row groups. #### Bug 2: Stale cached delete index when skipping unselected row groups **The Issue:** When row group selection is active (e.g., via byte range filtering for file splits) and an unselected row group is skipped, the cached `next_deleted_row_idx_opt` variable can become stale, leading to either lost deletes or infinite loops depending on the scenario. The function maintains a cached value (`next_deleted_row_idx_opt`) containing the next delete to apply. When skipping unselected row groups, it calls `delete_vector_iter.advance_to(next_row_group_base_idx)` to position the iterator, but this doesn't automatically update the cached variable. **Two problematic scenarios:** 1. Stale cache causes infinite loop (the bug we hit): - File with 2 row groups: rows 0-99 (group 0) and rows 100-199 (group 1) - Position delete at row 0 (in group 0) - Row group selection: read ONLY group 1 - Initial state: `next_deleted_row_idx_opt = Some(0)` (cached) - Skip group 0: `advance_to(100)` positions iterator past delete at 0 - BUG: cached value still `Some(0)` - STALE! - Process group 1: loop condition `0 < 200` is `true`, but `current_idx (100) != next_deleted_row_idx (0)`, so neither branch executes could result in infinite loop 2. Unconditionally calling `next()` loses deletes: - File with 2 row groups: rows 0-99 (group 0) and rows 100-199 (group 1) - Position delete at row 199 (in group 1) - Row group selection: read ONLY group 1 - Initial state: `next_deleted_row_idx_opt = Some(199)` (cached, already correct!) - Skip group 0: `advance_to(100)` - iterator already positioned correctly - If we call `next()`: BUG - consumes delete at 199, advancing past it - Process group 1: iterator exhausted, delete is lost **The Fix:** - If `cached value < next_row_group_base_idx` (stale), update it, thus avoiding infinite loop - If `cached value >= next_row_group_base_idx` (still valid), keep it, thus preserving delete ## Are these changes tested? Yes. This PR adds two comprehensive unit tests in reader.rs: 1. `test_position_delete_across_multiple_row_groups` - Tests bug 1 (missing base index increment) 2. `test_position_delete_with_row_group_selection` - Tests bug 2 scenario where delete is in selected group 3. `test_position_delete_in_skipped_row_group` - Tests bug 2 scenario where delete is in skipped group (would hang without fix) Additionally, these fixes resolve failures in Iceberg Java's spark-extension tests when running with DataFusion Comet’s PR apache/datafusion-comet#2528: - org.apache.iceberg.spark.extensions.TestMergeOnReadMerge - org.apache.iceberg.spark.extensions.TestMergeOnReadDelete - org.apache.iceberg.spark.extensions.TestMergeOnReadUpdate * feat(datafusion): implement the partitioning node for DataFusion to define the partitioning (apache#1620) ## Which issue does this PR close? - Closes apache#1543 ## What changes are included in this PR? Implement a physical execution repartition node that determines the relevant DataFusion partitioning strategy based on the Iceberg table schema and metadata. 1. Unpartitioned tables: Uses round-robin partitioning 2. Partitioned tables: It depends on the transform type: - Identity or Bucket transforms: Uses hash partitioning on the _partition column - Temporal transforms (Year, Month, Day, Hour): Uses round-robin partitioning _Minor change: I created a new `schema_ref()` helper method._ ## Are these changes tested? Yes, with unit tests --------- Signed-off-by: Florian Valeye <[email protected]> * feat(reader): Date32 from days since epoch for Literal:try_from_json (apache#1803) * chore(deps): Bump aws-sdk-glue from 1.125.0 to 1.126.0 (apache#1812) Bumps [aws-sdk-glue](https://github.com/awslabs/aws-sdk-rust) from 1.125.0 to 1.126.0. <details> <summary>Commits</summary> <ul> <li>See full diff in <a href="https://github.com/awslabs/aws-sdk-rust/commits">compare view</a></li> </ul> </details> <br /> [](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore(deps): Bump astral-sh/setup-uv from 6 to 7 (apache#1811) Bumps [astral-sh/setup-uv](https://github.com/astral-sh/setup-uv) from 6 to 7. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/astral-sh/setup-uv/releases">astral-sh/setup-uv's releases</a>.</em></p> <blockquote> <h2>v7.0.0 🌈 node24 and a lot of bugfixes</h2> <h2>Changes</h2> <p>This release comes with a load of bug fixes and a speed up. Because of switching from node20 to node24 it is also a breaking change. If you are running on GitHub hosted runners this will just work, if you are using self-hosted runners make sure, that your runners are up to date. If you followed the normal installation instructions your self-hosted runner will keep itself updated.</p> <p>This release also removes the deprecated input <code>server-url</code> which was used to download uv releases from a different server. The <a href="https://github.com/astral-sh/setup-uv?tab=readme-ov-file#manifest-file">manifest-file</a> input supersedes that functionality by adding a flexible way to define available versions and where they should be downloaded from.</p> <h3>Fixes</h3> <ul> <li>The action now respects when the environment variable <code>UV_CACHE_DIR</code> is already set and does not overwrite it. It now also finds <a href="https://docs.astral.sh/uv/reference/settings/#cache-dir">cache-dir</a> settings in config files if you set them.</li> <li>Some users encountered problems that <a href="https://github.com/astral-sh/setup-uv?tab=readme-ov-file#disable-cache-pruning">cache pruning</a> took forever because they had some <code>uv</code> processes running in the background. Starting with uv version <code>0.8.24</code> this action uses <code>uv cache prune --ci --force</code> to ignore the running processes</li> <li>If you just want to install uv but not have it available in path, this action now respects <code>UV_NO_MODIFY_PATH</code></li> <li>Some other actions also set the env var <code>UV_CACHE_DIR</code>. This action can now deal with that but as this could lead to unwanted behavior in some edgecases a warning is now displayed.</li> </ul> <h3>Improvements</h3> <p>If you are using minimum version specifiers for the version of uv to install for example</p> <pre lang="toml"><code>[tool.uv] required-version = ">=0.8.17" </code></pre> <p>This action now detects that and directly uses the latest version. Previously it would download all available releases from the uv repo to determine the highest matching candidate for the version specifier, which took much more time.</p> <p>If you are using other specifiers like <code>0.8.x</code> this action still needs to download all available releases because the specifier defines an upper bound (not 0.9.0 or later) and "latest" would possibly not satisfy that.</p> <h2>🚨 Breaking changes</h2> <ul> <li>Use node24 instead of node20 <a href="https://github.com/eifinger"><code>@eifinger</code></a> (<a href="https://github.com/astral-sh/setup-uv/issues/608">#608</a>)</li> <li>Remove deprecated input server-url <a href="https://github.com/eifinger"><code>@eifinger</code></a> (<a href="https://github.com/astral-sh/setup-uv/issues/607">#607</a>)</li> </ul> <h2>🐛 Bug fixes</h2> <ul> <li>Respect UV_CACHE_DIR and cache-dir <a href="https://github.com/eifinger"><code>@eifinger</code></a> (<a href="https://github.com/astral-sh/setup-uv/issues/612">#612</a>)</li> <li>Use --force when pruning cache <a href="https://github.com/eifinger"><code>@eifinger</code></a> (<a href="https://github.com/astral-sh/setup-uv/issues/611">#611</a>)</li> <li>Respect UV_NO_MODIFY_PATH <a href="https://github.com/eifinger"><code>@eifinger</code></a> (<a href="https://github.com/astral-sh/setup-uv/issues/603">#603</a>)</li> <li>Warn when <code>UV_CACHE_DIR</code> has changed <a href="https://github.com/jamesbraza"><code>@jamesbraza</code></a> (<a href="https://github.com/astral-sh/setup-uv/issues/601">#601</a>)</li> </ul> <h2>🚀 Enhancements</h2> <ul> <li>Shortcut to latest version for minimum version specifier <a href="https://github.com/eifinger"><code>@eifinger</code></a> (<a href="https://github.com/astral-sh/setup-uv/issues/598">#598</a>)</li> </ul> <h2>🧰 Maintenance</h2> <ul> <li>Bump dependencies <a href="https://github.com/eifinger"><code>@eifinger</code></a> (<a href="https://github.com/astral-sh/setup-uv/issues/613">#613</a>)</li> <li>Fix test-uv-no-modify-path <a href="https://github.com/eifinger"><code>@eifinger</code></a> (<a href="https://github.com/astral-sh/setup-uv/issues/604">#604</a>)</li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/astral-sh/setup-uv/commit/85856786d1ce8acfbcc2f13a5f3fbd6b938f9f41"><code>8585678</code></a> Bump dependencies (<a href="https://github.com/astral-sh/setup-uv/issues/664">#664</a>)</li> <li><a href="https://github.com/astral-sh/setup-uv/commit/22d500a65c34c827ebc95094040adcee254e92fa"><code>22d500a</code></a> Bump github/codeql-action from 4.30.8 to 4.30.9 (<a href="https://github.com/astral-sh/setup-uv/issues/652">#652</a>)</li> <li><a href="https://github.com/astral-sh/setup-uv/commit/14d557131df7147286f7ce93a5b6b0189f8e1bc3"><code>14d5571</code></a> chore: update known checksums for 0.9.5 (<a href="https://github.com/astral-sh/setup-uv/issues/663">#663</a>)</li> <li><a href="https://github.com/astral-sh/setup-uv/commit/29cd2350cd44d155ed44d0eba0e1b63d07fb3b69"><code>29cd235</code></a> Use tar for extracting the uv zip file on Windows too (<a href="https://github.com/astral-sh/setup-uv/issues/660">#660</a>)</li> <li><a href="https://github.com/astral-sh/setup-uv/commit/2ddd2b9cb38ad8efd50337e8ab201519a34c9f24"><code>2ddd2b9</code></a> chore: update known checksums for 0.9.4 (<a href="https://github.com/astral-sh/setup-uv/issues/651">#651</a>)</li> <li><a href="https://github.com/astral-sh/setup-uv/commit/b7bf78939d77607a9ccb489e4ec4651ba1092d5c"><code>b7bf789</code></a> Fix "lowest" resolution strategy with lower-bound only (<a href="https://github.com/astral-sh/setup-uv/issues/649">#649</a>)</li> <li><a href="https://github.com/astral-sh/setup-uv/commit/cb6c0a53d9c61608defba05145184489d20183b2"><code>cb6c0a5</code></a> Change version in docs to v7 (<a href="https://github.com/astral-sh/setup-uv/issues/647">#647</a>)</li> <li><a href="https://github.com/astral-sh/setup-uv/commit/dffc6292f2060d80116faf1baee66598a67f042c"><code>dffc629</code></a> Use working-directory to detect empty workdir (<a href="https://github.com/astral-sh/setup-uv/issues/645">#645</a>)</li> <li><a href="https://github.com/astral-sh/setup-uv/commit/6e346e1653b720be5aaa194026b82bdef65869c7"><code>6e346e1</code></a> chore: update known checksums for 0.9.3 (<a href="https://github.com/astral-sh/setup-uv/issues/644">#644</a>)</li> <li><a href="https://github.com/astral-sh/setup-uv/commit/3ccd0fd498ef6303a98d4125859aae05eedf6294"><code>3ccd0fd</code></a> Bump github/codeql-action from 4.30.7 to 4.30.8 (<a href="https://github.com/astral-sh/setup-uv/issues/639">#639</a>)</li> <li>Additional commits viewable in <a href="https://github.com/astral-sh/setup-uv/compare/v6...v7">compare view</a></li> </ul> </details> <br /> [](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Renjie Liu <[email protected]> * chore(deps): Bump crate-ci/typos from 1.38.1 to 1.39.0 (apache#1810) --------- Signed-off-by: Florian Valeye <[email protected]> Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: Matt Butrovich <[email protected]> Co-authored-by: Florian Valeye <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Renjie Liu <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Bumps aws-sdk-glue from 1.125.0 to 1.126.0.
Commits
Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting
@dependabot rebase.Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR:
@dependabot rebasewill rebase this PR@dependabot recreatewill recreate this PR, overwriting any edits that have been made to it@dependabot mergewill merge this PR after your CI passes on it@dependabot squash and mergewill squash and merge this PR after your CI passes on it@dependabot cancel mergewill cancel a previously requested merge and block automerging@dependabot reopenwill reopen this PR if it is closed@dependabot closewill close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually@dependabot show <dependency name> ignore conditionswill show all of the ignore conditions of the specified dependency@dependabot ignore this major versionwill close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)@dependabot ignore this minor versionwill close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)@dependabot ignore this dependencywill close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)