Skip to content

Conversation

@b-hahn
Copy link

@b-hahn b-hahn commented Dec 10, 2025

This change handles the case where bounding boxes predicted by VLMs have a width or a height <=0. Previously, this would cause the pipeline to crash while saving the PIL image. Now, malformed bounding boxes are simply ignored, resulting in an empty provenance field while keeping the extracted text.

Fixes docling-project/docling#2763

@github-actions
Copy link
Contributor

github-actions bot commented Dec 10, 2025

DCO Check Failed

Hi @b-hahn, your pull request has failed the Developer Certificate of Origin (DCO) check.

This repository supports remediation commits, so you can fix this without rewriting history — but you must follow the required message format.


🛠 Quick Fix: Add a remediation commit

Run this command:

git commit --allow-empty -s -m "DCO Remediation Commit for Benjamin Hahn <[email protected]>

I, Benjamin Hahn <[email protected]>, hereby add my Signed-off-by to this commit: 5e71eec8d01cd6d05caa470561962d2f8a9533a7"
git push

🔧 Advanced: Sign off each commit directly

For the latest commit:

git commit --amend --signoff
git push --force-with-lease

For multiple commits:

git rebase --signoff origin/main
git push --force-with-lease

More info: DCO check report

@mergify
Copy link

mergify bot commented Dec 10, 2025

Merge Protections

Your pull request matches the following merge protections and will not be merged until they are valid.

🔴 Require two reviewer for test updates

This rule is failing.

When test data is updated, we require two reviewers

  • #approved-reviews-by >= 2

🟢 Enforce conventional commit

Wonderful, this rule succeeded.

Make sure that we follow https://www.conventionalcommits.org/en/v1.0.0/

  • title ~= ^(fix|feat|docs|style|refactor|perf|test|build|ci|chore|revert)(?:\(.+\))?(!)?:

@b-hahn b-hahn force-pushed the main branch 3 times, most recently from ff062c3 to 1c3bd97 Compare December 10, 2025 20:17
@codecov
Copy link

codecov bot commented Dec 10, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@kklein
Copy link

kklein commented Dec 11, 2025

@dolfim-ibm @cau-git:

I think that @b-hahn from yesterday would require another approval of the CI runs on your end. :)

@dolfim-ibm
Copy link
Member

@b-hahn you identified the right function, i.e. load_from_doctags(), to fix the issue. As we discussed, instead of making an empty prov: [] we would like to drop the element completely if it has zero size.

@cau-git
Copy link
Contributor

cau-git commented Dec 11, 2025

@dolfim-ibm we agreed on the proposed solution here. It’s legit for some elements to have missing prov

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

SystemError: tile cannot extend outside image caused by VLM zero-area predictions (Integer Rounding Edge Case)

4 participants