vision_utils: clamp resize new_w/new_h to >=1 by danielhanchen · Pull Request #696 · unslothai/unsloth-zoo

danielhanchen · 2026-05-25T11:46:30Z

Summary

The integer-rounding formula in UnslothVisionDataCollator._resize_images_inplace can produce new_w = 0 or new_h = 0 for degenerate aspect ratios, causing PIL.resize to raise `height and width must be > 0`.
Repro: a 1024x1 image with resize=256, resize_dimension=\"max\" evaluates to new_h = (1 * 256 + 512) // 1024 = 0 and crashes.
The fix mirrors the max(1, ...) guard that already exists in Studio's MLX path (studio/backend/core/training/worker.py::_mlx_vlm_max_resized_size), bringing the two paths into parity.
Triggerable from Studio's new vision_image_size knob (Studio: expose image size setting in training UI unsloth#5743) when a dataset contains an image with one side small enough that the downscale ratio rounds it to 0. Realistic LaTeX_OCR / banner-style data (e.g. 1500x60, 5000x600) is unaffected; the crash only fires on truly degenerate inputs.

Test plan

Repro the crash with 1024x1 + resize=256 against current main.
Apply the patch; the same call now returns a 256x1 image.
Confirm aggregate matrix of 180 cases (9 caps x 25 weird shapes + boundary + multi-image + uncovered PIL modes + idempotence + real-world aspects) all pass with the patch.
Confirm realistic inputs (1500x60, 5000x600, 800x50) are byte-identical to pre-patch behavior.

The integer-rounding formula in _resize_images_inplace can produce new_w=0 or new_h=0 for inputs with extreme aspect ratios (e.g. 1024x1 with image_size=256: new_h = (1*256 + 512) // 1024 = 0). PIL.resize then raises "height and width must be > 0". This is reachable from Studio's vision_image_size knob when a dataset contains a degenerate image and the user picks a small cap. The fix mirrors the max(1, ...) guard already present in Studio's MLX resize helper (studio/backend/core/training/worker.py::_mlx_vlm_max_resized_size).

gemini-code-assist

Code Review

This pull request introduces a safety clamp to image resizing dimensions, ensuring they are at least 1 to prevent crashes in PIL.Image.resize when handling extreme aspect ratios. The reviewer suggested an optimization to cache the result of self.size_func(img) in a local variable to avoid redundant function calls and improve efficiency.

gemini-code-assist · 2026-05-25T11:51:43Z

+                new_w = max(1, (w * image_size + self.size_func(img) // 2) // self.size_func(img))
+                new_h = max(1, (h * image_size + self.size_func(img) // 2) // self.size_func(img))


While the max(1, ...) clamp correctly prevents a crash in PIL.Image.resize, the repeated calls to self.size_func(img) (up to 5 times per image in this block) are redundant. It is more efficient to store the result of self.size_func(img) in a local variable before calculating new_w and new_h.

current_size = self.size_func(img) new_w = max(1, (w * image_size + current_size // 2) // current_size) new_h = max(1, (h * image_size + current_size // 2) // current_size)

Pure refactor on top of the existing clamp; output byte-identical for every input that has worked since the collator landed. - Hoist is_tuple, snap_to_patch_size, factor out of the per-image loop. - Cache size_func(img) so it is not called 3x per image. No behavioral change. Same code paths, same numeric output, same exception surface.

After the clamp, surface a one-shot UserWarning when the resized image would have aspect_ratio > 200 (MAX_RATIO). Qwen2-VL / Qwen2.5-VL preprocessors reject such inputs in their own smart_resize; without this warning users only see the downstream crash and have no signal that the issue is a degenerate-aspect training image. Non-degenerate inputs are unaffected (warning is gated on the same MAX_RATIO check zoo's own smart_resize already enforces).

chatgpt-codex-connector · 2026-05-27T15:00:14Z

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.

chatgpt-codex-connector · 2026-05-27T15:45:48Z

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.

danielhanchen mentioned this pull request May 25, 2026

Studio: expose image size setting in training UI unslothai/unsloth#5743

Merged

gemini-code-assist Bot reviewed May 25, 2026

View reviewed changes

danielhanchen added 3 commits May 25, 2026 13:16

Tighten degenerate-aspect clamp comment

14ced56

Tighten comments in vision_utils._resize_images_inplace

4a026c1

danielhanchen merged commit ac1e962 into main May 27, 2026
15 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

vision_utils: clamp resize new_w/new_h to >=1#696

vision_utils: clamp resize new_w/new_h to >=1#696
danielhanchen merged 5 commits into
mainfrom
vision-resize-degenerate-clamp

danielhanchen commented May 25, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot May 25, 2026

Uh oh!

chatgpt-codex-connector Bot commented May 27, 2026

Uh oh!

chatgpt-codex-connector Bot commented May 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		new_w = max(1, (w * image_size + self.size_func(img) // 2) // self.size_func(img))
		new_h = max(1, (h * image_size + self.size_func(img) // 2) // self.size_func(img))

Conversation

danielhanchen commented May 25, 2026

Summary

Test plan

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 25, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot commented May 27, 2026

Uh oh!

chatgpt-codex-connector Bot commented May 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant