Remove unnecessary isinstance(part, dict) checks in image extraction#5439
Conversation
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: c02b86c694
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 5775033. Configure here.
| if isinstance(msg.get("content"), list): | ||
| for part in msg["content"]: | ||
| if isinstance(part, dict) and part.get("type") == "image": | ||
| if part["type"] == "image": |
There was a problem hiding this comment.
Missed fourth instance of the same guard pattern
Low Severity
The PR updates three of the four isinstance(part, dict) and part.get("type") == "image" checks but misses the identical pattern at grpo_trainer.py line 1660 inside the tool-call result handling. This leaves an inconsistency within the same file, violating the project rule that duplicated logic blocks must stay aligned and that not propagating a change across all copies is a bug.
Additional Locations (2)
Triggered by project rule: ../.ai/AGENTS.md
Reviewed by Cursor Bugbot for commit 5775033. Configure here.


Remove redundant
isinstance(part, dict)guards introduced in #5323 when checking for image parts in structured content lists.When
message["content"]is a list, elements are always dicts ({"type": "text", ...}, {"type": "image", ...}). If they aren't, that's a bug we want to surface via an error, not silently skip.Note
Medium Risk
Slight behavior change: malformed
contentlists will now raise (e.g.,TypeError/KeyError) instead of being silently ignored, which could surface previously hidden dataset/tool formatting issues during training.Overview
Simplifies image extraction in
GRPOTrainerandRLOOTrainerby removing redundantisinstance(part, dict)/part.get("type")checks and directly indexingpart["type"]when scanning structuredcontentblocks.This makes multimodal prompt and tool-response handling fail fast if
contentlist elements are not the expected dict shape (e.g., missing"type"/"image").Reviewed by Cursor Bugbot for commit 5775033. Bugbot is set up for automated code reviews on this repo. Configure here.