Skip to content

Support follow-up edits for generated images#5712

Merged
danielhanchen merged 28 commits into
unslothai:mainfrom
wasimysaid:wire-openai-image-generation-edits
May 26, 2026
Merged

Support follow-up edits for generated images#5712
danielhanchen merged 28 commits into
unslothai:mainfrom
wasimysaid:wire-openai-image-generation-edits

Conversation

@wasimysaid
Copy link
Copy Markdown
Collaborator

Summary

  • carry OpenAI image_generation_call ids through chat history
  • send prior image_generation_call refs on follow-up image generation turns
  • cover the request wiring and tool-event metadata in backend tests

Testing

  • cd studio/frontend && npm run typecheck
  • python3 -m py_compile studio/backend/core/inference/external_provider.py studio/backend/models/inference.py studio/backend/tests/test_openai_image_generation.py
  • ~/.unsloth/studio/unsloth_studio/bin/python -m pytest studio/backend/tests/test_openai_image_generation.py

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements support for OpenAI's image_generation_call references, allowing follow-up prompts to refine generated images without resending the full payload. Key changes include updating the backend to handle these call references in streaming responses and tool events, adding new Pydantic models for content parts, and modifying the frontend chat adapter to collect and forward these references in outbound messages. A potential issue was identified in the frontend where empty message content might bypass validation checks and lead to backend errors.

Comment thread studio/frontend/src/features/chat/api/chat-adapter.ts
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6045257e35

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread studio/frontend/src/features/chat/provider-capabilities.ts
@chatgpt-codex-connector
Copy link
Copy Markdown

💡 Codex Review

if (pendingImageEditReferenceForRun) {
runtime.clearPendingImageEditReference();

P2 Badge Preserve pending image edit reference until request can run

pendingImageEditReference is cleared before any capability checks run, but the same function can immediately abort afterward (for example when imageGenerationEnabledForThisTurn is false and we throw "Image generation edit unavailable"). In that path, the selected image reference is lost before the user can retry, so the follow-up send no longer carries the explicit target image and can degrade into editing the wrong/latest image (or a fresh generation). Clear this state only after the request has passed validation (or after a successful dispatch), not at function entry.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 328816ccea

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread studio/frontend/src/features/chat/api/chat-adapter.ts Outdated
Comment thread studio/frontend/src/features/chat/api/chat-adapter.ts Outdated
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 9ee5c9be8d

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread studio/frontend/src/features/chat/api/chat-adapter.ts Outdated
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8db2f5eb07

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread studio/backend/core/inference/external_provider.py Outdated
@chatgpt-codex-connector
Copy link
Copy Markdown

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.

1 similar comment
@chatgpt-codex-connector
Copy link
Copy Markdown

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.

@danielhanchen danielhanchen merged commit f364b08 into unslothai:main May 26, 2026
30 checks passed
rhsCZ pushed a commit to rhsCZ/unsloth that referenced this pull request May 26, 2026
Main moved forward 17 commits during PR review (latest: 953c8bf). Real
conflicts in five files; resolved by combining both branches' changes.

studio/backend/core/inference/external_provider.py
- Add fast_mode (Anthropic Opus 4.6/4.7 speed flag, unslothai#5715) to
  stream_chat_completion and Anthropic-branch call site, alongside
  existing Gemini tools/tool_choice forwarding.
- Add _openai_image_generation_tool() helper (action:"edit" for follow-
  up image edits, unslothai#5712) and use it inside the existing
  _responses_hosted_builtins_allowed gate so the forced-function /
  tool_choice="none" suppression added in rounds 21+ still applies.
- Keep Anthropic web_fetch gated on _anthropic_hosted_builtins_allowed
  (round 19+ hosted-builtin gate) while taking main's per-model
  version selector (web_fetch_20260209 vs _20250910).

studio/backend/routes/inference.py
- Add `openai = provider_type == "openai"` (used by main's reasoning
  content forwarding for follow-up image edits).
- Keep the round 25/26 Gemini filter chain (_filter_tool_calls drops
  synthetic server-builtin cards, marks tc_id so the matching
  role="tool" follow-up gets skipped, extra_content gated to native
  Gemini host).
- Forward fast_mode alongside tools/tool_choice.

studio/backend/tests/test_openai_image_generation.py
- Combine assertions: both _server_tool: True (PR) and
  openai_image_generation_call_id (main) are present on the tool_start
  arguments.

studio/frontend/src/features/chat/shared-composer.tsx
- Add supportsBuiltinWebFetch declaration (separate Fetch pill from
  unslothai#5742) before the PR's isExternalGemini constant so both the Gemini
  image-tier gating and the standalone Anthropic Fetch pill compile.

studio/frontend/src/features/chat/api/chat-adapter.ts
- Add main's normalizeOpenAIReasoningItem, toOpenAIImageEditReferenceMessage,
  isAnthropicRefusalMessage helpers alongside PR's collectAssistantToolCalls,
  collectToolResultMessages, SerializedMessage, collectAssistantTextThoughtSignature.
- toOpenAIMessages (PR) now also early-returns on isAnthropicRefusalMessage
  so refused turns get pruned from outbound history.
- Add a thin toOpenAIMessage (singular) wrapper for the OpenAI image-
  edit replay path's flat .map() usage.
- Merge per-turn enable flags: keep PR's imageGenerationEnabledForThisTurn,
  geminiImageModeForThisTurn, codeExecEnabledForThisTurn !geminiImageMode
  gate; take main's webFetchEnabledForThisTurn (sourced from independent
  webFetchToolsEnabled pill state).
- Outbound build chains main's anthropic_refusal survivingMessages prune,
  then flatMap(toOpenAIMessages) (PR), then PR's selectedImageEditReference
  reference message prepend; image-edit unavailable toast from main fires
  before any of that when the pill is off.
- tool_end merge: do main's nextArgs spread first, then PR's Gemini
  native_part parts concat so both OpenAI image-call ids and Gemini
  executableCode/codeExecutionResult/inlineData round-trip.
- Cumulative + final yields: orderAssistantContent(pinTextThoughtSignature(...))
  composes main's tool-vs-text ordering with PR's per-text thoughtSignature pin.

Tests: gemini provider 148/148; openai_responses_translation + openai_code_execution
+ openai_image_generation + anthropic_code_execution + anthropic_web_fetch +
external_provider_usage_chunk + providers_api: 50 passed, 42 skipped; main's
new anthropic_fast_mode + citations + openai_citation_markers + openai_tool_result_fallbacks
suites all 43/43.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants