refactor: move experimental rollout engines to canonical rllm/engine/rollout/ by listar2000 · Pull Request #488 · rllm-org/rllm

listar2000 · 2026-04-05T22:10:09Z

Summary

Graduate rollout engines from rllm/experimental/rollout/ to rllm/engine/rollout/ (RolloutEngine, VerlEngine, TinkerEngine, types, completer)
Update 24+ consumer files to import from the canonical location
Add backward-compat shim in rllm/experimental/rollout/__init__.py with DeprecationWarning
Rename .validate → .is_validation across rollout engines (4 files, 8 occurrences)

Test plan

pytest tests/engine/ passes
pytest tests/parser/ passes
Importing from rllm.experimental.rollout still works (with deprecation warning)
End-to-end GPU test with Tinker backend
End-to-end GPU test with Verl backend

🤖 Generated with Claude Code

…rollout/ The experimental rollout engines (RolloutEngine, ModelOutput, TinkerEngine, VerlEngine, Completer, TITOCompleter, types) are now the canonical implementations in rllm/engine/rollout/. The experimental versions were strict supersets of the non-experimental ones, with additional support for TITO (token-in-token-out), weight version tracking, and training pipeline integration. Key changes: - Move all engine classes and types from rllm/experimental/rollout/ to rllm/engine/rollout/ (the canonical location) - Add backward-compat shim in rllm/experimental/rollout/__init__.py that re-exports from the canonical location with a DeprecationWarning - Update all 24+ consumer imports from rllm.experimental.rollout to rllm.engine.rollout - Standardize on .is_validation (not .validate) as the attribute name across all rollout engine consumers - OpenAIEngine and FireworksEngine remain in rllm/engine/rollout/ and inherit the enhanced base RolloutEngine automatically Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

luyuzhe111 · 2026-04-06T00:01:51Z

-from rllm.tools.tool_base import ToolCall as RllmToolCall

 # ------------------------------------------------------------------
-# Fixtures


Hi @listar2000, it looks like all the tool related tests were removed. we should add equivalent ones back with the new TinkerChatTemplateParser?

Yeah @luyuzhe111 you are spot on -- these are removed from test_tinker_engine since the original functions we tested are no longer used. These utility functions are now a part of the TinkerChatTemplateParser. I will add these tests back to the corresponding test_tinker_parser.py file.

luyuzhe111 · 2026-04-06T00:08:23Z

        max_model_length=2048,
        sampling_params={"train": {"temperature": 0.0}, "val": {"temperature": 0.0}},
-        bypass_render_with_parser=True,
        disable_thinking=True,


this test will fail since disable_thinking: bool = False has been removed. why was it removed?

You mean the bypass_render_with_parser flag or sth else? I think the removal of bypass_render_with_parser is the only change here.

Oh I see the point you've made. Yeah I need to go back and check the argument list of the TinkerEngine.

luyuzhe111 · 2026-04-06T00:17:15Z

+
+        return tokens
+
+    def _prepare_messages(self, messages: list[dict], tools: list[Tool | dict] | None = None) -> list[dict]:


it looks like these functions assume the tool format is rllm format instead of openai format. how does it work with agentflow & gateway, where the messages and tool calls are in openai format?

@luyuzhe111 You make a good point (I think this is where this linear issue was raised) -- after all the rLLM lacks a well-defined messages format standard (it's mostly like OpenAI messages format, but also has its own ToolCall classes). I guess for here what I will need to do is (at least temporarily) detect for both rLLM and OpenAI format, and then process differently to Tinker.

luyuzhe111 · 2026-04-06T00:23:20Z

@listar2000 great work with TinkerChatTemplateParser and the refactoring! I would suggest breaking it into multiple PRs tho for easier review & history tracking. For example, consider:

Base RolloutEngine enrichment + engine migration (move from experimental to canonical, add the new abstract methods, backward-compat shim)
TinkerChatTemplateParser + TinkerEngine refactor (the parser, removing bypass_render_with_parser)
VerlEngine refactor (is_validation rename, new TITO methods, stop_reason mapping)

listar2000 · 2026-04-06T00:28:09Z

@luyuzhe111 Agree. But for now I think 2 PRs separating the purpose seems reasonable.

I will try to get the migration done first -- i.e. making sure there will be no "dual" rollout engine paths. Meanwhile the TinkerEngine stays in the original form.

Meanwhile I can work on another PR making sure the TinkerChatTemplateParser works well for both message formats.

listar2000 · 2026-04-06T00:55:05Z

@luyuzhe111 This PR currently only contains the 1st part (i.e. the migration of rollout engines outside experimental/). I think it should not block the agent flow or gateway (which are already using the experimental rollout engine anyway?).

As for the Verl engine-related changes: the renaming has been manually made sure consistent, and the new TITO methods are not novel things, but a breakdown of the originally super long get_model_response method (messages -> ModelOutput) into more semantically clear stages (messages -> token inputs -> token outputs -> ModelOutput).

Will also be running some real tests to see how the older trainers work with this new Verl Engine.

luyuzhe111 · 2026-04-06T05:53:22Z

-        prompt_length = len(prompt_ids)
-        if enforce_max_prompt_length and prompt_length > self.max_prompt_length:
-            raise TerminationEvent(TerminationReason.MAX_PROMPT_LENGTH_EXCEEDED)
+        token_output: TokenOutput = await self.get_token_output_from_token_input(token_input=request_prompt_ids, **kwargs)


it looks like image_data never made it to **kwargs, so not self.server_manager.generate() either? might be an old bug in the experimental path.

@kylemontgomery1 cc'ing Kyle who might be more of an expert here. If I remember this correctly, the image_data is manually processed into list[int] and combined with the text prompt.

rllm/rllm/engine/rollout/verl_engine.py

Lines 87 to 92 in aebb38b

if any(msg.get("images", None) is not None and msg["role"] == "user" for msg in messages) and self.processor is not None:

image_data = self.chat_parser.process_image_data(messages) # list[PIL.Image.Image]

model_inputs = self.processor(text=[prompt], images=image_data)

prompt_ids = model_inputs.pop("input_ids")[0] # list[int]

model_inputs.pop("attention_mask")

multi_modal_inputs = dict(model_inputs)

I think the experimental Verl engine's handle here should at least match the legacy (non-experimental) one right?

rllm/engine/rollout/verl_engine.py in main correctly passes image_data to server_manager.generate, the experimental one does not. Merging would break multimodal support for verl in the agent workflow trainer (it may have never worked in the unified trained...)

@listar2000 I'd suggest holding off on moving the experimental verl rollout engine out of experimental/ for now; the other rollout engines seem fine to move. We have some other changes coming to the verl rollout engine to enable fully async (e.g., we will use a different server manager). Besides, the verl backend for the unified trainer is still in experimental (and rightly so imo).

@kylemontgomery1 Sounds good and thx for the review. In this case I think I will hold off this PR until the changes to (experimental) Verl rollout engine come in (might not directly to the current branch in this PR -- but I can cherrypick those changes).

I also agree that the Verl backend is still in experimental -- but considering that the (non-experimental) Verl workflow trainer isn't much more stable as well, it makes sense for us to keep working to make Verl backend more stable and then migrate it out soon.

Hi @kylemontgomery1 I think it's time to revisit this PR.

Are there any major changes to the (experimental) Verl rollout engine in your incoming fully-async PR? If so I can wait for that before migrating it out of experimental.

Also I think this PR should be merged before your next PR on rLLM gateway -- which if I understand correctly also need to touch the rollout engines slightly.

luyuzhe111 · 2026-04-06T06:12:59Z

        from .rollout.openai_engine import OpenAIEngine as _OpenAIEngine

        return _OpenAIEngine
+    if name == "TinkerEngine":


Pre-existing pattern, this PR just extends it: Two things about the __getattr__blocks:

The try/except Exception converts ImportError into AttributeError, which hides the real error. If someone writes from rllm.engine.rollout import TinkerEngine and tinker isn't installed, they'll see "module has no attribute 'TinkerEngine'" instead of "No module named 'tinker'". Since these are explicit imports (not speculative hasattr checks), letting the ImportError propagate is more helpful.

The as _TinkerEngine underscore alias is unnecessary — Python caches whatever getattr returns on the module automatically, so the local name doesn't matter.

Could simplify to:

if name == "TinkerEngine": from .tinker_engine import TinkerEngine return TinkerEngine

Same applies to the existing VerlEngine and OpenAIEngine blocks.

Since CC tends to pattern match a lot, would recommend putting the right pattern in this PR and fixing other cases soon too in a separate PR : )

Fixed. Will do another PR later on reverting similar patterns across codebase.

luyuzhe111 · 2026-04-06T06:24:26Z

@listar2000 thanks for the hard work and splitting the PR!

Regarding verl related changes tho, as I mentioned above, the VerlEngine TITO refactoring (splitting get_model_response into get_token_output_from_token_input + assemble_model_output) seems to introduce a subtle image_data regression — it's computed in _get_model_response but never forwarded to get_token_output_from_token_input, so server_manager.generate() no longer receives it for VLM prompts.

Would it make sense to keep the VerlEngine TITO refactor as a follow-up PR and just do the mechanical migration here (import path change + validate → is_validation)? That way this PR stays a clean move with no behavioral changes, and we can merge it faster without needing rigorous GPU testing — it already cleans things up significantly as-is. The TITO refactor can then be reviewed and tested properly on its own.

listar2000 · 2026-04-06T14:59:46Z

@luyuzhe111 Sounds good, thx for such detailed review! Will work on these later today.

Drop try/except that converted ImportError to AttributeError (hiding the real missing-dependency error) and remove unnecessary underscore aliases for OpenAIEngine, TinkerEngine, VerlEngine. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

listar2000 · 2026-05-01T22:01:33Z

Close it for now since we might have a newer version for this feature.

luyuzhe111 reviewed Apr 6, 2026

View reviewed changes

listar2000 force-pushed the refactor/move-experimental-rollouts branch from 9f16421 to aebb38b Compare April 6, 2026 00:40

listar2000 changed the title ~~refactor: graduate experimental rollout engines and adopt TinkerChatTemplateParser~~ refactor: move experimental rollout engines to canonical rllm/engine/rollout/ Apr 6, 2026

listar2000 mentioned this pull request Apr 6, 2026

refactor: replace bypass_render_with_parser with TinkerChatTemplateParser #489

Draft

3 tasks

luyuzhe111 reviewed Apr 6, 2026

View reviewed changes

listar2000 mentioned this pull request Apr 6, 2026

fix: unified async trainer with verl backend #493

Open

1 task

listar2000 closed this May 1, 2026


		return tokens

		def _prepare_messages(self, messages: list[dict], tools: list[Tool \| dict] \| None = None) -> list[dict]:

	if any(msg.get("images", None) is not None and msg["role"] == "user" for msg in messages) and self.processor is not None:
	image_data = self.chat_parser.process_image_data(messages) # list[PIL.Image.Image]
	model_inputs = self.processor(text=[prompt], images=image_data)
	prompt_ids = model_inputs.pop("input_ids")[0] # list[int]
	model_inputs.pop("attention_mask")
	multi_modal_inputs = dict(model_inputs)

Conversation

listar2000 commented Apr 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

luyuzhe111 commented Apr 6, 2026

Uh oh!

listar2000 commented Apr 6, 2026

Uh oh!

listar2000 commented Apr 6, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

luyuzhe111 commented Apr 6, 2026

Uh oh!

listar2000 commented Apr 6, 2026

Uh oh!

listar2000 commented May 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

listar2000 commented Apr 5, 2026 •

edited

Loading