Skip to content

Conversation

@fracapuano
Copy link
Contributor

@fracapuano fracapuano commented Oct 8, 2025

What this does

Solves #2142, introducing an encapsulator to interface robot observations and models build_inference_frame.

Crucially, this PR enables a block of code like the following to run without any issues:

obs = robot.get_observation()
obs_frame = build_inference_frame(obs, device , dataset_metadata.features)

obs = preprocess(obs_frame)

action = model.select_action(obs)

action = make_robot_action(action)
action = postprocess(action)
robot.send_action(action)

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors the preprocessing steps for policy inference by introducing a new build_inference_frame function that encapsulates the conversion of robot observations into the format expected by machine learning models. This addresses issue #2142 by providing a clean interface between robot observations and model inference.

Key changes:

  • Added build_inference_frame function to centralize observation preprocessing logic
  • Replaced inline preprocessing code in predict_action with a call to the new function
  • The new function handles tensor conversion, image normalization, device transfer, and metadata addition

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
src/lerobot/policies/utils.py Adds the new build_inference_frame function with observation preprocessing logic
src/lerobot/utils/control_utils.py Replaces inline preprocessing with call to build_inference_frame function

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@fracapuano fracapuano changed the title Fix/encapsulate preprocessing steps for policy inference (improve api) Add the Build-Inference-Frame Util to Allow API-based Inference Oct 8, 2025
@imstevenpmwork imstevenpmwork added enhancement Suggestions for new features or improvements policies Items related to robot policies processor Issue related to processor labels Oct 8, 2025
Copy link
Collaborator

@imstevenpmwork imstevenpmwork left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with the linked issue and this PR's goal. However, instead of just refactoring into a function, I suggest a more robust solution: implement the linked code as a formal processor step within the policy's preprocessor.

This would make the main control loop much cleaner:

obs = robot.get_observation()
obs = preprocess(obs)  # Now includes the new step
action = model.select_action(obs)
action = postprocess(action)
# robot.send_action(action)

This approach also addresses the root cause mentioned in point # 7 of the previous review: the flawed assumptions in batch_to_transition. By moving this logic to a processor, we make progress on point #1 of that same comment, which is to migrate policy boilerplate into proper processor steps.

Shall we proceed with this processor-based approach, or would you prefer to stick with the current PR's function-based refactor? (I'm good either way)

Thanks anyways for taking a look into this !

@imstevenpmwork imstevenpmwork linked an issue Oct 8, 2025 that may be closed by this pull request
2 tasks
@imstevenpmwork imstevenpmwork changed the title (improve api) Add the Build-Inference-Frame Util to Allow API-based Inference feat(scripts): Introduce build_inference_frame util to easily allow API-based Inference Oct 8, 2025
@fracapuano
Copy link
Contributor Author

Shall we proceed with this processor-based approach, or would you prefer to stick with the current PR's function-based refactor? (I'm good either way)

I would stick to the current PR and open a "good first issue" for the community to implement this as part of the pipeline system. The main rationale for this choice is that it is the solution which would allow us to ship the tutorial the earliest (i.e., tomorrow) without delaying release further.

Also, I think there is a rather big CI blocker for the pipeline system: the need to be constantly updating all the models on the hub whenever we change the pipeline used, which is why I'd argue we should choose a bit more carefully when to roll out changes to the pipelines already uploaded to avoid disruptions. For instance, I think updates to the pipelines used by the model (such as the one you're describing here) could be introduced within a future 0.x.0 release. Happy to discuss this further tho!

@fracapuano
Copy link
Contributor Author

fracapuano commented Oct 9, 2025

@imstevenpmwork just flagging that 06654e1 adds a very similar modification on the policy's outputs too

I added this modification to this PR just to move fast. They are conceptually deriving from the same problem

@fracapuano fracapuano changed the title feat(scripts): Introduce build_inference_frame util to easily allow API-based Inference feat(scripts): Introduce build_inference_frame/make_robot_action util to easily allow API-based Inference Oct 9, 2025
@fracapuano fracapuano removed the processor Issue related to processor label Oct 10, 2025
fracapuano and others added 2 commits October 10, 2025 16:22
…on to only perform data type handling (whole conversion is: keys matching + data type conversion)
@fracapuano
Copy link
Contributor Author

Hey @imstevenpmwork 👋 I wanted to let you know (1) I succesfully tested these last changes using the CLI lerobot-record and (2) I think I had a (tiny) bug in record_loop, and now I have fixed it.

As per (2), I think I unnecessarily called build_dataset_frame twice on the same, already converted object. Not a big deal, but I don't think is ideal either. Here are my thoughts:

  1. Within record_loop, we need to convert a raw observation into a frame, either for inference or to append it to a dataset. At a high-level, converting a raw observation {motor1: ..., motor2: ..., ..., camera1: ...} into a frame {observation.state: ..., observation.images.camera1: ...} consists of (1) turning a dictionary into another dictionary and (2) turning arrays into tensors with specific data types.
  2. We can do both things using build_inference_frame. However, in lerobot-record the code's logic is such that we need to split these two operations. In particular, you can see that we first convert the raw observation into a useful frame here:
    observation_frame = build_dataset_frame(dataset.features, obs_processed, prefix=OBS_STR)
  3. Then, we call predict_action onto the frame:
    action_values = predict_action(
    observation=observation_frame,
    policy=policy,
    device=get_safe_torch_device(policy.config.device),
    preprocessor=preprocessor,
    postprocessor=postprocessor,
    use_amp=policy.config.use_amp,
  4. Inside of predict_action, we don't need to reconvert the frame again, and only need to take care of changing the data types accordingly. This means that we need a third util which encapsulates the conversion logic currently implemented on main, i.e. (1) handling the image dtypes int -> float conversion and turning tensors into arrays where necessary.

The new small function build_frame_for_inference thus groups these two operations so that clean API examples can just interface build_frame_for_inference (which under the hood calls build_dataset_frame for dict->dict operations and prepare_frame_for_inference for data type handling)

@imstevenpmwork
Copy link
Collaborator

imstevenpmwork commented Oct 14, 2025

After merging the slight changes from #2198
I'm good with approving and merging this PR after we haver added a working code example (similar to the snippet in the PR description) to the examples folder which uses build_inference_frame as this function is currently unused in the codebase

@imstevenpmwork imstevenpmwork self-requested a review October 14, 2025 13:12
Copy link
Collaborator

@imstevenpmwork imstevenpmwork left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

API examples will be added later. Consider moving these functions to processor steps in the future.

Thanks !

@fracapuano fracapuano mentioned this pull request Oct 14, 2025
@imstevenpmwork imstevenpmwork merged commit 723013c into main Oct 14, 2025
10 checks passed
@imstevenpmwork imstevenpmwork deleted the fix/encapsulate-preprocessing-steps-for-policy-inference branch October 14, 2025 13:47
@fracapuano fracapuano mentioned this pull request Oct 22, 2025
annarborace01 pushed a commit to annarborace01/lerobot that referenced this pull request Nov 16, 2025
…util to easily allow API-based Inference (huggingface#2143)

* fix: expose a function explicitly building a frame for inference

* fix: first make dataset frame, then make ready for inference

* fix: reducing reliance on lerobot record for policy's ouptuts too

* fix: encapsulating squeezing out + device handling from predict action

* fix: remove duplicated call to build_inference_frame and add a function to only perform data type handling (whole conversion is: keys matching + data type conversion)

* fix(policies): right utils signature + docstrings (huggingface#2198)

---------

Co-authored-by: Steven Palma <[email protected]>
nepyope pushed a commit that referenced this pull request Nov 21, 2025
…util to easily allow API-based Inference (#2143)

* fix: expose a function explicitly building a frame for inference

* fix: first make dataset frame, then make ready for inference

* fix: reducing reliance on lerobot record for policy's ouptuts too

* fix: encapsulating squeezing out + device handling from predict action

* fix: remove duplicated call to build_inference_frame and add a function to only perform data type handling (whole conversion is: keys matching + data type conversion)

* fix(policies): right utils signature + docstrings (#2198)

---------

Co-authored-by: Steven Palma <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement Suggestions for new features or improvements policies Items related to robot policies

Projects

None yet

Development

Successfully merging this pull request may close these issues.

(improving api) Over-reliance on lerobot-record for inference

3 participants