[Feat] Add new data form support for dataset queries#2688
[Feat] Add new data form support for dataset queries#2688screw-44 wants to merge 6 commits intohuggingface:mainfrom
Conversation
- Add special handling for 'affordance' key in _get_query_indices() - Return variable-length indices from current frame to episode end for affordance - Implement affordance-specific query in _query_hf_dataset() - Map affordance queries to 'action' column - Update get_delta_indices() to handle affordance key specially
There was a problem hiding this comment.
Pull request overview
This PR adds affordance support to LeRobot datasets, enabling queries that return variable-length action trajectories from the current frame to the end of an episode. This feature supports new forms of VLA (Vision-Language-Action) training by providing complete manipulation trajectories.
Key Changes:
- Special handling for "affordance" key in dataset queries to return variable-length sequences
- Mapping of affordance queries to the "action" column in the underlying dataset
- Support for empty delta_timestamps list via
{"affordance": []}
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 7 comments.
| File | Description |
|---|---|
| src/lerobot/datasets/utils.py | Added special handling in get_delta_indices() to preserve empty list for affordance key |
| src/lerobot/datasets/lerobot_dataset.py | Implemented affordance-specific logic in _get_query_indices() to generate indices from current frame to episode end, added affordance mapping in _query_hf_dataset(), and updated documentation |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| query_indices = {} | ||
| padding = {} | ||
|
|
||
| for key, delta_idx in self.delta_indices.items(): | ||
| # Special handling for "affordance" key: return all frames from current to episode end | ||
| if key == "affordance": | ||
| # Generate indices from idx to ep_end (exclusive), clamped to valid range | ||
| affordance_indices = list(range(max(idx, ep_start), ep_end)) | ||
| # If range is empty, return at least the current frame | ||
| if not affordance_indices: | ||
| affordance_indices = [max(idx, ep_start)] |
There was a problem hiding this comment.
The logic in line 939 has a potential issue. When idx < ep_start, the code uses max(idx, ep_start) which would start from ep_start. However, if idx is before the episode start, this means the current frame is outside the episode bounds, which seems like an invalid state. The code should either handle this edge case more explicitly or document why this behavior is acceptable. Additionally, the condition on line 941 checking if affordance_indices is empty would only be true if ep_start >= ep_end, which suggests an invalid episode. This edge case should be handled more robustly or validated earlier.
| query_indices = {} | |
| padding = {} | |
| for key, delta_idx in self.delta_indices.items(): | |
| # Special handling for "affordance" key: return all frames from current to episode end | |
| if key == "affordance": | |
| # Generate indices from idx to ep_end (exclusive), clamped to valid range | |
| affordance_indices = list(range(max(idx, ep_start), ep_end)) | |
| # If range is empty, return at least the current frame | |
| if not affordance_indices: | |
| affordance_indices = [max(idx, ep_start)] | |
| # Validate episode bounds and requested index to avoid querying outside the episode. | |
| if ep_start >= ep_end: | |
| raise ValueError(f"Invalid episode bounds: dataset_from_index ({ep_start}) " | |
| f"is not less than dataset_to_index ({ep_end}).") | |
| if not (ep_start <= idx < ep_end): | |
| raise IndexError( | |
| f"Frame index {idx} is outside the episode range " | |
| f"[{ep_start}, {ep_end})." | |
| ) | |
| query_indices = {} | |
| padding = {} | |
| for key, delta_idx in self.delta_indices.items(): | |
| # Special handling for "affordance" key: return all frames from current to episode end | |
| if key == "affordance": | |
| # Generate indices from the current frame to ep_end (exclusive). | |
| # The range is guaranteed to be non-empty by the bounds check above. | |
| affordance_indices = list(range(idx, ep_end)) |
| # Special handling for "affordance" key: keep empty list (will be handled specially in _get_query_indices) | ||
| if key == "affordance": | ||
| delta_indices[key] = [] | ||
| else: | ||
| delta_indices[key] = [round(d * fps) for d in delta_ts] |
There was a problem hiding this comment.
The new affordance feature lacks test coverage. Given that the repository has comprehensive test suites for dataset functionality (as seen in tests/datasets/test_delta_timestamps.py, test_datasets.py, etc.), this new feature should include tests to verify: 1) affordance queries return correct variable-length sequences from current frame to episode end, 2) empty list handling in get_delta_indices, 3) affordance mapping to action column in _query_hf_dataset, and 4) edge cases like querying at the last frame of an episode.
| if key == "affordance": | ||
| # Generate indices from idx to ep_end (exclusive), clamped to valid range | ||
| affordance_indices = list(range(max(idx, ep_start), ep_end)) | ||
| # If range is empty, return at least the current frame | ||
| if not affordance_indices: | ||
| affordance_indices = [max(idx, ep_start)] | ||
| query_indices[key] = affordance_indices | ||
| # No padding needed for affordance (variable length by design) | ||
| padding[f"{key}_is_pad"] = torch.BoolTensor([False] * len(affordance_indices)) |
There was a problem hiding this comment.
The affordance-specific logic in _get_query_indices lacks test coverage. Tests should verify: 1) correct generation of indices from current frame to episode end, 2) behavior when querying at different positions within an episode (start, middle, end), 3) padding tensor generation with all False values, and 4) integration with the rest of the dataset querying pipeline.
| torchvision.transforms.v2 here which will be applied to visual modalities (whether they come | ||
| from videos or images). Defaults to None. | ||
| delta_timestamps (dict[list[float]] | None, optional): _description_. Defaults to None. | ||
| delta_timestamps (dict[list[float]] | None, optional): _description_. Defaults to None. Use {"affordance":[]} to get full trajectory. |
There was a problem hiding this comment.
The documentation for the delta_timestamps parameter is incomplete. It still shows "description" as a placeholder and the added note "Use {"affordance":[]} to get full trajectory" is unclear. The documentation should: 1) replace "description" with a proper explanation of what delta_timestamps does, 2) explain what "affordance" means in this context, 3) clarify what "full trajectory" refers to, and 4) describe the expected structure and behavior of the affordance feature more thoroughly.
| delta_timestamps (dict[list[float]] | None, optional): _description_. Defaults to None. Use {"affordance":[]} to get full trajectory. | |
| delta_timestamps (dict[str, list[float]] | None, optional): Mapping from feature names to lists | |
| of relative time offsets (in seconds) that control which timesteps are materialized from the | |
| underlying recorded trajectory. Each key should be the name of a feature or feature group | |
| present in the dataset (for example, an observation, action, or label namespace), and each | |
| value should be a list of floats representing time deltas relative to the reference control | |
| timestamp for a given step. When ``delta_timestamps`` is ``None`` (default), all modalities are | |
| read at their native timestamps without additional temporal offsets. | |
| In this context, ``"affordance"`` refers to a special feature namespace used to store | |
| task‑specific targets or labels that are defined per timestep along the trajectory. Passing | |
| ``{"affordance": []}`` disables sub‑sampling for that namespace and exposes the full trajectory, | |
| i.e. the complete sequence of timesteps recorded for each episode, rather than a reduced set of | |
| timesteps derived from non‑empty offset lists. The dictionary is expected to use feature names | |
| as keys and lists of float offsets (typically sorted, in seconds) as values; these offsets are | |
| validated against the dataset frame rate so that each offset corresponds to an existing frame | |
| or step when added to the reference timestamps. Defaults to None. |
| delta_indices = {} | ||
| for key, delta_ts in delta_timestamps.items(): | ||
| delta_indices[key] = [round(d * fps) for d in delta_ts] | ||
| # Special handling for "affordance" key: keep empty list (will be handled specially in _get_query_indices) |
There was a problem hiding this comment.
The comment refers to "_get_query_indices" but should use proper backticks for code references to improve readability and follow documentation best practices. The comment should be: "Special handling for 'affordance' key: keep empty list (will be handled specially in _get_query_indices)"
| # Special handling for "affordance" key: keep empty list (will be handled specially in _get_query_indices) | |
| # Special handling for "affordance" key: keep empty list (will be handled specially in `_get_query_indices`) |
| if not affordance_indices: | ||
| affordance_indices = [max(idx, ep_start)] | ||
| query_indices[key] = affordance_indices | ||
| # No padding needed for affordance (variable length by design) |
There was a problem hiding this comment.
The comment on line 944 states "No padding needed for affordance (variable length by design)" but this is misleading. The code still creates a padding tensor with all False values, so padding is technically still present, just never set to True. The comment should clarify this: "Affordance sequences are variable length by design - padding tensor created with all False values since all frames are valid"
| # No padding needed for affordance (variable length by design) | |
| # Affordance sequences are variable length by design - padding tensor created with all False values since all frames are valid |
| # Special handling for "affordance": query from "action" column | ||
| query_key = "action" if key == "affordance" else key | ||
| try: | ||
| result[key] = torch.stack(self.hf_dataset[key][relative_indices]) |
There was a problem hiding this comment.
The first try block attempts to access self.hf_dataset[key][relative_indices] but should use query_key instead of key to properly handle the affordance case. When key is "affordance", the code should query the "action" column, but the try block still uses the original key name which will fail to find the "affordance" column.
| result[key] = torch.stack(self.hf_dataset[key][relative_indices]) | |
| result[key] = torch.stack(self.hf_dataset[query_key][relative_indices]) |
Type / Scope
Summary / Motivation
What changed
How was this tested
How to run locally (reviewer)
initualize the dataset as this
Checklist (required before merge)
pre-commit run -a)pytest)Reviewer notes