Skip to content

Conversation

@AdilZouitine
Copy link
Collaborator

@AdilZouitine AdilZouitine commented Sep 18, 2025

Summary

Add intelligent migration detection to DataProcessorPipeline.from_pretrained() to guide users when they try to load old LeRobot models that need migration to the new processor format. During implementation, we discovered opportunities to dramatically simplify the loading logic by requiring explicit config_filename parameter.

🎯 Main Purpose: Migration Detection

✨ Migration Detection Features

  • ProcessorMigrationError: Custom exception with actionable migration guidance
  • Smart detection: Analyzes JSON configs to detect old LeRobot models requiring migration
  • Precise targeting: Avoids false positives on other HuggingFace model types
  • Clear guidance: Provides exact migration command to run

Migration Detection Logic

# Before: Users got confusing errors
DataProcessorPipeline.from_pretrained("old-lerobot-model", config_filename="config.json")
# KeyError: 'steps' - unhelpful error

# After: Clear migration guidance  
DataProcessorPipeline.from_pretrained("old-lerobot-model", config_filename="config.json")
# ProcessorMigrationError: Model requires migration. 
# Run: python src/lerobot/processor/migrate_policy_normalization.py --pretrained-path old-lerobot-model

🔧 Secondary Improvements: Simplified Loading Logic

💥 Breaking Change: Explicit config_filename Required

While implementing migration detection, we realized the auto-detection logic was unnecessarily complex and prone to bugs. We simplified it by requiring explicit config_filename:

# Before: Auto-detection with potential confusion
DataProcessorPipeline.from_pretrained("model/")  # Which config? Dangerous with multiple configs!

# After: Explicit and unambiguous
DataProcessorPipeline.from_pretrained("model/", config_filename="policy_preprocessor.json")   # Clear!
DataProcessorPipeline.from_pretrained("model/", config_filename="policy_postprocessor.json")  # Clear!

🗑️ Massive Simplification Achieved

  • Removed methods: _resolve_config_source (65 lines), _looks_like_local_path (25 lines)
  • Simplified: _load_config (40 lines → 20 lines with cleaner logic)
  • Eliminated: Complex auto-detection and path heuristics
  • Total reduction: ~135 lines of complex logic eliminated

Super-Simplified 3-Way Loading

# New simple logic (was 100+ lines, now 20 lines):
model_path = Path(model_id)

if model_path.is_dir():
    # Directory: load specified config from directory
    config_path = model_path / config_filename
elif model_path.is_file():
    # File: load file directly (config_filename ignored)
    config_path = model_path
else:
    # Hub: download specified config
    config_path = hf_hub_download(repo_id=model_id, filename=config_filename, ...)

🔧 Refactoring for Better Maintainability

Modular Helper Methods

Method Purpose Lines Tests
_load_config Simplified 3-way loading 20 5
_validate_loaded_config Config validation & migration 15 3
_resolve_step_class Registry vs import resolution 25 4
_instantiate_step Step creation with overrides 15 1
_load_step_state State file loading 20 1
_build_steps_with_overrides Step construction orchestration 25 0*
_validate_overrides_used Override validation 15 3

*Tested through integration tests

Before vs After Complexity

# Before: Monolithic method (270+ lines)
@classmethod
def from_pretrained(cls, ...):
    # 270+ lines of mixed responsibilities with complex auto-detection

# After: Clean orchestration (30 lines)
@classmethod  
def from_pretrained(cls, ...):
    loaded_config, base_path = cls._load_config(model_id, config_filename, hub_kwargs)
    cls._validate_loaded_config(model_id, loaded_config, config_filename)
    steps, remaining_overrides = cls._build_steps_with_overrides(...)
    cls._validate_overrides_used(remaining_overrides, loaded_config)
    return cls(steps=steps, ...)

🚨 Breaking Change Details

Why Explicit config_filename is Critical

The main driver was preventing dangerous silent bugs in migration detection implementation:

# Real-world scenario that would cause subtle bugs:
# HuggingFace repo contains:
# - policy_preprocessor.json    (input normalization)
# - policy_postprocessor.json   (output denormalization)

# Before: Auto-detection could silently load wrong config
pipeline = DataProcessorPipeline.from_pretrained("model/")  # Gets which one? 🤷‍♂️
# Result: Subtle data processing bugs, very hard to debug

# After: Explicit selection prevents confusion
preprocessor = DataProcessorPipeline.from_pretrained("model/", config_filename="policy_preprocessor.json")   ✅
postprocessor = DataProcessorPipeline.from_pretrained("model/", config_filename="policy_postprocessor.json") ✅
# Result: Correct configs loaded, no confusion possible

Migration Guide

# Before (breaks):
DataProcessorPipeline.from_pretrained("/model/")
DataProcessorPipeline.from_pretrained("user/repo")

# After (required):
DataProcessorPipeline.from_pretrained("/model/", config_filename="processor.json")
DataProcessorPipeline.from_pretrained("user/repo", config_filename="processor.json")

Impact Analysis

  • Breaking changes: Only in test code (18 test cases updated mechanically)
  • Production code: No known breaking changes in real usage
  • Migration effort: Simple addition of config_filename parameter

🧪 Testing

Comprehensive Test Coverage

  • 17 tests for migration detection (test_migration_detection.py)
  • 17 tests for simplified helper methods (test_pipeline_from_pretrained_helpers.py)
  • 65 tests for pipeline functionality (all updated and passing)
  • Total: 99 tests - comprehensive coverage maintained

Test Structure Improvements

  • Function-based tests: Simplified structure, no unnecessary classes
  • Clear naming: test_pipeline_from_pretrained_helpers.py (descriptive)
  • Independent tests: No shared state, easier to debug
  • Focused testing: Each test targets specific functionality

💡 Migration Example (Main Feature)

# User tries to load old model
DataProcessorPipeline.from_pretrained("old-model", config_filename="config.json")

# Gets clear migration guidance
ProcessorMigrationError: Model 'old-model' requires migration to processor format.
Run: python src/lerobot/processor/migrate_policy_normalization.py --pretrained-path old-model

# User runs migration
python src/lerobot/processor/migrate_policy_normalization.py --pretrained-path old-model

# Now loading works with new processor configs
DataProcessorPipeline.from_pretrained("old-model", config_filename="policy_preprocessor.json")  # ✅ Success!

🎯 Smart Migration Detection Algorithm

def _should_suggest_migration(model_path: Path) -> bool:
    """Detect old LeRobot models that need migration."""
    json_files = list(model_path.glob("*.json"))
    if not json_files:
        return False  # No configs = no migration needed
    
    # Check if ANY file is a valid processor config
    for json_file in json_files:
        if _is_processor_config(json.load(json_file.open())):
            return False  # Valid processor found = no migration needed
    
    return True  # JSON exists but none are processors = migration needed

🚀 Benefits

For Users (Primary Goal)

  • Clear migration guidance: Exact commands provided for upgrading old models
  • No confusion: Actionable errors instead of cryptic KeyError: 'steps'
  • Smooth upgrade path: Easy transition from old to new processor format
  • Prevents silent bugs: Can't accidentally load wrong config (preprocessor vs postprocessor)

For Developers (Secondary Benefits)

  • Dramatically simplified logic: 135+ lines of complex code eliminated
  • Better maintainability: Focused helper methods with single responsibilities
  • Enhanced testability: Individual methods can be unit tested
  • Comprehensive documentation: Decision trees fully explained in docstrings

… processor configurations

- Added ProcessorMigrationError to handle migration requirements for old model formats.
- Enhanced DataProcessorPipeline.from_pretrained to include robust migration detection logic.
- Implemented methods for resolving configuration sources, validating loaded configs, and checking for valid processor configurations.
- Introduced comprehensive tests for migration detection and configuration validation to ensure correct behavior.
@AdilZouitine AdilZouitine self-assigned this Sep 18, 2025
@AdilZouitine AdilZouitine added refactor Code cleanup or restructuring without changing behavior processor Issue related to processor labels Sep 18, 2025
…ection

- Refactored DataProcessorPipeline to implement a simplified three-way loading strategy for configuration files.
- Introduced explicit config_filename parameter to avoid ambiguity during loading.
- Updated ProcessorMigrationError to provide clearer error messages for migration requirements.
- Enhanced tests to cover new loading logic and ensure proper migration detection.
- Removed deprecated methods related to config source resolution.
@AdilZouitine AdilZouitine merged commit a0def4a into user/azouitine/2025-7-4-convert-codebase-with-pipeline Sep 18, 2025
5 checks passed
@AdilZouitine AdilZouitine deleted the refactor/add_migration_error_processor branch September 18, 2025 10:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

processor Issue related to processor refactor Code cleanup or restructuring without changing behavior

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants