Feature/pi05 inference/train/server support #935

Teeeio · 2025-11-27T04:01:48Z

PR Category

New Features

PR Types

New Features

PR Description

Add comprehensive Pi0.5 support - complete AI model development and deployment pipeline

Overview

This PR introduces end-to-end support for Pi0.5, covering the entire lifecycle from training to production serving with expert-enhanced architecture.

🏗️ Architecture Implementation

Pi0.5 Model Architecture: PaliGemmaWithExpert implementation with MoE routing
Expert Routing System: Dynamic expert selection for optimal performance
Multi-modal Capabilities: Support for text, vision, and robotics tasks

🚀 Model Support Components

Inference (`flagscale/models/pi0/`)

paligemma_with_expert.py: Core inference engine with expert routing
modeling_pi0_5.py: Complete Pi0.5 model implementation
Optimized inference pipeline with batching support

Training Support (`configs/training/`)

Pi0.5 specific training configurations
Expert-aware training strategies
Performance optimization settings
Integration with existing Megatron-LM backend

Serving Infrastructure (`flagscale/serve/`)

HTTP API Server: Production-ready RESTful service
OpenAI Compatibility: Drop-in replacement for existing integrations
Multi-node Deployment: Distributed serving across multiple GPUs/nodes
Client Library: Python SDK for seamless integration
Monitoring & Logging: Health checks, metrics, and debugging tools

Serving Infrastructure: - Add Flask-based HTTP server with CORS support for Pi0.5 inference - Implement comprehensive API endpoints for real-time model serving - Support batch and single inference requests with JSON/image input - Include robust error handling and logging capabilities Client Implementation: - Add Python client with HTTP API integration for model interaction - Support image encoding and base64 transmission for vision inputs - Include action space configuration and discrete state handling - Provide easy-to-use interface for robotics applications Configuration Management: - Add comprehensive serving configuration templates - Support both high-level and detailed serving configs - Include host, port, model parameters and engine settings - Maintain compatibility with existing FlagScale serving patterns API Features: - RESTful HTTP API with /infer endpoint for model predictions - Real-time image processing and action generation - Support for 32-dimensional action space output - Configurable tokenizer and model parameters

Core Model Implementation: - Add PI0_5_Policy model with 32-dimensional action space support - Implement discrete state input processing for robotics tasks - Add PaliGemmaWithExpert backbone with AdaRMSNorm for flow matching - Support 16-step action prediction with temporal modeling Inference Configuration: - Add comprehensive inference configuration templates - Support both high-level and detailed inference configs - Include tokenizer and action dimension parameters - Maintain compatibility with existing Pi0 inference patterns Technical Details: - Extended Pi0 architecture for expert-enhanced multimodal reasoning - Flow matching timestep injection with adaRMS normalization - Vision-language-action model with discrete state integration Code Quality: - Applied Black code formatting - Fixed trailing whitespace and line endings - Ensured isort import organization compliance

Training Pipeline: - Add complete training script with distributed data parallel support - Implement Megatron-Energon integration for efficient data loading - Support wandb logging and experiment tracking - Include training resume and checkpoint management capabilities Training Configuration: - Add comprehensive training configuration templates - Support both standard and simplified training configs - Include optimizer, scheduler, and data processing parameters - Maintain compatibility with Hydra configuration system Technical Features: - Distributed training with DDP across multiple GPUs - Task encoder integration for robotics data processing - Automatic mixed precision training support - Comprehensive logging with wandb integration Model Pipeline: - Add parallelize and pipeline transformations for Pi0.5 - Support expert-enhanced model training - Include flow matching training objectives - Optimized for large-scale robotics data

- Format paligemma_with_expert.py - Format modeling_pi0_5.py

…ron-LM version - Remove megatron/inference/text_generation/sampling.py.patch - Remove megatron/inference/text_generation/tokenization.py.patch - These patches reference files that don't exist in current Megatron-LM version - Resolves patch application failures during unpatching

- Update Megatron-LM from feature/pi05-support to latest main (5153663) - Remove outdated patch files as they are no longer needed with upstream sync - Pi0.5 support remains fully compatible with latest Megatron-LM - CI/CD compatibility restored by using standard upstream version Key improvements from upstream: - Enhanced multimodal support beneficial for Pi0.5 - Performance optimizations with CUDA graphs and dynamic inference - FSDP stability fixes for distributed training - Extended quantization support (FP8, NVFP4) - Improved tokenizer and checkpoint handling The sync maintains full Pi0.5 functionality while ensuring compatibility with CI/CD pipelines that require upstream alignment.

Teeeio requested a review from cyber-pioneer as a code owner November 27, 2025 04:01

Teeeio changed the title ~~Feature/pi05 support~~ Feature/pi05 inference/train/server support Nov 27, 2025

Teeeio force-pushed the feature/pi05-support branch 2 times, most recently from c9efc51 to 07aace1 Compare November 27, 2025 04:18

Teeeio added 4 commits November 27, 2025 12:23

style: Fix code formatting with Black for Pi0.5 files

32f510e

- Format paligemma_with_expert.py - Format modeling_pi0_5.py

Teeeio force-pushed the feature/pi05-support branch from 07aace1 to 32f510e Compare November 27, 2025 04:28

Teeeio requested review from aoyulong, heavyrain-lzy and zhaoyinglia as code owners November 27, 2025 06:29

Teeeio force-pushed the feature/pi05-support branch from bdb222e to 91bf345 Compare November 27, 2025 06:31

Teeeio closed this Nov 27, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature/pi05 inference/train/server support #935

Feature/pi05 inference/train/server support #935

Uh oh!

Teeeio commented Nov 27, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Feature/pi05 inference/train/server support #935

Feature/pi05 inference/train/server support #935

Uh oh!

Conversation

Teeeio commented Nov 27, 2025

PR Category

PR Types

PR Description

Overview

🏗️ Architecture Implementation

🚀 Model Support Components

Inference (flagscale/models/pi0/)

Training Support (configs/training/)

Serving Infrastructure (flagscale/serve/)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Inference (`flagscale/models/pi0/`)

Training Support (`configs/training/`)

Serving Infrastructure (`flagscale/serve/`)