-
Notifications
You must be signed in to change notification settings - Fork 124
Feature/pi05 inference/train/server support #935
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
c9efc51 to
07aace1
Compare
Serving Infrastructure: - Add Flask-based HTTP server with CORS support for Pi0.5 inference - Implement comprehensive API endpoints for real-time model serving - Support batch and single inference requests with JSON/image input - Include robust error handling and logging capabilities Client Implementation: - Add Python client with HTTP API integration for model interaction - Support image encoding and base64 transmission for vision inputs - Include action space configuration and discrete state handling - Provide easy-to-use interface for robotics applications Configuration Management: - Add comprehensive serving configuration templates - Support both high-level and detailed serving configs - Include host, port, model parameters and engine settings - Maintain compatibility with existing FlagScale serving patterns API Features: - RESTful HTTP API with /infer endpoint for model predictions - Real-time image processing and action generation - Support for 32-dimensional action space output - Configurable tokenizer and model parameters
Core Model Implementation: - Add PI0_5_Policy model with 32-dimensional action space support - Implement discrete state input processing for robotics tasks - Add PaliGemmaWithExpert backbone with AdaRMSNorm for flow matching - Support 16-step action prediction with temporal modeling Inference Configuration: - Add comprehensive inference configuration templates - Support both high-level and detailed inference configs - Include tokenizer and action dimension parameters - Maintain compatibility with existing Pi0 inference patterns Technical Details: - Extended Pi0 architecture for expert-enhanced multimodal reasoning - Flow matching timestep injection with adaRMS normalization - Vision-language-action model with discrete state integration Code Quality: - Applied Black code formatting - Fixed trailing whitespace and line endings - Ensured isort import organization compliance
Training Pipeline: - Add complete training script with distributed data parallel support - Implement Megatron-Energon integration for efficient data loading - Support wandb logging and experiment tracking - Include training resume and checkpoint management capabilities Training Configuration: - Add comprehensive training configuration templates - Support both standard and simplified training configs - Include optimizer, scheduler, and data processing parameters - Maintain compatibility with Hydra configuration system Technical Features: - Distributed training with DDP across multiple GPUs - Task encoder integration for robotics data processing - Automatic mixed precision training support - Comprehensive logging with wandb integration Model Pipeline: - Add parallelize and pipeline transformations for Pi0.5 - Support expert-enhanced model training - Include flow matching training objectives - Optimized for large-scale robotics data
- Format paligemma_with_expert.py - Format modeling_pi0_5.py
07aace1 to
32f510e
Compare
…ron-LM version - Remove megatron/inference/text_generation/sampling.py.patch - Remove megatron/inference/text_generation/tokenization.py.patch - These patches reference files that don't exist in current Megatron-LM version - Resolves patch application failures during unpatching
bdb222e to
91bf345
Compare
- Update Megatron-LM from feature/pi05-support to latest main (5153663) - Remove outdated patch files as they are no longer needed with upstream sync - Pi0.5 support remains fully compatible with latest Megatron-LM - CI/CD compatibility restored by using standard upstream version Key improvements from upstream: - Enhanced multimodal support beneficial for Pi0.5 - Performance optimizations with CUDA graphs and dynamic inference - FSDP stability fixes for distributed training - Extended quantization support (FP8, NVFP4) - Improved tokenizer and checkpoint handling The sync maintains full Pi0.5 functionality while ensuring compatibility with CI/CD pipelines that require upstream alignment.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
PR Category
New Features
PR Types
New Features
PR Description
Add comprehensive Pi0.5 support - complete AI model development and deployment pipeline
Overview
This PR introduces end-to-end support for Pi0.5, covering the entire lifecycle from training to production serving with expert-enhanced architecture.
🏗️ Architecture Implementation
🚀 Model Support Components
Inference (
flagscale/models/pi0/)paligemma_with_expert.py: Core inference engine with expert routingmodeling_pi0_5.py: Complete Pi0.5 model implementationTraining Support (
configs/training/)Serving Infrastructure (
flagscale/serve/)