Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
050c2f4
update: robotics training support, add modelings and tests in main()
Hchnr Nov 4, 2025
ec826f6
update: rdata process (jsonl -> webdataset) for pi0 and robotics
Hchnr Nov 4, 2025
c010541
update: data process (jsonl -> webdataset) for pi0 and robotics
Hchnr Nov 4, 2025
f8bf0f4
update: data process (jsonl -> webdataset) for pi0 and robotics
Hchnr Nov 4, 2025
9be7732
format: with black
Hchnr Nov 4, 2025
fdc7c9a
Update tools/datasets/qwenvl/convert_pi0.py
Hchnr Nov 4, 2025
b4674f9
Update tools/datasets/qwenvl/convert_pi0.py
Hchnr Nov 4, 2025
b7da7ef
update: move data processing tools for pi0 and robotics to tools/data…
Hchnr Nov 4, 2025
3ddcd2d
Merge branch 'pi0_webdataset_convert' of github.com:Hchnr/FlagScale i…
Hchnr Nov 4, 2025
de0c940
update: move data processing tools for pi0 and robotics to tools/data…
Hchnr Nov 4, 2025
910b2ae
Merge branch 'main' into robotics_1120
Hchnr Nov 4, 2025
c2f7ab2
merge: pi0_webdataset_convert
Hchnr Nov 4, 2025
2039fe7
update: support robotics qwen_groot stucture.
Hchnr Nov 5, 2025
d279950
update: robotics qwen_groot main() work, data_loader & forward & back…
Hchnr Nov 5, 2025
d4d8c79
update: robotics qwen_groot main() work, data_loader & forward & back…
Hchnr Nov 6, 2025
a001c02
Merge branch 'main' into robotics_1120
Hchnr Nov 7, 2025
1aa77cf
Merge branch 'main' into robotics_1120
Hchnr Nov 7, 2025
2c955f6
format: qwen_groot from pretrained
Hchnr Nov 10, 2025
71261a9
format: qwen_groot from_pretrain and save_pretrain
Hchnr Nov 10, 2025
193ca48
update: add qwen_groot example
Hchnr Nov 10, 2025
282d8ef
update: add qwen_groot serve entrypoint
Hchnr Nov 10, 2025
f09dc35
update: add qwen_groot serve entrypoint
Hchnr Nov 10, 2025
897311d
update: qwen_groot ddp=2
Hchnr Nov 13, 2025
cf7beea
merge: merge main and resolve conflict
Hchnr Nov 13, 2025
f94a6b8
format: black and isort
Hchnr Nov 13, 2025
2b83c50
delete: qwenpi model
Hchnr Nov 13, 2025
5cbdc40
update: readme and dryrun script
Hchnr Nov 13, 2025
9129be9
Merge branch 'main' into robotics_train_bm
Hchnr Nov 17, 2025
81aa6f4
update: image download in readme file for test-client
Hchnr Nov 17, 2025
8a227da
Merge branch 'main' into robotics_train_bm
Hchnr Nov 18, 2025
7f2334a
update: move flagscale before megatron in python_path, add transforme…
Hchnr Nov 20, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions examples/pi0/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -159,9 +159,9 @@ Download test images:

```sh
cd FlagScale/
wget https://gitee.com/hchnr/flag-scale/blob/robotics_dataset/orbbec_0_latest.jpg
wget https://gitee.com/hchnr/flag-scale/blob/robotics_dataset/orbbec_1_latest.jpg
wget https://gitee.com/hchnr/flag-scale/blob/robotics_dataset/orbbec_2_latest.jpg
wget https://gitee.com/hchnr/flag-scale/raw/robotics_dataset/orbbec_0_latest.jpg
wget https://gitee.com/hchnr/flag-scale/raw/robotics_dataset/orbbec_1_latest.jpg
wget https://gitee.com/hchnr/flag-scale/raw/robotics_dataset/orbbec_2_latest.jpg
```

Run client:
Expand Down
12 changes: 9 additions & 3 deletions examples/robobrain_x0/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,12 @@ cd FlagScale/

Install train and inference env according to [README](https://github.com/FlagOpen/FlagScale/blob/main/README.md)

Install transformers. Higher version cause problem on image pre-processing.

```sh
pip install transformers==4.53.0
```

# Download Model

```sh
Expand Down Expand Up @@ -70,9 +76,9 @@ Download test images:

```sh
cd FlagScale/
wget https://gitee.com/hchnr/flag-scale/blob/robotics_dataset/orbbec_0_latest.jpg
wget https://gitee.com/hchnr/flag-scale/blob/robotics_dataset/orbbec_1_latest.jpg
wget https://gitee.com/hchnr/flag-scale/blob/robotics_dataset/orbbec_2_latest.jpg
wget https://gitee.com/hchnr/flag-scale/raw/robotics_dataset/orbbec_0_latest.jpg
wget https://gitee.com/hchnr/flag-scale/raw/robotics_dataset/orbbec_1_latest.jpg
wget https://gitee.com/hchnr/flag-scale/raw/robotics_dataset/orbbec_2_latest.jpg
```

Run client:
Expand Down
71 changes: 68 additions & 3 deletions examples/robobrain_x0_5/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ cd FlagScale/
Install train and inference env according to [README](https://github.com/FlagOpen/FlagScale/blob/main/README.md)

# Download Model

Checkpoint is not publish yet.

Directory structure:
Expand All @@ -40,6 +41,7 @@ Directory structure:
`-- config.yaml
```


# Serving

## Edit Config
Expand All @@ -66,9 +68,9 @@ Download test images:

```sh
cd FlagScale/
wget https://gitee.com/hchnr/flag-scale/blob/robotics_dataset/orbbec_0_latest.jpg
wget https://gitee.com/hchnr/flag-scale/blob/robotics_dataset/orbbec_1_latest.jpg
wget https://gitee.com/hchnr/flag-scale/blob/robotics_dataset/orbbec_2_latest.jpg
wget https://gitee.com/hchnr/flag-scale/raw/robotics_dataset/orbbec_0_latest.jpg
wget https://gitee.com/hchnr/flag-scale/raw/robotics_dataset/orbbec_1_latest.jpg
wget https://gitee.com/hchnr/flag-scale/raw/robotics_dataset/orbbec_2_latest.jpg
```

Run client:
Expand All @@ -82,3 +84,66 @@ python examples/robobrain_x0_5/client_libero.py \
--right-wrist-img orbbec_2_latest.jpg \
--num-steps 20
```


# Training

## Prepare Dataset

FlagScale uses WebDataset format and Megatraon.Energon data loader, you need process your data first.

For example, there is a dataset of 2 timesteps: [demo_0913_n2](https://gitee.com/hchnr/flag-scale/tree/robotics_dataset/demo_0913_n2/wds-2).

Download demo_0913_n2:

```sh
git archive [email protected]:hchnr/flag-scale.git robotics_dataset demo_0913_n2/ | tar -xv -C .
```

The directory structure of demo_0913_n2 is as follows:
- build_dep.sh: Copy .npy and .jpg files from production environment to ./deps
- demo_0913_n2.jsonl: A single timestep, including: task(str), images(.jpg), action(.npy), state(.npy)
- deps: .npy and .jpg files
- wds-2: Data in webdataset format (DP=2), generated by tools/datasets/vla/convert.py

Generate Data in webdataset format (DP=2) to ./demo_0913_n2/wds-2:

```sh
python tools/datasets/vla/convert.py \
--dataset-root=./demo_0913_n2 \
--output-root=./demo_0913_n2 \
--json=demo_0913_n2.jsonl \
--train-split 1 \
--val-split 0 \
--images-key=image \
--videos-key=video \
--vision-root='' \
--shuffle-tars \
--num-workers=1 \
--max-samples-per-tar 100000 \
--dp-size 2
```

Move .jpg and .npy files from ./demo_0913_n2/deps to /:

```sh
mkdir -p /share/
cp -r ./demo_0913_n2/deps/* /
```

## Edit Config

```sh
cd FlagScale/
vim examples/robobrain_x0_5/conf/train/libero_qwengroot.yaml
```
Change 4 fields:
- checkpoint_dir: path to model checkpoint
- framework.qwenvl.base_vlm: path to backbone model (for example: qwenvl) checkpoint
- datasets.data_path: path to dataset, for example: ./demo_0913_n2/wds-2

## Start Training
```sh
cd FlagScale/
python run.py --config-path ./examples/robobrain_x0_5/conf --config-name train action=run
```
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
defaults:
- _self_
- train: 3_3b
- train: libero_qwengroot

experiment:
exp_name: Robotics-3.3B
exp_name: libero_qwengroot
seed: 42
save_steps: 10000
load: null
Expand All @@ -12,7 +12,7 @@ experiment:
task:
type: train
backend: robotics
entrypoint: flagscale/train/train_robotics.py
entrypoint: flagscale/train/train_robotics_qwengroot.py
runner:
per_node_task: false
no_shared_fs: false
Expand All @@ -22,15 +22,10 @@ experiment:
before_start: echo "Starting Robotics Training"
envs:
LOGLEVEL: "INFO"
CUDA_VISIBLE_DEVICES: "1"
CUDA_VISIBLE_DEVICES: "0,1"
CUDA_DEVICE_MAX_CONNECTIONS: 1
# Set python paths for: robotics, lerobot, openpi-client, FlagScale, Megatron-LM
PYTHONPATH: python/paths
# Set lerobot data path
HF_LEROBOT_HOME: lerobot/data/path
WANDB_MODE: offline


action: run

hydra:
Expand Down
102 changes: 102 additions & 0 deletions examples/robobrain_x0_5/conf/train/libero_qwengroot.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,102 @@
config_path: examples/robobrain_x0_5/conf/train/libero_qwengroot.yaml
seed: 42
trackers: [jsonl, wandb]
wandb_entity: jinhuiye
wandb_project: StarVLA_Libero
is_debug: false

batch_size: 2
resume: false
checkpoint_dir: results/ckpt_in
exp_name: starvla
project_name: starvla
wandb_enabled: false
output_directory: results/ckpt_out
log_freq: 10
train_steps: 100

framework:
name: QwenGR00T
qwenvl:
base_vlm: /repos/flagscale_new_robotics/FlagScale/results/ckpt_in/backbone
# attn_implementation: flash_attention_2
attn_implementation: eager
vl_hidden_dim: 2048
dino:
dino_backbone: dinov2_vits14
action_model:
action_model_type: DiT-L
action_hidden_dim: 1024
hidden_size: 1024
add_pos_embed: true
max_seq_len: 1024
action_dim: 14
state_dim: 7
future_action_window_size: 29
action_horizon: 30
past_action_window_size: 0
repeated_diffusion_steps: 8
noise_beta_alpha: 1.5
noise_beta_beta: 1.0
noise_s: 0.999
num_timestep_buckets: 1000
num_inference_timesteps: 4
num_target_vision_tokens: 32
diffusion_model_cfg:
cross_attention_dim: 2048
dropout: 0.2
final_dropout: true
interleave_self_attention: true
norm_type: ada_norm
num_layers: 16
output_dim: 1024
positional_embeddings: null
reduce_in_full_precision: true

datasets:
task_encoder:
vision_root: ""
state_key: eepose
action_horizon: 7
action_key: eepose
data_path: /repos/flagscale_new_robotics/FlagScale/demo_0913_n2/wds-1
vlm_data: {}
vla_data: {}

trainer:
epochs: 10
max_train_steps: 36000
num_warmup_steps: 3600
save_interval: 3600
eval_interval: 500
learning_rate:
base: 3.0e-05
qwen_vl_interface: 1.0e-05
action_model: 1.0e-04
lr_scheduler_type: cosine_with_min_lr
scheduler_specific_kwargs:
min_lr: 1.0e-06
freeze_modules: true
loss_scale:
vla: 1.0
vlm: 0.1
max_grad_norm: 1.0
warmup_ratio: 0.1
weight_decay: 0.0
logging_frequency: 10
gradient_clipping: 1.0
gradient_accumulation_steps: 1
optimizer:
name: AdamW
betas: [0.9, 0.95]
eps: 1.0e-08
weight_decay: 1.0e-08
is_resume: false
resume_epoch: null
resume_step: null
enable_gradient_checkpointing: true
enable_mixed_precision_training: true

system: {}
model: {}
data: {}
94 changes: 0 additions & 94 deletions examples/robotics/README.md

This file was deleted.

Loading
Loading