Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -412,6 +412,17 @@ uv run python examples/converters/convert_dcp_to_hf.py \
--dcp-ckpt-path results/grpo/step_170/policy/weights/ \
--hf-ckpt-path results/grpo/hf
```

If you have a model saved in Megatron format, you can use the following command to convert it to Hugging Face format prior to running evaluation:

```sh
# Example for a GRPO checkpoint at step 170
uv run python examples/converters/convert_megatron_to_hf.py \
--config results/grpo/step_170/config.yaml \
--dcp-ckpt-path results/grpo/step_170/policy/weights/iter_0000000 \
--hf-ckpt-path results/grpo/hf
```

> **Note:** Adjust the paths according to your training output directory structure.

For an in-depth explanation of checkpointing, refer to the [Checkpointing documentation](docs/design-docs/checkpointing.md).
Expand Down
16 changes: 14 additions & 2 deletions docs/design-docs/checkpointing.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,10 @@
# Checkpointing with Hugging Face Models
# Exporting Checkpoints to Hugging Face Format

NeMo RL provides two checkpoint formats for Hugging Face models: Torch distributed and Hugging Face format. Torch distributed is used by default for efficiency, and Hugging Face format is provided for compatibility with Hugging Face's `AutoModel.from_pretrained` API. Note that Hugging Face format checkpoints save only the model weights, ignoring the optimizer states. It is recommended to use Torch distributed format to save intermediate checkpoints and to save a Hugging Face checkpoint only at the end of training.

A checkpoint converter is provided to convert a Torch distributed checkpoint checkpoint to Hugging Face format after training:
## Converting Torch Distributed Checkpoints to Hugging Face Format

A checkpoint converter is provided to convert a Torch distributed checkpoint to Hugging Face format after training:

```sh
uv run examples/converters/convert_dcp_to_hf.py --config=<YAML CONFIG USED DURING TRAINING> <ANY CONFIG OVERRIDES USED DURING TRAINING> --dcp-ckpt-path=<PATH TO DIST CHECKPOINT TO CONVERT> --hf-ckpt-path=<WHERE TO SAVE HF CHECKPOINT>
Expand All @@ -17,3 +19,13 @@ CKPT_DIR=results/sft/step_10
uv run examples/converters/convert_dcp_to_hf.py --config=$CKPT_DIR/config.yaml --dcp-ckpt-path=$CKPT_DIR/policy/weights --hf-ckpt-path=${CKPT_DIR}-hf
rsync -ahP $CKPT_DIR/policy/tokenizer ${CKPT_DIR}-hf/
```

## Converting Megatron Checkpoints to Hugging Face Format

For models that were originally trained using the Megatron-LM backend, a separate converter is available to convert Megatron checkpoints to Hugging Face format. This script requires Megatron-Core, so make sure to launch the conversion with the `mcore` extra. For example,

```sh
CKPT_DIR=results/sft/step_10

uv run --extra mcore examples/converters/convert_megatron_to_hf.py --config=$CKPT_DIR/config.yaml --megatron-ckpt-path=$CKPT_DIR/policy/weights/iter_0000000/ --hf-ckpt-path=<path_to_save_hf_ckpt>
```
7 changes: 7 additions & 0 deletions examples/converters/convert_megatron_to_hf.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,13 @@

from nemo_rl.models.megatron.community_import import export_model_from_megatron

""" NOTE: this script requires mcore. Make sure to launch with the mcore extra:
uv run --extra mcore python examples/converters/convert_megatron_to_hf.py \
--config <path_to_ckpt>/config.yaml \
--megatron-ckpt-path <path_to_ckpt>/policy/weights/iter_xxxxx \
--hf-ckpt-path <path_to_save_hf_ckpt>
"""


def parse_args():
"""Parse command line arguments."""
Expand Down