diff --git a/docs/training/multi-token-prediction.md b/docs/training/multi-token-prediction.md index 3c8a3deb05..d7c3f63b2b 100644 --- a/docs/training/multi-token-prediction.md +++ b/docs/training/multi-token-prediction.md @@ -85,7 +85,7 @@ cfg.dataset.path_to_cache = "/path/to/cache" # MTP Configuration cfg.mtp_num_layers = 1 cfg.mtp_loss_scaling_factor = 0.1 -pretrain(config) +pretrain(cfg) ``` Follow the [DCLM Tutorial](https://github.com/NVIDIA-NeMo/Megatron-Bridge/tree/main/tutorials/data/dclm) to prepare the training data