Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 28 additions & 0 deletions examples/summarization/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -209,6 +209,34 @@ PT_HPU_MAX_COMPOUND_OP_SIZE=512 python ../gaudi_spawn.py \
--deepspeed ds_flan_t5_z3_config_bf16.json
```

Here is an example on 8 HPUs on Gaudi2 with DeepSpeed-ZeRO2 to fine-tune t5-large:
```bash
PT_HPU_LAZY_MODE=0 python ../gaudi_spawn.py \
--world_size 8 \
--use_deepspeed run_summarization.py \
--deepspeed ../../tests/configs/deepspeed_zero_2.json \
--do_train \
--do_eval \
--overwrite_output_dir \
--predict_with_generate \
--use_habana \
--gaudi_config_name Habana/t5 \
--ignore_pad_token_for_loss False \
--pad_to_max_length \
--save_strategy no \
--throughput_warmup_steps 15 \
--model_name_or_path t5-large \
--source_prefix '"summarize:"' \
--dataset_name cnn_dailymail \
--dataset_config '"3.0.0"' \
--output_dir /tmp/tst-summarization \
--per_device_train_batch_size 20 \
--per_device_eval_batch_size 20 \
--max_train_samples 2000 \
--torch_compile_backend hpu_backend \
--torch_compile
```

You can look at the [documentation](https://huggingface.co/docs/optimum/habana/usage_guides/deepspeed) for more information about how to use DeepSpeed in Optimum Habana.


Expand Down