Skip to content

Commit 5f1cd5f

Browse files
authored
Add cifar10 example using TE+DeepSpeed-Zero+MS-AMP (#143)
**Description** Add cifar10 example using TE+DeepSpeed-Zero+MS-AMP There is a 1P customer who only uses DeepSpeed ZeRO and the model is transformer based. Purely replacing Linear with FP8Linear will not get performance gain. So we suggest them to use TransformerEngine for speeding up and MS-AMP for saving memory. This is a good reference to them. **Major Revision** - Add example - Add document - Fix a bug in mnist_ddp example
1 parent bf6f01a commit 5f1cd5f

File tree

9 files changed

+502
-6
lines changed

9 files changed

+502
-6
lines changed

.gitignore

+2
Original file line numberDiff line numberDiff line change
@@ -138,3 +138,5 @@ dmypy.json
138138

139139
# Cython debug symbols
140140
cython_debug/
141+
142+
examples/data

docs/getting-started/run-msamp.md

+6
Original file line numberDiff line numberDiff line change
@@ -40,4 +40,10 @@ deepspeed cifar10_deepspeed.py --deepspeed --deepspeed_config ds_config_msamp.js
4040
deepspeed cifar10_deepspeed.py --deepspeed --deepspeed_config ds_config_zero_msamp.json
4141
```
4242

43+
### 4. Run cifar10 using deepspeed-ZeRO + TE with msamp enabled
44+
45+
```bash
46+
deepspeed cifar10_deepspeed_te.py --deepspeed --deepspeed_config ds_config_zero_te_msamp.json
47+
```
48+
4349
For more comprehensive examples, please go to [MS-AMP-Examples](https://github.com/Azure/MS-AMP-Examples).

docs/user-tutorial/usage.md

+3-1
Original file line numberDiff line numberDiff line change
@@ -31,11 +31,13 @@ For enabling MS-AMP in DeepSpeed, add one line of code `from msamp import deepsp
3131
```json
3232
"msamp": {
3333
"enabled": true,
34-
"opt_level": "O1|O2|O3"
34+
"opt_level": "O1|O2|O3",
35+
"use_te": false
3536
}
3637
```
3738

3839
"O3" is designed for FP8 in ZeRO optimizer, so please make sure ZeRO is enabled when using "O3".
40+
"use_te" is designed for Transformer Engine, if you have already used Transformer Engine in your model, don't forget to set "use_te" to true.
3941

4042
## Usage in Megatron-DeepSpeed and Megatron-LM
4143

0 commit comments

Comments
 (0)