Skip to content

Commit 1787a6d

Browse files
authored
Add Write Models Docs (Oneflow-Inc#203)
* refine layer docs * refine model docs * add more layer docs * refine layer docs * add write models * refine * refine docs * fix tokenizer docs * update tokenization docs * refine and update docs * update docs * fix comments * make format * fix conflict * update README and changelog * update link and merge main
1 parent ce95b9b commit 1787a6d

14 files changed

+183
-68
lines changed

README.md

+2-2
Original file line numberDiff line numberDiff line change
@@ -57,11 +57,11 @@ LiBai is a large-scale open-source model training toolbox based on OneFlow. The
5757

5858
## Installation
5959

60-
See [Installation instructions](https://libai.readthedocs.io/en/latest/tutorials/Installation.html).
60+
See [Installation instructions](https://libai.readthedocs.io/en/latest/tutorials/get_started/Installation.html).
6161

6262
## Getting Started
6363

64-
See [Getting Started](https://libai.readthedocs.io/en/latest/tutorials/Getting_Started.html) for the basic usage of LiBai.
64+
See [Quick Run](https://libai.readthedocs.io/en/latest/tutorials/get_started/quick_run.html) for the basic usage of LiBai.
6565

6666
## Documentation
6767

README_zh-CN.md

+2-2
Original file line numberDiff line numberDiff line change
@@ -55,10 +55,10 @@ LiBai是一个基于OneFlow的大规模模型训练开源工具箱,主分支
5555
</details>
5656

5757
## 安装
58-
请参考[LiBai安装文档](https://libai.readthedocs.io/en/latest/tutorials/Installation.html)进行安装。
58+
请参考[LiBai安装文档](https://libai.readthedocs.io/en/latest/tutorials/get_started/Installation.html)进行安装。
5959

6060
## 快速入门
61-
请参考[快速入门文档](https://libai.readthedocs.io/en/latest/tutorials/Getting_Started.html)了解和学习LiBai的基本使用,后续我们将提供丰富的教程与完整的使用指南。
61+
请参考[快速入门文档](https://libai.readthedocs.io/en/latest/tutorials/get_started/quick_run.html)了解和学习LiBai的基本使用,后续我们将提供丰富的教程与完整的使用指南。
6262

6363
## 使用文档
6464
请参考[LiBai使用文档](https://libai.readthedocs.io/en/latest/index.html)了解LiBai中相关接口的使用。

changelog.md

+2-2
Original file line numberDiff line numberDiff line change
@@ -22,5 +22,5 @@
2222
- Support 3D parallel [T5](https://arxiv.org/abs/1910.10683) model
2323
- Support 3D parallel [Vision Transformer](https://arxiv.org/abs/2010.11929)
2424
- Support Data parallel [Swin Transformer](https://arxiv.org/abs/2103.14030) model
25-
- Support finetune task in [projects](/projects/)
26-
- Support text classification task in [projects](/projects/)
25+
- Support finetune task in [QQP project](/projects/QQP/)
26+
- Support text classification task in [text classification project](/projects/text_classification/)

docs/source/modules/libai.layers.rst

+1
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@ libai.layers
88
VocabEmbedding,
99
SinePositionalEmbedding,
1010
PatchEmbedding,
11+
drop_path,
1112
DropPath,
1213
build_activation,
1314
Linear,

docs/source/modules/libai.tokenizer.rst

+3-1
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,9 @@ libai.tokenizer
33

44
.. currentmodule:: libai.tokenizer
55
.. automodule:: libai.tokenizer
6+
:member-order: bysource
67
:members:
78
BertTokenizer,
89
GPT2Tokenizer,
9-
GoogleT5Tokenizer
10+
GoogleT5Tokenizer,
11+
PreTrainedTokenizer,

docs/source/tutorials/basics/Distributed_Configuration.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -45,7 +45,7 @@ from .common.train import train
4545
train.dist.pipeline_parallel_size = 8
4646
```
4747

48-
**Note:** For models which have been configured with pipeline parallelism(e.g., BERT, GPT-2, T5 and ViT), you can simply update the distributed config to execute pipeline parallel training on them. If you need to train your own model with pipeline parallel strategy, please refer to [Write Models]() for more details about configuring your own model with pipeline parallelism.
48+
**Note:** For models which have been configured with pipeline parallelism(e.g., BERT, GPT-2, T5 and ViT), you can simply update the distributed config to execute pipeline parallel training on them. If you need to train your own model with pipeline parallel strategy, please refer to [Write Models](https://libai.readthedocs.io/en/latest/tutorials/basics/Write_Models.html) for more details about configuring your own model with pipeline parallelism.
4949

5050
#### **Data Parallel + Tensor Parallel for 2D Parallel Training on 8 GPUs**
5151

Original file line numberDiff line numberDiff line change
@@ -0,0 +1,59 @@
1+
# Write Models
2+
3+
In this section, we will introduce how to implement a new model entirely from scratch and make it compatible with LiBai.
4+
5+
6+
## Construct Models in LiBai
7+
8+
LiBai uses [LazyConfig](https://libai.readthedocs.io/en/latest/tutorials/Config_System.html) for more flexible config system, which means you can simply import your own model in your config and train it under LiBai.
9+
10+
For image classification task, the input data is usually a batch of images and labels. The following code shows how to build a toy model for this task, import it in your code:
11+
```python
12+
# toy_model.py
13+
import oneflow as flow
14+
import oneflow.nn as nn
15+
16+
17+
class ToyModel(nn.Module):
18+
def __init__(self,
19+
num_classes=1000,
20+
):
21+
super().__init__()
22+
self.features = nn.Conv2d(3, 64, kernel_size=3, stride=1, padding=1)
23+
self.avgpool = nn.AdaptiveAvgPool2d(1)
24+
self.classifier = nn.Linear(64, num_classes)
25+
self.loss_func = nn.CrossEntropyLoss()
26+
27+
def forward(self, images, labels=None):
28+
x = self.features(images)
29+
x = self.avgpool(x)
30+
x = flow.flatten(x, 1)
31+
x = self.classifier(x)
32+
33+
if labels is not None and self.training:
34+
losses = self.loss_func(x, labels)
35+
return {"losses": losses}
36+
else:
37+
return {"prediction_scores": x}
38+
```
39+
40+
**Note:**
41+
- For classification models, the ``forward`` function must have ``images`` and ``labels`` as arguments, which corresponds to the output in ``__getitem__`` of LiBai's built-in datasets, please refer to [imagenet.py](https://github.com/Oneflow-Inc/libai/blob/main/libai/data/datasets/imagenet.py) for more details about the dataset.
42+
- **This toy model** will return ``losses`` during training and ``prediction_scores`` during inference, and both of them should be the type of ``dict``, which means you should implement the ``loss function`` in your model, like ``self.loss_func=nn.CrossEntropyLoss()`` as the ToyModel showing above.
43+
44+
45+
## Import the model in config
46+
47+
With ``LazyConfig System``, you can simply import the model in your config file. The following code shows how to use ``ToyModel`` in your config file:
48+
```python
49+
# config.py
50+
from libai.config import LazyCall
51+
from toy_model import ToyModel
52+
53+
model = LazyCall(ToyModel)(
54+
num_classes=1000
55+
)
56+
```
57+
58+
59+

docs/source/tutorials/basics/index.rst

+1
Original file line numberDiff line numberDiff line change
@@ -10,4 +10,5 @@ Basics
1010
Training.md
1111
Train_and_Eval_Command_Line.md
1212
Build_New_Project_on_LiBai.md
13+
Write_Models.md
1314
Distributed_Configuration.md

libai/layers/activation.py

+4
Original file line numberDiff line numberDiff line change
@@ -47,6 +47,10 @@ def forward(self, x: flow.Tensor) -> flow.Tensor:
4747

4848

4949
def build_activation(activation: Optional[Activation]):
50+
"""
51+
Fetching activation layers by name, e.g.,
52+
``build_activation("gelu")`` returns ``nn.GELU()`` module.
53+
"""
5054
if not activation:
5155
return Passthrough()
5256

libai/layers/droppath.py

+2
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,8 @@
1818

1919

2020
def drop_path(x, drop_prob: float = 0.5, training: bool = False):
21+
"""Drop paths (Stochastic Depth) per sample (when applied in main path of residual blocks)."""
22+
2123
if drop_prob == 0.0 or not training:
2224
return x
2325
keep_prob = 1 - drop_prob

libai/layers/transformer_layer.py

+18-15
Original file line numberDiff line numberDiff line change
@@ -53,21 +53,6 @@ class TransformerLayer(nn.Module):
5353
https://arxiv.org/pdf/1909.08053.pdf.
5454
Default: ``False``.
5555
layer_idx: the layer index, which determines the placement.
56-
57-
Inputs:
58-
* **hidden_states**: [bsz, seq_length, hidden_size], (S(0), B).
59-
* **attention_mask**: [bsz, 1, seq_length, seq_length], (S(0), B),
60-
the combination of key padding mask and casual mask of hidden states.
61-
* **encoder_states**: [bsz, seq_length, hidden_size], (S(0), B), encoder output,
62-
this will be used in cross attention.
63-
* **encoder_attention_mask**: [bsz, 1, seq_length, seq_length],
64-
(S(0), B) key padding mask of encoder states.
65-
* **past_key_value**: tuple of key and value, each shape is
66-
[src_len, bsz, num_heads, head_size], For decoder layer,
67-
the past_key_value contains the states both from
68-
self attention and cross attention.
69-
* **use_cache**: it will be set to `True`, when the model is in the inference phase and
70-
used for incremental decoding.
7156
"""
7257

7358
def __init__(
@@ -149,6 +134,24 @@ def forward(
149134
past_key_value=None,
150135
use_cache=False,
151136
):
137+
"""
138+
Args:
139+
hidden_states: shape is (batch_size, seq_length, hidden_size),
140+
sbp signature is (S(0), B).
141+
attention_mask: the combination of key padding mask and casual mask of hidden states
142+
with shape (batch_size, 1, seq_length, seq_length) and the sbp
143+
signature is (S(0), B),
144+
encoder_states: encoder output with shape (batch_size, seq_length, hidden_size)
145+
and the sbp signature is (S(0), B), which will be used in cross attention.
146+
encoder_attention_mask: key padding mask of encoder states with shape
147+
(batch_size, 1, seq_length, seq_length) and the sbp signature is (S(0), B).
148+
past_key_value: tuple of key and value, each shape is
149+
(seq_length, bsz, num_heads, head_size), For decoder layer,
150+
the past_key_value contains the states both from self attention
151+
and cross attention.
152+
use_cache: it will be set to `True` when the model is in the inference phase and
153+
used for incremental decoding.
154+
"""
152155
# Change placement for pipeline parallelsim
153156
hidden_states = hidden_states.to_global(placement=dist.get_layer_placement(self.layer_idx))
154157

libai/models/build.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@
2626

2727

2828
def build_model(cfg):
29-
"""Build the whole model architecture, defined by ``cfg.model.model_name``.
29+
"""Build the whole model architecture, defined by ``cfg.model``.
3030
Note that is does not load any weights from ``cfg``.
3131
"""
3232
if "_target_" in cfg: # LazyCall

libai/tokenizer/__init__.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@
1414
# limitations under the License.
1515

1616
from .build import TOKENIZER_REGISTRY, build_tokenizer
17-
from .tokenization_base import PreTrainedTokenizer
1817
from .tokenization_bert import BertTokenizer
1918
from .tokenization_gpt2 import GPT2Tokenizer
2019
from .tokenization_t5 import GoogleT5Tokenizer
20+
from .tokenization_base import PreTrainedTokenizer

0 commit comments

Comments
 (0)