diff --git a/model_cards/huawei-noah/DynaBERT_MNLI/README.md b/model_cards/huawei-noah/DynaBERT_MNLI/README.md new file mode 100644 index 000000000000..934cfbefb774 --- /dev/null +++ b/model_cards/huawei-noah/DynaBERT_MNLI/README.md @@ -0,0 +1,20 @@ +## DynaBERT: Dynamic BERT with Adaptive Width and Depth + +* DynaBERT can flexibly adjust the size and latency by selecting adaptive width and depth, and +the subnetworks of it have competitive performances as other similar-sized compressed models. +The training process of DynaBERT includes first training a width-adaptive BERT and then +allowing both adaptive width and depth using knowledge distillation. + +* This code is modified based on the repository developed by Hugging Face: [Transformers v2.1.1](https://github.com/huggingface/transformers/tree/v2.1.1), and is released in [GitHub](https://github.com/huawei-noah/Pretrained-Language-Model/tree/master/DynaBERT). + +### Reference +Lu Hou, Zhiqi Huang, Lifeng Shang, Xin Jiang, Qun Liu. +[DynaBERT: Dynamic BERT with Adaptive Width and Depth](https://arxiv.org/abs/2004.04037). +``` +@inproceedings{hou2020dynabert, + title = {DynaBERT: Dynamic BERT with Adaptive Width and Depth}, + author = {Lu Hou, Zhiqi Huang, Lifeng Shang, Xin Jiang, Qun Liu}, + booktitle = {NeurIPS}, + year = {2020} +} +``` diff --git a/model_cards/huawei-noah/DynaBERT_SST-2/README.md b/model_cards/huawei-noah/DynaBERT_SST-2/README.md index a0b963b52c27..934cfbefb774 100644 --- a/model_cards/huawei-noah/DynaBERT_SST-2/README.md +++ b/model_cards/huawei-noah/DynaBERT_SST-2/README.md @@ -1,9 +1,20 @@ -# DynaBERT: Dynamic BERT with Adaptive Width and Depth +## DynaBERT: Dynamic BERT with Adaptive Width and Depth * DynaBERT can flexibly adjust the size and latency by selecting adaptive width and depth, and the subnetworks of it have competitive performances as other similar-sized compressed models. The training process of DynaBERT includes first training a width-adaptive BERT and then allowing both adaptive width and depth using knowledge distillation. -* This code is modified based on the repository developed by Hugging Face: [Transformers v2.1.1](https://github.com/huggingface/transformers/tree/v2.1.1) -* The results in the paper are produced by using single V100 GPU. +* This code is modified based on the repository developed by Hugging Face: [Transformers v2.1.1](https://github.com/huggingface/transformers/tree/v2.1.1), and is released in [GitHub](https://github.com/huawei-noah/Pretrained-Language-Model/tree/master/DynaBERT). + +### Reference +Lu Hou, Zhiqi Huang, Lifeng Shang, Xin Jiang, Qun Liu. +[DynaBERT: Dynamic BERT with Adaptive Width and Depth](https://arxiv.org/abs/2004.04037). +``` +@inproceedings{hou2020dynabert, + title = {DynaBERT: Dynamic BERT with Adaptive Width and Depth}, + author = {Lu Hou, Zhiqi Huang, Lifeng Shang, Xin Jiang, Qun Liu}, + booktitle = {NeurIPS}, + year = {2020} +} +```