Transformer-kernel - supporting any arbitrary sequence-length #587

RezaYazdaniAminabadi · 2020-12-08T01:48:13Z

No description provided.

jeffra

Looks good to me as long as convergence checks are fine.

RezaYazdaniAminabadi · 2020-12-11T01:10:30Z

Thanks Jeff, I think the part I changed would not impact the convergence a lot. Just that it covers more cases for the Transformer Kernel. By the way, there is one change I did for the transformer API (https://github.com/microsoft/DeepSpeed/pull/587/files#diff-05e444aa64c2739a8357e712df27aaf32a95c7b54479994d9741008dd226d793L21), so it won't need to get the sequence length, I can use the same strategy to remove batch_size from config too. These last two changes will potentially help smooth our op-injection line of work!

RezaYazdaniAminabadi · 2020-12-11T01:14:28Z

I also need to make a PR for the DeepSpeed-Example branch to consolidate these change and won't let our bing-bert example crash!

jeffra · 2020-12-11T20:41:02Z

I also need to make a PR for the DeepSpeed-Example branch to consolidate these change and won't let our bing-bert example crash!

Sounds good. Once we have an update DSE let's update the submodule here and we can merge this.

zmxdream · 2021-02-24T03:18:04Z

Hi, I took a look at softmax_ kernel.cu, Is the code customized for the sequence length of power 2, such as 128, 256, 512? Doesn't it seem to apply to a sequence length of 50? right?

I also need to make a PR for the DeepSpeed-Example branch to consolidate these change and won't let our bing-bert example crash!

Transformer-kernel - supporting any arbitrary sequence-length

eb4700b

RezaYazdaniAminabadi requested review from ShadenSmith, arashashari, awan-10, cli99, conglongli, eltonzheng, jeffra, minjiaz, niumanar, samyam and tjruwase as code owners December 8, 2020 01:48

Reza Yazdani added 2 commits December 8, 2020 02:38

remove seq-len from transformer config

0674348

pad seq-len to be 16-aligned

0659f92

This was referenced Dec 8, 2020

Issues using deepspeed transformer kernel #589

Closed

hidden_dim constraint in transformer cuda kernel #491

Open

Reza Yazdani and others added 2 commits December 9, 2020 03:08

resolve the issue with softmax forward when sequence is low

cb15de6

Merge branch 'master' into transformer/support-arbitrary-seqlen

9981c21

jeffra approved these changes Dec 11, 2020

View reviewed changes

jeffra and others added 3 commits December 11, 2020 12:41

Merge branch 'master' into transformer/support-arbitrary-seqlen

2a3b3d2

make the padding more efficient

3b34bcc

Merge branch 'master' into transformer/support-arbitrary-seqlen

57b01e4

jeffra mentioned this pull request Dec 17, 2020

Modify scripts for fixing the deepspeed-transformer interface deepspeedai/DeepSpeedExamples#69

Merged

bump DSE to support this PR

23c70a3

jeffra merged commit fd2f970 into master Dec 17, 2020

zmxdream mentioned this pull request Feb 24, 2021

issue with softmax_kernel.cu #785

Closed

mrwyattii deleted the transformer/support-arbitrary-seqlen branch July 7, 2023 02:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Transformer-kernel - supporting any arbitrary sequence-length #587

Transformer-kernel - supporting any arbitrary sequence-length #587

Uh oh!

RezaYazdaniAminabadi commented Dec 8, 2020

Uh oh!

jeffra left a comment

Uh oh!

RezaYazdaniAminabadi commented Dec 11, 2020 •

edited

Loading

Uh oh!

RezaYazdaniAminabadi commented Dec 11, 2020

Uh oh!

jeffra commented Dec 11, 2020

Uh oh!

zmxdream commented Feb 24, 2021 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Transformer-kernel - supporting any arbitrary sequence-length #587

Transformer-kernel - supporting any arbitrary sequence-length #587

Uh oh!

Conversation

RezaYazdaniAminabadi commented Dec 8, 2020

Uh oh!

jeffra left a comment

Choose a reason for hiding this comment

Uh oh!

RezaYazdaniAminabadi commented Dec 11, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

RezaYazdaniAminabadi commented Dec 11, 2020

Uh oh!

jeffra commented Dec 11, 2020

Uh oh!

zmxdream commented Feb 24, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

RezaYazdaniAminabadi commented Dec 11, 2020 •

edited

Loading

zmxdream commented Feb 24, 2021 •

edited

Loading