[v0.10.x] Softmax optimization & bertpass refactor #1565

bgawrych · 2021-04-27T08:34:51Z

This PR adds graph pass to optimize CPU's softmax on BERT.
Currently for BERT-large length tensor is created by following operations: expand_dims -> brodcast_axis -> Reshape
and there is x24 such tensor creation. This pass replace softmax (with length) with regular softmax (but with masked input) - mask is created only once and then is passed to elemwise_sum to mask input. Applying pass in the scripts is optional

Original:

Masked softmax:

Thoughput in samples/s:

batches	batch_size	fp32	quantized	quantized + mha_interleave	quantized + mask_softmax	quantized + mask_softmax + mha_interleave
1000	24	19,74	25,08	25,26	34,36	34,80
500	1	14,06	16,97	18,02	20,24	21,81

Accuracy:

batch_size	fp32	quantized	quantized + mha_interleave	quantized + mask_softmax	quantized + mask_softmax + mha_interleave
EM	80,99	80,91	80,44	80,73	80,44
F1	88,60	88,33	88,06	88,29	88,06

There is also bug fix in interrleaved mha pass
Accuracy without mha_interleave bug fix: {'exact_match': 79.62157048249763, 'f1': 87.75497143592598}

bartekkuncer · 2021-04-27T08:40:39Z

LGTM

github-actions · 2021-04-27T09:51:58Z

The documentation website for preview: http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR1565/d13d37d19e549bb13984e855b9f3e6cb24a4bbc6/index.html

scripts/bert/bertpass.cc

scripts/bert/index.rst

github-actions · 2021-04-28T13:46:31Z

The documentation website for preview: http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR1565/ad9185846d06baf328878fa7b37a5356a6439c89/index.html

scripts/bert/finetune_classifier.py

bgawrych · 2021-04-30T14:06:05Z

@szha Can you help with CI? I'm not sure why it's failing and don't know how to rerun it.
website-build seems to fail on notebook I haven't edited

barry-jin · 2021-04-30T16:34:54Z

Hi @bgawrych, Could you try to merge with v0.10.x, we ported new CI settings from v0.x to v0.10.x. Thanks!

bgawrych · 2021-05-13T09:29:41Z

@barry-jin, @szha still issue with notebook, little strange as bert.md file was not changed for 1 year - should I fix it or it's CI issue?

leezu · 2021-05-13T13:25:10Z

@barry-jin I see the following error in the log, pointing out that there's a CI issue:

[2021-05-10T05:59:14.916Z] umount: /dev/shm: must be superuser to unmount.
[2021-05-10T05:59:14.917Z] mount: /dev/shm: permission denied.
[2021-05-10T05:59:14.917Z] ./gluon_nlp_job.sh: line 33: sudo: command not found

barry-jin · 2021-05-13T16:10:34Z

@barry-jin, @szha still issue with notebook, little strange as bert.md file was not changed for 1 year - should I fix it or it's CI issue?

Hi @bgawrych, we have ported the changes in bert.md from v0.x branch, you could try to merge with current v0.10.x.

barry-jin · 2021-05-13T16:12:03Z

@barry-jin I see the following error in the log, pointing out that there's a CI issue:

[2021-05-10T05:59:14.916Z] umount: /dev/shm: must be superuser to unmount.
[2021-05-10T05:59:14.917Z] mount: /dev/shm: permission denied.
[2021-05-10T05:59:14.917Z] ./gluon_nlp_job.sh: line 33: sudo: command not found

Thanks, I will fix this issue.

bgawrych · 2021-05-21T11:53:14Z

@barry-jin

[2021-05-19T09:19:56.267Z] Exception occurred:
[2021-05-19T09:19:56.267Z]   File "/workspace/gluon-nlp/docs/conf.py", line 237, in setup
[2021-05-19T09:19:56.267Z]     app.add_javascript('google_analytics.js')
[2021-05-19T09:19:56.267Z] AttributeError: 'Sphinx' object has no attribute 'add_javascript'

barry-jin · 2021-05-21T17:19:11Z

@barry-jin

[2021-05-19T09:19:56.267Z] Exception occurred:
[2021-05-19T09:19:56.267Z]   File "/workspace/gluon-nlp/docs/conf.py", line 237, in setup
[2021-05-19T09:19:56.267Z]     app.add_javascript('google_analytics.js')
[2021-05-19T09:19:56.267Z] AttributeError: 'Sphinx' object has no attribute 'add_javascript'

Will be fixed in #1575

github-actions · 2021-05-27T08:52:47Z

The documentation website for preview: http://gluon-nlp-staging.s3-accelerate.dualstack.amazonaws.com/PR1565/1644e8fe25e66b05300042510ced0603d1dd4098/index.html

szha · 2021-05-27T17:39:08Z

Merged. Thanks @bgawrych!

ptrendx · 2021-06-18T17:24:44Z

I'm suprised that the elimination of the 24x mask tensor creation gave you any speedup (as opposed to using masked softmax, which should) - MXNet already has common expression elimination pass (I wrote it: apache/mxnet#15657). Does that not work for you?

bgawrych · 2021-06-23T10:30:27Z

@ptrendx I didn't know about this feature, but I wrote this small graph pass to test it:

#if MX_LIBRARY_VERSION <= 7
MXReturnValue TEST(const std::string& in_graph, const std::string** out_graph,
                          const std::unordered_map<std::string, std::string>& options,
                          const std::unordered_map<std::string, MXTensor>& args,
                          const std::unordered_map<std::string, MXTensor>& aux,
                          const PassResource& res) {
  Graph *g = Graph::fromString(in_graph);
#else
MXReturnValue TEST(mxnet::ext::Graph *g,
                          const std::unordered_map<std::string, std::string>& options) {
#endif

  Node* commonnode;
#if MX_LIBRARY_VERSION <= 7
  for(Node* n : g->nodes) {
#else
  for(int i=0; i < g->size(); i++) {
    Node* n = g->getNode(i);
#endif
    if (n->op.compare("softmax") == 0) {
      commonnode = n->inputs[1].node;
      break;
    }
  }

#if MX_LIBRARY_VERSION <= 7
  for(Node* n : g->nodes) {
#else
  for(int i=0; i < g->size(); i++) {
    Node* n = g->getNode(i);
#endif
    if (n->op.compare("softmax") == 0) {
      n->inputs[1].node = commonnode;
    }
  }

#if MX_LIBRARY_VERSION <= 7
  // convert back to JSON string from Graph/Node
  *out_graph = new std::string(g->toString());
#endif
  return MX_SUCCESS;
}

Overhead from these operators are negligible, but seems like it don't work in this case:
with graph pass:

w/o

bgawrych requested a review from a team as a code owner April 27, 2021 08:34

mozga-intel reviewed Apr 28, 2021

View reviewed changes

szha reviewed Apr 28, 2021

View reviewed changes

scripts/bert/finetune_classifier.py Outdated Show resolved Hide resolved

bgawrych force-pushed the softmax_opt branch from 2a52674 to fcf434b Compare May 10, 2021 05:54

bgawrych force-pushed the softmax_opt branch from fcf434b to b242a7a Compare May 19, 2021 08:32

bgawrych force-pushed the softmax_opt branch from b242a7a to 76b3826 Compare May 27, 2021 06:32

bgawrych added 13 commits May 27, 2021 10:01

fix export

89e9f83

Separate graph passes

7f326b6

Add softmax mask pass optimization

ba87489

Add graph passes to finetune_squad

506df27

Fix bias

30c14f3

Update finetune_squad

0f21fd0

Change filename and update documentation

3e91737

Add support for graph passes in finetune_classifier

3cb773e

Fix for mxnet 1.7

3a62b46

Fix review

e50e10a

Add tests for finetune_* scripts

7c699b3

Fix review

81255f6

sanity

7772c97

bgawrych added 2 commits May 27, 2021 10:01

Refactor bias recalculation loops

72ecb69

Remove nightly version of mxnet from warning

1644e8f

bgawrych force-pushed the softmax_opt branch from 76b3826 to 1644e8f Compare May 27, 2021 08:01

szha approved these changes May 27, 2021

View reviewed changes

szha merged commit b4d7c0f into dmlc:v0.10.x May 27, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[v0.10.x] Softmax optimization & bertpass refactor #1565

[v0.10.x] Softmax optimization & bertpass refactor #1565

bgawrych commented Apr 27, 2021

bartekkuncer commented Apr 27, 2021

github-actions bot commented Apr 27, 2021

github-actions bot commented Apr 28, 2021

bgawrych commented Apr 30, 2021

barry-jin commented Apr 30, 2021

bgawrych commented May 13, 2021

leezu commented May 13, 2021

barry-jin commented May 13, 2021

barry-jin commented May 13, 2021

bgawrych commented May 21, 2021

barry-jin commented May 21, 2021

github-actions bot commented May 27, 2021

szha commented May 27, 2021

ptrendx commented Jun 18, 2021

bgawrych commented Jun 23, 2021

[v0.10.x] Softmax optimization & bertpass refactor #1565

[v0.10.x] Softmax optimization & bertpass refactor #1565

Conversation

bgawrych commented Apr 27, 2021

bartekkuncer commented Apr 27, 2021

github-actions bot commented Apr 27, 2021

github-actions bot commented Apr 28, 2021

bgawrych commented Apr 30, 2021

barry-jin commented Apr 30, 2021

bgawrych commented May 13, 2021

leezu commented May 13, 2021

barry-jin commented May 13, 2021

barry-jin commented May 13, 2021

bgawrych commented May 21, 2021

barry-jin commented May 21, 2021

github-actions bot commented May 27, 2021

szha commented May 27, 2021

ptrendx commented Jun 18, 2021

bgawrych commented Jun 23, 2021