-
Notifications
You must be signed in to change notification settings - Fork 6.8k
[PERFORMANCE] [v1.x] Layer normalization code from Marian for CPU #19601
Commits on Nov 30, 2020
-
Layer normalization code from Marian
Kenneth Heafield committedNov 30, 2020 Configuration menu - View commit details
-
Copy full SHA for 2f87167 - Browse repository at this point
Copy the full SHA 2f87167View commit details -
Remove MKL version of LayerNorm.
Experiment with OMP_NUM_THREADS=4, times in s, c5.12xlarge |batchxchanne| New code | MKL | | 1x 32 | 0.0000288| 0.0000278| | 128x 32 | 0.0000308| 0.0000311| | 2560x 32 | 0.0000712| 0.0000672| | 4096x 32 | 0.0000946| 0.0000910| | 8192x 32 | 0.0001597| 0.0001523| |16384x 32 | 0.0002905| 0.0002619| | 1x 64 | 0.0000264| 0.0000256| | 128x 64 | 0.0000339| 0.0000330| | 2560x 64 | 0.0000829| 0.0000972| | 4096x 64 | 0.0001137| 0.0001356| | 8192x 64 | 0.0002027| 0.0002435| |16384x 64 | 0.0003715| 0.0004639| | 1x 128 | 0.0000262| 0.0000263| | 128x 128 | 0.0000325| 0.0000389| | 2560x 128 | 0.0001074| 0.0001580| | 4096x 128 | 0.0001505| 0.0002336| | 8192x 128 | 0.0002861| 0.0004481| |16384x 128 | 0.0005648| 0.0008613| | 1x 256 | 0.0000273| 0.0000276| | 128x 256 | 0.0000390| 0.0000431| | 2560x 256 | 0.0001533| 0.0002811| | 4096x 256 | 0.0002258| 0.0004300| | 8192x 256 | 0.0004300| 0.0008464| |16384x 256 | 0.0010436| 0.0017613| | 1x 512 | 0.0000256| 0.0000302| | 128x 512 | 0.0000408| 0.0000551| | 2560x 512 | 0.0002444| 0.0005225| | 4096x 512 | 0.0003828| 0.0008147| | 8192x 512 | 0.0008832| 0.0017192| |16384x 512 | 0.0058463| 0.0074497| | 1x 768 | 0.0000252| 0.0000308| | 128x 768 | 0.0000450| 0.0000676| | 2560x 768 | 0.0003440| 0.0007719| | 4096x 768 | 0.0005890| 0.0013346| | 8192x 768 | 0.0014946| 0.0026145| |16384x 768 | 0.0089495| 0.0113557| | 1x 1024 | 0.0000285| 0.0000308| | 128x 1024 | 0.0000487| 0.0000786| | 2560x 1024 | 0.0004614| 0.0010190| | 4096x 1024 | 0.0008083| 0.0017376| | 8192x 1024 | 0.0059020| 0.0075588| |16384x 1024 | 0.0116553| 0.0146855| Benchmark program ```python import mxnet as mx import time def time_procedure(shape, count): data = mx.nd.random_uniform(shape=shape, low=-1.0, high = 1.0) factors = mx.nd.random_uniform(shape=(shape[-1],)) mx.nd.waitall() begin = time.time() for i in range(0, count): out = mx.nd.LayerNorm(data, factors, factors) mx.nd.waitall() return (time.time() - begin) / count count = 200 for channel in [32, 64, 128, 256, 512, 768, 1024]: for batch in [1, 128, 2560, 4096, 8192, 16384]: s = (batch, channel) timing = time_procedure(s, count) print("{:5d}x{:5d} | {:.7f}".format(s[0], s[1], timing)) ```
Kenneth Heafield committedNov 30, 2020 Configuration menu - View commit details
-
Copy full SHA for 95efe8f - Browse repository at this point
Copy the full SHA 95efe8fView commit details
Commits on Dec 4, 2020
-
Enable pragma omp simd on MSVC
Kenneth Heafield committedDec 4, 2020 Configuration menu - View commit details
-
Copy full SHA for 40d3326 - Browse repository at this point
Copy the full SHA 40d3326View commit details -
Merge branch 'v1.x' into layernorm
Kenneth Heafield committedDec 4, 2020 Configuration menu - View commit details
-
Copy full SHA for c6b653e - Browse repository at this point
Copy the full SHA c6b653eView commit details
Commits on Dec 7, 2020
-
Fix MSVC error C3016: 'j': index variable in OpenMP 'for' statement m…
…ust have signed integral type
Kenneth Heafield committedDec 7, 2020 Configuration menu - View commit details
-
Copy full SHA for 3605226 - Browse repository at this point
Copy the full SHA 3605226View commit details -
Try to make MSVC happy since it doesn't have ssize_t
Kenneth Heafield committedDec 7, 2020 Configuration menu - View commit details
-
Copy full SHA for dcb61aa - Browse repository at this point
Copy the full SHA dcb61aaView commit details -
Change gcc 8 PPA to ppa:jonathonf/gcc
Kenneth Heafield committedDec 7, 2020 Configuration menu - View commit details
-
Copy full SHA for a11dc7e - Browse repository at this point
Copy the full SHA a11dc7eView commit details -
Kenneth Heafield committed
Dec 7, 2020 Configuration menu - View commit details
-
Copy full SHA for 5ae7ae4 - Browse repository at this point
Copy the full SHA 5ae7ae4View commit details
Commits on Dec 21, 2020
-
Merge branch 'v1.x' into layernorm
Kenneth Heafield committedDec 21, 2020 Configuration menu - View commit details
-
Copy full SHA for 606759d - Browse repository at this point
Copy the full SHA 606759dView commit details -
Option to use MKL version requested by @samskalicky
Kenneth Heafield committedDec 21, 2020 Configuration menu - View commit details
-
Copy full SHA for 2d2a91e - Browse repository at this point
Copy the full SHA 2d2a91eView commit details -
Fix order if MKL override is on
Kenneth Heafield committedDec 21, 2020 Configuration menu - View commit details
-
Copy full SHA for e5093eb - Browse repository at this point
Copy the full SHA e5093ebView commit details
Commits on Dec 28, 2020
-
Have CI test MKL layer norm in build_ubuntu_cpu_mkl
Kenneth Heafield committedDec 28, 2020 Configuration menu - View commit details
-
Copy full SHA for a566558 - Browse repository at this point
Copy the full SHA a566558View commit details -
Merge branch 'v1.x' of https://github.com/apache/incubator-mxnet into…
… layernorm
Kenneth Heafield committedDec 28, 2020 Configuration menu - View commit details
-
Copy full SHA for eb3d9d9 - Browse repository at this point
Copy the full SHA eb3d9d9View commit details