Warning: Could not find working CUDA device Note: Google Test filter = OMP_TUNING.EvaluateTuneTestFloat [==========] Running 1 test from 1 test case. [----------] Global test environment set-up. [----------] 1 test from OMP_TUNING [ RUN ] OMP_TUNING.EvaluateTuneTestFloat ****************************** Operators: relu, for type: float ****************************** NeverOMP Timing: 5 iterations of 10 calls, shape = [1,1,28,28] = 784 items Timing: 5 iterations of 10 calls, shape = [1,3,28,28] = 2,352 items Timing: 5 iterations of 10 calls, shape = [50,1,18,32] = 28,800 items Timing: 5 iterations of 10 calls, shape = [25,3,64,64] = 307,200 items Timing: 5 iterations of 10 calls, shape = [10,3,128,128] = 491,520 items Timing: 5 iterations of 10 calls, shape = [20,3,128,128] = 983,040 items Timing: 5 iterations of 10 calls, shape = [30,3,128,128] = 1,474,560 items Timing: 5 iterations of 10 calls, shape = [30,3,256,128] = 2,949,120 items Auto Timing: 5 iterations of 10 calls, shape = [1,1,28,28] = 784 items Timing: 5 iterations of 10 calls, shape = [1,3,28,28] = 2,352 items Timing: 5 iterations of 10 calls, shape = [50,1,18,32] = 28,800 items Timing: 5 iterations of 10 calls, shape = [25,3,64,64] = 307,200 items Timing: 5 iterations of 10 calls, shape = [10,3,128,128] = 491,520 items Timing: 5 iterations of 10 calls, shape = [20,3,128,128] = 983,040 items Timing: 5 iterations of 10 calls, shape = [30,3,128,128] = 1,474,560 items Timing: 5 iterations of 10 calls, shape = [30,3,256,128] = 2,949,120 items AlwaysOMP Timing: 5 iterations of 10 calls, shape = [1,1,28,28] = 784 items Timing: 5 iterations of 10 calls, shape = [1,3,28,28] = 2,352 items Timing: 5 iterations of 10 calls, shape = [50,1,18,32] = 28,800 items Timing: 5 iterations of 10 calls, shape = [25,3,64,64] = 307,200 items Timing: 5 iterations of 10 calls, shape = [10,3,128,128] = 491,520 items Timing: 5 iterations of 10 calls, shape = [20,3,128,128] = 983,040 items Timing: 5 iterations of 10 calls, shape = [30,3,128,128] = 1,474,560 items Timing: 5 iterations of 10 calls, shape = [30,3,256,128] = 2,949,120 items [1,1,28,28] lhs=784 items (Forward) Auto: 0.00022 ms (NeverOMP) NeverOMP: 0.00022 ms AlwaysOMP: 0.0118 ms [1,3,28,28] lhs=2,352 items (Forward) Auto: 0.00034 ms (NeverOMP) NeverOMP: 0.00034 ms AlwaysOMP: 0.01208 ms [50,1,18,32] lhs=28,800 items (Forward) Auto: 0.00574 ms (NeverOMP) NeverOMP: 0.00648 ms AlwaysOMP: 0.01562 ms [25,3,64,64] lhs=307,200 items (Forward) AlwaysOMP: 0.02288 ms Auto: 0.03416 ms (AlwaysOMP) NeverOMP: 0.12644 ms [10,3,128,128] lhs=491,520 items (Forward) Auto: 0.02286 ms (AlwaysOMP) AlwaysOMP: 0.02442 ms NeverOMP: 0.19794 ms [20,3,128,128] lhs=983,040 items (Forward) Auto: 0.02636 ms (AlwaysOMP) AlwaysOMP: 0.07012 ms NeverOMP: 0.38514 ms [30,3,128,128] lhs=1,474,560 items (Forward) AlwaysOMP: 0.08456 ms Auto: 0.36562 ms (NeverOMP) NeverOMP: 0.63766 ms *** WARNING: Wrong OMP state selected *** [30,3,256,128] lhs=2,949,120 items (Forward) Auto: 0.1412 ms (AlwaysOMP) AlwaysOMP: 0.15476 ms NeverOMP: 1.37026 ms [1,1,28,28] lhs=784 items (Backward) Auto: 0.00032 ms (NeverOMP) NeverOMP: 0.00034 ms AlwaysOMP: 0.01186 ms [1,3,28,28] lhs=2,352 items (Backward) Auto: 0.0005 ms (NeverOMP) NeverOMP: 0.0005 ms AlwaysOMP: 0.01204 ms [50,1,18,32] lhs=28,800 items (Backward) Auto: 0.0043 ms (NeverOMP) NeverOMP: 0.00442 ms AlwaysOMP: 0.01318 ms [25,3,64,64] lhs=307,200 items (Backward) AlwaysOMP: 0.01714 ms Auto: 0.0177 ms (AlwaysOMP) NeverOMP: 0.15054 ms [10,3,128,128] lhs=491,520 items (Backward) Auto: 0.0198 ms (AlwaysOMP) AlwaysOMP: 0.02256 ms NeverOMP: 0.22596 ms [20,3,128,128] lhs=983,040 items (Backward) Auto: 0.02912 ms (AlwaysOMP) AlwaysOMP: 0.04018 ms NeverOMP: 0.49106 ms [30,3,128,128] lhs=1,474,560 items (Backward) AlwaysOMP: 0.05578 ms Auto: 0.25118 ms (AlwaysOMP) NeverOMP: 0.78454 ms [30,3,256,128] lhs=2,949,120 items (Backward) Auto: 0.11424 ms (AlwaysOMP) AlwaysOMP: 0.16688 ms NeverOMP: 2.17438 ms ****************************** Operators: sigmoid, for type: float ****************************** NeverOMP Timing: 5 iterations of 10 calls, shape = [1,1,28,28] = 784 items Timing: 5 iterations of 10 calls, shape = [1,3,28,28] = 2,352 items Timing: 5 iterations of 10 calls, shape = [50,1,18,32] = 28,800 items Timing: 5 iterations of 10 calls, shape = [25,3,64,64] = 307,200 items Timing: 5 iterations of 10 calls, shape = [10,3,128,128] = 491,520 items Timing: 5 iterations of 10 calls, shape = [20,3,128,128] = 983,040 items Timing: 5 iterations of 10 calls, shape = [30,3,128,128] = 1,474,560 items Timing: 5 iterations of 10 calls, shape = [30,3,256,128] = 2,949,120 items Auto Timing: 5 iterations of 10 calls, shape = [1,1,28,28] = 784 items Timing: 5 iterations of 10 calls, shape = [1,3,28,28] = 2,352 items Timing: 5 iterations of 10 calls, shape = [50,1,18,32] = 28,800 items Timing: 5 iterations of 10 calls, shape = [25,3,64,64] = 307,200 items Timing: 5 iterations of 10 calls, shape = [10,3,128,128] = 491,520 items Timing: 5 iterations of 10 calls, shape = [20,3,128,128] = 983,040 items Timing: 5 iterations of 10 calls, shape = [30,3,128,128] = 1,474,560 items Timing: 5 iterations of 10 calls, shape = [30,3,256,128] = 2,949,120 items AlwaysOMP Timing: 5 iterations of 10 calls, shape = [1,1,28,28] = 784 items Timing: 5 iterations of 10 calls, shape = [1,3,28,28] = 2,352 items Timing: 5 iterations of 10 calls, shape = [50,1,18,32] = 28,800 items Timing: 5 iterations of 10 calls, shape = [25,3,64,64] = 307,200 items Timing: 5 iterations of 10 calls, shape = [10,3,128,128] = 491,520 items Timing: 5 iterations of 10 calls, shape = [20,3,128,128] = 983,040 items Timing: 5 iterations of 10 calls, shape = [30,3,128,128] = 1,474,560 items Timing: 5 iterations of 10 calls, shape = [30,3,256,128] = 2,949,120 items [1,1,28,28] lhs=784 items (Forward) AlwaysOMP: 0.00934 ms Auto: 0.00968 ms (NeverOMP) NeverOMP: 0.00982 ms [1,3,28,28] lhs=2,352 items (Forward) AlwaysOMP: 0.01088 ms NeverOMP: 0.02722 ms Auto: 0.02734 ms (NeverOMP) *** WARNING: Wrong OMP state selected *** [50,1,18,32] lhs=28,800 items (Forward) Auto: 0.02106 ms (AlwaysOMP) AlwaysOMP: 0.02126 ms NeverOMP: 0.323 ms [25,3,64,64] lhs=307,200 items (Forward) AlwaysOMP: 0.15196 ms Auto: 0.15606 ms (AlwaysOMP) NeverOMP: 3.43368 ms [10,3,128,128] lhs=491,520 items (Forward) Auto: 0.23134 ms (AlwaysOMP) AlwaysOMP: 0.23526 ms NeverOMP: 5.49212 ms [20,3,128,128] lhs=983,040 items (Forward) Auto: 0.45208 ms (AlwaysOMP) AlwaysOMP: 0.4828 ms NeverOMP: 10.9786 ms [30,3,128,128] lhs=1,474,560 items (Forward) AlwaysOMP: 0.68502 ms Auto: 0.70412 ms (AlwaysOMP) NeverOMP: 16.6016 ms [30,3,256,128] lhs=2,949,120 items (Forward) AlwaysOMP: 1.38424 ms Auto: 1.42146 ms (AlwaysOMP) NeverOMP: 33.1566 ms [1,1,28,28] lhs=784 items (Backward) Auto: 0.00032 ms (NeverOMP) NeverOMP: 0.00032 ms AlwaysOMP: 0.00964 ms [1,3,28,28] lhs=2,352 items (Backward) Auto: 0.00048 ms (NeverOMP) NeverOMP: 0.00048 ms AlwaysOMP: 0.01178 ms [50,1,18,32] lhs=28,800 items (Backward) Auto: 0.00464 ms (NeverOMP) NeverOMP: 0.00746 ms AlwaysOMP: 0.01378 ms [25,3,64,64] lhs=307,200 items (Backward) Auto: 0.02116 ms (AlwaysOMP) AlwaysOMP: 0.02122 ms NeverOMP: 0.14452 ms [10,3,128,128] lhs=491,520 items (Backward) AlwaysOMP: 0.0236 ms Auto: 0.02434 ms (AlwaysOMP) NeverOMP: 0.22712 ms [20,3,128,128] lhs=983,040 items (Backward) Auto: 0.03558 ms (AlwaysOMP) AlwaysOMP: 0.04054 ms NeverOMP: 0.46418 ms [30,3,128,128] lhs=1,474,560 items (Backward) AlwaysOMP: 0.05364 ms Auto: 0.05486 ms (AlwaysOMP) NeverOMP: 0.75992 ms [30,3,256,128] lhs=2,949,120 items (Backward) Auto: 0.15996 ms (AlwaysOMP) AlwaysOMP: 0.1616 ms NeverOMP: 2.19816 ms ****************************** Operators: sqrt, for type: float ****************************** NeverOMP Timing: 5 iterations of 10 calls, shape = [1,1,28,28] = 784 items Timing: 5 iterations of 10 calls, shape = [1,3,28,28] = 2,352 items Timing: 5 iterations of 10 calls, shape = [50,1,18,32] = 28,800 items Timing: 5 iterations of 10 calls, shape = [25,3,64,64] = 307,200 items Timing: 5 iterations of 10 calls, shape = [10,3,128,128] = 491,520 items Timing: 5 iterations of 10 calls, shape = [20,3,128,128] = 983,040 items Timing: 5 iterations of 10 calls, shape = [30,3,128,128] = 1,474,560 items Timing: 5 iterations of 10 calls, shape = [30,3,256,128] = 2,949,120 items Auto Timing: 5 iterations of 10 calls, shape = [1,1,28,28] = 784 items Timing: 5 iterations of 10 calls, shape = [1,3,28,28] = 2,352 items Timing: 5 iterations of 10 calls, shape = [50,1,18,32] = 28,800 items Timing: 5 iterations of 10 calls, shape = [25,3,64,64] = 307,200 items Timing: 5 iterations of 10 calls, shape = [10,3,128,128] = 491,520 items Timing: 5 iterations of 10 calls, shape = [20,3,128,128] = 983,040 items Timing: 5 iterations of 10 calls, shape = [30,3,128,128] = 1,474,560 items Timing: 5 iterations of 10 calls, shape = [30,3,256,128] = 2,949,120 items AlwaysOMP Timing: 5 iterations of 10 calls, shape = [1,1,28,28] = 784 items Timing: 5 iterations of 10 calls, shape = [1,3,28,28] = 2,352 items Timing: 5 iterations of 10 calls, shape = [50,1,18,32] = 28,800 items Timing: 5 iterations of 10 calls, shape = [25,3,64,64] = 307,200 items Timing: 5 iterations of 10 calls, shape = [10,3,128,128] = 491,520 items Timing: 5 iterations of 10 calls, shape = [20,3,128,128] = 983,040 items Timing: 5 iterations of 10 calls, shape = [30,3,128,128] = 1,474,560 items Timing: 5 iterations of 10 calls, shape = [30,3,256,128] = 2,949,120 items [1,1,28,28] lhs=784 items (Forward) NeverOMP: 0.00366 ms Auto: 0.00374 ms (NeverOMP) AlwaysOMP: 0.00948 ms [1,3,28,28] lhs=2,352 items (Forward) AlwaysOMP: 0.00924 ms NeverOMP: 0.01576 ms Auto: 0.01634 ms (NeverOMP) *** WARNING: Wrong OMP state selected *** [50,1,18,32] lhs=28,800 items (Forward) Auto: 0.01538 ms (AlwaysOMP) AlwaysOMP: 0.01574 ms NeverOMP: 0.32708 ms [25,3,64,64] lhs=307,200 items (Forward) AlwaysOMP: 0.13882 ms Auto: 0.1427 ms (AlwaysOMP) NeverOMP: 3.49724 ms [10,3,128,128] lhs=491,520 items (Forward) AlwaysOMP: 0.2119 ms Auto: 0.21518 ms (AlwaysOMP) NeverOMP: 5.61336 ms [20,3,128,128] lhs=983,040 items (Forward) AlwaysOMP: 0.41486 ms Auto: 0.4157 ms (AlwaysOMP) NeverOMP: 11.1994 ms [30,3,128,128] lhs=1,474,560 items (Forward) AlwaysOMP: 0.6451 ms Auto: 0.6476 ms (AlwaysOMP) NeverOMP: 16.9544 ms [30,3,256,128] lhs=2,949,120 items (Forward) Auto: 1.30356 ms (AlwaysOMP) AlwaysOMP: 1.61614 ms NeverOMP: 33.8389 ms [1,1,28,28] lhs=784 items (Backward) Auto: 0.00034 ms (NeverOMP) NeverOMP: 0.00036 ms AlwaysOMP: 0.00962 ms [1,3,28,28] lhs=2,352 items (Backward) NeverOMP: 0.00064 ms Auto: 0.00078 ms (NeverOMP) AlwaysOMP: 0.00996 ms [50,1,18,32] lhs=28,800 items (Backward) Auto: 0.00616 ms (NeverOMP) NeverOMP: 0.00654 ms AlwaysOMP: 0.0114 ms [25,3,64,64] lhs=307,200 items (Backward) Auto: 0.0175 ms (AlwaysOMP) AlwaysOMP: 0.01754 ms NeverOMP: 0.1422 ms [10,3,128,128] lhs=491,520 items (Backward) AlwaysOMP: 0.02234 ms Auto: 0.02252 ms (AlwaysOMP) NeverOMP: 0.22748 ms [20,3,128,128] lhs=983,040 items (Backward) AlwaysOMP: 0.03518 ms Auto: 0.03606 ms (AlwaysOMP) NeverOMP: 0.46182 ms [30,3,128,128] lhs=1,474,560 items (Backward) AlwaysOMP: 0.05342 ms Auto: 0.05422 ms (AlwaysOMP) NeverOMP: 0.76676 ms [30,3,256,128] lhs=2,949,120 items (Backward) AlwaysOMP: 0.1534 ms Auto: 0.15758 ms (AlwaysOMP) NeverOMP: 2.2617 ms ****************************** Operators: elemwise_add, _backward_add for type: float ****************************** NeverOMP Timing: 5 iterations of 10 calls, shape = [1,1,28,28] = 784 items Timing: 5 iterations of 10 calls, shape = [1,3,28,28] = 2,352 items Timing: 5 iterations of 10 calls, shape = [50,1,18,32] = 28,800 items Timing: 5 iterations of 10 calls, shape = [25,3,64,64] = 307,200 items Timing: 5 iterations of 10 calls, shape = [10,3,128,128] = 491,520 items Timing: 5 iterations of 10 calls, shape = [20,3,128,128] = 983,040 items Timing: 5 iterations of 10 calls, shape = [30,3,128,128] = 1,474,560 items Timing: 5 iterations of 10 calls, shape = [30,3,256,128] = 2,949,120 items Auto Timing: 5 iterations of 10 calls, shape = [1,1,28,28] = 784 items Timing: 5 iterations of 10 calls, shape = [1,3,28,28] = 2,352 items Timing: 5 iterations of 10 calls, shape = [50,1,18,32] = 28,800 items Timing: 5 iterations of 10 calls, shape = [25,3,64,64] = 307,200 items Timing: 5 iterations of 10 calls, shape = [10,3,128,128] = 491,520 items Timing: 5 iterations of 10 calls, shape = [20,3,128,128] = 983,040 items Timing: 5 iterations of 10 calls, shape = [30,3,128,128] = 1,474,560 items Timing: 5 iterations of 10 calls, shape = [30,3,256,128] = 2,949,120 items AlwaysOMP Timing: 5 iterations of 10 calls, shape = [1,1,28,28] = 784 items Timing: 5 iterations of 10 calls, shape = [1,3,28,28] = 2,352 items Timing: 5 iterations of 10 calls, shape = [50,1,18,32] = 28,800 items Timing: 5 iterations of 10 calls, shape = [25,3,64,64] = 307,200 items Timing: 5 iterations of 10 calls, shape = [10,3,128,128] = 491,520 items Timing: 5 iterations of 10 calls, shape = [20,3,128,128] = 983,040 items Timing: 5 iterations of 10 calls, shape = [30,3,128,128] = 1,474,560 items Timing: 5 iterations of 10 calls, shape = [30,3,256,128] = 2,949,120 items [1,1,28,28] lhs=784 items (Forward) NeverOMP: 0.0003 ms Auto: 0.00032 ms (NeverOMP) AlwaysOMP: 0.01056 ms [1,3,28,28] lhs=2,352 items (Forward) NeverOMP: 0.00048 ms Auto: 0.00052 ms (NeverOMP) AlwaysOMP: 0.01028 ms [50,1,18,32] lhs=28,800 items (Forward) NeverOMP: 0.0062 ms Auto: 0.0063 ms (NeverOMP) AlwaysOMP: 0.01642 ms [25,3,64,64] lhs=307,200 items (Forward) AlwaysOMP: 0.0352 ms Auto: 0.03674 ms (AlwaysOMP) NeverOMP: 0.14758 ms [10,3,128,128] lhs=491,520 items (Forward) Auto: 0.02808 ms (AlwaysOMP) AlwaysOMP: 0.02842 ms NeverOMP: 0.2267 ms [20,3,128,128] lhs=983,040 items (Forward) AlwaysOMP: 0.07456 ms Auto: 0.08664 ms (AlwaysOMP) NeverOMP: 0.5293 ms [30,3,128,128] lhs=1,474,560 items (Forward) Auto: 0.13088 ms (AlwaysOMP) AlwaysOMP: 0.13124 ms NeverOMP: 0.86746 ms [30,3,256,128] lhs=2,949,120 items (Forward) AlwaysOMP: 0.21186 ms Auto: 0.22716 ms (AlwaysOMP) NeverOMP: 2.1614 ms [1,1,28,28] lhs=784 items (Backward) Auto: 0.00034 ms (NeverOMP) NeverOMP: 0.00034 ms AlwaysOMP: 0.02238 ms [1,3,28,28] lhs=2,352 items (Backward) NeverOMP: 0.00072 ms Auto: 0.00074 ms (NeverOMP) AlwaysOMP: 0.02108 ms [50,1,18,32] lhs=28,800 items (Backward) NeverOMP: 0.0072 ms Auto: 0.00738 ms (NeverOMP) AlwaysOMP: 0.0254 ms [25,3,64,64] lhs=307,200 items (Backward) Auto: 0.0273 ms (AlwaysOMP) AlwaysOMP: 0.03092 ms NeverOMP: 0.18666 ms [10,3,128,128] lhs=491,520 items (Backward) Auto: 0.03154 ms (AlwaysOMP) AlwaysOMP: 0.03346 ms NeverOMP: 0.3 ms [20,3,128,128] lhs=983,040 items (Backward) Auto: 0.04456 ms (AlwaysOMP) AlwaysOMP: 0.04724 ms NeverOMP: 0.64402 ms [30,3,128,128] lhs=1,474,560 items (Backward) AlwaysOMP: 0.06784 ms Auto: 0.07268 ms (AlwaysOMP) NeverOMP: 1.05224 ms [30,3,256,128] lhs=2,949,120 items (Backward) AlwaysOMP: 0.16028 ms Auto: 0.1608 ms (AlwaysOMP) NeverOMP: 2.80582 ms ****************************** Operators: elemwise_mul, _backward_mul for type: float ****************************** NeverOMP Timing: 5 iterations of 10 calls, shape = [1,1,28,28] = 784 items Timing: 5 iterations of 10 calls, shape = [1,3,28,28] = 2,352 items Timing: 5 iterations of 10 calls, shape = [50,1,18,32] = 28,800 items Timing: 5 iterations of 10 calls, shape = [25,3,64,64] = 307,200 items Timing: 5 iterations of 10 calls, shape = [10,3,128,128] = 491,520 items Timing: 5 iterations of 10 calls, shape = [20,3,128,128] = 983,040 items Timing: 5 iterations of 10 calls, shape = [30,3,128,128] = 1,474,560 items Timing: 5 iterations of 10 calls, shape = [30,3,256,128] = 2,949,120 items Auto Timing: 5 iterations of 10 calls, shape = [1,1,28,28] = 784 items Timing: 5 iterations of 10 calls, shape = [1,3,28,28] = 2,352 items Timing: 5 iterations of 10 calls, shape = [50,1,18,32] = 28,800 items Timing: 5 iterations of 10 calls, shape = [25,3,64,64] = 307,200 items Timing: 5 iterations of 10 calls, shape = [10,3,128,128] = 491,520 items Timing: 5 iterations of 10 calls, shape = [20,3,128,128] = 983,040 items Timing: 5 iterations of 10 calls, shape = [30,3,128,128] = 1,474,560 items Timing: 5 iterations of 10 calls, shape = [30,3,256,128] = 2,949,120 items AlwaysOMP Timing: 5 iterations of 10 calls, shape = [1,1,28,28] = 784 items Timing: 5 iterations of 10 calls, shape = [1,3,28,28] = 2,352 items Timing: 5 iterations of 10 calls, shape = [50,1,18,32] = 28,800 items Timing: 5 iterations of 10 calls, shape = [25,3,64,64] = 307,200 items Timing: 5 iterations of 10 calls, shape = [10,3,128,128] = 491,520 items Timing: 5 iterations of 10 calls, shape = [20,3,128,128] = 983,040 items Timing: 5 iterations of 10 calls, shape = [30,3,128,128] = 1,474,560 items Timing: 5 iterations of 10 calls, shape = [30,3,256,128] = 2,949,120 items [1,1,28,28] lhs=784 items (Forward) Auto: 0.0003 ms (NeverOMP) NeverOMP: 0.0003 ms AlwaysOMP: 0.00924 ms [1,3,28,28] lhs=2,352 items (Forward) NeverOMP: 0.00046 ms Auto: 0.00048 ms (NeverOMP) AlwaysOMP: 0.01122 ms [50,1,18,32] lhs=28,800 items (Forward) NeverOMP: 0.00584 ms Auto: 0.00644 ms (NeverOMP) AlwaysOMP: 0.01956 ms [25,3,64,64] lhs=307,200 items (Forward) AlwaysOMP: 0.03396 ms Auto: 0.03698 ms (AlwaysOMP) NeverOMP: 0.14916 ms [10,3,128,128] lhs=491,520 items (Forward) AlwaysOMP: 0.02624 ms Auto: 0.02692 ms (AlwaysOMP) NeverOMP: 0.22936 ms [20,3,128,128] lhs=983,040 items (Forward) Auto: 0.0719 ms (AlwaysOMP) AlwaysOMP: 0.0737 ms NeverOMP: 0.59254 ms [30,3,128,128] lhs=1,474,560 items (Forward) Auto: 0.0669 ms (AlwaysOMP) AlwaysOMP: 0.06698 ms NeverOMP: 0.89104 ms [30,3,256,128] lhs=2,949,120 items (Forward) AlwaysOMP: 0.21424 ms Auto: 0.23394 ms (AlwaysOMP) NeverOMP: 2.13982 ms [1,1,28,28] lhs=784 items (Backward) NeverOMP: 0.00046 ms Auto: 0.00048 ms (NeverOMP) AlwaysOMP: 0.01952 ms [1,3,28,28] lhs=2,352 items (Backward) NeverOMP: 0.00094 ms Auto: 0.00108 ms (NeverOMP) AlwaysOMP: 0.02082 ms [50,1,18,32] lhs=28,800 items (Backward) NeverOMP: 0.00998 ms Auto: 0.01022 ms (NeverOMP) AlwaysOMP: 0.02198 ms [25,3,64,64] lhs=307,200 items (Backward) AlwaysOMP: 0.03154 ms Auto: 0.03278 ms (AlwaysOMP) NeverOMP: 0.282 ms [10,3,128,128] lhs=491,520 items (Backward) AlwaysOMP: 0.03334 ms Auto: 0.03664 ms (AlwaysOMP) NeverOMP: 0.45714 ms [20,3,128,128] lhs=983,040 items (Backward) Auto: 0.05764 ms (AlwaysOMP) AlwaysOMP: 0.05954 ms NeverOMP: 1.06712 ms [30,3,128,128] lhs=1,474,560 items (Backward) AlwaysOMP: 0.1036 ms Auto: 0.10514 ms (AlwaysOMP) NeverOMP: 1.91428 ms [30,3,256,128] lhs=2,949,120 items (Backward) AlwaysOMP: 0.36086 ms Auto: 0.38696 ms (AlwaysOMP) NeverOMP: 5.02986 ms ****************************** Operators: elemwise_div, _backward_div for type: float ****************************** NeverOMP Timing: 5 iterations of 10 calls, shape = [1,1,28,28] = 784 items Timing: 5 iterations of 10 calls, shape = [1,3,28,28] = 2,352 items Timing: 5 iterations of 10 calls, shape = [50,1,18,32] = 28,800 items Timing: 5 iterations of 10 calls, shape = [25,3,64,64] = 307,200 items Timing: 5 iterations of 10 calls, shape = [10,3,128,128] = 491,520 items Timing: 5 iterations of 10 calls, shape = [20,3,128,128] = 983,040 items Timing: 5 iterations of 10 calls, shape = [30,3,128,128] = 1,474,560 items Timing: 5 iterations of 10 calls, shape = [30,3,256,128] = 2,949,120 items Auto Timing: 5 iterations of 10 calls, shape = [1,1,28,28] = 784 items Timing: 5 iterations of 10 calls, shape = [1,3,28,28] = 2,352 items Timing: 5 iterations of 10 calls, shape = [50,1,18,32] = 28,800 items Timing: 5 iterations of 10 calls, shape = [25,3,64,64] = 307,200 items Timing: 5 iterations of 10 calls, shape = [10,3,128,128] = 491,520 items Timing: 5 iterations of 10 calls, shape = [20,3,128,128] = 983,040 items Timing: 5 iterations of 10 calls, shape = [30,3,128,128] = 1,474,560 items Timing: 5 iterations of 10 calls, shape = [30,3,256,128] = 2,949,120 items AlwaysOMP Timing: 5 iterations of 10 calls, shape = [1,1,28,28] = 784 items Timing: 5 iterations of 10 calls, shape = [1,3,28,28] = 2,352 items Timing: 5 iterations of 10 calls, shape = [50,1,18,32] = 28,800 items Timing: 5 iterations of 10 calls, shape = [25,3,64,64] = 307,200 items Timing: 5 iterations of 10 calls, shape = [10,3,128,128] = 491,520 items Timing: 5 iterations of 10 calls, shape = [20,3,128,128] = 983,040 items Timing: 5 iterations of 10 calls, shape = [30,3,128,128] = 1,474,560 items Timing: 5 iterations of 10 calls, shape = [30,3,256,128] = 2,949,120 items [1,1,28,28] lhs=784 items (Forward) NeverOMP: 0.00036 ms Auto: 0.00042 ms (NeverOMP) AlwaysOMP: 0.01212 ms [1,3,28,28] lhs=2,352 items (Forward) NeverOMP: 0.00062 ms Auto: 0.00064 ms (NeverOMP) AlwaysOMP: 0.01218 ms [50,1,18,32] lhs=28,800 items (Forward) NeverOMP: 0.00596 ms Auto: 0.00716 ms (NeverOMP) AlwaysOMP: 0.01656 ms [25,3,64,64] lhs=307,200 items (Forward) Auto: 0.02738 ms (AlwaysOMP) AlwaysOMP: 0.02802 ms NeverOMP: 0.14842 ms [10,3,128,128] lhs=491,520 items (Forward) Auto: 0.02306 ms (AlwaysOMP) AlwaysOMP: 0.02316 ms NeverOMP: 0.22746 ms [20,3,128,128] lhs=983,040 items (Forward) AlwaysOMP: 0.06104 ms Auto: 0.06242 ms (AlwaysOMP) NeverOMP: 0.5872 ms [30,3,128,128] lhs=1,474,560 items (Forward) AlwaysOMP: 0.06408 ms Auto: 0.06498 ms (AlwaysOMP) NeverOMP: 0.85152 ms [30,3,256,128] lhs=2,949,120 items (Forward) AlwaysOMP: 0.15306 ms Auto: 0.2167 ms (AlwaysOMP) NeverOMP: 2.14322 ms [1,1,28,28] lhs=784 items (Backward) NeverOMP: 0.0006 ms Auto: 0.00062 ms (NeverOMP) AlwaysOMP: 0.02624 ms [1,3,28,28] lhs=2,352 items (Backward) NeverOMP: 0.00118 ms Auto: 0.00148 ms (NeverOMP) AlwaysOMP: 0.02484 ms [50,1,18,32] lhs=28,800 items (Backward) Auto: 0.01208 ms (NeverOMP) NeverOMP: 0.01306 ms AlwaysOMP: 0.03136 ms [25,3,64,64] lhs=307,200 items (Backward) Auto: 0.0348 ms (AlwaysOMP) AlwaysOMP: 0.0355 ms NeverOMP: 0.3277 ms [10,3,128,128] lhs=491,520 items (Backward) AlwaysOMP: 0.04238 ms Auto: 0.04294 ms (AlwaysOMP) NeverOMP: 0.53778 ms [20,3,128,128] lhs=983,040 items (Backward) AlwaysOMP: 0.06548 ms Auto: 0.06602 ms (AlwaysOMP) NeverOMP: 1.33208 ms [30,3,128,128] lhs=1,474,560 items (Backward) AlwaysOMP: 0.11048 ms Auto: 0.11306 ms (AlwaysOMP) NeverOMP: 2.29048 ms [30,3,256,128] lhs=2,949,120 items (Backward) AlwaysOMP: 0.29378 ms Auto: 0.39178 ms (AlwaysOMP) NeverOMP: 5.6523 ms Success rate for type float: 0.96875 [ OK ] OMP_TUNING.EvaluateTuneTestFloat (235243 ms) [----------] 1 test from OMP_TUNING (235243 ms total) [----------] Global test environment tear-down [==========] 1 test from 1 test case ran. (235243 ms total) [ PASSED ] 1 test.