Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Move Windows CI build to a 64-bit toolchain to fix 'out of heap space'. #15882

Closed
wants to merge 18 commits into from
Closed
Show file tree
Hide file tree
Changes from 12 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions ci/build_windows.py
Original file line number Diff line number Diff line change
Expand Up @@ -36,8 +36,8 @@
from util import *

KNOWN_VCVARS = {
'VS 2015': r'C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\bin\x86_amd64\vcvarsx86_amd64.bat',
'VS 2017': r'C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Auxiliary\Build\vcvarsx86_amd64.bat'
'VS 2015': r'C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\bin\amd64\vcvars64.bat',
'VS 2017': r'C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Auxiliary\Build\vcvars64.bat'
}


Expand Down
4 changes: 2 additions & 2 deletions tests/python/unittest/test_operator.py
Original file line number Diff line number Diff line change
Expand Up @@ -6844,7 +6844,7 @@ def test_laop_6():
atol_fw = 1e-9
num_eps = 1e-6
rtol_bw = 1e-4
atol_bw = 1e-6
atol_bw = 5e-5

data = mx.symbol.Variable('data')

Expand All @@ -6853,7 +6853,7 @@ def test_laop_6():
atol=atol_fw, dtype=dtype)
check_grad = lambda sym, location:\
check_numeric_gradient(sym, location, numeric_eps=num_eps, rtol=rtol_bw,
atol=atol_bw, dtype=dtype)
atol=atol_bw, dtype=dtype, use_approx_grad=False)

## det(I + dot(v, v.T)) = 1 + dot(v.T, v) >= 1, so it's always invertible;
## det is away from zero, so the value of logdet is stable
Expand Down
15 changes: 8 additions & 7 deletions tests/python/unittest/test_random.py
Original file line number Diff line number Diff line change
Expand Up @@ -893,14 +893,15 @@ def compute_expected_prob():
def test_shuffle():
def check_first_axis_shuffle(arr):
stride = int(arr.size / arr.shape[0])
column0 = arr.reshape((arr.size,))[::stride].sort()
column0 = arr.reshape((arr.size,))[::stride]
seq = mx.nd.arange(0, arr.size - stride + 1, stride, ctx=arr.context)
assert (column0 == seq).prod() == 1
for i in range(arr.shape[0]):
subarr = arr[i].reshape((arr[i].size,))
start = subarr[0].asscalar()
seq = mx.nd.arange(start, start + stride, ctx=arr.context)
assert (subarr == seq).prod() == 1
assert (column0.sort() == seq).prod() == 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

curious to know what difference it makes by moving sort?

Copy link
Contributor Author

@DickJC123 DickJC123 Aug 15, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are no random shapes in test_shuffle so its runtime, even give the sort, should be fairly consistent. My only explanation for the runtime variation is that maybe there's multiple cpu runners on the same machine. Looking for confirmation from @marcoabreu.

Moving the sort was just a style preference, not performance driven. The biggest runtime savings came from introducing the if stride > 1: clause to avoid needless work for the last big 1D test input. The rest was just some further perf polishing.

I keep stumbling on flakey tests with this PR, so my last commit is fixing laop_6. A bit of bad luck I would say.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we run up to 4 jobs per windows-cpu machine. 1 job per windows-gpu machine.

# Check for ascending flattened-row sequences for 2D or greater inputs.
if stride > 1:
ascending_seq = mx.nd.arange(0, stride, ctx=arr.context)
equalized_columns = arr.reshape((arr.shape[0], stride)) - ascending_seq
column0_2d = column0.reshape((arr.shape[0],1))
assert (column0_2d == equalized_columns).prod() == 1

# This tests that the shuffling is along the first axis with `repeat1` number of shufflings
# and the outcomes are uniformly distributed with `repeat2` number of shufflings.
Expand Down