-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add benchmark suite #85
Comments
Initially, I'm not sure that a traditional performance-over-time benchmark suite is what we need. Instead, I had in mind something that compares performance of import numpy as xp
from marray import numpy as mxp
import time, json
from test_marray import (arithmetic_unary, elementwise_unary, elementwise_binary,
statistical_array, utility_array, searching_array,
comparison_binary, get_arrays)
seed = 64379182864537915
data = {}
class Timeit:
def __init__(self, f_name, array_type):
self.f_name = f_name
self.array_type = array_type
data[f_name] = data.get(f_name, {})
def __enter__(self):
self.tic = time.perf_counter_ns()
def __exit__(self, type_, value, traceback):
self.toc = time.perf_counter_ns()
data[self.f_name][self.array_type] = self.toc - self.tic
for n, fdict in [(1, arithmetic_unary), (2, comparison_binary)]:
for f_name, f in fdict.items():
marrays, masked_arrays, seed = get_arrays(n, shape=(1000, 1000), ndim=2,
dtype='float64', xp=xp, seed=seed)
with Timeit(f_name, 'MArray '):
res = f(*marrays)
with Timeit(f_name, 'masked_array'):
ref = f(*masked_arrays)
for n, flist in [(1, elementwise_unary), (2, elementwise_binary)]:
for f_name in flist:
marrays, masked_arrays, seed = get_arrays(n, shape=(1000, 1000), ndim=2,
dtype='float64', xp=xp, seed=seed)
f = getattr(mxp, f_name)
with Timeit(f_name, 'MArray '):
res = f(*marrays)
f = getattr(xp, f_name)
with Timeit(f_name, 'masked_array'):
ref = f(*masked_arrays)
for f_name in (statistical_array + utility_array + searching_array
+ ["sort", "argsort"]):
marrays, masked_arrays, seed = get_arrays(1, shape=(1000, 1000), ndim=2,
dtype='float64', xp=xp, seed=seed)
f = getattr(mxp, f_name)
with Timeit(f_name, 'MArray '):
res = f(*marrays, axis=-1)
f = getattr(xp, f_name)
with Timeit(f_name, 'masked_array'):
ref = f(*masked_arrays, axis=-1)
# print(json.dumps(data, indent=4))
for fun_name, d in data.items():
print(f"{fun_name}: {d["MArray "]/d["masked_array"]}") And we get:
In some cases that masked array performance is significantly better (e.g. I didn't include The only other thing that stands out to me is Should we include something like this in the repo, or should this just be used to look for bottlenecks? |
I don't see a strong reason to include it in the repo. If someone reports a performance issue and you notice that a regression occurred, then maybe it would be to include it from then onwards. But hopefully we can avoid introducing any significant regressions |
No description provided.
The text was updated successfully, but these errors were encountered: