Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add benchmark suite #85

Open
mdhaber opened this issue Jan 28, 2025 · 2 comments
Open

Add benchmark suite #85

mdhaber opened this issue Jan 28, 2025 · 2 comments

Comments

@mdhaber
Copy link
Owner

mdhaber commented Jan 28, 2025

No description provided.

@mdhaber
Copy link
Owner Author

mdhaber commented Feb 1, 2025

Initially, I'm not sure that a traditional performance-over-time benchmark suite is what we need. Instead, I had in mind something that compares performance of marray.numpy relative to np.ma. I threw together a script that follows worst-practices for timing (but I think it's good enough).

import numpy as xp
from marray import numpy as mxp
import time, json
from test_marray import (arithmetic_unary, elementwise_unary, elementwise_binary,
                         statistical_array, utility_array, searching_array,
                         comparison_binary, get_arrays)

seed = 64379182864537915

data = {}

class Timeit:
    def __init__(self, f_name, array_type):
        self.f_name = f_name
        self.array_type = array_type
        data[f_name] = data.get(f_name, {})

    def __enter__(self):
        self.tic = time.perf_counter_ns()

    def __exit__(self, type_, value, traceback):
        self.toc = time.perf_counter_ns()
        data[self.f_name][self.array_type] = self.toc - self.tic

for n, fdict in [(1, arithmetic_unary), (2, comparison_binary)]:
    for f_name, f in fdict.items():
        marrays, masked_arrays, seed = get_arrays(n, shape=(1000, 1000), ndim=2,
                                                  dtype='float64', xp=xp, seed=seed)

        with Timeit(f_name, 'MArray      '):
            res = f(*marrays)
        with Timeit(f_name, 'masked_array'):
            ref = f(*masked_arrays)

for n, flist in [(1, elementwise_unary), (2, elementwise_binary)]:
    for f_name in flist:
        marrays, masked_arrays, seed = get_arrays(n, shape=(1000, 1000), ndim=2,
                                                  dtype='float64', xp=xp, seed=seed)

        f = getattr(mxp, f_name)
        with Timeit(f_name, 'MArray      '):
            res = f(*marrays)

        f = getattr(xp, f_name)
        with Timeit(f_name, 'masked_array'):
            ref = f(*masked_arrays)

for f_name in (statistical_array + utility_array + searching_array
               + ["sort", "argsort"]):
    marrays, masked_arrays, seed = get_arrays(1, shape=(1000, 1000), ndim=2,
                                              dtype='float64', xp=xp, seed=seed)

    f = getattr(mxp, f_name)
    with Timeit(f_name, 'MArray      '):
        res = f(*marrays, axis=-1)

    f = getattr(xp, f_name)
    with Timeit(f_name, 'masked_array'):
        ref = f(*masked_arrays, axis=-1)

# print(json.dumps(data, indent=4))
for fun_name, d in data.items():
    print(f"{fun_name}: {d["MArray      "]/d["masked_array"]}")

And we get:

+x: 0.9838730911005793
-x: 1.0474080401473493
abs: 1.2828895849647612
x < y: 0.9012968299711815
x <= y: 0.7426002248032971
x > y: 0.7703748216029445
x >= y: 0.7435935973547757
x == y: 0.37816048448145345
x != y: 0.39428777439512896
acos: 0.7645574162915288
acosh: 0.7382654813789811
asin: 0.7515523869629358
asinh: 1.0087446985265183
atan: 1.1043581144194075
atanh: 0.7536458398717836
ceil: 1.405304228619635
conj: 1.3806090922465795
cos: 1.0500133540403391
cosh: 1.0663130170823165
exp: 1.0963265755031713
expm1: 1.0764397343405954
floor: 1.2214679260133805
imag: 0.3570015447573105
isfinite: 1.7262647262647262
isinf: 2.023948836576405
isnan: 2.0267314702308625
log: 0.693747527251395
log1p: 1.0310941309235633
log2: 0.7363454922065926
log10: 0.6914929221476402
logical_not: 1.435344172618143
negative: 1.3840529026514912
positive: 1.082298433320596
real: 0.3277566120037546
round: 0.936799285643417
sign: 1.1576075371957197
signbit: 1.671334431630972
sin: 1.0721085368227976
sinh: 1.016884504255182
square: 1.425114130135856
sqrt: 0.20736540900128134
tan: 0.3768382246050252
tanh: 1.052388229808838
trunc: 1.2194134491952207
add: 1.2221301826742816
atan2: 1.03907157234652
copysign: 1.0840591843634784
divide: 0.3149407017972857
equal: 1.3578900839840873
floor_divide: 0.8377731558513588
greater: 1.2551762288604393
greater_equal: 1.1378729915837797
hypot: 1.075618209691395
less: 1.3647252582721965
less_equal: 1.1959523809523809
logaddexp: 1.0095855814487429
logical_and: 1.3190449438202247
logical_or: 1.1736838762938973
logical_xor: 1.166481569622931
maximum: 1.1349277528649726
minimum: 1.2298706917271547
multiply: 1.182573738651249
not_equal: 1.3796389556054853
pow: 1.0043532645583413
remainder: 0.837134005921702
subtract: 1.2727363432277377
cumulative_sum: 2.7529326666037677
max: 1.0779800003129842
mean: 1.1520927166868065
min: 1.0162735849056603
prod: 0.9765995531572965
std: 0.9688674599917931
sum: 1.0198767260738688
var: 0.9740312224714611
all: 0.8646909999691462
any: 0.9109962991462387
argmax: 1.0314792381721516
argmin: 1.0189545362151908
sort: 0.6495999554596337
argsort: 0.8117180126801875

In some cases that masked array performance is significantly better (e.g. isfinite, isinf, isnan), I think these are quite fast operations and ufunc machinery is saving the day or something, because the operation is just as fast on masked array as on the unmasked arrays. We can try to reduce overhead, but we can't eliminate it entirely like np.ma can in these cases.

I didn't include @ or matmul here because NumPy masked array matmul behavior is just wrong; you have to use dot. Performance is comparable but we're slightly slower (~20%).

The only other thing that stands out to me is cumulative_sum. I haven't looked into why we're slower there.

Should we include something like this in the repo, or should this just be used to look for bottlenecks?

@lucascolley
Copy link
Collaborator

I don't see a strong reason to include it in the repo. If someone reports a performance issue and you notice that a regression occurred, then maybe it would be to include it from then onwards. But hopefully we can avoid introducing any significant regressions

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants