-
-
Notifications
You must be signed in to change notification settings - Fork 205
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create new Benchmark Repository for mathics core #1563
Comments
From what I have understood each function would need to be benchmarked manually.
|
Yes but I suspect it would be good to form groups of builtin Functions that are roughly comparable. For example the calculator-like functions So as a side thing, we may publish a table of relative costs of various built in functions. Note that although it is well known that in machine code multiplication is much slower than addition, from the Mathics standpoint we want to group "Plus" and "Times" together and separate from functions like "N", "Plot3D" to focus on the difference in speed between "Plus" and "Times" and those as a group from "Plot3D" which I suspect will be an order of magnitude (at least) slower. I also suspect one can get a rough sense of what functions are comparable or should be grouped together by looking at the Wolfram Mathematics "Guide" sections such as https://reference.wolfram.com/language/guide/ArithmeticFunctions.html |
OK, first question: what are we trying to compare? The master branch against previous versions? The master branch against certain branches with PRs? Once we have this, the first thing I would like to benchmark is the time and memory that requires
These aspects impose the largest performance handicap to the core. Specific functions (like |
Yes - the master branch versus PRs, especially the ones that purport to make things faster. And in those benchmarks, it is okay to add more benchmarks to highlight the specific kinds of improvements. The benchmarks however may be scrutinized on their own merit for validity. For example suppose I add a custom rule for 7 * 8 = 42. I may get some sort of improvement here, but the likelihood it comes up is rare. Insight is also an important aspect of benchmarking. So any kind of benchmarks that gives insight are also important.
I've looked into this and the short answer is no. (This kind of thing as is not a problem specific to Mathics, it is something that others have looked at outside for Python and have not found a magic bullet.) For now, the simplest thing for you to do here is run using pyston 2.3. At some later date we'll address the long startup-time loading problem. But that is a separate problem.
Great - create all and any benchmarks you want.
(The difference between arbitrary addition and multiplication in MMA system is not double but more like a low-order polynomial but this is missing the point. The point was to be able to understand the cost of things which includes the different kinds of calls) There are many areas to attack the problem. If you want to follow one path and someone else wants to follow another, that's okay. One of the facets I am exploring is avoiding pattern matching in function call altogether by caching the method that was found on the last call. Initially I am contemplating only for things inside the Python code, but depending on how that goes it can be expanded to other things. |
Before and as we improve performance, it is crucial to have a way to measure what is going on, how bad things are, and where we need to improve most.
Key to this is having an understanding of the cost of each of the thousand or so primitives.
At first I thought, hey we could hook this into the doctests we already have since those iterate over each primitive. (Note that the doctest portion of Mathics core is slated to get put in its own repository.)
But this is a bad idea because the focus of doctests is documentation examples, not tests. .And benchmark tests are not generic tests either. And a big mistake we've seen over and over is trying to cram everything into a single piece of code or code base.
So rather, I propose an entirely new repository for this. And the benchmark test could be of several levels.
For example, I understand MMA uses a program called
Benchmarking.m
.But to start out with, at the lowest level we should benchmark individual Built-in functions for each of the kinds of signatures and patterns that each has.
Some built in functions are very slow like some of the Plotting routines. So as part of the test data (the expression to check) we probably want to also store the number of times for that to run the expression. For built in functions like
Times
, this benchmark iteration number will be high, while for operations likePlot3D
(which in fact is probably calling a number of other builtins) this number will be low.In fact it might be useful to separate just the the really primitive built in functions, e.g.
Times
from the complex, compound (and slow) ones, e.g.Plot3D
.Where we save data from runs is a whole issue in of itself.
Also having programs to graph the performance over time is topic for discussion in of itself.
The text was updated successfully, but these errors were encountered: