-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance and load tests for codespeed #3585
Comments
This sounds great to me. Codespeed requires some metadata about tests when uploading (What the units being reported are, whether "less" is better for that metric, etc....) which motivated me to define each test module to report that metadata when running the tests. I have two examples uploaded right now, but I'm open to alternative designs. |
Fairly obviously, but there's no point in benchmarking the non-Julia code in perf over and over again. We should only be running the Julia benchmarks through codespeed. |
Definitely. If you like, however, we can measure the non-Julia code once, and provide it as a baseline for certain benchmarks |
Yes, that seems like a good idea. We really only need to know how we're doing relative to C, which is nice because that's the benchmark that takes the least time. We certainly don't want to be running the Octave benchmark every time. |
I think a second task, that is less glamorous/a bit tedious, is to work back through closed issues that are performance related and make sure we have test coverage, to avoid introducing performance regressions. |
@StefanKarpinski is the C-code you're referring to just what is contained within the
|
We don't actually have C versions of all of these benchmarks. Maybe that's ok or maybe we should write C versions of everything. Generally, I like to use C as the gold standard, but that's a lot of tedious work. |
So when you refer to the C codes and the octave benchmark, etc..... what are you referring to? |
I meant the home page microbenchmarks that we have versions of in seven languages. The Octave benchmark takes a really long time to run. |
I have been thinking of having a base comparison, which can be either C or matlab. It is not fun to write the cat benchmark in C, and it would not even be a meaningful comparison. All the shootout benchmarks do have C versions available. I have to look if they have Matlab too. |
I don't think it's critical to have C versions of all of the benchmarks, although some would be nice. The main point of this is to prevent performance regressions and track improvements, no? |
Yes, that's true. The main point of having C versions is to know how good we could get if we really nail it. Certainly for things like hcat and vcat, it doesn't make much sense to have C versions. |
The way it's set up right now, the number of branches getting run in codespeed is pretty crazy. The "Changes" and "Timeline" views can only track Does it make sense to maintain a whitelist of branches we will track? Something like only |
Yes, we should only track |
I was trawling through julia-users trying to find examples of code that would make for good tests, but was struggling a bit to navigate all the tests to avoid duplication - especially the micro ones. Perhaps another to-do is to have a catalog of the tests? |
I'm going to go ahead and close this, we can discuss further codespeed stuff on the mailing list or in more focused issues. |
…ac617 (#50973) Co-authored-by: Dilum Aluthge <[email protected]> fix inference of PackageSpec constructor in presence of imprecise input types (#3585)
…3fa4 (#50976) Co-authored-by: Dilum Aluthge <[email protected]> fix inference of PackageSpec constructor in presence of imprecise input types (#3585)
…3fa4 (#50976) Co-authored-by: Dilum Aluthge <[email protected]> fix inference of PackageSpec constructor in presence of imprecise input types (#3585)
…ac617 (#50973) Co-authored-by: Dilum Aluthge <[email protected]> fix inference of PackageSpec constructor in presence of imprecise input types (#3585)
Now that we have codespeed integration for performance testing coming into place, we should start running a larger number of codes.
To start with, we should make all the stuff in
perf
,perf2
, andload
run uniformly and produce consistent output. Over time, we can even start including some packages as part of the performance measurement.@staticfloat What do you think?
The text was updated successfully, but these errors were encountered: