Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Time to run N tests grows faster than linear in N: codegen problem? #15346

Closed
dpsanders opened this issue Mar 3, 2016 · 21 comments · Fixed by #19513
Closed

Time to run N tests grows faster than linear in N: codegen problem? #15346

dpsanders opened this issue Mar 3, 2016 · 21 comments · Fixed by #19513
Assignees
Labels
kind:regression Regression in behavior compared to a previous version performance Must go faster
Milestone

Comments

@dpsanders
Copy link
Contributor

The time to run N tests using FactCheck (which internally uses anonymous functions heavily) grows significantly faster than linear in N, causing long generated test suites to hang.
This apparently is due to #13412 (tested with that commit compared to the previous one; thanks to @andreasnoack). It is approximately linear on 0.4.

I have written a simple file that generates test suites consisting of N tests of the form @fact i-->i:

This gives the following run-times (in seconds!) on 0.5:
0 5_test_times

[This seems to be faster than power law and slower than exponential growth. Raw data here.]

@IainNZ
Copy link
Member

IainNZ commented Mar 4, 2016

OT, but I liked this so much I ran this on 0.4.1 versus BaseTestNext:
Last 5 FactCheck times: 2.42, 2.63, 2.82, 2.98, 3.13
Last 5 BaseTestNext times: 1.33, 1.47, 1.70, 1.76, 1.94
Don't have a Julia 0.5 build to test BTN on that, but I've add a comment to the linked gist with my code

@IainNZ
Copy link
Member

IainNZ commented Mar 4, 2016

Running it up to 1000 on Julia 0.4.1
figure_1

@dpsanders
Copy link
Contributor Author

Thanks, that was useful. Here are the discouraging results for Base.Test on 0.5.
There is definitely an underlying problem floating around.
0 5_times_base_test

@StefanKarpinski
Copy link
Sponsor Member

@JeffBezanson: wild guess but could this have to do with the issue you were having with @testset on the jb/functions branch?

@vtjnash
Copy link
Sponsor Member

vtjnash commented Mar 21, 2016

this appears to be partially related to #15262 (the non-linear behavior of the lowering pass is much stronger than for the rest of the passes)

@JeffBezanson
Copy link
Sponsor Member

I suspect the cl-convert pass. For each local method definition, it scans the whole enclosing function body to look for other definitions with the same name. Should be done by a pass to pre-populate a lookup table.

@JeffBezanson
Copy link
Sponsor Member

Ah, if this happens with lots of comprehensions then it's probably something other than cl-convert.

vtjnash added a commit that referenced this issue Apr 5, 2016
vtjnash added a commit that referenced this issue Apr 5, 2016
vtjnash added a commit that referenced this issue Apr 5, 2016
vtjnash added a commit that referenced this issue Apr 6, 2016
vtjnash added a commit that referenced this issue Apr 6, 2016
@dpsanders
Copy link
Contributor Author

I was hoping that #16634 would fix this, but apparently not.

@StefanKarpinski StefanKarpinski added this to the 0.5.x milestone Sep 8, 2016
@StefanKarpinski StefanKarpinski added status:help wanted Indicates that a maintainer wants help on an issue or pull request and removed status:help wanted Indicates that a maintainer wants help on an issue or pull request labels Oct 27, 2016
@vtjnash vtjnash removed the compiler:lowering Syntax lowering (compiler front end, 2nd stage) label Nov 9, 2016
vtjnash added a commit that referenced this issue Nov 9, 2016
when merging two frames (such as when handling exception frames),
most of the time new is old, but the computation here is
exponential in the number of variables so we want the inner computation
here to be ultra-fast in the common case

fix #15346
@vtjnash vtjnash removed the status:help wanted Indicates that a maintainer wants help on an issue or pull request label Nov 16, 2016
vtjnash added a commit that referenced this issue Nov 17, 2016
when merging two frames (such as when handling exception frames),
most of the time new is old, but the computation here is
exponential in the number of variables so we want the inner computation
here to be ultra-fast in the common case

fix #15346
@andreasnoack
Copy link
Member

I think there is still an issue here.

screen shot 2016-11-30 at 9 29 01 am

@andreasnoack andreasnoack reopened this Nov 30, 2016
@StefanKarpinski
Copy link
Sponsor Member

How does this compare to 0.5? I.e. did this at least improve it somewhat?

@andreasnoack
Copy link
Member

Seems to have improved a lot relative to 0.5 but my 0.5 results are quite a bit worse than the previous results reported in this issue. E.g. 500 tests take 385 seconds.
screen shot 2016-11-30 at 10 31 47 pm

@StefanKarpinski
Copy link
Sponsor Member

Thanks, @andreasnoack. Good to know where we stand on all three versions. Seems like backporting the frontend speedup PR to 0.5 might be a good idea, and that some more improvement here would be even better.

vtjnash added a commit that referenced this issue Dec 1, 2016
when merging two frames (such as when handling exception frames),
most of the time new is old, but the computation here is
exponential in the number of variables so we want the inner computation
here to be ultra-fast in the common case

fix #15346
vtjnash added a commit that referenced this issue Dec 3, 2016
when merging two frames (such as when handling exception frames),
most of the time new is old, but the computation here is
exponential in the number of variables so we want the inner computation
here to be ultra-fast in the common case

fix #15346

(cherry-picked from ae680e8, PR #19388)
vtjnash added a commit that referenced this issue Dec 3, 2016
when merging two frames (such as when handling exception frames),
most of the time new is old, but the computation here is
exponential in the number of variables so we want the inner computation
here to be ultra-fast in the common case

fix #15346

(cherry-picked from ae680e8, PR #19346)
@vtjnash
Copy link
Sponsor Member

vtjnash commented Dec 4, 2016

In my measurements, https://github.com/JuliaLang/julia/tree/jn/frontend-cl-closure-perf is another ~10x faster. Does that close the performance gap for you on this test case?

@andreasnoack
Copy link
Member

Unfortunately, I don't see a difference relative to master.

@vtjnash
Copy link
Sponsor Member

vtjnash commented Dec 4, 2016

Maybe I'm looking at the wrong testcase. What one are you using?

@andreasnoack
Copy link
Member

The one from the top post of this issue https://gist.github.com/anonymous/52b6599f6549c390b0e9.

@vtjnash
Copy link
Sponsor Member

vtjnash commented Dec 4, 2016

I am seeing a net 2x speedup:

~/julia/zipyard$ time ../julia tests.jl
Facts
500 facts verified.

real	0m43.855s
user	0m42.506s
sys	0m0.668s

~/julia/zipyard$ time ../julia tests.jl
Facts
500 facts verified.

real	0m22.240s
user	0m21.681s
sys	0m0.455s

@andreasnoack
Copy link
Member

I ran @IainNZ's updated version (in the same gist) except that I use Base.Test on 0.5 and 0.6. Have you tried that version?

@vtjnash
Copy link
Sponsor Member

vtjnash commented Dec 6, 2016

Ah, OK. That one doesn't have any problem with lowering performance, but does have an issue with inference performance of code with branches. In that case, #19513 should give is a pretty good speedup.

fcard pushed a commit to fcard/julia that referenced this issue Feb 28, 2017
when merging two frames (such as when handling exception frames),
most of the time new is old, but the computation here is
exponential in the number of variables so we want the inner computation
here to be ultra-fast in the common case

fix JuliaLang#15346
vtjnash added a commit that referenced this issue May 8, 2017
when merging two frames (such as when handling exception frames),
most of the time new is old, but the computation here is
exponential in the number of variables so we want the inner computation
here to be ultra-fast in the common case

fix #15346

(cherry-picked from ae680e8, PR #19346)
@dpsanders
Copy link
Contributor Author

On latest master, the time for N tests is still rather nonlinear in N.
It grows faster than a power law; if you fit a power law it's something like N^1.6.

Gist with new version that writes out test times to a data file:
https://gist.github.com/dpsanders/200c10be8ede3259d37cd5000d15d014

screen shot 2018-02-26 at 2 39 15 pm

@KristofferC
Copy link
Sponsor Member

Tried this now and running up to 1000 tests and things still look very linear (and much much faster than the original examples)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind:regression Regression in behavior compared to a previous version performance Must go faster
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants