-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Precompile nested functions #29
Comments
Probably we should take a look in |
Numba is making more damages than giving advantages... @scarrazza @scarlehoff have you any advice?
|
Without knowing what functions are you actually trying to compile: As a general rule, try to have a function that take as argument as many things as you can and compile that one and pass everything as an argument. Another possible pitfall is having intermediate objects that are not allowing numba to fully compile (basically, complicated python objects, where complicated I think starts at dictionary, can't remember right now though). For more information I guess I'd have to read the code you are trying to numba. |
Thanks, we thought that this could be the issue, and we will try to address in some way. The main problem is that we are not really calling |
Ah, right, there the structure/organization whatever of what is compiled and when is done thinking on the specific case for eko, which might be highly unoptimal here. |
Well, I think not really - we're dealing with |
A quick way is checking where (when used as a decorator it wouldn't be much of a problem because I think we only use it like that in a few functions that are very basic, nothing fancy there) |
@scarlehoff we made a try and we discover that we are compiling once per
|
I think I put a flag to avoid compilation? Or maybe I just thought about putting it... I did it for vegasflow for sure :_ In any case, if compilation is a problem parallelizing just creates a problem in every thread, the right thing is not to compile. (maybe I'm failing to understand the issue, but looking at the benchmark you linked before it looks like what you want is to avoid the compile, lowering the time from 1 minute will create more problems that it will solve) |
relevant timings are:
so our current setup is
or more general, if and only if, we have n_obs/n_bf > 5/3 it is worth to run numba The common number of data points in a DIS only PDF fit are ~ 3000 |
Ok, so you are "losing" time in the trial benchmark but "in real life" you would prefer to compile. Is that it? If that's the case I would just avoid compilation during development (but keeping it in tests to ensure it works) |
Ok, if all those points are going to be included in the relevant fits our threshold is for xgrids' size about 2000, so we can safely say that if you are going to use an xgrid with less than 1000 points it's useful to make numba compiling |
We updated the convolution function, using some clever caching in order to speed up. Because of this we also moved the trade off value, mainly because we optimized the number of calls to The trade off report updated:
Then the new trade off is: n_obs/n_bf > 5 Conclusion: still convenient, but exactly 3 times less... |
Ok, the discussion on compiling We concluded that with the current structure of the code is not possible to compile coefficient function inside python, because it's depending on external objects ( @scarlehoff if you have any idea about compiling sums and products of functions with |
Because it's python not js! In any case, as I said, if timing is not a problem over-optimizing is not worth it. *personally I'm ignoring it for now but making eko run in GPU is in the "cool things to do" of my TODO list (cc @scarrazza, they claim to be compatible with AMD cards) ** @felixhekhorn (I'm using github as a slack chat at this point) about eko, maybe we need to have a chat at some point to decide priorities, whether we want to keep it on the side for now or whether we want to start pushing forward to have it for the summer meeting (I'll be busy for a few weeks on n3fit related stuff but will be much freer afterwards) |
Probably if at the end speed will be an issue the solution of inlining lambdas is working out a proper computational graph with a suitable library. Does this mean |
Since we are using a bunch of lambdas and functions calls in other functions (cf. all the full DistributionVec API) maybe at some point this can slow down the execution.
Instead of find a way to avoid this nesting and so on, rewriting stuff in a proper fashion with even more tricks, it would be nice to find a way to precompile the functions before using them (e.g. for integration in
scipy.integrate.quad
where a lot of functions' calls are involved).Of course, if needed, this can involve rewriting the functions definition in a proper fashion, adding suitable decorator or whatsoever, but the goal is to avoid to change critically the structure because of this.
My idea of precompiling: of course I'm not an expert, otherwise I would have already figured out the proper way, but in the worst scenario means define a function that takes another one as input and go through the calls collecting everything in a single expanded expression, to be evaluated again as python code (hopefully an external library can do this for us, I don't know if
numba
or what else)The text was updated successfully, but these errors were encountered: