-
Notifications
You must be signed in to change notification settings - Fork 96
fix: Memory leak #282
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Memory leak #282
Conversation
import ConfigSpace.hyperparameters as CSH
import numpy as np
rnd = np.random.RandomState(19937)
# This gets to the for loop before hanging
#a = CSH.UniformIntegerHyperparameter('a', lower=1, upper=2147483647, log=True)
# This hangs before the prints
a = CSH.NormalIntegerHyperparameter('a', mu=10, sigma=500, lower=1, upper=2147483647, log=True)
print(a, flush=True)
print(rnd, flush=True)
for i in range(1, 10000):
a.get_neighbors(0.031249126501512327, rnd, number=8, std=0.05)See #283 for the cause |
|
Possibly unrelated error. Doesn't cause memory to explode but it's stuck in an endless loop that can't be killed with a KeyboardInterupt (Ctrl+c). import ConfigSpace.hyperparameters as CSH
import numpy as np
rnd = np.random.RandomState(19937)
a = CSH.NormalIntegerHyperparameter('a', mu=10, sigma=500, lower=1, upper=1000, log=True)
for i in range(1, 10000):
a.get_neighbors(0.031249126501512327, rnd, number=8) |
|
Back to the original memory overflow: number = 5 # slow but fine
for i in range(1, 10000):
a.get_neighbors(0.031249126501512327, rnd, number=number, std=0.05)
number = 6 # Suddenly blows up memory and unkillable process
for i in range(1, 10000):
a.get_neighbors(0.031249126501512327, rnd, number=number, std=0.05) |
When querying a large range for a UniformIntegerHyperparameter with a small std.deviation and log scale, this could cause an infinite loop as the reachable neighbors would be quickly exhausted, yet rejection sampling will continue sampling until some arbitrary termination criterion. Why this was causing a memory leak, I'm not entirely sure. The solution now is that is we have seen a sampled value before, we simply take the one "next to it".
Replaced usages of arange with a chunked version to prevent memory blowup. However this is still incredibly slow and needs a more refined solution as a huge amount of values are required to be computed for what can possibly be analytically derived.
Codecov ReportBase: 67.64% // Head: 67.97% // Increases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## main #282 +/- ##
==========================================
+ Coverage 67.64% 67.97% +0.32%
==========================================
Files 24 25 +1
Lines 1768 1786 +18
==========================================
+ Hits 1196 1214 +18
Misses 572 572
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
| def get_num_neighbors(self, value = None) -> int: | ||
| return self.upper - self.lower | ||
|
|
||
| def get_neighbors( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this new implementation. It is very lean and we should try to use it for the other hyperparameters, too (without the rounding of course).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought I would just deal with memory issues for now, it's quite possible we could unite the neighbor generating algorithm into one lean function and not have many similar implementations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I agree. This should be a separate PR. I will leave this open to remember to open an issue on this when this PR is done.
|
I fixed the compiler directives to actually by active |
mfeurer
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This now looks great and I'd be happy to merge it. Unfortunately, it's a bit slower generating neighbors for the auto-sklearn search space. Do you think you could re-add the cython annotations to make this fast again?
|
RE Windows/wraparound: you could change the flag for that one file? |
ConfigSpace/util.pyx
Outdated
| # OPTIM: To prevent large memory allocations, we place an upperbound on the maximum | ||
| # amount of neighbors to be sampled | ||
| MAX_NEIGHBORHOOD = 10_000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was an issue here where the number of requested neighbors for a UniformIntegerHyperparameter was set to the full possible range. This would firstly blow up memory as it would try to generate every possible neighbor and secondly, only a fraction of these were used.
This constant can be seen in use later on in this PR.
ConfigSpace/util.pyx
Outdated
| neighbors = finite_neighbors_stack.get(hp_name, []) | ||
| if len(neighbors) == 0: | ||
| if isinstance(hp, UniformIntegerHyperparameter): | ||
| _n_neighbors = min(n_neighbors_per_hp[hp_name], MAX_NEIGHBORHOOD) | ||
| neighbors = hp.get_neighbors( | ||
| value, random, | ||
| number=n_neighbors_per_hp[hp_name], std=stdev, | ||
| value, | ||
| random, | ||
| number=_n_neighbors, | ||
| std=stdev |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a line further below this block neighbors.pop(). After fixing an issue with UniformIntegerHyperparamter::get_num_Neighbors, this cause the benchmark to fail. There was some parameter with only 3 possible neighbors which must have been sampled a few times, causing it to run out of neighbors during this procedure.
Now it will just generate a new set of neighbors if it has run out from all of the neighbors.pop
| # If there is a value in the range, then that value is not a neighbor of itself | ||
| # so we need to remove one | ||
| if value is not None and self.lower <= value <= self.upper: | ||
| return self.upper - self.lower - 1 | ||
| else: | ||
| return self.upper - self.lower |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't sure how to handle this but I thought it should act similar to Categorical and that the value would count as a neighbor. However Categorical just implies it implicitly, regardless of the value passed in. Here, I've said that any value in the range can not be a neighbor of itself.
|
See above for some more small bits and peices. As for the timings, it seems to be back to it's original for me: On average,
Let's talk about this on Monday but short answer is I'm not sure how to handle this properly because it appears tldr; I think this is a pre-existing error for large integers on windows and was silently eaten. Update: |
This truncnorm has some slight overhead due to however scipy generates its truncnorm distribution, however this overhead is considered worth it for the sake of readability and understanding
* test: Add reproducing test * fix: Make sampling neighbors form uniform Int stable * fix: Memory leak with UniformIntegerHyperparameter When querying a large range for a UniformIntegerHyperparameter with a small std.deviation and log scale, this could cause an infinite loop as the reachable neighbors would be quickly exhausted, yet rejection sampling will continue sampling until some arbitrary termination criterion. Why this was causing a memory leak, I'm not entirely sure. The solution now is that is we have seen a sampled value before, we simply take the one "next to it". * fix: Memory issues with Normal and Beta dists Replaced usages of arange with a chunked version to prevent memory blowup. However this is still incredibly slow and needs a more refined solution as a huge amount of values are required to be computed for what can possibly be analytically derived. * chore: Update flake8 * fix: flake8 version compatible with Python 3.7 * fix: Name generators properly * fix: Test numbers * doc: typo fixes * perf: Generate all possible neighbors at once * test: Add test for center_range and arange_chunked * perf: Call transform on np vector from rvs * perf: Use numpy `.astype(int)` instead of `int` * doc: Document how to get flamegraphs for optimizing * fix: Allow for negatives in arange_chunked again * fix: Change build back to raw Extensions * build: Properly set compiler_directives * ci: Update makefile with helpful commands * ci: Fix docs to install build * perf: cython optimizations * perf: Fix possible memory leak with UniformIntegerHyperparam * fix: Duplicates as `list` instead of set * fix: Convert to `long long` vector * perf: Revert clip to truncnorm This truncnorm has some slight overhead due to however scipy generates its truncnorm distribution, however this overhead is considered worth it for the sake of readability and understanding * test: Test values not match implementation * Intermediate commit * INtermediate commit 2 * Update neighborhood generation for UniformIntegerHyperparameter * Update tests * Make the benchmark sampling script more robust * Revert small change in util function * Improve readability Co-authored-by: Matthias Feurer <[email protected]>
A core issue addressed was the following setup:
get_neighborswith a small std. deviation.This resulted in rejection sampling never finding enough valid neighbors in its Gaussian from which to sample.
The solution was to simply find the next closest neighbor that was not sampled from yet. Iteratively checking around that number until one was found. This also respects the boundaries.
I'm not sure if this is "correct" but I don't see another viable solution.
Another issue came up, as seen in the comment below in that the
NormalIntegerHyperparamater(and later diagnosed to also be a problem forBetaIntegerHyperparameter) is that the_compute_normalizationandget_max_densityrequire computing the pdf for every possible int value. This caused memory to blow up in the case where the range of possible values was incredibly large, for example every possible int32 value.To combat this, I implemented an
arange_chunkedwhich functions similarly toarangebut yields sub-chunks. This was possible because thesumandmaxoperations are possible over partial chunks and do not require the fullarangeto be in memory at once.This is still incredibly slow, calculating the
pdfandmaxdensity over this entire range and it's likely that an analytical solution is possible, as we deal with subsequent numbers.This is documented in #283
It's also quite difficult to work with this codebase given the
.pyxdoesn't allow for editors to be very smart (for example jump-to-defintion). Using normal text search is also frustrating due to all classes sharing method names and being in one file. Just voicing again that converting this back to pure python and splitting up the files would make working with ConfigSpace easier. Any performance issues which originally motivated the switch are likely solvable within python itself as numpy can do the heavy lifting in C.