You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
While evaluating MPO I got some strange raise ValueError("x0 violates bound constraints.").
They originate in this line. However I now implemented a "clamping" with np.max([self.η,1e-6]).
According to their code to check for the bound constraints, this should be totally fine.
But I keep getting this error from time to time and the training for the algorithm completely stops as it errors out.
Lines from the corresponding file scipy/optimize/_numdiff.py:
def _prepare_bounds(bounds, x0):
"""
Prepares new-style bounds from a two-tuple specifying the lower and upper
limits for values in x0. If a value is not bound then the lower/upper bound
will be expected to be -np.inf/np.inf.
Examples
--------
>>> _prepare_bounds([(0, 1, 2), (1, 2, np.inf)], [0.5, 1.5, 2.5])
(array([0., 1., 2.]), array([ 1., 2., inf]))
"""
lb, ub = [np.asarray(b, dtype=float) for b in bounds]
if lb.ndim == 0:
lb = np.resize(lb, x0.shape)
if ub.ndim == 0:
ub = np.resize(ub, x0.shape)
return lb, ub
Any idea to this? Not really an algorithm related question but for me this seems strange.
The text was updated successfully, but these errors were encountered:
Please tell me the followings.
What version of scipy and numpy are you using? In my case scipy==1.6.3 and numpy==1.20.3.
Does it occur at the LunarLanderContinuous-v2 example? https://github.com/vinerich/mpo
I think if you want to clamp, this should be like np.clip(self.η, -1e-6, 1e-6), not np.max([self.η,1e-6]).
I was wrong, np.max([self.η,1e-6]) makes sense.
I looked over to both issues mentioned and I experience the warning mentioned by scipy/scipy#13277 frequently. So it seems working.
But sometimes it still gives me above error. Sadly I can't reproduce this as I don't had the proper logging setup and it only occurs roughly once every ~4 million timesteps.
I will check onto the LunarLanderContinous and let it running for a day or so and report back.
Hey again.
While evaluating MPO I got some strange
raise ValueError("
x0violates bound constraints.")
.They originate in this line. However I now implemented a "clamping" with
np.max([self.η,1e-6])
.According to their code to check for the bound constraints, this should be totally fine.
But I keep getting this error from time to time and the training for the algorithm completely stops as it errors out.
Lines from the corresponding file
scipy/optimize/_numdiff.py
:Bounds are prepared like this:
Any idea to this? Not really an algorithm related question but for me this seems strange.
The text was updated successfully, but these errors were encountered: