Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question: Minimization of dual function #10

Open
vinerich opened this issue May 17, 2021 · 2 comments
Open

Question: Minimization of dual function #10

vinerich opened this issue May 17, 2021 · 2 comments

Comments

@vinerich
Copy link

Hey again.

While evaluating MPO I got some strange raise ValueError("x0 violates bound constraints.").
They originate in this line. However I now implemented a "clamping" with np.max([self.η,1e-6]).

According to their code to check for the bound constraints, this should be totally fine.
But I keep getting this error from time to time and the training for the algorithm completely stops as it errors out.

Lines from the corresponding file scipy/optimize/_numdiff.py:

    if np.any((x0 < lb) | (x0 > ub)):
        ("`x0` violates bound constraints.")

Bounds are prepared like this:

def _prepare_bounds(bounds, x0):
    """
    Prepares new-style bounds from a two-tuple specifying the lower and upper
    limits for values in x0. If a value is not bound then the lower/upper bound
    will be expected to be -np.inf/np.inf.

    Examples
    --------
    >>> _prepare_bounds([(0, 1, 2), (1, 2, np.inf)], [0.5, 1.5, 2.5])
    (array([0., 1., 2.]), array([ 1.,  2., inf]))
    """
    lb, ub = [np.asarray(b, dtype=float) for b in bounds]
    if lb.ndim == 0:
        lb = np.resize(lb, x0.shape)

    if ub.ndim == 0:
        ub = np.resize(ub, x0.shape)

    return lb, ub

Any idea to this? Not really an algorithm related question but for me this seems strange.

@daisatojp
Copy link
Owner

daisatojp commented May 18, 2021

Hmm, it seems this should not happen in latest version of scipy as discussed in scipy/scipy#11403 and scipy/scipy#13277.

Please tell me the followings.
What version of scipy and numpy are you using? In my case scipy==1.6.3 and numpy==1.20.3.
Does it occur at the LunarLanderContinuous-v2 example? https://github.com/vinerich/mpo

I think if you want to clamp, this should be like np.clip(self.η, -1e-6, 1e-6), not np.max([self.η,1e-6]).
I was wrong, np.max([self.η,1e-6]) makes sense.

@vinerich
Copy link
Author

scipy=1.6.3
numpy=1.20.2

I looked over to both issues mentioned and I experience the warning mentioned by scipy/scipy#13277 frequently. So it seems working.

But sometimes it still gives me above error. Sadly I can't reproduce this as I don't had the proper logging setup and it only occurs roughly once every ~4 million timesteps.

I will check onto the LunarLanderContinous and let it running for a day or so and report back.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants