Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reparameterizing compositional linear equality constraints as linear inequality constraints introduces Sobol sampling bias #903

Closed
sgbaird opened this issue Apr 8, 2022 · 10 comments
Assignees
Labels
question Further information is requested

Comments

@sgbaird
Copy link
Contributor

sgbaird commented Apr 8, 2022

I've been struggling with this concept for a week or two and decided to surface this to the Ax devs in a fresh issue to get some suggestions and as a sanity check.

@bernardbeckerman offered a very useful comment related to removing degenerate search space dimensions that I've been implementing:

Brief drive-by comment regarding the constraints in this problem: @sgbaird correct me if i'm wrong, but you really only have 4 free parameters in this case, since the requirement that they sum to 1 removes a degree of freedom. For this reason, it might be useful for you to set a constraint:

"filler_A + filler_B + resin_A + resin_B <= 1"

and then for each parameterization resin_C = 1 - (filler_A + filler_B + resin_A + resin_B).

I described the issue related to a bias in Sobol sampling in #727 (comment):

I realized that there is an issue with using Sobol sampling and the one-fewer-degrees-of-freedom style compositional constraint, and wanted to make a record of it in this issue.

Take the case of:

x1+x2+x3 == 1

reparameterized to:

x1+x2 <= 1

where x3 == 1 - x1 - x2. When the Sobol candidates are generated, it is completely unaware of the original representation involving x3 during the sampling. If the bounds on x1 and x2 are [0,1], then it will preferentially favor the exploration of higher values of x1 and x2 than that of the unseen x3, if I'm not mistaken.

To illustrate, the bias in Sobol sampling towards the first two parameters might look something like the following, made-up data:

0.60, 0.30, 0.10
0.30, 0.60, 0.10
0.40, 0.40, 0.20
...

In #727 (comment), I mentioned that retaining the original linear equality parameterization is probably the only way to prevent a bias in the Sobol sampling (and to some extent, possibly in the Bayesian iterations); however, this requires passing directly to BoTorch and manually implementing the underlying transforms #769 (comment) @dme65.

For simple use-cases where anything/everything is allowed to range from [0, 1], it might be OK to ignore the transforms. For more complicated cases that I'm dealing with where I have different bounds for parameters (e.g. [0.1, 0.25]) or the constraint isn't relative to 1.0 (e.g. x_1 + x_2 <= 0.25 && x_1 + x2 >= 0.15), I wonder if this will start causing issues.

@sgbaird
Copy link
Contributor Author

sgbaird commented Apr 12, 2022

xref: #786

@Balandat
Copy link
Contributor

I went back to the issue and it seems what you're really trying to do is sample from the unit simplex here? The bias in the sampling is a known issue with the kind of approach you tried out. In BoTorch we actually implement proper sampling form the d-simplex: https://github.com/pytorch/botorch/blob/main/botorch/utils/sampling.py#L270. If the parameters don't appear in other constraints then you can just sample the other dimensions independently and combine the samples (note that that will destroy the quasi-random low-discrepancy structure, but that's probably fine for initialization; doing this kind of QMC sampling properly for non-box domains turns out to be hard / unsolved). If they do appear in other constraints then you can build a box with samples of the other parameters by adding dims to the components and then do rejection sampling based on the constraints.

Hooking this up into Ax would be somewhat challenging / require a good amount of effort since essentially we need to either automatically parse the constraints to infer that this is what the random initial step should be doing, or we have to introduces some new high level abstractions for specifying these constraints (in terms of the ParameterConstraint object and the language used to define constraints in the Service API).

I think the easiest short term fix would be to just manually generate the initial random points using this sample_simple utility and then add them as custom arms to the experiment. Then you can use a standard generation strategy after that.

Does this make sense?

@lena-kashtelyan lena-kashtelyan added the question Further information is requested label Apr 12, 2022
@sgbaird
Copy link
Contributor Author

sgbaird commented Apr 14, 2022

@Balandat This is really helpful, thank you! sample_simplex looks like it will help. I think one clarification is that I'm trying to sample from a simplex embedded in one higher dimension (in the case of 3 components as in the example, this means a triangle embedded in 3D space). I've dealt with PCA transformations and the like before (albeit in a different language), so I should be able to find the transformation from the vertices of the d-1 simplex (i.e. hyperplane) embedded in d to the vertices of the simplex in d-1 getting sampled via sample_simplex, and map the sampled points onto the hyperplane embedded in d. (where d and d-1 referring to dimensions).

It's nice to see sample_polytope as well, thank you for linking to this. Let me know if there is an easier way than what I described above.

There are cases in materials science where it may not necessarily be a unit simplex (e.g. application of domain knowledge that x1 should have tighter/different bounds than x2), at which point I imagine I would convert the vertices to matrix-form inequality constraints and pass them to sample_polytope.

@Balandat
Copy link
Contributor

You can also give [DelaunayPolytopeSampler]9https://github.com/pytorch/botorch/blob/21ce6c7fa9fa907674c37b849e2e5dc683ca2682/botorch/utils/sampling.py#L697-L715) a try - This does uses a pretty cool algorithm to uniformly sample from a general Polytope (it supports equality constraints as well) by subdividing the whole polytope into primitive shapes and then uses their volume to build a two-stage sample process. There is some expensive upfront computation (the computation of the convex hull) but if you need lots of samples this can be worth it. Note though that the complexity grows quickly here so if you are in higher dimensions or have lots of complex constraints this can quickly get intractable.

@lena-kashtelyan
Copy link
Contributor

Closing this issue as inactive; @sgbaird, please reopen it if you do follow up!

@sgbaird
Copy link
Contributor Author

sgbaird commented May 25, 2023

@Balandat follow-up question: Any recommendations for if there are categorical parameters in the search space? Draw randomly? Convert to some numerical representation with equal distance between choices?

@Balandat
Copy link
Contributor

Hmm good question. Sobol (and QMC methods in general) don't really play well with non-box bounds and non-continuous parameters. I would recommend just drawing those parameters independently at random and then stitch those together with the sobol samples of the rest of the space (or just do everything uniformly normal if you don't care too much about the low-discrepancy space filling properties of Sobol).

You can also map the categoricals to integers and then use the approach from https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.qmc.Sobol.integers.html#scipy.stats.qmc.Sobol.integers to get qMC-ish integer draws and then map those back to the categorical parameters.

@sgbaird
Copy link
Contributor Author

sgbaird commented May 30, 2023

@Balandat thanks! I wasn't aware of the integers method from scipy.stats. After looking at the source code, it seems to just be an np.floor(...) method. In terms of setting up the search space and using existing machinery in Ax, maybe it's as simple as:

@Balandat
Copy link
Contributor

So one thing to note is that if you define the parameter as an integer in Ax itself, we'll try to model it as a numerical parameter; i.e. we'll assume that distance in the parameter values is related to distance in function values. Which is probably not what you want want for the modeling if your parameter are truly categorical (and unordered).

So you could generate your custom initial random arms externally that doesn't involve the surrogate modeling, and then go from there in Ax. Or you could alternatively just use random sampling rather than Sobol sampling if the parameter constraints bias the sampling otherwise.

@sgbaird
Copy link
Contributor Author

sgbaird commented May 30, 2023

if you define the parameter as an integer in Ax itself, we'll try to model it as a numerical parameter; i.e. we'll assume that distance in the parameter values is related to distance in function values. Which is probably not what you want want for the modeling if your parameter are truly categorical (and unordered).

Good catch, thanks!

So you could generate your custom initial random arms externally that doesn't involve the surrogate modeling, and then go from there in Ax. Or you could alternatively just use random sampling rather than Sobol sampling if the parameter constraints bias the sampling otherwise.

External generation and attaching manually seems like the way to go. Thank you!

There's one thing I want to clarify. Can the parameter constraints bias the Sobol sampling because Sobol points are generated within the unconstrained design space and then limited to points within the feasible design space via a rejection strategy? (meaning you might get large, systematic gaps along certain "faces")

I imagine that the DelaunayPolytopeSampler you mentioned in a previous comment would circumvent the bias since the parameter constraints are factored into the polygon that gets sampled from.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants