-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement flexible shape/dims/size API #4625
Conversation
fa43a1a
to
bd348fb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This introduces a considerable amount of complexity into Distribution.__new__
; the shape
logic really needs its own function/method, at the very least.
Also, this PR unnecessarily reinstates the use of shape
internally (e.g. numerous change from size
to shape
in the tests), which is the opposite of what we set out to do.
Caution: The API from x = aesara.shared(np.array([10,20,30,40]))
n = atr.normal(loc=x, scale=0.1, size=(2,4)) # size=(2,) does not work! In PyMC3 this will equivalent to "inflexible" use of with pm.Model():
x = aesara.shared(np.array([10,20,30,40]))
pm.Normal("n1", mu=x, sd=0.1, shape=(2,4))
pm.Normal("n2", mu=x, sd=0.1, size=(2,)) |
f6c4602
to
d3fde48
Compare
With the latest push the
Passing a non-Ellipsis shape like A
No
Before we invest too much time into |
💯 |
To be clear, this is not a change that we should make in Aesara. Dimension names are completely unrelated to the concerns of Aesara and we absolutely must keep the objects in that library as simple and efficient as possible. That does not mean that we cannot consider creating a custom For instance, why can't we just use |
These are superfluous and oddly strict. NumPy doesn't even complain about this; why should PyMC? |
I agree, I don't think a warning is required here. |
da76513
to
4c60766
Compare
Okay, this leaves us with no
|
66d5fa7
to
5da6006
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These improvements on dims are amazing and I personally think most user code will soon use mostly dims instead of either of shape or size.
e478bce
to
9afc5ef
Compare
Also refactor into properties to add docstrings and type annotations. And no longer allow InferenceData conversion without a Model on stack. Co-authored-by: Oriol Abril Pla <[email protected]>
5b3acb5
to
f4d8246
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a lot of unnecessary shape
keyword changes in unrelated tests that need to be reverted. As well, there seem to be some similarly unrelated non-shape
test changes that need to be reverted.
Otherwise, the logic in Distribution.[__new__|dist]
is still more complicated than it needs to be and should be better organized and distributed between the Model
and Distribution
classes.
Under these changes, Distribution
is taking on steps that are—and were—performed in Model
, and this is likely the cause of the current failures. Without an unusually compelling reason for this previously Model
-specific logic to be relocated to Distribution
, these changes are not sound.
For instance, the steps in Distribution
that are performed based on the value of observed
are actually Model
-exclusive, because observations are a Model
-only concept and feature. By moving—or adding—parts of that logic into Distribution
, we fragment our logic and violate the abstractions that underlie these classes. The end result is code that's harder to follow—because the underlying abstractions are much less relevant to the actual implementation—and exceedingly more difficult to maintain and extend, due to a less manageable back-and-forth between the two classes.
In the end, these changes should only require a single call in Distribution.__new__
to a function that—at maximum—takes the size
, shape
, dims
, rv_op
, and distribution parameters as arguments, determines the appropriate size
to be passed to Distribution.dist
, and that's all.
This function could be a method provided by Model
, which would make it reasonable to also perform any relevant Model
updates that involve dims
within it.
The only step that seems like it could/should be in Distribution
is the application of specify_shape
when shape
is used, but I'm not seeing that in this diff.
if dims is not None: | ||
if Ellipsis in dims: | ||
# Auto-complete the dims tuple to the full length | ||
dims = (*dims[:-1], *[None] * rv_out.ndim) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the RandomVariable
created first, and then resized, only because rv_out.ndim
might be needed in this step?
Regardless, rv_out.ndim
is obtainable without first creating the underlying RandomVariable
.
pymc3/distributions/distribution.py
Outdated
n_resize = len(dims) - n_implied | ||
|
||
if "shape" in kwargs: | ||
raise DeprecationWarning("The `shape` keyword is deprecated; use `size`.") | ||
# All resize dims must be known already (numerically or symbolically). | ||
unknown_resize_dims = set(dims[:n_resize]) - set(model.dim_lengths) | ||
if unknown_resize_dims: | ||
raise KeyError( | ||
f"Dimensions {unknown_resize_dims} are unknown to the model and cannot be used to specify a `size`." | ||
) | ||
|
||
transform = kwargs.pop("transform", UNSET) | ||
# The numeric/symbolic resize tuple can be created using model.RV_dim_lengths | ||
resize = tuple(model.dim_lengths[dname] for dname in dims[:n_resize]) | ||
elif observed is not None: | ||
if not hasattr(observed, "shape"): | ||
observed = pandas_to_array(observed) | ||
n_resize = observed.ndim - n_implied | ||
resize = tuple(observed.shape[d] for d in range(n_resize)) | ||
|
||
if resize: | ||
# A batch size was specified through `dims`, or implied by `observed`. | ||
rv_out = change_rv_size(rv_var=rv_out, new_size=resize, expand=True) | ||
|
||
if dims is not None: | ||
# Now that we have a handle on the output RV, we can register named implied dimensions that | ||
# were not yet known to the model, such that they can be used for size further downstream. | ||
for di, dname in enumerate(dims[n_resize:]): | ||
if not dname in model.dim_lengths: | ||
model.add_coord(dname, values=None, length=rv_out.shape[n_resize + di]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, this logic needs to be factored out into a simple and clear function that—for instance—computes an appropriate value of size
(to be passed to Distribution.dist
) using values from dims
or shape
.
This is especially important given that this code also updates the model
's coordinates, making it easily more relevant as a method in Model
.
In other words, it's best to keep the Model
-specific logic in Model
, and not spread out across Distribution
and Model
.
Also, the steps related to observed
are apparently duplicating some of the steps taken later in Model.make_obs_var
. This might be yet another good reason for moving this logic into Model
.
# Create the RV without specifying size or testval. | ||
# The size will be expanded later (if necessary) and only then the testval fits. | ||
rv_native = cls.rv_op(*dist_params, size=None, **kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This demonstrates a way to obtain the symbolic shape for a RandomVariable
without actually creating one.
@@ -982,7 +982,7 @@ def test_linalg(self, caplog): | |||
a = Normal("a", size=2, testval=floatX(np.zeros(2))) | |||
a = at.switch(a > 0, np.inf, a) | |||
b = at.slinalg.solve(floatX(np.eye(2)), a) | |||
Normal("c", mu=b, size=2, testval=floatX(np.r_[0.0, 0.0])) | |||
Normal("c", mu=b, shape=(2,), testval=floatX(np.r_[0.0, 0.0])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Normal("c", mu=b, shape=(2,), testval=floatX(np.r_[0.0, 0.0])) | |
Normal("c", mu=b, size=2, testval=floatX(np.r_[0.0, 0.0])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as with other examples: The way size
was used before the testval
shape does not match. Like with other examples, this was likely an unintended change of behavior from the original test when it was switched to size
.
def build_model(self, distfam, params, *, size=None, shape=None, transform=None, testval=None): | ||
if testval is not None: | ||
testval = pm.floatX(testval) | ||
with pm.Model() as m: | ||
distfam("x", size=size, transform=transform, testval=testval, **params) | ||
distfam("x", size=size, shape=shape, transform=transform, testval=testval, **params) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All of the changes in this module need to be reverted; they have nothing to do with the shape
or dims
feature, nor should they.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All of the changes in this module are because many of these tests are actually parametrized by shape
.
Consider this test case from before the change:
@pytest.mark.parametrize(
"a,b,size",
[
(1.0, 1.0, (2,)), # scalar is implied; 2 replicate dimensions
(np.ones(3), np.ones(3), (4, 3)), # (3,) is implied; (4, 3) replicate dimensions
],
)
def test_beta_ordered(self, a, b, size):
testval = np.sort(np.abs(np.random.rand(*size))) # but the testval is at most 2-dimensional
model = self.build_model(
pm.Beta,
{"alpha": a, "beta": b},
size=size, # in the second parametrize case the RV is expanded to (4, 3, 3)
testval=testval, # but the testval is only (4, 3)
transform=tr.Chain([tr.logodds, tr.ordered]),
)
self.check_vectortransform_elementwise_logp(model, vect_opt=0)
Comparing with master
we can see that the test was up to 2-dimensional to begin with. The change to size
had changed the behaviour of the test - I'm just reverting it to what was originally intended & tested.
https://github.com/pymc-devs/pymc3/blob/2dee9dafb3541043ae23b66fe9262e70fbd2769a/pymc3/tests/test_transforms.py#L405-L421
Yes, the test could be refactored to use size
, by either replicating the shape
←→size
logic, or parametrizing the test with both shape
and size
manually. That would be a lot of unnecessary work and make the tests harder to work with!
Since this PR is about restoring shape
as a backwards-compatible and fully supported way to parametrize RVs I do not see the point in making our lives harder by notoriously avoiding it in tests like this one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked the testval
s for this test in v3
and v4
and they're both the same. The symbolic types for the resulting variables are also the same between v3
and v4
, so I don't understand what's changed here—aside from the switch to size
.
More importantly, this is a scalar variable, so size
and shape
should be the same. If they're not, then that's an altogether different issue that needs to be addressed separately.
Regardless, the change here is still an unnecessary reversion that forces these tests to use an irrelevant and newly added feature.
Always treats `size` as being in addition to dimensions implied by RV parameters. All resizing beyond parameter-implied dimensionality is done from: - `shape` or `size` in `Distribution.dist()` - `dims` or `observed` in `Distribution.__new__` and only in those two places. Closes pymc-devs#4552.
The test about #4652 It shouldn't have done that - looks like https://github.com/pymc-devs/pymc3/blob/v4/pyproject.toml#L2 isn't working. With respect to the other changes... Reasons for why some tests were changed are thoroughly commented above (see threads). Regarding other options to obtain implied dimensionality information, I'd like to quote @brandonwillard:
So that's how I implemented it. There might be other ways, but given that it's cheap, that it works and that it restores the important backwards compatibility while bringing more flexibility I don't think we should delay this PR any further. I think the shape/dims/size/size_from_observed logic from the two So from my point of view this PR is all good cc @brandonwillard @OriolAbril @twiecki |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mostly agree with @michaelosthege last comment, but my opinion on model/distribution topics probably should not matter much as I only have a loose understanding their internals.
I do want to add an extra point on that however, as a relative sideliner. I feel that without a clear/detailed roadmap (that I know of) and with things like pymc-devs/pytensor#352 on the air, I am not sure it's possible at this point to know what is the correct level of abstraction for each thing. There seem to be arguments to move dims
in either direction, towards dist
and a better integration with Aesara or towards Model
.
And last but definitely not least, I want to say that the more I see about the capabilities that v4
is improving/adding/extending the more amazed I am with it. Thanks to everyone that is pushing hard on this and leading the effort.
I agree, I think this looks great. Agree with @OriolAbril that at this point it's not clear what the right level of abstraction is and I don't see a clear winner. We can always revisit when we find out later that we can improve things. |
Terminology
shape,dims,size,observed
.size
of an RV are all dimensions in addition to (preceeding) the implied dimensions.Key Takeaways
shape
can be numeric/symbolic and supportsEllipsis
in the last position. WithoutEllipsis
aSpecifyShape
Op
is added automatically.dims
can beNone
and are defined at first mention.Ellipsis
is also supported. In combination withpm.Data
the dimensions specified throughdims
are re-sizeable.size
does care about implied dimensions in any way and just adds new dimensions to the RV. Internallysize
is inferred fromshape
anddims
.Model.make_obs_var
because it leaves code inDistribution.__new__
not knowing the output shape.ToDo
Distribution.dist()
andDistribution.__new__
are similar in use (dims
andobserved
only through__new__
)size
/shape
appropriately -- some tests illegally usedsize
to specify size and implied dimensions.ToDo (optional)
dims
in combination withsize
. We should just require a single source of truth (dim names must be undefined for dimensions thatsize
integers apply to).