Stan code "canonicalizer" #262

seantalts · 2019-08-16T17:42:03Z

We have a great pretty printer (thanks @palmerlao and @VMatthijs!) for Stan code that attempts to preserve most of the original syntax (well, sans comments for now, oops #93). But as we continue to evolve the Stan language, we deprecate some syntax, add new syntax, and generally discover better ways of writing models. We'd like to provide users with the option to pretty print a canonicalized version of their program. Here are some initial transformations we can provide:

Upgrading deprecated syntax
- <- becomes =
- get_lp() becomes target()
- increment_log_prob(x) becomes target += x, etc
We can automatically add any inferred priors to variables that don't have one specified. So if you don't specify a prior on real<lower=0, upper=2> sigma; then we can add sigma ~ uniform(0, 2); to the model block. Hopefully this helps folks realize when they've forgotten a prior and draws attention to that fact (it's often a troublesome prior, but it's the default behavior).

Eventually, we will also start breaking backwards compatibility for Stan 3. Then we'll need to keep a copy of the Stan 2 parser and AST around so we can transform it to a Stan 3 AST and pretty print that. The canonicalizer will be helpful for that.

The text was updated successfully, but these errors were encountered:

nhuurre · 2019-11-18T18:55:27Z

Does it matter whether the canonicalizer operates on the typed or untyped AST? --auto-format doesn't type-check and one deprecated construct (explicit _lpdf suffix in a sampling statement) causes a typing error. Is there anything in the typed AST that helps when canonicalizing?

AST contains explicit parenthesis. Should they be "normalized" in some way?

Inferred priors is a cool idea but what about all the cases where they're improper? Infinite bounds are allowed (just means unbounded) so thoughtless addition of ~ uniform(L, U) could break a model. Then there are unit_vector and cholesky_factor_corr types whose implicit priors are proper but not expressible as Stan Math builtin distributions.

There are uncomputably many ways inferring priors doesn't work. For instance

parameters {
  real<lower=0> x;
  real<lower=0,upper=x> y;
} model {
  x ~ std_normal();
}

Here y is not drawn from uniform(0, x). Rather, the model behaves as if y were drawn from uniform(0, inf) independent of x but only keeping the joint draw if y happened to be less than x.

seantalts · 2019-11-18T19:30:00Z

Is there anything in the typed AST that helps when canonicalizing?

Gut here would be that there could be some type information that helps for certain kinds canonicalizing transformations. For example, perhaps we want to partially evaluate some expressions and replace them with simpler versions in e.g. indexing.

AST contains explicit parenthesis. Should they be "normalized" in some way?

I think the way to go for all of these decisions would be for someone to put in work towards some mildly-configurable set of rules that they think is reasonable and then get feedback from the community. For this I might expect to remove double parenthesis but leave any single ones specified by the user. Alternatively we could remove any that are unnecessary according to operator precedence rules, but that could lead to some code being harder to read.

There are uncomputably many ways inferring priors doesn't work.

None of these rules need 100% coverage to be valuable for the community. If special cases like these become important to someone, they can leverage the canonicalization framework to add a special case rule for this that does actually print out some code.

One issue here is that Stan Math for some reason doesn't define uniform with either of the bounds being non-finite. If we added support for that, would there be a way to express that implicit behavior as Stan code?

seantalts added the good first issue Good for newcomers label Aug 16, 2019

nhuurre mentioned this issue Nov 19, 2019

Stan code canonicalizer #390

Merged

rok-cesnovar closed this as completed Jun 3, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stan code "canonicalizer" #262

Stan code "canonicalizer" #262

seantalts commented Aug 16, 2019

nhuurre commented Nov 18, 2019

seantalts commented Nov 18, 2019

Stan code "canonicalizer" #262

Stan code "canonicalizer" #262

Comments

seantalts commented Aug 16, 2019

nhuurre commented Nov 18, 2019

seantalts commented Nov 18, 2019