Recursion problems in `make_initial_point_expression` #5168

michaelosthege · 2021-11-10T10:28:39Z

Description of your problem

I have a rather big model that's taking ages to compile the initial point function, eventually running into a MemoryError after ~45 minutes.

The model involes a latent GP and lots of subindexing.

UPDATE: It happens also without the GP.

Please provide a minimal, self-contained, and reproducible example.
#5168 (comment)

Please provide the full traceback.

Complete error traceback

c:\users\osthege\repos\pymc-main\pymc\sampling.py in sample(draws, step, init, n_init, initvals, trace, chain_idx, chains, cores, tune, progressbar, model, random_seed, discard_tuned_samples, compute_convergence_checks, callback, jitter_max_retries, return_inferencedata, idata_kwargs, mp_ctx, **kwargs)
    480             # By default, try to use NUTS
    481             _log.info("Auto-assigning NUTS sampler...")
--> 482             initial_points, step = init_nuts(
    483                 init=init,
    484                 chains=chains,

c:\users\osthege\repos\pymc-main\pymc\sampling.py in init_nuts(init, chains, n_init, model, seeds, progressbar, jitter_max_retries, tune, initvals, **kwargs)
   2191     ]
   2192 
-> 2193     initial_points = _init_jitter(
   2194         model,
   2195         initvals,

c:\users\osthege\repos\pymc-main\pymc\sampling.py in _init_jitter(model, initvals, seeds, jitter, jitter_max_retries)
   2060     """
   2061 
-> 2062     ipfns = make_initial_point_fns_per_chain(
   2063         model=model,
   2064         overrides=initvals,

c:\users\osthege\repos\pymc-main\pymc\initial_point.py in make_initial_point_fns_per_chain(model, overrides, jitter_rvs, chains)
    102         # Only one function compilation is needed.
    103         ipfns = [
--> 104             make_initial_point_fn(
    105                 model=model,
    106                 overrides=overrides,

c:\users\osthege\repos\pymc-main\pymc\initial_point.py in make_initial_point_fn(model, overrides, jitter_rvs, default_strategy, return_transformed)
    165     overrides = convert_str_to_rv_dict(model, overrides or {})
    166 
--> 167     initial_values = make_initial_point_expression(
    168         free_rvs=model.free_RVs,
    169         rvs_to_values=model.rvs_to_values,

c:\users\osthege\repos\pymc-main\pymc\initial_point.py in make_initial_point_expression(free_rvs, rvs_to_values, initval_strategies, jitter_rvs, default_strategy, return_transformed)
    318     for i in range(n_variables):
    319         outputs = [initial_values_clone[i], initial_values_transformed_clone[i]]
--> 320         graph = FunctionGraph(outputs=outputs, clone=False)
    321         graph.replace_all(zip(free_rvs_clone[:i], initial_values), import_missing=True)
    322         initial_values.append(graph.outputs[0])

~\AppData\Local\Continuum\miniconda3\envs\CARenv\lib\site-packages\aesara\graph\fg.py in __init__(self, inputs, outputs, features, clone, update_mapping, memo, copy_inputs, copy_orphans)
    165 
    166         for output in outputs:
--> 167             self.import_var(output, reason="init")
    168         for i, output in enumerate(outputs):
    169             self.clients[output].append(("output", i))

~\AppData\Local\Continuum\miniconda3\envs\CARenv\lib\site-packages\aesara\graph\fg.py in import_var(self, var, reason, import_missing)
    335         # Imports the owners of the variables
    336         if var.owner and var.owner not in self.apply_nodes:
--> 337             self.import_node(var.owner, reason=reason, import_missing=import_missing)
    338         elif (
    339             var.owner is None

~\AppData\Local\Continuum\miniconda3\envs\CARenv\lib\site-packages\aesara\graph\fg.py in import_node(self, apply_node, check, reason, import_missing)
    379         # input set.  (The functions in the graph module only use the input set
    380         # to know where to stop going down.)
--> 381         new_nodes = io_toposort(self.variables, apply_node.outputs)
    382 
    383         if check:

~\AppData\Local\Continuum\miniconda3\envs\CARenv\lib\site-packages\aesara\graph\basic.py in io_toposort(inputs, outputs, orderings, clients)
   1160             else:
   1161                 todo.append(cur)
-> 1162                 todo.extend(i.owner for i in cur.inputs if i.owner)
   1163         return order
   1164 

MemoryError:

Please provide any additional information below.

Interestingly I can do .eval() on the likelihoods in the model just fine.

Any ideas?

Versions and main components

PyMC/PyMC3 Version: latest main
Aesara/Theano Version: 2.2.6
Python Version: 3.8.10
Operating system: Windows 10

The text was updated successfully, but these errors were encountered:

ricardoV94 · 2021-11-10T10:53:18Z

Can you try to compile the initial point function in FAST_RUN?

michaelosthege · 2021-11-10T12:01:17Z

It doesn't even get to compilation. The error is already in the creation of the expression.

The model also involves a set_subtensor. Could it be that the dependency between these nodes is not properly resolved? ~~The model_to_graphviz is also missing the link from P0 and P_in_R to P.~~

mask_RinRID # is a boolean numpy array
P_in_R # is a matrix random variable
P0 # is a vector pm.Data

P = at.empty(
    shape=(190, 5),
    dtype=aesara.config.floatX
)
P = at.set_subtensor(P[mask_RinRID, :], P_in_R)
P = at.set_subtensor(P[~mask_RinRID, :], P0[~mask_RinRID, None])
P = pm.Deterministic("P", P)

Until yesterday I was running this model with PyMC main from c06c2f4 which was before #4983 was merged.
The set_subtensor stuff was already part of the model and worked with there.
Other than making the model slightly more complex, this is the only relevant difference I can find right now.

ricardoV94 · 2021-11-10T12:07:07Z

By the way, can you also do prior predictive sampling?

michaelosthege · 2021-11-10T12:10:04Z

Yes, prior predictive sampling works just fine.

michaelosthege · 2021-11-10T14:13:20Z

Here's an example.

with pm.Model() as pmodel:
    one = pm.LogNormal("one", mu=0, sd=0.01)
    two = pm.Lognormal("two", mu=one+0.1, sd=0.01)
    three = pm.LogNormal("three", mu=two+0.1, sd=0.01)

The problematic cases are characterized by a nonsensical recursion in the first element of the resulting initial values graph:

ipx = pm.initial_point.make_initial_point_expression(
    free_rvs=pmodel.free_RVs,
    rvs_to_values=pmodel.rvs_to_values,
    initval_strategies=pmodel.initial_values,
    jitter_rvs={},
    default_strategy="prior",
    return_transformed=False,
)
aesara.dprint(ipx)

Elemwise{exp,no_inplace} [id A] ''   
 |Elemwise{log,no_inplace} [id B] ''   
   |Elemwise{exp,no_inplace} [id A] ''   
Elemwise{exp,no_inplace} [id C] ''   
 |Elemwise{log,no_inplace} [id D] ''   
   |lognormal_rv{0, (0, 0), floatX, False}.1 [id E] 'two'   
     |RandomStateSharedVariable(<RandomState(MT19937) at 0x2031C634C40>) [id F]
     |TensorConstant{[]} [id G]
     |TensorConstant{11} [id H]
     |Elemwise{add,no_inplace} [id I] ''   
     | |Elemwise{exp,no_inplace} [id A] ''   
     | |TensorConstant{0.1} [id J]
     |TensorConstant{0.01} [id K]
Elemwise{exp,no_inplace} [id L] ''   
 |Elemwise{log,no_inplace} [id M] ''   
   |lognormal_rv{0, (0, 0), floatX, False}.1 [id N] 'three'   
     |RandomStateSharedVariable(<RandomState(MT19937) at 0x2031C634340>) [id O]
     |TensorConstant{[]} [id P]
     |TensorConstant{11} [id Q]
     |Elemwise{add,no_inplace} [id R] ''   
     | |Elemwise{exp,no_inplace} [id C] ''   
     | |TensorConstant{0.1} [id S]
     |TensorConstant{0.01} [id T]

ipx[0].eval()  # RecursionError

Traceback

---------------------------------------------------------------------------

RecursionError                            Traceback (most recent call last)

<ipython-input-7-55b33d0fc512> in <module>
----> 1 ipx[0].eval()

~/Documents/Projects/pymc3-venv/lib/python3.8/site-packages/aesara/graph/basic.py in eval(self, inputs_to_values)
    548         inputs = tuple(sorted(inputs_to_values.keys(), key=id))
    549         if inputs not in self._fn_cache:
--> 550             self._fn_cache[inputs] = function(inputs, self)
    551         args = [inputs_to_values[param] for param in inputs]
    552 

~/Documents/Projects/pymc3-venv/lib/python3.8/site-packages/aesara/compile/function/__init__.py in function(inputs, outputs, mode, updates, givens, no_default_updates, accept_inplace, name, rebuild_strict, allow_input_downcast, profile, on_unused_input)
    335         # note: pfunc will also call orig_function -- orig_function is
    336         #      a choke point that all compilation must pass through
--> 337         fn = pfunc(
    338             params=inputs,
    339             outputs=outputs,

~/Documents/Projects/pymc3-venv/lib/python3.8/site-packages/aesara/compile/function/pfunc.py in pfunc(params, outputs, mode, updates, givens, no_default_updates, accept_inplace, name, rebuild_strict, allow_input_downcast, profile, on_unused_input, output_keys)
    473     extended_outputs = out_list + additional_outputs
    474 
--> 475     output_vars = rebuild_collect_shared(
    476         extended_outputs,
    477         in_variables,

~/Documents/Projects/pymc3-venv/lib/python3.8/site-packages/aesara/compile/function/pfunc.py in rebuild_collect_shared(outputs, inputs, replace, updates, rebuild_strict, copy_inputs_over, no_default_updates)
    217         for v in outputs:
    218             if isinstance(v, Variable):
--> 219                 cloned_v = clone_v_get_shared_updates(v, copy_inputs_over)
    220                 cloned_outputs.append(cloned_v)
    221             elif isinstance(v, Out):

~/Documents/Projects/pymc3-venv/lib/python3.8/site-packages/aesara/compile/function/pfunc.py in clone_v_get_shared_updates(v, copy_inputs_over)
     88             if owner not in clone_d:
     89                 for i in owner.inputs:
---> 90                     clone_v_get_shared_updates(i, copy_inputs_over)
     91 
     92                 clone_d[owner] = owner.clone_with_new_inputs(

... last 1 frames repeated, from the frame below ...

~/Documents/Projects/pymc3-venv/lib/python3.8/site-packages/aesara/compile/function/pfunc.py in clone_v_get_shared_updates(v, copy_inputs_over)
     88             if owner not in clone_d:
     89                 for i in owner.inputs:
---> 90                     clone_v_get_shared_updates(i, copy_inputs_over)
     91 
     92                 clone_d[owner] = owner.clone_with_new_inputs(

RecursionError: maximum recursion depth exceeded

(updated with the findings @ricardoV94 and I got so far)

michaelosthege · 2021-11-10T17:23:11Z

michaelosthege · 2021-11-10T19:38:12Z

@aseyboldt we tried different things without luck so far. Can you help us look into this?

michaelosthege added the bug label Nov 10, 2021

michaelosthege added this to the v4.0.0-beta1 (vNext) milestone Nov 10, 2021

ricardoV94 added the help wanted label Nov 10, 2021

michaelosthege changed the title ~~MemoryError in make_initial_point_expression~~ Recursion problems in make_initial_point_expression Nov 10, 2021

ricardoV94 mentioned this issue Nov 11, 2021

Fix make_initial_point_expression recursion bug #5170

Merged

ricardoV94 closed this as completed in #5170 Nov 12, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Recursion problems in `make_initial_point_expression` #5168

Recursion problems in `make_initial_point_expression` #5168

michaelosthege commented Nov 10, 2021 •

edited

Loading

ricardoV94 commented Nov 10, 2021

michaelosthege commented Nov 10, 2021 •

edited

Loading

ricardoV94 commented Nov 10, 2021 •

edited

Loading

michaelosthege commented Nov 10, 2021

michaelosthege commented Nov 10, 2021 •

edited by ricardoV94

Loading

michaelosthege commented Nov 10, 2021

michaelosthege commented Nov 10, 2021

Recursion problems in make_initial_point_expression #5168

Recursion problems in make_initial_point_expression #5168

Comments

michaelosthege commented Nov 10, 2021 • edited Loading

Description of your problem

Versions and main components

ricardoV94 commented Nov 10, 2021

michaelosthege commented Nov 10, 2021 • edited Loading

ricardoV94 commented Nov 10, 2021 • edited Loading

michaelosthege commented Nov 10, 2021

michaelosthege commented Nov 10, 2021 • edited by ricardoV94 Loading

michaelosthege commented Nov 10, 2021

michaelosthege commented Nov 10, 2021

Recursion problems in `make_initial_point_expression` #5168

Recursion problems in `make_initial_point_expression` #5168

michaelosthege commented Nov 10, 2021 •

edited

Loading

michaelosthege commented Nov 10, 2021 •

edited

Loading

ricardoV94 commented Nov 10, 2021 •

edited

Loading

michaelosthege commented Nov 10, 2021 •

edited by ricardoV94

Loading