Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recursion problems in make_initial_point_expression #5168

Closed
michaelosthege opened this issue Nov 10, 2021 · 7 comments · Fixed by #5170
Closed

Recursion problems in make_initial_point_expression #5168

michaelosthege opened this issue Nov 10, 2021 · 7 comments · Fixed by #5170

Comments

@michaelosthege
Copy link
Member

michaelosthege commented Nov 10, 2021

Description of your problem

I have a rather big model that's taking ages to compile the initial point function, eventually running into a MemoryError after ~45 minutes.

The model involes a latent GP and lots of subindexing.

UPDATE: It happens also without the GP.

Please provide a minimal, self-contained, and reproducible example.
#5168 (comment)

Please provide the full traceback.

Complete error traceback
c:\users\osthege\repos\pymc-main\pymc\sampling.py in sample(draws, step, init, n_init, initvals, trace, chain_idx, chains, cores, tune, progressbar, model, random_seed, discard_tuned_samples, compute_convergence_checks, callback, jitter_max_retries, return_inferencedata, idata_kwargs, mp_ctx, **kwargs)
    480             # By default, try to use NUTS
    481             _log.info("Auto-assigning NUTS sampler...")
--> 482             initial_points, step = init_nuts(
    483                 init=init,
    484                 chains=chains,

c:\users\osthege\repos\pymc-main\pymc\sampling.py in init_nuts(init, chains, n_init, model, seeds, progressbar, jitter_max_retries, tune, initvals, **kwargs)
   2191     ]
   2192 
-> 2193     initial_points = _init_jitter(
   2194         model,
   2195         initvals,

c:\users\osthege\repos\pymc-main\pymc\sampling.py in _init_jitter(model, initvals, seeds, jitter, jitter_max_retries)
   2060     """
   2061 
-> 2062     ipfns = make_initial_point_fns_per_chain(
   2063         model=model,
   2064         overrides=initvals,

c:\users\osthege\repos\pymc-main\pymc\initial_point.py in make_initial_point_fns_per_chain(model, overrides, jitter_rvs, chains)
    102         # Only one function compilation is needed.
    103         ipfns = [
--> 104             make_initial_point_fn(
    105                 model=model,
    106                 overrides=overrides,

c:\users\osthege\repos\pymc-main\pymc\initial_point.py in make_initial_point_fn(model, overrides, jitter_rvs, default_strategy, return_transformed)
    165     overrides = convert_str_to_rv_dict(model, overrides or {})
    166 
--> 167     initial_values = make_initial_point_expression(
    168         free_rvs=model.free_RVs,
    169         rvs_to_values=model.rvs_to_values,

c:\users\osthege\repos\pymc-main\pymc\initial_point.py in make_initial_point_expression(free_rvs, rvs_to_values, initval_strategies, jitter_rvs, default_strategy, return_transformed)
    318     for i in range(n_variables):
    319         outputs = [initial_values_clone[i], initial_values_transformed_clone[i]]
--> 320         graph = FunctionGraph(outputs=outputs, clone=False)
    321         graph.replace_all(zip(free_rvs_clone[:i], initial_values), import_missing=True)
    322         initial_values.append(graph.outputs[0])

~\AppData\Local\Continuum\miniconda3\envs\CARenv\lib\site-packages\aesara\graph\fg.py in __init__(self, inputs, outputs, features, clone, update_mapping, memo, copy_inputs, copy_orphans)
    165 
    166         for output in outputs:
--> 167             self.import_var(output, reason="init")
    168         for i, output in enumerate(outputs):
    169             self.clients[output].append(("output", i))

~\AppData\Local\Continuum\miniconda3\envs\CARenv\lib\site-packages\aesara\graph\fg.py in import_var(self, var, reason, import_missing)
    335         # Imports the owners of the variables
    336         if var.owner and var.owner not in self.apply_nodes:
--> 337             self.import_node(var.owner, reason=reason, import_missing=import_missing)
    338         elif (
    339             var.owner is None

~\AppData\Local\Continuum\miniconda3\envs\CARenv\lib\site-packages\aesara\graph\fg.py in import_node(self, apply_node, check, reason, import_missing)
    379         # input set.  (The functions in the graph module only use the input set
    380         # to know where to stop going down.)
--> 381         new_nodes = io_toposort(self.variables, apply_node.outputs)
    382 
    383         if check:

~\AppData\Local\Continuum\miniconda3\envs\CARenv\lib\site-packages\aesara\graph\basic.py in io_toposort(inputs, outputs, orderings, clients)
   1160             else:
   1161                 todo.append(cur)
-> 1162                 todo.extend(i.owner for i in cur.inputs if i.owner)
   1163         return order
   1164 

MemoryError: 

Please provide any additional information below.

Interestingly I can do .eval() on the likelihoods in the model just fine.

Any ideas?

Versions and main components

  • PyMC/PyMC3 Version: latest main
  • Aesara/Theano Version: 2.2.6
  • Python Version: 3.8.10
  • Operating system: Windows 10
@ricardoV94
Copy link
Member

Can you try to compile the initial point function in FAST_RUN?

@michaelosthege
Copy link
Member Author

michaelosthege commented Nov 10, 2021

It doesn't even get to compilation. The error is already in the creation of the expression.

The model also involves a set_subtensor. Could it be that the dependency between these nodes is not properly resolved? The model_to_graphviz is also missing the link from P0 and P_in_R to P.

mask_RinRID # is a boolean numpy array
P_in_R # is a matrix random variable
P0 # is a vector pm.Data

P = at.empty(
    shape=(190, 5),
    dtype=aesara.config.floatX
)
P = at.set_subtensor(P[mask_RinRID, :], P_in_R)
P = at.set_subtensor(P[~mask_RinRID, :], P0[~mask_RinRID, None])
P = pm.Deterministic("P", P)

Until yesterday I was running this model with PyMC main from c06c2f4 which was before #4983 was merged.
The set_subtensor stuff was already part of the model and worked with there.
Other than making the model slightly more complex, this is the only relevant difference I can find right now.

@ricardoV94
Copy link
Member

ricardoV94 commented Nov 10, 2021

By the way, can you also do prior predictive sampling?

@michaelosthege
Copy link
Member Author

Yes, prior predictive sampling works just fine.

@michaelosthege
Copy link
Member Author

michaelosthege commented Nov 10, 2021

Here's an example.

with pm.Model() as pmodel:
    one = pm.LogNormal("one", mu=0, sd=0.01)
    two = pm.Lognormal("two", mu=one+0.1, sd=0.01)
    three = pm.LogNormal("three", mu=two+0.1, sd=0.01)

The problematic cases are characterized by a nonsensical recursion in the first element of the resulting initial values graph:

ipx = pm.initial_point.make_initial_point_expression(
    free_rvs=pmodel.free_RVs,
    rvs_to_values=pmodel.rvs_to_values,
    initval_strategies=pmodel.initial_values,
    jitter_rvs={},
    default_strategy="prior",
    return_transformed=False,
)
aesara.dprint(ipx)
Elemwise{exp,no_inplace} [id A] ''   
 |Elemwise{log,no_inplace} [id B] ''   
   |Elemwise{exp,no_inplace} [id A] ''   
Elemwise{exp,no_inplace} [id C] ''   
 |Elemwise{log,no_inplace} [id D] ''   
   |lognormal_rv{0, (0, 0), floatX, False}.1 [id E] 'two'   
     |RandomStateSharedVariable(<RandomState(MT19937) at 0x2031C634C40>) [id F]
     |TensorConstant{[]} [id G]
     |TensorConstant{11} [id H]
     |Elemwise{add,no_inplace} [id I] ''   
     | |Elemwise{exp,no_inplace} [id A] ''   
     | |TensorConstant{0.1} [id J]
     |TensorConstant{0.01} [id K]
Elemwise{exp,no_inplace} [id L] ''   
 |Elemwise{log,no_inplace} [id M] ''   
   |lognormal_rv{0, (0, 0), floatX, False}.1 [id N] 'three'   
     |RandomStateSharedVariable(<RandomState(MT19937) at 0x2031C634340>) [id O]
     |TensorConstant{[]} [id P]
     |TensorConstant{11} [id Q]
     |Elemwise{add,no_inplace} [id R] ''   
     | |Elemwise{exp,no_inplace} [id C] ''   
     | |TensorConstant{0.1} [id S]
     |TensorConstant{0.01} [id T]
ipx[0].eval()  # RecursionError
Traceback
---------------------------------------------------------------------------

RecursionError                            Traceback (most recent call last)

<ipython-input-7-55b33d0fc512> in <module>
----> 1 ipx[0].eval()

~/Documents/Projects/pymc3-venv/lib/python3.8/site-packages/aesara/graph/basic.py in eval(self, inputs_to_values)
    548         inputs = tuple(sorted(inputs_to_values.keys(), key=id))
    549         if inputs not in self._fn_cache:
--> 550             self._fn_cache[inputs] = function(inputs, self)
    551         args = [inputs_to_values[param] for param in inputs]
    552 

~/Documents/Projects/pymc3-venv/lib/python3.8/site-packages/aesara/compile/function/__init__.py in function(inputs, outputs, mode, updates, givens, no_default_updates, accept_inplace, name, rebuild_strict, allow_input_downcast, profile, on_unused_input)
    335         # note: pfunc will also call orig_function -- orig_function is
    336         #      a choke point that all compilation must pass through
--> 337         fn = pfunc(
    338             params=inputs,
    339             outputs=outputs,

~/Documents/Projects/pymc3-venv/lib/python3.8/site-packages/aesara/compile/function/pfunc.py in pfunc(params, outputs, mode, updates, givens, no_default_updates, accept_inplace, name, rebuild_strict, allow_input_downcast, profile, on_unused_input, output_keys)
    473     extended_outputs = out_list + additional_outputs
    474 
--> 475     output_vars = rebuild_collect_shared(
    476         extended_outputs,
    477         in_variables,

~/Documents/Projects/pymc3-venv/lib/python3.8/site-packages/aesara/compile/function/pfunc.py in rebuild_collect_shared(outputs, inputs, replace, updates, rebuild_strict, copy_inputs_over, no_default_updates)
    217         for v in outputs:
    218             if isinstance(v, Variable):
--> 219                 cloned_v = clone_v_get_shared_updates(v, copy_inputs_over)
    220                 cloned_outputs.append(cloned_v)
    221             elif isinstance(v, Out):

~/Documents/Projects/pymc3-venv/lib/python3.8/site-packages/aesara/compile/function/pfunc.py in clone_v_get_shared_updates(v, copy_inputs_over)
     88             if owner not in clone_d:
     89                 for i in owner.inputs:
---> 90                     clone_v_get_shared_updates(i, copy_inputs_over)
     91 
     92                 clone_d[owner] = owner.clone_with_new_inputs(

... last 1 frames repeated, from the frame below ...

~/Documents/Projects/pymc3-venv/lib/python3.8/site-packages/aesara/compile/function/pfunc.py in clone_v_get_shared_updates(v, copy_inputs_over)
     88             if owner not in clone_d:
     89                 for i in owner.inputs:
---> 90                     clone_v_get_shared_updates(i, copy_inputs_over)
     91 
     92                 clone_d[owner] = owner.clone_with_new_inputs(

RecursionError: maximum recursion depth exceeded

(updated with the findings @ricardoV94 and I got so far)

@michaelosthege
Copy link
Member Author

grafik

@michaelosthege
Copy link
Member Author

@aseyboldt we tried different things without luck so far. Can you help us look into this?

@michaelosthege michaelosthege changed the title MemoryError in make_initial_point_expression Recursion problems in make_initial_point_expression Nov 10, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants