report [and proposed fix] for autoreload error caused by explicit module.super(ChildClass,self)init() #1670

mkarikom · 2022-09-13T17:09:16Z

I noticed that autoreload fails when updating class methods in the api_overview.ipynb, and that this can be fixed by removing the explicit child class parameter in the super()__init__() call. In my case, I was debugging an updated TrainingPlan, but I suspect this will be effective for other classes as well.

I can submit a PR for this but wanted to make sure it wasn't already being addressed somehow.

add a cell in api_overview.ipynb with

%load_ext autoreload
%autoreload 2

instantiate and train the model
introduce a bug (eg in train.TrainingPlan, inherited by the already-instantiated model) like:

    def training_step(self, batch, batch_idx, optimizer_idx=0):
        if "kl_weight" in self.loss_kwargs:
            self.loss_kwargs.update({"kl_weight": self.kl_weight})
        _, _, scvi_loss = self.forward(batch, loss_kwargs=self.loss_kwargs)
        self.log("train_loss", scvi_loss_not_defined.loss, on_epoch=True) # this will cause an error
        self.compute_and_log_metrics(scvi_loss, self.train_metrics, "train")
        return scvi_loss.loss

which tries to log an undefined key and gives the following error

/workspaces/scvi_work_pip/scvi-tools/scvi/model/base/_training_mixin.py:67: UserWarning: max_epochs=4 is less than n_epochs_kl_warmup=400. The max_kl_weight will not be reached during training.
  warnings.warn(
GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0,1]
Epoch 1/4:   0%|          | 0/4 [00:00<?, ?it/s]/workspaces/scvi_work_pip/scvi-tools/scvi/distributions/_negative_binomial.py:64: UserWarning: Specified kernel cache directory could not be created! This disables kernel caching. Specified directory is /home/vscode/.cache/torch/kernels. This warning will appear only once per process. (Triggered internally at  ../aten/src/ATen/native/cuda/jit_utils.cpp:860.)
  + torch.lgamma(x + theta)
Output exceeds the [size limit](command:workbench.action.openSettings?[). Open the full output data [in a text editor](command:workbench.action.openLargeOutput?f36d3b59-1e46-42ab-b965-5092e5ad6188)
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
/workspaces/scvi_work_pip/scvi-tools/docs/tutorials/notebooks/api_overview.ipynb Cell 22 in <cell line: 1>()
----> [1](vscode-notebook-cell://dev-container%2B7b22686f737450617468223a222f686f6d652f61752f436f6e7461696e6572732f736376695f776f726b5f7069702f776f726b73706163652e636f64652d776f726b7370616365222c226c6f63616c446f636b6572223a66616c73652c2273657474696e6773223a7b22686f7374223a227373683a2f2f7332227d7d/workspaces/scvi_work_pip/scvi-tools/docs/tutorials/notebooks/api_overview.ipynb#X30sdnNjb2RlLXJlbW90ZQ%3D%3D?line=0) model.train(max_epochs=4)

File /workspaces/scvi_work_pip/scvi-tools/scvi/model/base/_training_mixin.py:142, in UnsupervisedTrainingMixin.train(self, max_epochs, use_gpu, train_size, validation_size, batch_size, early_stopping, plan_kwargs, **trainer_kwargs)
    131 trainer_kwargs[es] = (
    132     early_stopping if es not in trainer_kwargs.keys() else trainer_kwargs[es]
    133 )
    134 runner = TrainRunner(
    135     self,
    136     training_plan=training_plan,
   (...)
    140     **trainer_kwargs,
    141 )
--> 142 return runner()

File /workspaces/scvi_work_pip/scvi-tools/scvi/train/_trainrunner.py:74, in TrainRunner.__call__(self)
     71 if hasattr(self.data_splitter, "n_val"):
     72     self.training_plan.n_obs_validation = self.data_splitter.n_val
---> 74 self.trainer.fit(self.training_plan, self.data_splitter)
     75 self._update_history()
     77 # data splitter only gets these attrs after fit

File /workspaces/scvi_work_pip/scvi-tools/scvi/train/_trainer.py:188, in Trainer.fit(self, *args, **kwargs)
...
--> 327 self.log("train_loss", scvi_loss.loss_not_defined, on_epoch=True)
    328 self.compute_and_log_metrics(scvi_loss, self.train_metrics, "train")
    329 return scvi_loss.loss

AttributeError: 'LossRecorder' object has no attribute 'loss_not_defined'

now fix the bug, which would normally reload the class method but instead causes the following error:

/workspaces/scvi_work_pip/scvi-tools/scvi/model/base/_training_mixin.py:67: UserWarning: max_epochs=4 is less than n_epochs_kl_warmup=400. The max_kl_weight will not be reached during training.
  warnings.warn(
Output exceeds the [size limit](command:workbench.action.openSettings?[). Open the full output data [in a text editor](command:workbench.action.openLargeOutput?343d0b6e-d72c-481c-a5f0-4218a06efaca)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
/workspaces/scvi_work_pip/scvi-tools/docs/tutorials/notebooks/api_overview.ipynb Cell 22 in <cell line: 1>()
----> [1](vscode-notebook-cell://dev-container%2B7b22686f737450617468223a222f686f6d652f61752f436f6e7461696e6572732f736376695f776f726b5f7069702f776f726b73706163652e636f64652d776f726b7370616365222c226c6f63616c446f636b6572223a66616c73652c2273657474696e6773223a7b22686f7374223a227373683a2f2f7332227d7d/workspaces/scvi_work_pip/scvi-tools/docs/tutorials/notebooks/api_overview.ipynb#X30sdnNjb2RlLXJlbW90ZQ%3D%3D?line=0) model.train(max_epochs=4)

File /workspaces/scvi_work_pip/scvi-tools/scvi/model/base/_training_mixin.py:128, in UnsupervisedTrainingMixin.train(self, max_epochs, use_gpu, train_size, validation_size, batch_size, early_stopping, plan_kwargs, **trainer_kwargs)
    119 _check_warmup(plan_kwargs, max_epochs, n_cells, batch_size)
    121 data_splitter = DataSplitter(
    122     self.adata_manager,
    123     train_size=train_size,
   (...)
    126     use_gpu=use_gpu,
    127 )
--> 128 training_plan = TrainingPlan(self.module, **plan_kwargs)
    130 es = "early_stopping"
    131 trainer_kwargs[es] = (
    132     early_stopping if es not in trainer_kwargs.keys() else trainer_kwargs[es]
    133 )

File /workspaces/scvi_work_pip/scvi-tools/scvi/train/_trainingplans.py:154, in TrainingPlan.__init__(self, module, lr, weight_decay, eps, optimizer, n_steps_kl_warmup, n_epochs_kl_warmup, reduce_lr_on_plateau, lr_factor, lr_patience, lr_threshold, lr_scheduler_metric, lr_min, max_kl_weight, min_kl_weight, **loss_kwargs)
    133 def __init__(
    134     self,
    135     module: BaseModuleClass,
   (...)
    152     **loss_kwargs,
...
--> 154     super(TrainingPlan, self).__init__()
    155     self.module = module
    156     self.lr = lr

TypeError: super(type, obj): obj must be an instance or subtype of type

update train/_trainingplans.py:154 so that train.TrainingPlan.__init__() calls init() on the parent implicitly like:

        super().__init__()

without re-instantiating model, re-run model.train(max_epochs=4), and notice that our fix in (4) is reloaded successfully:

GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0,1]
Epoch 4/4: 100%|██████████| 4/4 [00:14<00:00,  3.64s/it, loss=297, v_num=1]

Versions:

'0.17.2'

The text was updated successfully, but these errors were encountered:

adamgayoso · 2022-09-13T17:23:04Z

I'm not sure I understand why this is the fix, but I do agree we should use super().__init__() everywhere. A PR changing all super calls codebase-wide with just super() would be appreciated.

mkarikom · 2022-09-13T18:22:26Z

done, please see #1671 and #76

adamgayoso · 2022-09-14T02:56:11Z

This was closed by #1671

mkarikom added the bug label Sep 13, 2022

mkarikom changed the title ~~bug and possible fix for autoreload error caused by explicit module.super(ChildClass,self)__init__()~~ report [and proposed fix] for autoreload error caused by explicit module.super(ChildClass,self)__init__() Sep 13, 2022

This was referenced Sep 13, 2022

change to super().__init__() #1671

Merged

model demo init fix scverse/scvi-tutorials#76

Merged

adamgayoso closed this as completed Sep 14, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

report [and proposed fix] for autoreload error caused by explicit module.super(ChildClass,self)init() #1670

report [and proposed fix] for autoreload error caused by explicit module.super(ChildClass,self)init() #1670

mkarikom commented Sep 13, 2022 •

edited

Loading

adamgayoso commented Sep 13, 2022

mkarikom commented Sep 13, 2022 •

edited

Loading

adamgayoso commented Sep 14, 2022

report [and proposed fix] for autoreload error caused by explicit module.super(ChildClass,self)__init__() #1670

report [and proposed fix] for autoreload error caused by explicit module.super(ChildClass,self)__init__() #1670

Comments

mkarikom commented Sep 13, 2022 • edited Loading

Versions:

adamgayoso commented Sep 13, 2022

mkarikom commented Sep 13, 2022 • edited Loading

adamgayoso commented Sep 14, 2022

report [and proposed fix] for autoreload error caused by explicit module.super(ChildClass,self)init() #1670

report [and proposed fix] for autoreload error caused by explicit module.super(ChildClass,self)init() #1670

mkarikom commented Sep 13, 2022 •

edited

Loading

mkarikom commented Sep 13, 2022 •

edited

Loading